November 06, 2004

Language Quiz II: The answer

See below. (Patrick Hall pointed out that for people who don't get to the quiz on the first day, it's unfair to put the answer on the blog's index page. So don't read the rest of this point unless you want the answer.)

The language is Somali.

The text is the title and the first two sentences from this page (in standard Somali orthography):

Colaadda gurigu waa mid waxyeelleynaysa qoyska oo dhan.
Haddaad u malaynaysid in dumarku badanaaba halis u yihiin colaad am waxyeelo uga imaaneysa banaanka ka baxsan ammaanka guriga, waad ku qaldantahay.
Waxay u badan tahay in dumarka lagu weeraro guryahooda ayna weeraraan dadka ay la nool yihiin.

whose English version is here:

Domestic violence hurts the whole family.
If you think women are most at risk from violence out on the streets away from the safety of home, you’d be mistaken.
Women are more likely to be attacked in their own houses by people they live with.

The rest of the Somali audio version can be found here in mp3 form.

The European relevance is to the recent killing of Theo van Gogh, in retaliation for his film on Muslim treatment of women, co-produced with Dutch MP Ayaan Hirsi Ali, who is Somali by origin. The five-page note tacked to van Gogh's body with a knife was addressed to her, though the killers were North Africans rather than Somalis

In 1998, I taught a field methods course in which we worked on Somali, and I was somewhat taken aback by one of the proverbs that we learned:

 Naag   ha             kaga     jirto       guri  ama  god  
 woman [optative AUX] in-in  stay+FEM+OPT   home  or   grave

"A woman's place is in the home or in the grave".

I was never quite sure how to square that one with another Somali proverb: Kunka koodi kownaka guurso; "A thousand assignations, one marriage."

Interesting grammatical note: the wordform kaga is a preposition cluster, in this case (I believe) equivalent to ku+ku, that is, two copies of the preposition ku meaning "in, into, on, at, with (by means of)". As John Saeed explains in his Somali Reference Grammar (p. 206, 2nd edition): "Where there are several locative prepositions in a sentence they all occur before the verb and merge into a cluster, with some accompanying sound changes". More on this (from the same 1998 field methods course) is here.

I'll post something more about Somali phonology, phonetics and orthography at some point -- for now I should just say that the letter "c" is used for a voiced pharyngeal "fricative" (where the scare quotes mean that it is often frictionless).

[Update: several people got this one -- often by interesting techniques.

For example, Jonathan Mayhew at Bemsha Swing used broad phonetic transcription and Google -- as he wrote by email:

Olad de guruga, o a mitwa fui elenis´ hoiska odan.
Hadad uma laysit, in dumarka padanaava, haleese ulaheen ...

Is it Somali? A google search for the word "ulaheen" brings up texts in this language. Also, Somali has tones, double vowels, glottal stops, and complex dipthongs--all of which I imagine I'm hearing here.

I'm probably way off. It's fun to guess anyway. Mitwa could be a Hindi word, Elenisa the first name of a Brazilian woman.

This was an example of true poetic inspiration, since the actual sequence that Jonathan transcribes as "haleese ulaheen" is "halis u yihiin" ("they are most in danger") in standard Somali orthography, and in IPA roughly /halis u jihi:n/. The fact that "ulaheen" turns up pages in Somali is a happy accident -- it's a completely different word, in which the "ee" is pronounced more like French é.

Chris Waigl got the right answer too, by more conventionally linguistic means, and documented her efforts in interesting detail here.

Stefano Taschini, who contributed the audio for the first quiz, emailed without explanation that

My very wild guess is Somali.

and of course his guess was absolutely right. I'm impressed.]

[Update #2: Sauvage Noble has an excellent transcription and much interesting discussion, but was not able to come up with an ID for the language. ]

[Update #3: Stefano Taschini emailed his method:

First, I used the "Amazing Slow Downer" to listen to the audio clips at half speed. What struck me is the retroflexed 'd' at the end of the first clip. Since that sound is often transcribed in latin alphabet as 'dh' when occurring in Hindi, I thought to look up the word 'odhan' on Google, and that returned a number of pages in Somali. What I could find about that language afterwards seemed to be consistent with the clips.

As with Jonathan Mayhew's method, this one involved a happy accident as one step: "qoyska oo dhan" means "the whole family" -- literally "family-the REL complete" = "the family, all of it"; the word "odhan" is the verb "to say", which is completely unconnected, as far as I know, and doesn't occur in that phrase at all...

The key thing, though, was that once the hypothesis of Somali was on the table, Stefano was able to find evidence by Googling texts for the identity of the language! ]


Posted by Mark Liberman at November 6, 2004 10:53 PM