October 12, 2006

Two new things from Europe

I was going to post about my latest discovery in the "12 most powerful words" saga -- an Akkadian tablet from 1200 BC, which attributes the list to an unspecified Sumerian haruspex. But I ran into problems embedding the cuneiform font, so instead, I'll interrupt our regularly-scheduled linguistic mythbusting to point you to two new things from the far side of the Atlantic.

The first is a cute web app: Wortschatz. The authors (from the University of Leipzig) have processed and indexed text corpora in 17 languages in various ways, and for each word, their web application will show you (quoting their help page):

  • Frequency information (both absolute and relative to the most frequent word in the corpus)
  • Sample sentences containing this word.
  • Cooccurrences: Other words, occurring significantly often together with the given word; either as immediate left or right neighbor, or within the same sentence. They are ordered by significance with respect to the log likelihood measure.
  • The strongest cooccurrences (at sentence level) are visualized as a semantic map.

You can download the (relational) databases and the browser used to access them, as well as a recent paper discussing the application: Quasthoff, U.; M. Richter; C. Biemann, "Corpus Portal for Search in Monolingual Corpora", Proceedings of the fifth international conference on Language Resources and Evaluation, LREC 2006, Genoa, pp. 1799-1802.

The second is a new book on political discourse. But this time it's not about Republicans as strict parents, or Democrats as latte-sipping elitists, it's about French presidential candidates as -- well, you'll have to judge for yourself. The book is by computational-linguist-blogger Jean Véronis and his colleague Louis-Jean Calvet, and the book is "Combat Pour l'Elysée: Paroles des prétendents".

The publisher's blurb:

La politique est, depuis la nuit des temps, une affaire de jeu avec les mots. Platon déjà fustigeait les sophistes, ces antiques "conseillers en communication " qui monnayaient leurs services linguistiques auprès des politiciens de l'époque, leur enseignant comment accommoder la Vérité à leurs fins. Les choses n'ont guère changé. En cette période pré-électorale où les candidats potentiels essaient de nous convaincre et de nous séduire, Louis-Jean Calvet et Jean Véronis décident de les prendre au mot et de décortiquer leur parole. Langue de bois, petites phrases, bons mots et lapsus, rumeurs du Web, sont ainsi systématiquement passés au crible de leurs analyses linguistiques et informatiques. Nouvelle manière de comprendre le politique, ce livre, à la fois sérieux et hilarant, propose un portrait cruel et inédit des différents prétendants à l'élection présidentielle.

And the authors have a book blog.

Posted by Mark Liberman at October 12, 2006 07:14 AM