April 20, 2007

The postmodern web

This post is brought to you courtesy of the free wifi at Phoenix Sky Harbor International Airport, where I'm waiting for my flight back to Philadelphia after the end of the NSF/JISC Repositories Workshop. The wifi here at the airport is not only free, it's also pretty fast -- and both dimensions are a stark contrast to the wifi at the Hyatt Regency where the workshop was held, which cost $12.27 a day and was painfully slooow -- though not as totally molassified as the wifi at the Westin hotel at SFO, where the DARPA GALE P.I. meeting was held back in March. Anyhow, props to PHX.

One of the more interesting features of the Repositories workshop was a paper by Malcolm Hyman and Jürgen Renn, "From Research Challenges of the Humanities to the Epistemic Web". Perhaps as a reminder of the cultural norms of the past, Jürgen chose to pass the paper out in actual paper form, and then to read all 14 single-spaced pages out loud from beginning to end. Round about page five, a passage caught my attention:

If the Web only knew what it is talking about, it would understand itself much better -- but who is going to teach it? Evidently, the potential of the Web as a universal representation of human knowledge and communication would be greatly enhanced if its sites could "speak" to each other in the sense of recognizing if two figures refer to the same date whatever the format or if two texts refer, say, to soccer whether the word occurs or not. The Web's semiotic connectivity would thus be transformed into a semantic connectivity. One of the strategies of adding meaning to data is using metadata establishing the data's significance, for instance by referring to ontologies offering common frames of semantic reference. Historically, attempts of creating such a second world of meaning with a claim to universal validity are familiar from the Catholic Church and the Soviet Union. They typically involve a great deal of central planning and are characterized by the rule of technocrats as well as the incapacity to cope with developmental dynamics. Providing meaning to the Web with the help of metadata created by expert groups and committees ultimately amounts to an Orwellian vision of the Web in which adopting Newspeak is obligatory for being part of the accepted community. Natural language works differently in that meaning emerges rather than being predefined -- as dangerous as that may be for any classificatory order. It would thus be better to learn from natural language: One of the reasons for the power of natural language in representing and furthering human thinking is its inbuilt reflexivity: natural language is its own metalanguage. The separation of language and metalanguage makes it to a certain degree possible to fix the semantics clarifying communication and avoiding paradoxes but severely limits the potential of creating new meaning.

But wait, you're thinking, ontologies are so 2001 -- the real solution is folksonomies! Well, Malcolm and Jürgen are right there with you:

Web 2.0 is the protestant vision of the Semantic Web: where central authorities have failed in mediating between the real world of data and the transcendental world of meaning, smaller, self-organized groups feel that they are called upon to open direct access to this trancendental world in terms of their own interpretations of the Great Hypertext. The traditional separation between providers/priests and clients/laymen is thus modified in favor of a new social network in which meaning is actually created bottom up. The unrealistic idea of taxonomies inaugurated by top-down meausres is being replaced by the more feasible enterprise of "folksonomies" spread by special interest groups. As their scope remains, however, rather limited and the separation between data and metadata essentially unchallenged, the chances for developing such a social network into a knowledge network fulling [sic] coping with the real world of data are slim.

This reminds me, again, of the old joke about a conversation between a native of Belfast and an American:

Belfast: Are you a protestant or a catholic?
U.S.: Well, neither one, actually. As it happens, I'm a jew.
Belfast: All right, but are you a protestant jew or a catholic jew?

Ecumenical fairness also raises obvious questions about other approaches to finding meaning in web data -- the Islamic view, the Hindu view, the Buddhist view, and so on. In any case, I hope that their paper finds its way onto the web -- even the old-fashioned (pagan?) html web without any sort of cultic overlay -- so that you can learn about the creed that Malcolm and Jürgen are evangelizing, the "Epistemic Web", which they (perhaps optimistically?) call "Web 3.0". I'd explain it to you, but I will need further theological study or a special revelation to understand it fully myself. Here are the bullet points from the two key sections of the outline that Jürgen put on the workshop web site:

Creating a universe of knowledge on the Web that parallels human knowledge
Turning (private) reading into the (public) creation of information
Allowing all data to be metadata and all documents to be windows into the universe of knowledge

Moving from servers and browsers to interagents that allow people to interact with information
Replacing browsing and searching with projecting and federating information
Enabling automated federation through an extensible service architecture
Extending current hypertext architecture with granular addressing and enriched links

The trouble is, I believe that I understand what they're saying, but I don't think I know what it means.

Posted by Mark Liberman at April 20, 2007 10:49 AM