Tuesday, May 31, 2005

Discussion about my semantic web comments is going on over at Danny's place.

Here's my response to some of these :

On Microformats: Yep, I was using the term loosely, for any kind of home-rolled XML. (And maybe not even XML) So I don’t necessarily want to restrict it to “rel” tags.

Reiterating my emphasis :

And I want to go beyond a debate about upper-case vs. lower-case semantic web if all that means is whether the format is XML-RDF vs. some other representation which can probably be translated into RDF. This does seem to me to be a rather pointless argument about file-formats.

I want to emphasize the distinction I made in the original post because this appears to be the real heart of the argument, and something that’s genuinely theoretically interesting.

That is, the distinction between documents who’s elements get their semantics primarily from the fact that they are *in* those documents, vs. documents who’s elements get their semantics primarily from the fact they are linked to an external ontology.

Let’s call these “document-semantic” and “ontology-semantic” approaches rather than lower-case and upper-case SemWeb.

Now there is a practical question about which strategy to prefer. Document-semantics is, in a sense, what we’ve always had. And part of the pitch for the SW was that this was wrong and the ontology-semantics was a “better” way. It was going to make it easier (and more likely) that information from different sources would be combined, than an ad-hoc custom scraping of document-semantic documents.

Without that claim, the SemWeb seems to me to be nothing more than a file-format.

Now, the value of ontology-semantics really only kicks in when you want to meaningfully combine data from documents without knowing (or caring) what kind of document they’re in.

This combination of elements from different documents is what Shirky called “syllogism” and said wouldn’t be very useful in practice - because the document is the more important context. Half the alleged rebuttals are just trying to say that Shirky is wrong because that isn’t the aim. Nevertheless, I can’t see any other reason to prefer ontology-semantics over document-semantics.

On dependency of centralized servers (and other substrates) :

Shelley has an interesting point about the “centralization” of tagging. I agree about del.icio.us and flickr, but I’m not sure it goes for Technorati tags. I think these are embedded in people’s blogs, and technorati is just a “scutter”. If it disappeared tomorrow, and someone found it useful to write a new scutter, the data would still be there.

Also, I’m not sure this has any semantic significance. You might as well argue that English has a lousy semantics because if all English speakers died, the words wouldn’t have meaning. But all symbol systems need to be physically instantiated. That’s not an argument against their meaningfulness.

Shelley also says : “Technology is just moving parts and a bit of syntax.”

I think most theories of language assume meaning is embedded in practice, and by analogy a semantic web’s meaning would be embedded in the practice of the software that uses it. Nothing has semantics without the moving parts.

No comments: