Writing Semantic Markup

Jared Spool

September 9th, 2005

In case you missed it, Josh and Richard McManus have another installment of their Web 2.0 column out on Digital Web:

Let’s take a closer look. Consider the following text:

Web 2.0 Design: Bootstrapping the Social Web
By Richard MacManus & Joshua Porter

Humans can instantly recognize this as a title and authors of a work, in this case a column here at Digital Web Magazine. We know this because of past experience. We’ve seen similar things before. It is apparent that the first line is a title and the second line is two authors. Given this information, humans are able to act on it in a meaningful way. For instance, you could answer someone if they asked you “Who wrote that?”

Machines, with their rigid information processing capabilities, need everything spelled out for them. To be able to do something useful with this title and byline, a machine would need to be able to parse it correctly. It would need to know that the number (2.0) in the first line is part of the title and shouldn’t be interpreted as a numeric value, that the spaces around it separate words from each other, and that the second line is made up of two names and not one. In other words, a machine would need to be able to do algorithmically what we humans do almost without thinking.

This would work amazingly well, and is very possible even today, except that the syntax of titles and bylines changes from person to person and from usage to usage. What if I changed my first name to just be the initial “J”? Or misspelled it? Humans would still understand the endless permutations. Machines, though, unless programmed for every single possible permutation, cannot reliably make the same decisions that we can. The human ability to adapt and interpret is special.

Add a Comment