Michael K. Bergmann has written a very useful technological history of the web, which also presents a thoroughly argued vision of the future. If you are like me and think the vision of a fully semantic web is premature, then this vision of an intermediary state called the Structured Web will resonate. It’s a more realistic, but still very challenging, vision of the coming Web 3.0.
He defines it in the following way:
“The structured Web is object-level data within Internet documents and databases that can be extracted, converted from available forms, represented in standard ways, shared, re-purposed, combined, viewed, analyzed and qualified without respect to originating form or provenance.
The structured Web is more in that it pertains to any data formalism in use on the Web and includes the notion of extracting structure from uncharacterized content, by far the largest repository of potentially useful information on the Web. Yet the structured Web is also less because its ambition is solely to get that data into an interoperable framework and to forgo the full objectives of the ‘Semantic Web.’ In that regard, my concept of the structured Web is perhaps closest to the idea of Linked Data, though with less insistence on “correct” RDF and with specific attention to structure extraction from uncharacterized content.”
And then argues why it is an appropriate intermediate goal:
“If nothing else, the reality of the past 15 years shows us that the Web is a “dirty,” chaotic place. If HTML coding can be screwed up, it will. If loopholes in standards and protocols exist, they will be exploited. If there is ambiguity, all interpretations become possible, with many passionately held. Innovation and unintended uses occur everywhere.
This should not be surprising, and experienced Web designers, scientists and technologists should all know this by now. There can be no disconnect between workable standards and approaches and actual use in the “wild.” Nuanced arguments over the subtleties of standards and approaches are bound to fail. Robustness, simplicity and forgiveness must take precedence over elegance and theoretical completeness.
While there has been obvious growth in the sophistication of Web sites and the underlying technologies that support them, we see continued use of obsolete approaches that clearly should have been abandoned long ago (such as Web-safe colors, small displays, older browser versions, Web pages parked on some servers that have not been modified or looked at by their original authors in a decade, etc.). We also see slow uptake for clearly “better” new approaches. And we also sometimes see explosive uptake of approaches and ideas that seemingly come out of nowhere.
We also see that those approaches that enjoy the greatest success — blogging, tagging, microformats, RSS, widgets, for example, come most recently to mind — are those that the “citizen” user can easily and readily embrace. HTML was pretty foreign at first, but now millions of users modify their own code. Millions of users of various CMS systems and Firefox have learned how to install plug-ins and add-ins and modify CSS themes and use administration consoles.
So, my observation and argument is not that we must always choose what is mindless and unchallenging. But my argument is that we must accept real-world diversity and seek simplicity, robustness and clarity for what is new.
After nearly a decade of standards work, the basis for beginning the transition to the semantic Web is in place. But that vision itself sometimes appears too demanding, too intimidating. The vision at times appears all too unreachable.
Of course, this perception is wrong. Measured over many years, perhaps some decades, the vision of the semantic Web is reachable. Much remains to be worked on and understood regarding this vision in terms of mediating and resolving semantic heterogeneities, capturing and expressing world views through formal ontologies, making inferences between these views, and establishing trust and authoritativeness. And those challenges do not yet address the even more-exciting prospects of intelligent and autonomous agents.
Rather, the rationale for the structured Web is to tone down the vision, stay with the here and now, focus on what is achievable today. And what is achievable today is very great. Why This Series on the ‘Structured Web‘?
Though version numbers for the Web are silly, with ‘Web 3.0? for the semantic Web possibly being the silliest of all, such attempts do speak to the need for providing handles and language for capturing the dynamic change, diversity and complexity of the Web.
Today, right now, and all around us, a fundamental transition is taking place in the Web from a document-centric to a data-centric environment. A confluence of standards, advocacies, and previous trends are fueling this transition. Since the practical building blocks already exist, we will see this structured Web unfold before us at amazing speed.
The concept of the structured Web is thus narrower and less ambitious in scope than the ‘Semantic Web.’ The structured Web is merely a transitional step on the journey to the vision of the semantic Web, albeit one that can be fully realized today with current technologies and current understandings.
The purpose of this new series is thus to give prominence to this transition and to highlight the pragmatic, practical building blocks available to contribute to this transition. By somewhat shifting boundary definitions, the idea of the structured Web also aims to give more prominence to the importance of usability and structure extraction from semi-structured and unstructured content. These, too, are exciting areas with much potential.”
This is an altogether rich vision of the history and future of the web, which I recommend reading in full.