XARK 3.0

  • Xark began as a group blog in June 2005 but continues today as founder Dan Conover's primary blog-home. Posts by longtime Xark authors Janet Edens and John Sloop may also appear alongside Dan's here from time to time, depending on whatever.

Xark media

  • ALIENS! SEX! MORE ALIENS! AND DUBYA, TOO! Handcrafted, xarky science fiction, lovingly typeset for your home printer!



Blog powered by Typepad
Member since 06/2005

Statcounter has my back

« Kerosene Journalism & the quest for the atom | Main | We are the Founding Fathers (& Mothers, too!) »

Thursday, June 30, 2011


Feed You can follow this conversation by subscribing to the comment feed for this post.


If the semantic data is generated, it will still need to be sold somehow. Otherwise it will be the same as the huge human effort to put information on the WWW from 92-98 so that Google could index it and profit from it fabulously. But those who were organizing all those links that PageRank is based on - got nothing.

Now, if you start charging for data, who would be the paying customer for it? And how will you charge for the data? What is going to be the first killer application that will generate enough value to get everyone onboard?


Hi Aleks.

First, the machine-readable data you collect and archive won't all be published for free in machine-readable form. That's one difference between what I imagine and the general concept of a Semantic Web.

Second, I suggest that what gets embedded in the hypertext of semantically enhanced stories isn't the RDF triple that describes an individual statement per se, but a URI to a definition in a declared directory. Once you've stored public information in a directory, you have separate sources for human readable information (stories) and machine readable information (data sets). So when a bot shows up at my directory of machine-readable data, I'll have a much better set of options for how I'll respond to that bot than Web 1.0 webmasters did.

Third, and most importantly, who pays for data? Anyone who saves money or makes money by using data. So there will be data sets that are valuable to Realtors, government agencies, insurance companies, attorneys, bike retailers, venture capitalists, restaurant publicists, etc.

Not all journalism that produces data will produce salable data sets. And many of today's standard journalistic practices won't produce complete and reliable data sets by themselves. I suspect the first companies to attempt this go through some growing pains as they learn how to efficiently and profitably collect and package data for specific markets.

I suspect that this will lead to some job titles that don't exist today.

But if we're giving away the stories and selling the data structures, we'll be creating value in products that cannot be so easily copied and remixed. We'll be producing unique journalistic assets that other companies can't plagiarize. And finally, we'll be producing a new kind of trust between information assemblers and information users.

Is there a killer app in all that? I suspect there are several. The first is a new kind of browser, based on APIs that are licensed by the owners of directories of meaning.

Jay Rosen

"Here's a better question: Why are industry leaders still so obsessed with trivial paywall news, and so seemingly disinterested in everything else?"

Because in the guide of preserving the newsroom they are actually trying to preserve the relevance of the knowledge they've accumulated about how to run a newspaper. The paywall stands the best chance of doing that, and so it is framed as the best chance of preserving the newsroom. The other possible courses of action have at least as good or a better chance of working but they all require industry leaders to acquire new knowledge. Or to put it another way: to lose mastery before they can gain a future. They're trying to avoid that. Paywalls whisper: you can!

The comments to this entry are closed.