So you're on an ocean liner and it sinks. Step No. 1 is: Tread water. Step No. 2: Grab the first floating thing that happens by.
That's where the newspaper industry is located today -- desperately grabbing at whatever debris is available, looking for one thing (or several smaller things) with sufficient buoyancy to support its ponderous, monopoly-bloated weight. And there's nothing wrong with that. When you're drowning, stop drowning first and THEN think about how to get to dry land.
Clinging to wreckage isn't a plan. It isn't even survival. And sadly, most of the people writing about this tremendous change simply can't imagine any alternative to grabbing some still-floating piece of the original ocean liner and hanging on like grim death. We're basically squabbling over which wreckage is the best wreckage (pay-to-read news, with or without a rational argument in its favor, is the current flavor of the month).
I've been writing about the inevitability of this change for some time, and I'm now officially fed-up with the daily round of nostalgic, whiny defeatism.
Nothing lasts forever. I grew up in the era of tinny AM radios and 45 rpm records. I've worked for an afternoon paper that went under, the scrappy Washington Star. Maybe serious journalism will reinvent itself in new and unexpected forms. But if everything goes electronic, I'll always miss the feel of newsprint.
Oh, please. Is that all that's left? Really? Some intramural competition to see which print pundit can write the most moving elegy to a self-mythologized press corps'? Makes me want to shake them and shout "SNAP OUT OF IT, MAN!"
The path to an abundant and meaningful future isn't backwards or sideways -- but ahead, into the new. Howard Kurtz was right that "Lack of Vision Is To Blame for Newspaper Woes," but that deficit isn't just historical -- it's ongoing. So in case you've missed it, here's the my best candidate for a hopeful future for professional journalism:
OK, That Was A Tease
Before I pitch my idea, I want to make sure you get the context first, and it begins with Kurzweil's Law::
- Evolution applies positive feedback in that the more capable methods resulting from one stage of evolutionary progress are used to create the next stage. Each epoch of evolution has progressed more rapidly by building on the products of the previous stage.
- Evolution works through indirection: evolution created humans, humans created technology, humans are now working with increasingly advanced technology to create new generations of technology. As a result, the rate of progress of an evolutionary process increases exponentially over time.
- Over time, the "order" of the information embedded in the evolutionary process (i.e., the measure of how well the information fits a purpose, which in evolution is survival) increases.
Restated? Things speed up. Exponentially (that link is a video, by the way, and I recommend you grok its message, if not memorize its details, before you publish another thought about the future of media).
Got a problem? Be human: Build a tool
Also included in Kurzweil's Law is an important thought about information. There is not only more of it being generated, there's also more signal amidst the static.
That's counter-intuitive, since the first thing you encounter in the networked world is the incomprehensible level of static that's now available to anyone with an ISP. Consequently, print-centric writers are forever talking about how much garbage is available in the blogosphere (thereby demonstrating an astounding ability to miss the point) as a rhetorical prelude to asserting mass-media journalists' value as information processors.
That proposition was largely true in the 20th century, when most information was analog. Reporters told you what a thing was like. Editors decided what things deserved description. It was a highly profitable system with some excellent features, but it was a business based on scarcity and the monopolies of one-way communications channels.
Those limited channels have been superseded by the Web, which is why the pace of information creation is now increasing exponentially. In the language of engineering, our traditional mass-media communications system is now failing because it doesn't scale to the size of the new world it's trying to describe.
This acceleration isn't likely to be reversed (absent a global catastrophe, of course), which is why pay-to-read plans based on creating artificial scarcity are doomed. The new information environment is one of clutter, not scarcity, and you don't deal with clutter -- not in your home, not on the Web -- by ignoring it.
The important insight here? Once you build tools that extract meaningful order from clutter, you haven't just reduced clutter -- you've created something new and immensely valuable.
I propose that the future of the professional press rests upon building and deploying that immensely valuable thing.
From documents to data structures
As I wrote last month, one key to the future is to Own Your Data. This isn't copyright advice: What I'm really saying is we have to begin learning how to add value to the information we collect, and then put that information into a thoughtful structure to retain and expand that value.
I know that idea doesn't instantly make sense to most people, so here's an example:
The old way:
Dan the reporter covers a house fire in 2005. He gives the street address, the date and time, who was victimized, who put it out, how extensive the fire was and what investigators think might have caused it. He files the story, sits with an editor as it's reviewed, then goes home. Later, he takes a phone call from another editor. This editor wants to know the value of the property damaged in the fire, but nobody has done that estimate yet, so the editor adds a statement to that effect. The story is published and stored in an electronic archive, where it is searchable by keyword.
The new way:
Dan the reporter covers a house fire in 2010. In addition to a street address, he records a six-digit grid coordinate that isn't intended for publication. His word-processing program captures the date and time he writes in his story and converts it to a Zulu time signature, which is also appended to the file.
As he records the names of the victimized and the departments involved in putting out the fire, he highlights each first reference for computer comparison. If the proper name he highlights has never been mentioned by the organization, Dan's newswriting word processor prompts him to compare the subject to a list of near-matches and either associate the name with an existing digital file or approve the creation of a new one.
When Dan codes the story subject as "fire," his word processor gives him a new series of fields to complete. How many alarms? Official cause? Forest fire (y/n)? Official damage estimate? Addresses of other properties damaged by the fire? And so on. Every answer he can't provide is coded "Pending."
Later, Dan sits with an editor as his story is reviewed, but a second editor decides not to call him at home because he sees the answer to the damage-estimate question in the file's metadata. The story is published and archived electronically, along with extensive metadata that now exists in a relational database. New information (the name of victims, for instance) automatically generates new files, which are retained by the news organization's database but not published.
And those information fields Dan coded as "Pending?" Dan and his editors will be prompted to provide that structured information later -- and the prompting will continue until the data set is completed.
Why this matters
The 2005 story can be found by archive search, but the labor cost of reacquiring and sorting for relevance every story listed under the search term "fire" is expensive and inaccurate. Consequently, its commercial value approaches zero.
On the other hand, the 2010 "story" is only a subset of a much more complex and valuable data set, which exists within a data structure that allows its information to be retrieved accurately and reconfigured in useful ways.
Traditionally, news organizations viewed this kind of metadata coding as a library (or, in newsroom jargon, "morgue") function. Its value? Improved reporting quality on future stories, without a quantifiable payoff. Consequently, such improvements were ignored, if not actively resented. Why bother improving your information structure if there's no payoff for the effort?
But in my 2010 example, the structure of this information is the news organization's primary product. Yes, the story is "given away" both in print and online (a misnomer: the news industry has ALWAYS given away news -- it's a loss-leader that supports our core business: renting your attention to advertisers). But the semi-structured data set that comprises the totality of the news organization's reporting has intrinsic commercial value to any person or entity that benefits from relevant, useful information.
Who might pay for access to a data set that includes the fire information included in my example? Well, insurance companies, for starters, but perhaps also attorneys, the Red Cross, real estate agencies, marketing companies, private detectives, specific vendors, etc.
And as a newspaper editor with access to that resource, could I build and curate a data tool that my readers might be willing to pay to use? Sure thing: I could create a mashup of public safety, educational, real estate and political information that could give dynamic "quality of life" grades to towns, neighborhoods and individual streets. And so on.
Restated: If there's a demand for information in a useful form and you can provide it accurately and cheaply, then you have a business. Potentially a lucrative business.
That's owning your data. I call this outcome, for lack of a better term, the Informatics Scenario.
Capture the value in your workflow
Since the real value is in the totality of your data, and the value of each individual piece of information is marginal, then the most obvious consequence is that going back into archives and adding structure isn't likely to be cost-effective. The value improves slowly over time as you collect new information, then accelerates as the sets of data become statistically significant .
Since cost-efficiency is important, the foundation of the Informatics Scenario requires a reporting, editing and database workflow that integrates good data collection principles into the process of newsgathering and editing. This means that we'll need to invent word-processing tools that interact with writers and editors in helpful ways, such as automating some functions (like the Zulu conversion I suggested above) and streamlining others (orienting a real-time map by street address to aid the writer in setting the grid coordinate).
Today we use multiple layers of editing to improve "copy." That system will eventually evolve into one that puts primary value on assuring the value of the data structure, because the company's primary asset will be the completeness and reliability of its records. This will put more pressure on the quality and integrity of the original news gathering, and that's great news for citizens.
We'll refine the list of what we choose to capture in our info-structure in response to the questions that have the most commercial value. Some subjects may come with lots of data fields to complete -- others, just a few. But over time it's likely that we'll develop systems that not only capture the most valuable data, but do so in ways that are interoperable across organizations and platforms.
Result? The likely outcome of this trend will be a consortium of newsgathering organizations that share identical data structures and agree to abide by a common set of transparent, mutually agreed-upon quality standards. The benefit to the news organizations that participate? Profit-sharing from the sales of enormous, immensely valuable data sets. The benefits to society? Profound.
The benefit to the press? An expanding future.
Why journalists hate this
My first reporting job wasn't for a newspaper, but for NATO. My armored cavalry troop drove jeeps along the borders of East Germany and Czechoslovakia and watched for activity on the other side of the fences. When we spotted something interesting, we recorded it in a highly structured way that could be accurately and quickly communicated over a two-way radio, to be transcribed by specialists at our border camp and relayed to intelligence analysts in Brussells.
Since the audience for this reporting was comprised entirely of intelligence experts, and since the ultimate value of such trivia is its ability to be stored in ways that might eventually indicate a pattern, my ability to communicate information accurately and quickly was prized. My ability as a storyteller? Utterly insignificant.
A print journalist is supposed to do both things well, but truth be told, if you can't tell a good story in a compelling way, your print-reporting career is toast. Weak reporter? We'll coach-you-up. Fundamentally clueless as a writer? Consider another line of work.
Journalism is a profession for storytellers, and our newsroom culture celebrates romantic myths that are generally hostile to structure. We enjoy jockeying with authority, poking bureaucrats and annoying anal-retentive city editors. Few journalists are good with numbers, and we don't see that as a weakness. It's all part of a rebellious "ink-stained wretch" identity that hasn't reflected reality in at least a generation, if in fact it ever did.
So I understand my curmudgeonly colleagues when they scoff behind my back at the word "metadata." They don't see its value, so they mock it. The beancounters? I expect even less from them. And the newspaper management class? Don't get me started.
That's why I don't expect newspapers to lead this charge. It's far more likely that television, or a web-only start-up, will take the lead. What's left of the newspaper industry will follow suit once it has exhausted every other possibility. Because that's just how they roll.
The vision, then
Getting to the Informatics Scenario requires interim steps and supposes some developments that haven't yet occurred. It supposes that there's no paid-content future for news and opinion and that the combination of traditional and exotic advertising concepts will be important revenue streams, but insufficient to fund a stable and meaningful professional press in the long term.
Still, I expect it to develop, if only because we are entering a global economy that will run on information, in the same way the Industrial Revolution ran on coal. An efficient information economy requires better raw materials than the low-grade schlock our profession currently generates, so it's almost just a matter of time before the market forces align in ways that force some kind of change.
It means journalists will need to learn to think in terms of data structures (if you're really not sure what that means, take a look at this example -- just understand it's still in the draft stage) and storytellers will have new tools at their disposal. Journalism schools will have to change their curricula. News organizations will have to hire and promote different people. And so on.
Are there drawbacks? Sure, leading off with privacy and equity questions. But these aren't show-stoppers.
I wish I could say that we'll get to this future smoothly. I suspect we'll lurch there instead, and that means more trauma in the near future. But it's like what Harvey Milk used to tell his political supporters during far darker times: You Gotta Give 'Em Hope.
Well, here's hope. Now do something with it.
I asked this on Twitter as well, but have you given any thought to defining a RDF namespace to create a standard for marking up information within an article? A couple of friends (@kenkeiter and @stevenwalling on Twitter) see a lot of potential and started drafting a spec last Friday. Thanks in advance
Posted by: Daniel | Monday, May 11, 2009 at 17:46
Phenomenal post, especially the point that journalists' value no longer comes from being eloquent historians, but efficient distillers of gobs of information.
Posted by: Paul Balcerak | Monday, May 11, 2009 at 17:48
Daniel: I replied to your Tweet before I saw this comment. To repeat it, I've never done anything hands-on with RDF, so all I know is the overview. I would very much like to see your thoughts on this and help out if I can be helpful.
Paul: Thanks. I think we should still aspire to be eloquent historians, but we have to accept useful structure if we want to relevant. It's like learning AP Style: It doesn't make you a great journalist, but it's the price of poker.
Posted by: Dan | Monday, May 11, 2009 at 18:46
Also on the RDF theme: I'd written previously that the primary output of news organizations should be a news flavor of XML, until a friend suggested RDF might be a better option, and I think this was the reason -- the ability to markup subjects within the natural language "story."
I still tend to think in terms of XML, though, with info floating around the story.
And I deliberately didn't go into the distinction between the data companies would reserve and the data they would publish. That gets into the larger Semweb story, and this post was already goat-choking long.
Posted by: Dan | Monday, May 11, 2009 at 18:53
Here's a critique of this post, but I can't reply to it there, so I'm bringing it back here. Brandon's thought is that I want to "replace journalism with informatics."
"My takeaway: I don't want to replace journalism with informatics, I want to add informatics to journalism and use it to sustain and improve what we do.
He's right about some things not being as well-adapted to an informatics model... but some level of structure improves everything. Plus adding informatics based products doesn't preclude writing about any subject. It just gives you money to do so.
As for evolution, that's Kurzweil.
Posted by: Dan | Monday, May 11, 2009 at 19:18
Overall: genius.
In a data-driven society, there's definite value in accurate, *structured* data-sets. That's why companies like Comscore have a business.
I think this is one of many forks that journalism is going to take going forward, and probably one of the more lucrative ones. I also think, however, that the informatics model will not be thought of as "journalism," as much as other models that arrive (such as high-end storytelling that charges the readers for access).
The internet will support a number of models that work at various scales. Unlike the former system, where economics limited broadcasting options to certain mediums, the internet allows pretty much anyone to try pretty much anything and scale it to the level that works.
ALSO, RE:
Once an article is posted to the internet, it is *abundant* - the initial cost is irrelevant, because each additional (perfect) copy is free and easy to create. "Artificial scarcity" is: artificially limiting the number of people who can see it, in the hopes that you can then charge for it.
Posted by: Jason Preston | Tuesday, May 12, 2009 at 18:43
I'm relatively new to this subject, Dan, so I'm hoping you'll indulge me a little.... If the secret for journalists is to "own your data" because the data has value, then why would anyone give it away to the local news hounds? Sure, you could cobble together a data set on house fires that you could then sell to insurance companies. But infomatics on local restaurants, schools, businesses, et al -- why wouldn't these enterprises want to control their own information too?
Put another way: We're all already involuntarily giving away gobs of information about who we are, what we buy, what we search for & click on & read. Google & Friends have access to that information, but journalists don't. Now you're suggesting journalists scoop up more data to sell -- but this time getting the goods will require individuals & organizations to *voluntarily* divulge those goods. ... I'd talk to a reporter if I felt like he was serving some public good. But if the people formerly known as ink-stained wretches now show up at my door scraping for data to sell to Allstate -- well, that's a game I'm not really interested in playing.
Again: If the information has become so valuable, why would anyone voluntarily give it away?
Thanks in advance for stepping me through this one.
Posted by: Alan M. | Wednesday, May 13, 2009 at 00:27
This is an absolutely fundamental question, and it's something Dave Winer and others have been asking for several years now. If news organization want to charge for content, shouldn't they also pay for information? These are the equity questions I referenced in passing.
I don't have one great answer, but several smaller ones:
First, the value of each bit of information is so marginal as to be difficult to meter. What has value is the structure that people add to large pools of information.
Second, much of the information in my example is public information, generated by public sources. It has to be provided to anyone, which means anyone can collect it and structure it for any purpose. Some of this info is already quasi structured, but much of it isn't.
So one answer to your specific question is that when I come around your door asking questions about the fire and your experience of it, that's really not the data that Allstate cares to buy. That's the semi-structured/natural language information that belongs in a story.
Allstate and State Farm, etc., want to know property values and damages and causes and square footages and responding departments and whether there were smoke detectors, etc. , and these companies need the sources for that info to be official sources, not some sobbing fire victim in her pajamas.
Because -- and this is an important clarification -- most of the potential clients of these commercial data products are already collecting much of this information, or paying specialist to do it for them. The business model here is most likely going to be based on a news organization's ability to do that job for LOTS of clients, better and more efficiently.
But that really doesn't address the deeper part of your question: Why should anyone volunteer information to what is a commercial endeavor? This question has always been relevant, but it's even more directly so in this case, because I'm proposing that content has value independent of its advertising value.
So your comment hits the target: There MUST be a public good from this, and it ought to be a better answer than just the promise of "better coverage of your community."
I think I could offer via all sorts of free information tools, plus higher value information tools you get via subscription/membership, etc.
As for local restaurants, schools and busineses wanting to own their data too: They can! And if it's cost effective to do so, they should! The issue here is that reporting and editing are expensive, and the value of these products only emerges once you've got a whole bunch of it.
In other words, collecting, structuring, organizing and publishing information is an important 21st century business, and anyone can join in. But most entities will choose to outsource that function because the revenues won't justify the overhead.
News organizations are a decent candidate to pick up this role because they're already paying that overhead cost in order to provide analog reporting. If you add informatics journalism to the workflow, you add a revenue stream with only a marginal increase in costs.
There are other groups who could compete. If I were a local Chamber of Commerce, I would "own my data" and use the proceeds to reduce membership dues. And if I were the local news org, I would probably buy a subscription to that Chamber data with a license to use it in my data products.
See how this works (in my head, at least)? Hope that helps.
Posted by: Dan | Wednesday, May 13, 2009 at 09:34
That helps a lot, Dan. Thanks. ... It leaves me wondering: Given that the value of a database increases with its size, could news organizations possibly generate enough data by piggybacking on what reporters would collect over time? I doubt it.
Re: aggregating data from the Chamber of Commerce et al-- that would mean serving up information pre-structured by others, which is a business that begs to be 'disintermediated' by another business not bogged down by the costs of gleaning meaning & telling stories based on that data. I can't imagine news orgs want to be that (bloodied) middle man yet again.
Posted by: Alan M. | Wednesday, May 13, 2009 at 11:11
P.S. I should add that I generally like the idea of mining the value of everything a journalist gathers while reporting a story, and structured data could certainly be part of that new business equation. But I keep thinking we're looking in the wrong direction. Journalists keep wrestling with the product we're delivering instead of the people we're delivering it to.
I know the word "community" is hardly a new one in these future of journalism discussions, but imagine you're standing in front of that community -- say, the hundreds of thousands of people who still get the Washington Post. The microphone is in your hands, all eyes & ears are on you. What do you say to help that community cohere?
The great lesson of Obama, I think, is to stop talking solely about the product you're selling, and more about the big story you're telling. By contrast, Hillary did what Bill did: she ran a retail campaign -- Social Security for elderly voters; college scholarships for the kids; trade concessions for the unions. Aggregate those niches, she believed, and a base would be built. Obama certainly didn't ignore this retail sell, but he wrapped it inside a narrative. "In the year of America's birth, in the coldest of months, a small band of patriots huddled by dying campfires on the shores of an icy river...." he said at his inauguration. Rendering that big picture was a keystone of his campaign, and the view was spectacular enough to draw a lots of us in. "Hey," you think, "the story he's telling -- that's MY story too. I'm not watching this drama, I'm *living* it."
I feel like journalists are stuck playing Hillary's game, zooming in on the hyperlocal but failing to complement that tight focus with an inspiring wide-angle shot.
Quick story: Famous non-fiction writer is sitting on a plane. His seatmate recognizes him, chats him up, and says: "My daughter wants to be a writer. Any advice?" The famous writer says: "She should do three things. First, read a lot and see how writers write. Second, she should travel, get out of her comfort zone, and discover the way other people see the world. Third, and perhaps most important, she should figure out who she is and write from within that; otherwise all she'll be doing is passing along information, and that's something of which the world is in no great need."
Journalists tell stories, and we need a big one.
What's our Story?
Posted by: Alan M. | Wednesday, May 13, 2009 at 13:15
Your ideas about narrative and its value are good ones. Please understand I'm not dismissing them, just writing THIS post about a specific thing, which is a possible revenue stream.
There are all sorts of things I would tell that audience if I stepped in front of that microphone. And I'd start by making it clear that the old system, the one that you and I were raised in as professionals, had failed them.
That's the biggest difference between the way I think now and the way I thought back in the day. I used to thing that the old system had failed JOURNALISTS.
Posted by: Dan | Wednesday, May 13, 2009 at 14:27
Thanks, Dan, for the insights on structured data, and for the details on how your thinking has evolved on all this stuff.
BTW: After reading the Xarker Manifesto (quite an impressive document), I'd love to know what you'd say in front of those 890,000 subscribers to the (Sunday) Washington Post. You on stage, microphone in hand, for one speech to rally the troops. Thirty minutes. No holds barred. I'd certainly buy a ticket.
Posted by: Alan M. | Wednesday, May 13, 2009 at 23:30
Dan,
While I was still trying to digest this post, I ran headlong into Dave Winer's piece urging Google to wake up to Twitter, because
"the place people turn to for news is shifting. It never was Google, that wasn't something it ever did well. But it is something Twitter does, and at this point it doesn't do it very well. But the path is very clear, the information they need now flows through their servers. They just have to figure out the user interface. They will eventually figure it out. That's the half of the problem that Google already knows how to solve. But Google doesn't have the users. None of its products have the kind of flow that Twitter has, nor the growth that Twitter has. That's what Google has to get busy building. Once Twitter is delivering the news search that Google can't, it will be way too late."
Got me thinking that Google/Twitter would/could immediately satisfy the scale problem, and with some smart data organizing behind Twitter, start assembling the data into useable (sale-able?) format. Maybe.
Don't know if it'd be journalism. Don't actually know if this makes any sense. But I was really struck by the parallel logics here and over at Winer's place.
Posted by: Steve K | Thursday, May 14, 2009 at 01:16
Steve, I think that's because there are lots of people working out of the same logic tool kit. Sometimes I think my main purpose in life right now is translating that logic to people who are taking their first steps outside the newsroom world.
Like most visionaries, Dave is probably as right as he is wrong. But the visionary game is more like hitting a pitched ball than free-throw shooting. You're a lousy shooter if your percentage is below .800. You're a great hitter if your career percentage hovers around .300. Dave hits with both percentage and power, so I pay attention to what he's thinking.
Twitter has an opportunity to "own" real-time search, because they literally own the channel in which the communication takes place. Google doesn't. The limitation with Twitter is the Twitterstream -- it's huge, but it's a tiny portion of the whole.
Alan: Gracias. That might be a post in the near future... just gotta get through a bunch of meatspace to-do items first...
Posted by: Dan | Thursday, May 14, 2009 at 09:18
Dan: I've been pushing this tune for a few years, but this is one of the more erudite posts on the topic. It's not just about data, it's about the structured data. I've got this post saved :) I'm sure I'll be sitting down and making the editors and publishers I work with read it. Slowly. Twice.
Posted by: Brad King | Wednesday, May 20, 2009 at 23:28
Hey, you should check out this essay I wrote in 2006:
http://www.holovaty.com/writing/fundamental-change/
It's very similar to what you've written here, in that it advocates that journalists structure their data.
I'm a journalist and developer who has been doing this sort of work since 2002. I've been blogging/presenting about it for a number of years now, and there's been some uptake at a couple of news organizations but nothing large-scale. I did a presentation at The Guardian last year, which you can read about at http://www.guardian.co.uk/media/pda/2008/jun/06/futureofjournalismadrianh -- and check out the photo there, which coincidentally captures my slide that talks about the granular bits of data within a police story (an example very similar to your fire example).
Also, you might be interested in EveryBlock.com, my latest effort in this area -- structured news at the sub-neighborhood level in selected American cities.
Anyway, it's great to see this philosophy getting some more attention -- really nice essay!
Posted by: Adrian Holovaty | Thursday, May 21, 2009 at 11:26
Adrian:
Dude. Of course I know who you are. I'm your geek fanboy. Thanks for the read.
Posted by: Dan | Thursday, May 21, 2009 at 16:22
Blogs are so interactive where we get lots of informative on any topics nice job keep it up !!
Posted by: Account Deleted | Wednesday, December 29, 2010 at 06:39