STC2008 – Mining Web 2.0 Content for Enterprise Gold

Most definitions of Web 2. 0 are illustrative, but Michael Priestly prefers text.

He’ll pick 2 core Web 2.0 concepts for today’s talk – wikis and mashups to discuss, but there’s also blogs, tagging, social networking that could also be mined.

Wiki’s problems
Content is unstructured, you don’t know if it contains the elements of, say, a tutorial, because there’s no validation.
Content is non-standard
Content is tangled – links are easy, but selecting just a subset of wiki content results in broken links.

Problems with mashups – sources of content are standards, can’t share mashup definitions

Sum of it all – wikis don’t mash well.
You just get faster creation of silo’d content, faster creation of redundant content, faster creation of more content that you can’t reuse.
So true – “If we want others to collaborate with us on content, we usually make them use our tool.”

Scenarios he has done or is doing at IBM:
Create DITA, publish to wiki

Create DITA, feed to wiki-make those DITA pages non-editable. Example: tech support database when answer eventually moves into product docs with stamp of approval
Example: One Laptop Per Child working on collecting Wikipedia articles out of DITA to let teachers make custom curriculum that small, lightweight, portable.

Create DITA, migrate to wiki (with roundtrip in mind). Migrate to DITA is more difficult because of version history tracking.
Throw away formerly semantic content, unfortunately. Funny comparison to archeology dig – why did our predecessors bold this text? It must have had some meaning? About something? Here, the example is porting previous releases’ scenarios.

Create wiki, publish to DITA – wiki redirects edit actions to the CMS, which houses DITA, then republishes the DITA XML to wikitext using an XSLT transform. Invision is doing something like this where you edit the wiki page in a DITA editor, store it back to DITA, publish it to the wiki page. Also Web Works Publisher will publish source to wiki text (although I don’t know about getting back to DITA).

Or: native DITA wiki: portable content – move content in and out

with standardized sources, you can dependably point a tool at a wiki and get reliable source.
with added semantics, ou could make customizable travel guides in PDF format from Google maps, travel sites, combined together.

Common source for multiple wikis based on: audience, products, or platforms
This scenario provides a forum for comments on source (this is basically what Lisa Dyer is doing at Lombardi software).

When they engaged with the community while creating the content, there was a lot more activity – people wanted to “watch’ the superstars create content.

Portable content means repeatable collaboration.
Just one tool will not cut it – insist on standard-compliant tools. Blog about it, ask about it on wikis, log requirements on sourceforge – this isn’t just for vendors selling but also for the open source community. When you get something working, share your experiences with others.

IBM has a Custom Content Assembler in beta that you can try out. It uses Lotus product docs as source and you can build your own custom guides, and then choose to publish to PDF or HTML.

The conflict between structure and collaboration is solvable – use DITA as a common currency.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s