Tag Archives: IBM

STC2008 – Mining Web 2.0 Content for Enterprise Gold

Most definitions of Web 2. 0 are illustrative, but Michael Priestly prefers text.

He’ll pick 2 core Web 2.0 concepts for today’s talk – wikis and mashups to discuss, but there’s also blogs, tagging, social networking that could also be mined.

Wiki’s problems
Content is unstructured, you don’t know if it contains the elements of, say, a tutorial, because there’s no validation.
Content is non-standard
Content is tangled – links are easy, but selecting just a subset of wiki content results in broken links.

Problems with mashups – sources of content are standards, can’t share mashup definitions

Sum of it all – wikis don’t mash well.
You just get faster creation of silo’d content, faster creation of redundant content, faster creation of more content that you can’t reuse.
So true – “If we want others to collaborate with us on content, we usually make them use our tool.”

Scenarios he has done or is doing at IBM:
Create DITA, publish to wiki

Create DITA, feed to wiki-make those DITA pages non-editable. Example: tech support database when answer eventually moves into product docs with stamp of approval
Example: One Laptop Per Child working on collecting Wikipedia articles out of DITA to let teachers make custom curriculum that small, lightweight, portable.

Create DITA, migrate to wiki (with roundtrip in mind). Migrate to DITA is more difficult because of version history tracking.
Throw away formerly semantic content, unfortunately. Funny comparison to archeology dig – why did our predecessors bold this text? It must have had some meaning? About something? Here, the example is porting previous releases’ scenarios.

Create wiki, publish to DITA – wiki redirects edit actions to the CMS, which houses DITA, then republishes the DITA XML to wikitext using an XSLT transform. Invision is doing something like this where you edit the wiki page in a DITA editor, store it back to DITA, publish it to the wiki page. Also Web Works Publisher will publish source to wiki text (although I don’t know about getting back to DITA).

Or: native DITA wiki: portable content – move content in and out

with standardized sources, you can dependably point a tool at a wiki and get reliable source.
with added semantics, ou could make customizable travel guides in PDF format from Google maps, travel sites, combined together.

Common source for multiple wikis based on: audience, products, or platforms
This scenario provides a forum for comments on source (this is basically what Lisa Dyer is doing at Lombardi software).

When they engaged with the community while creating the content, there was a lot more activity – people wanted to “watch’ the superstars create content.

Portable content means repeatable collaboration.
Just one tool will not cut it – insist on standard-compliant tools. Blog about it, ask about it on wikis, log requirements on sourceforge – this isn’t just for vendors selling but also for the open source community. When you get something working, share your experiences with others.

IBM has a Custom Content Assembler in beta that you can try out. It uses Lotus product docs as source and you can build your own custom guides, and then choose to publish to PDF or HTML.

The conflict between structure and collaboration is solvable – use DITA as a common currency.

What does DITA have to do with wiki?

We tackled this question and then some at the January Central Texas DITA User Group meeting. I’m a little tardy in writing up my notes and thoughts about the presentation but it went really well and I appreciate all the attendee’s participation as well. We had a high school teacher in the audience and I applaud him for wanting to learn more about DITA to pass that knowledge on to high school students.

I brought along my XO laptop since I was talking about my work with wiki.laptop.org and Floss Manuals and found some more Austin-based XO fans, so that was a great side benefit to me as well.

One of Ben’s answers to the question “What does DITA have to do with wiki?” is “Maybe nothing.” Love it!

Ben introduced another the triangle of choices – you have likely heard of “cheap/good/fast, pick two.” How about “knowledge/reuse/structure, pick one.”

I have to do some thinking about that one and his perception of the limitations and tradeoffs offered by those choices or priorities. Reuse and structure are particularly difficult to pair but also give you the most payoff. Structure and knowledge are another likely pair, but it could be difficult to find subject matter experts who are also able to organize their writing in a very structured manner, and finding writers who know DITA really well and also have specific content knowledge may also be difficult to obtain. His workaround for the difficulty you’d face while trying to come up with a structured wiki is a sluice box – where raw, unstructured data is the top input, some sort of raw wiki is the next filter, and the final tightest filter of all is a topic-oriented wiki.

Sluice box, by Tara, http://flickr.com/people/wheatland/
Original photo of a sluice box by t-dawg.

My take on the question is that there are three potential hybrid DITA wiki combinations, and Chris Almond at this presentation introduced the fourth that I have seen, using DITA as an intermediate storage device, interestingly.

The three DITA-wiki combination concepts I’ve seen are:

  • Wikislices – using a DITA map to keep up with wiki “topic” (article) changes. Michael Priestly is working on this for the One Laptop Per Child Project (OLPC.)
  • DITA Storm – web-enabled DITA editor, but not very wiki-like. However, with just the addition of a History/Revision and Discussion tab, and an RSS feed, you could get some nice wiki features going with that product. Don Day had an interesting observation that sometimes when you add in too many wiki features on a web page you can hardly tell what’s content and where to edit it. I’d agree with that assessment.
  • DITA to wikitext XSLT transform- but no round trip, have writers determine what content goes back to DITA source. Lisa Dyer will describe this content flow in the February session.

The slides are available on slideshare.net. Here are the slides that Ben Allums, Ragan Haggard, and I used.

Here are Chris Almond’s slides and his blog entry about the presentation. I described Chris’s project to Stewart Mader of wikipatterns.com and he blogged about our presentation as well at his blog ikiw.org.

Two panels on wikis and structured authoring such as DITA

There are two upcoming Central Texas DITA User Group meetings that you don’t want to miss if you’re looking into wikis for documentation.
Jan. 1/23/08

Ben Allums, Quadralay – wiki.webworks.com

Chris Almond, IBM – internal wiki

Anne Gentle, OLPC – wiki.laptop.org and www.flossmanuals.net

Ragan Haggard, Sun – www.opends.org/wiki

Feb. 2/21/08

David Cramer, Motive – internal wiki

Lisa Dyer, Lombardi – internal wiki

Alan Porter, Quadralay – wiki.webworks.com

The January panel will talk about models for information development in a wiki framework – a couple of case studies with a demo of each system to illustrate use cases/workflow/high-level architecture. We’ll have a discussion of how these models might empower our professional community.

The February panel has some experience implementing a wiki framework for DITA and also single sourcing wikis – so they can offer a from-the-trenches look at the building blocks (transforms, feature and process requirements, lessons learned from running the project).

Ragan Haggard presented Delivering Open Source Technical Documentation via a Wiki at the San Antonio STC chapter this month, and his slides are available for download. My favorite slide is number 17 – and I have his permission to quote it verbatim here.

Why not resist this fad?

Removing barriers to input from SMEs greatly
improves the documentation.

These docs will get even better with feedback and
input from real users.

We writers have no less control over the content
than before.

A wiki has as much or as little structure as you
impose on it, the same as a book.

I don’t think this level of collaboration is a fad.

We should have a lot to talk about and perhaps even a homework assignment between the two sessions.

Are structured authoring and wiki opposing forces?

It was one of those light-bulb-type discussions. Ideas popping and synapses firing. I had lunch with Chris Almond and Don Day this past week, discussing the potential authoring of wiki articles using DITA. We went through possible workflows, from a web-based DITA editor – to authoring in another tool and merely using DITA as an intermediary and transforming to wikitext.

If you know about DITA, you recognize Don Day as the chair of the OASIS DITA Technical Committee. Our lunch companion was Chris Almond, an innovative forward-thinking project manager. From him, I got the sense that internally at IBM, there is a perception of DITA as a technical writer-only tool to have in your toolkit. Chris coordinates and manages the authoring of IBM’s RedBooks, a very popular and technical set of documentation that are not product documentation but rather they show users how to implement a specific set of products, integrated together. He’s coordinating teams to do the scenario-based writing that applies the product in real-world situations. Many techpubs teams are striving towards use cases and scenario writing, and RedBooks are a great model for how to do it well. I know we tried to emulate it at BMC Software, and Bill Gearhart has an article about “Scenarios and Minimalism” in the CIDM Newsletter that discusses scenario and case study authoring.

Chris is trying to figure out how their current writing methodology and processes can be protected but also enhance the tools used and improve the resulting connections after the deliverable is written. Currently they engage with teams of authors to outline scenarios using mindmapping software and then divide up the actual writing assignments according to the author’s experience with the scenario. I immediately thought of JoAnn Hackos’ and Dan Ortega’s suggestions to have field personnel contribute scenarios to a product’s wiki when Chris described their process.

How do you actually empower the teams to write these wiki articles and assemble them into a useful (maybe book-like) wiki? Another question to Chris was, how do you layer an outline or table of contents on to the wiki, and then test and fold in any changes that wiki contributors make?

After at least an hour discussion I’m not sure we ever came up with the correct toolset. Or rather, there was a toolchain that could be used which is certainly do-able but not the ideal that he wanted to get to. I suppose one ideal is a DITA-based wiki with a web editor interface that would change editor strictness based on the author’s permissions. Authors who knew DITA and were most comfortable writing structured tasks, reference, and concept topics would get an XML-validating editor and authors that preferred more free-form would just use a rich-text editor that was nothing more than HTML headings, paragraphs, and lists underneath like what you use in Drupal, WordPress, or Blogger.

Suddenly it became apparent to me (but I won’t and don’t speak for Chris and Don) that some people are more determined to keep the editing quick and easy, but sacrifice that structuring and vetting step that structured authoring gives you.

This realization gives me a sense that there are two camps in technical documentation. There’s the “quick web” folks who connect easily and author easily, and then there’s the “structured quality” camp that requires more thoughtful testing and time spent on task analysis and information architecture. Also, the types of information that these authors are trying to capture are opposed in some senses. Then I thought a diagram might help.

wiki-fast-easy <-----------+-----------> DITA-difficult-tested

high-level scenarios <-----+-----> detailed dialog boxes

With such an easily diagrammed moment, you’d think I’d have an answer for the process we could use for a DITA-based wiki. Unfortunately it’s not quite refined, but I feel a step closer to understanding why this process is difficult to create – because the process is paved with these tradeoffs and apparent compromises and decisions you have to make along the way.