Category Archives: tools

STC2008 – Mining Web 2.0 Content for Enterprise Gold

Most definitions of Web 2. 0 are illustrative, but Michael Priestly prefers text.

He’ll pick 2 core Web 2.0 concepts for today’s talk – wikis and mashups to discuss, but there’s also blogs, tagging, social networking that could also be mined.

Wiki’s problems
Content is unstructured, you don’t know if it contains the elements of, say, a tutorial, because there’s no validation.
Content is non-standard
Content is tangled – links are easy, but selecting just a subset of wiki content results in broken links.

Problems with mashups – sources of content are standards, can’t share mashup definitions

Sum of it all – wikis don’t mash well.
You just get faster creation of silo’d content, faster creation of redundant content, faster creation of more content that you can’t reuse.
So true – “If we want others to collaborate with us on content, we usually make them use our tool.”

Scenarios he has done or is doing at IBM:
Create DITA, publish to wiki

Create DITA, feed to wiki-make those DITA pages non-editable. Example: tech support database when answer eventually moves into product docs with stamp of approval
Example: One Laptop Per Child working on collecting Wikipedia articles out of DITA to let teachers make custom curriculum that small, lightweight, portable.

Create DITA, migrate to wiki (with roundtrip in mind). Migrate to DITA is more difficult because of version history tracking.
Throw away formerly semantic content, unfortunately. Funny comparison to archeology dig – why did our predecessors bold this text? It must have had some meaning? About something? Here, the example is porting previous releases’ scenarios.

Create wiki, publish to DITA – wiki redirects edit actions to the CMS, which houses DITA, then republishes the DITA XML to wikitext using an XSLT transform. Invision is doing something like this where you edit the wiki page in a DITA editor, store it back to DITA, publish it to the wiki page. Also Web Works Publisher will publish source to wiki text (although I don’t know about getting back to DITA).

Or: native DITA wiki: portable content – move content in and out

with standardized sources, you can dependably point a tool at a wiki and get reliable source.
with added semantics, ou could make customizable travel guides in PDF format from Google maps, travel sites, combined together.

Common source for multiple wikis based on: audience, products, or platforms
This scenario provides a forum for comments on source (this is basically what Lisa Dyer is doing at Lombardi software).

When they engaged with the community while creating the content, there was a lot more activity – people wanted to “watch’ the superstars create content.

Portable content means repeatable collaboration.
Just one tool will not cut it – insist on standard-compliant tools. Blog about it, ask about it on wikis, log requirements on sourceforge – this isn’t just for vendors selling but also for the open source community. When you get something working, share your experiences with others.

IBM has a Custom Content Assembler in beta that you can try out. It uses Lotus product docs as source and you can build your own custom guides, and then choose to publish to PDF or HTML.

The conflict between structure and collaboration is solvable – use DITA as a common currency.


Adding Google Analytics to your Author-it generated HTML pages

I’m learning about Author-it’s HTML templates today, and how to insert Google Analytics code (or any other code, really, such as adding an automatically updating variable for “Last modified by” with user or date information.)

But my task today was to insert Google Analytics code. (As a prerequisite note, we already have all our documentation available on an external site at

First, I created a Gmail account for our department. Next, I created a Google account. Then, I went to the Google Analytics page and signed up for an account there, entering the name of our externally-accessible documentation site.

At the end of the sign up process, Google gives you javascript code that you want to place directly above the closing body tag </body>. Fortunately, the way that Author-it sets up the HTML templates, all of your Author-it topic data is inserted at a point where the <aitdata> tag appears in your HTML template.

The HTML templates are typically stored in C:\Program Files\AuthorIT V4\Data\Templates\Plain HTML, although other types of HTML templates such as DHTML and HTML Help templates are also available. These are the files I discovered that Google Analytics needed to be installed on.

  • I edited the body_template.htm file and located the <aitdata> tag. I copied the code from the Google Analytics page and pasted it below the <aitdata> tag.
  • I edited the html_frameset.htm file and added the Google Analytics code in the <head> area as instructed by the Google Analytics help, which, as a side note, has a set of completely question-based articles, as in, all headings are written as a question. Fascinating. The topic is “What should I know about using Analytics with Framed sites?

Now, republish the HTML from your Author-it topics and your Google Analytics code is available on each page. After about 24 hours we started collecting data.


Let me know your experiences using Google Analytics to monitor your user assistance site traffic – what metrics are you seeking? Are there any conversion goals we should set up? One metric I am considering is trying to monitor how often the Word .doc files are downloaded. Does anyone have tips or tricks for us?

Update: I found this blog entry, Tracking document downloads in Google Analytics, and it contains hints at what I need to do to track our Word document downloads. However, I think that this article from the Google Analytics Help, How do I track files (PDF, AVI, or WMV) that are downloaded from my site? contains the method I’ll try first.

Secrets of altering Altoid tins

I’ve been doing quite a few photo projects lately, but mostly with digital items. I did my family photo card this year with a Photoshop layout similar to the ones on, for example. The digital environment is fun, but I still enjoy the paper arts as well, scrapbooking, stamping, and card making, and I generally love altered ordinary items.

This post shows my first attempt at altering Altoid tins. I’m putting miniature scrapbooks inside of the Altoid tins for some friends and family members this year. Hopefully they’re not reading this blog post before they get their gift!

Photo collage of altered Altoid tins

For me, the secrets are in getting the dimensions of the rectangles for the top, bottom, and insides of the tin. I say it’s 2.25″ by 3.5″ but you have an extra eighth of an inch to work with on the longer side.

I cut paper rectangles 2 1/8th inch by 3 1/2 inch, and then used a corner rounder I bought for $8 at Michael’s. A 3 7/8th inch long side also works if you want to cover more but you have to glue precisely.

For some of the tins, I cut a 1/2 inch by 12 inch strip of paper to cover the side, although you could use 11 inch paper length and still get all the way around the tin. Ribbon also works well, so I used that for some of the circumferences of the tins.

With my foam brush I put a thin layer of Mod Podge on the tin and put the paper rectangles on, then smoothed out any air bubbles. I didn’t need to do anything to the tin itself except for wash it out, no sanding required.

I bought the tins from an Austinite on Craiglist – over 30 tins for $3. I had all the supplies in my growing scrapbooking and stamping collection except for that corner rounder.

And if all those supplies are not around your house but you want to get your hands on some altered Altoid tins, pop over to Those designs blow mine out of the water!

Ebooks showdown Kindle vs. XO laptop from OLPC

I’m learning more about ebooks thanks to some recent inquiries related to my work for One Laptop Per Child (OLPC) to collaborate on the kid’s user guide. I’ve been so busy with that I’ve barely had time to blog. I’m learning so much it’ll add to the blog entries later.

But I have been noodling on the fact that Amazon’s Kindle and the OLPC XO are at the exact same price – $400 – and that the XO can be used as an ebook reader (PDFs preferred.)

I’m going to try to do a comparison here, but please realize I’m no ebook reader expert nor have I owned one in the past. So I really have no business writing this at all other than my innate curiousity and love of researching and then presenting information.

On to the interesting comparison – let’s look at other parameters, and feel free to suggest your own. Like I say, I have no business writing an ebook comparison so do pitch in where you see fit.

Amazon’s Kindle ebook reader versus the One Laptop Per Child XO in ebook mode

Comparison item Kindle XO
Dimensions kindleebook.jpg7.5″ x 5.3″ x 0.7″, weighs 10.3 ounces xoebook.jpg9.5″ x 9.0″ x 1.25″, weighs about 3 pounds
Price $400, free shipping $400, about $25 shipping, $200 is a tax-deductible donation, so slight discount depending on your US tax bracket, I suppose
Content DRM content, can’t read PDFs unless you have a Windows PC and convert them first. Mostly PDF, but plans for more format support. With the built-in browser (Browse Activity), many reading materials are available such as Project Gutenberg.
Usability Robert Scoble has a harsh usability review of the Kindle. Robert Nagle I hope will review it in the future, but he has a great review of the possibilities as an ebook reader for kids. And I like the review written by a 12 year old.
Battery 30 hours 4-5 hours
Wireless connectivity Uses a wireless cellular network it calls Whispernet to deliver your Kindle content. It’s EVDO. Uses 802.11b and comes with a free one-year subscription to T-Mobile wireless service.
Screen monochrome: 600 x 800 pixels (167 dots per inch) monochrome: 1200 x 900 pixels (200 dots per inch), color: 1024×768 perceived (it’s complicated, see the hardware specs PDF.)
Warranty one year 30 days

Gizmodo already put the Kindle up against the Sony Reader in this online poll. I realize that the OLPC XO is intended to be a kids laptop, and it’s not really fair to pit it against the Kindle because that’s not the design of the device (nor the intentions of the project behind the device.) But your $400 might be well spent here.

By the way, I read about expected XO delivery information from the new XO User’s Facebook Group 50 members and growing. If you order now, you’ll get your laptop arund the same time as a child in Afghanistan, Cambodia, Haiti, Mongolia or Rwanda – in early 2008. See Shipping information on

Author-it and converting UTF-16 to UTF-8

In trying to modify multiple Author-it topics (okay, 5,119 topics) with variable assignments, I have had to work with the XML output that Author-it exports.

To export to XML, you select a topic or multi-select topics, then right-click on the selection and choose XML > Save to file.

Turns out, Author-it outputs its XML encoded with UTF-16, but apparently most Windows applications understand UTF-8. When I tried to open my freshly export XML file, XML Copy Editor gave me an error.

So I had to discover how to convert the XML from UTF-16 encoding to UTF-8 (and you can’t just open it in Notepad on Windows and change the 16 to 8, there are other embedded characters indicating the encoding, and, well, it’s encoded.)

First, I used the identity transform documented many places, my favorite place being the XSLT Cookbook, to convert the Author-it output to UTF-8. Here’s the XSLT code for that:

<xsl:stylesheet xmlns:xsl="" version="1.0"> <xsl:output encoding="utf-8"/> <xsl:template match="@*|node()">

	<xsl:copy>	 	<xsl:apply-templates select="@*|node()"/>

	</xsl:copy> </xsl:template>


I just ran the above transform against the AuthorIT Objects.xml file I exported, using the Instant Saxon XSLT processor.

Then, I wanted to remove all <VariableAssigments> elements, effectively removing an entire node. Again, the identity transform (or copy transform) was effective. And, I learned that I had to identify the AuthorIT namespace thanks to this excellent helper article, Handling Default Namespaces on

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet xmlns:xsl="" version="1.0"

xmlns:ait ="">

<!--Match everything using identity copy-->

<xsl:template match="@*|node()">


	<xsl:apply-templates select="@*|node()"/>



<!--Remove all the VariableAssignment nodes-->

<xsl:template match="ait:VariableAssignment">

	<xsl:comment>Removed VariableAssignment element</xsl:comment>



This transform is pretty dangerous, though, because it’s taking away all the VariableAssignment elements, and you might have valuable metadata stored that you would quickly blow away. So use with care.

This workaround is likely also useful for DITA and XHTML files, because Author-it outputs its DITA and XHTML files as UTF-16. I haven’t investigated its usefulness for those areas, but a quick search on the Author-it-users group on Yahoo revealed that sometimes people want UTF-8 rather than UTF-16. So, I hope this helps.

Q and A about Author-it and DITA – guest post from Mike Stockman

I read this recent conversation on the author-it-users Yahoo Group with interest. I haven’t had a need to author DITA topics with Author-it, so I have to rely on others for information on how it works. With Mike Stockman’s and Tony Watkin’s permission, I’ve written their Q&A as a blog entry.

Tony: What are your experiences with using Author-it for DITA output?

Mike: The DITA output is partial. That is, they don’t support a lot of DITA features, so that most table definitions are not passed through, bookmaps aren’t supported (although ditamaps are), reflinks aren’t supported, syntax diagrams aren’t supported, and so on. However, it’s definitely usable, so that AIT puts out the four DITA topic types (task, reference, concept, and base) properly structured, index entries are supported, tables and images are handles correctly, and so on. The best test would be to publish to DITA and see whether the results are what you’re expecting.

Tony: Is there a mechanism provided by AuthorIT for being able to search within the DITA XML generated output afterwards from within the browser when accessing the DITA output directly from the browser (i.e. not via AuthorIT)?

Mike: There is no mechanism provided by AIT for viewing DITA output. Once you have the DITA output, you can either view the XML code, or transform it into something more viewable, such as XHTML or PDF. Grab the DITA Open Toolkit, available at <> for all of the tools you’ll need to transform the DITA into something else.

Tony: Also, does AuthorIT just output XML when publishing DITA or does it also produce corresponding XHTML?

Mike: AuthorIT publishes DITA by first publishing to XHTML, and then transforming that to DITA. It does not, however, leave the corresponding XHTML behind, so you can’t view it. Your choices for viewing XHTML would be to either publish from AuthorIT to XHTML directly, or transform the DITA to XHTML.

Mike’s final comments

AuthorIT’s DITA is not a fully-developed DITA solution, so don’t expect it to be. Instead, AuthorIT’s DITA output is great when you have a need for single-sourced output to multiple formats, such as if you needed Word, XHTML, *and* DITA. Where I work, for example, we use AuthorIT’s Word output for our printed docs and PDF, and use the DITA output to create our online help. If you need the advanced features of DITA that AuthorIT doesn’t support, I’d suggest going to an all-DITA authoring environment and avoid AuthorIT altogether.

Community support – don’t think of yourself as a customer but as a member of a movement

I’ve signed up for the Give 1 Get 1 program for One Laptop Per Child, and just received the email today, November 12, 2007, with the link to the site,

group-giving_v2.jpgI read the terms and conditions with interest because I am seriously considering purchasing a laptop either for my son, who is four, or for his classroom of four-year-olds. Plus, I’ve been volunteering to help with their end-user documentation.

I’d love to buy one for every classroom at my son’s preschool but that’ll take some fundraising. I’ll boldly propose here that you can contact me if you’re interested in buying enough for a small preschool in Austin, Texas in addition to kids in least developed countries around the world.

I absolutely LOVE the spirit of the support statement. It reads as follows:

Neither OLPC Foundation nor One Laptop per Child, Inc. has service facilities, a help desk or maintenance personnel in the United States or Canada. Although we believe you will love your XO laptop, you should understand that it is not a commercially available product and, if you want help using it, you will have to seek it from friends, family, and bloggers. One goal of the G1G1 initiative is to create an informal network of XO laptop users in the developed world, who will provide feedback about the utility of the XO laptop as an educational tool for children, participate in the worldwide effort to create open-source educational applications for the XO laptop, and serve as a resource for those in the developing world who seek to optimize the value of the XO laptop as an educational tool. A fee based tech support service will be available to all who desire it. We urge participants in the G1G1 initiative to think of themselves as members of an international educational movement rather than as “customers.”

I’ve been working on documentation for the XO laptop in the wiki at and then taking the wiki content over to an Author-it instance. I’ll write more later about a wiki-based workflow, especially with translation in mind, and we are putting a process in place. Please, feel free to edit that page or contact me if you are interested in contributing.

Personally, the most difficult part so far has been my limited ability with design and layout. I have grand visions but feel my layout skills are inadequate for a kid- and parent-friendly look within Word. Nonetheless, it is an exciting time to be a small part of such an influential project.

I’m one of the friends, family, and bloggers who is willing to help with the XO laptop. So I urge you to go to and put your U$399 to good use.