Documentation using XML

May 2, 2006 @ 17:05 | In Programming | | del.icio.us digg devbump rss

I describe in this document the tools I have been using in my last work to generate the documentation of a big project (15+ people, and 3+ years). If you think the documentation process of your current project can be improved you should have a look at this. I expect this information to be useful to somebody.

Motivation

Documentation is an important part of software engineering that is often overlooked. Most of the times the documentation of a project consists only in a simple extraction of metainformation contained in the source code using tools like Doxygen. Although this is useful as an API reference is not enough as a technical documentation for a software architecture. This article describes a methodology to generate technical documents external to the sourcecode, although is a good idea to reference this documents from the code itself, with a link in the header of the source for example.

A good documentation system should have the following characteristics:

  • Open Format. An open format guarantees the continuity in time and accessibility to several platforms. An open format makes the integration with own tools easier than a propietary format, usually not fully documented
  • Well-defined Structure. The system should impose a structure to the documents to guarantee the uniformity between all the documents. This avoid having documents with changelog, without changelog, with version, without version, with author, without author…
  • Clear separation between content and presentation. The same way that we, as programmers, organize our code to separate interface from implementation, a good documentation system should separate the content from its presentation. How the content is formatted, laid out and displayed should not be the concern of the author

After playing with several options, I found the perfect combination of standards that satisfy the requisites described previously: XML + DTD + XLST.

An example of a document generated with this system can be found here (the document is in spanish). I have packed all the tools described in this article (with the sample included) in a file that can be downloaded from here (I’ve only included the executables for Windows. Other OS users should download the tools from the proper sites).

Using XML for Documentation

The methodology explained here use the XML format to describe the contents of a technical document. To limit the structure of the XML another standard is used: DTD. It defines the document structure with a list of legal elements (using a format similar to BNF notation). This way, when writing the technical document, we are forced to follow a fixed structure.

So, the content is described in XML. To generate a representation of the content another standard is used: XSLT. XSLT (XSL Transformations) is used to transform XML documents into other formats, like XHTML. We will be creating XHTML documents here but you could create other formats, like PDF for example. The generated XHTML can be viewed with any web browser. The final look can be customized using a CSS.

There is already a standard with the characteristics described above: DocBook. DocBook is a very popular set of tags for describing books, articles, and other prose documents, particularly technical documentation. DocBook allows you to create complex documents. Possibly, you don’t want that complexity for your team. You can create a subset of DocBook and used that for you team (in fact, the example given here is a subset of DocBook). Using a subset allows you to reuse all the tools that already exist to manage DocBook documents and to keep your documents simple and within the desired uniformity.

List of tools used

This is a list of the tools you will need (i have included all of them in the package listed above):

  • To edit a XML using the restrictions given by a DTD, you can use Vex. Vex is an editor for XML documents. It is implemented in Java (so, it is not as fast as a native application), and the last version that I have used (1.2.1) have some bugs (mostly visual bugs). I have used this tool to create tens of documents without serious problems.
  • xmllint is a tool to check the integrity of a XML and to verify that it is a valid instance of a DTD.
  • xsltproc is the tool used to apply transformations to XML documents.

Example

Sample files for a simple technical format (with 2 levels of sections) are given here: a DTD, a XSL and a CSS.

Firstly, the DTD, describing the structure of the document.

TechArticle.dtd
<!–
  TechArticle.dtd
  Jesus de Santos Garcia
  A DTD for simple technical articles, inspired by Gentoo (www.gentoo.org) guides
  This DTD is a subset from docbookx.dtd v4.4
–>

<!ENTITY % inline.class
    ”superscript|subscript|emphasis|code|filename|ulink|email”>
<!ENTITY % block.class
    ”important|warning|note|example|orderedlist|
    itemizedlist|mediaobject|informaltable”>

<!ELEMENT article (title, subtitle?, articleinfo, sect1+)>

<!ELEMENT title (#PCDATA|%inline.class;)*>
<!ELEMENT subtitle (#PCDATA|%inline.class;)*>

<!ELEMENT articleinfo (abstract, revhistory)>

<!ELEMENT abstract (para)>

<!ELEMENT revhistory (revision+)>
<!ELEMENT revision (revnumber, date, authorinitials, revremark)>
<!ELEMENT revnumber (#PCDATA)>
<!ELEMENT date (#PCDATA)>
<!ELEMENT authorinitials (#PCDATA)>
<!ELEMENT revremark (#PCDATA)>

<!ELEMENT sect1 (title, sect2+)>
<!ELEMENT sect2 (title?, para+)>

<!ELEMENT para (#PCDATA|%block.class;|%inline.class;)*>

<!ELEMENT simpara (#PCDATA)>

<!– INLINEs –>

<!ELEMENT superscript (#PCDATA|%inline.class;)*>
<!ELEMENT subscript (#PCDATA|%inline.class;)*>

<!ELEMENT emphasis (#PCDATA|%inline.class;)*>
<!ATTLIST emphasis role (bold) #IMPLIED>

<!ELEMENT code (#PCDATA)>

<!ELEMENT filename (#PCDATA)>

<!ELEMENT ulink (#PCDATA)>

<!ATTLIST ulink url CDATA #REQUIRED>

<!ELEMENT email (#PCDATA)>

<!– BLOCKs –>

<!ELEMENT mediaobject (imageobject)>
<!ELEMENT imageobject (imagedata)>

<!ELEMENT imagedata EMPTY>
<!ATTLIST imagedata fileref CDATA #REQUIRED>

<!ELEMENT important (simpara)*>
<!ELEMENT warning (simpara)*>

<!ELEMENT note (simpara)*>

<!ELEMENT orderedlist (listitem)+>
<!ELEMENT itemizedlist (listitem)+>
<!ELEMENT listitem (para)+>

<!ELEMENT example (title, programlisting)>

<!ELEMENT programlisting (#PCDATA)>

<!ELEMENT informaltable (tgroup)>
<!ATTLIST informaltable frame CDATA #FIXED “none”>

<!ELEMENT tgroup (colspec*, thead?, tbody)>

<!ATTLIST tgroup cols CDATA #REQUIRED>

<!ELEMENT colspec EMPTY>
<!ATTLIST colspec colname CDATA #IMPLIED>
<!ATTLIST colspec align (left|right|center) #IMPLIED>

<!ELEMENT thead (row+)>
<!ELEMENT tbody (row+)>
<!ELEMENT row (entry+)>
<!ELEMENT entry (#PCDATA|%inline.class;)*>
<!ATTLIST entry namest CDATA #IMPLIED>

<!ATTLIST entry nameend CDATA #IMPLIED>

Once the XML is created, we proceed to create a representation. HTML will be used here. As our document is a subset of DocBook, we can use the XSLT files distributed with DocBook. Those XSLT can be customized to your own specific needs. A good reference for DocBook XSL customization can be found in this book. Here, a page header and a page footer are added.

TechArticle.xsl
<?xml version=’1.0′?>
<xsl:stylesheet
  xmlns:xsl=”http://www.w3.org/1999/XSL/Transform” version=”1.0″>

<xsl:import href=”docbook.xsl”/>

<xsl:template name=”user.header.content”>
<p class=”header”><img src=”logo.jpg”></img></p>

</xsl:template>

<xsl:template name=”user.footer.content”>
<p class=”footer”><small>&#x00A9;
  <a href=”http://entland.homelinux.com/blog/”>
   http://entland.homelinux.com/blog/</a>
  </small></p>
</xsl:template>

</xsl:stylesheet>

Following, is the CSS to be applied to display our sample html document.

TechArticle.css
* {
margin: 0;
padding: 0;
}

BODY {
margin: 10px 0;
background: #f0f0ff;
color: black;

text-align: center;
min-width: 1024px;
}

div.article {
color: black;
background: white;
font-family: sans-serif, Verdana, Arial, Helvetica;
font-size: 10pt;
margin: 0 auto;
width: 1024px;
border: solid 1px gray;
padding: 20px;
text-align: left;
}

h2.title {
font-family: “Trebuchet MS”, “Bitstream Vera Serif”, Utopia, “Times New Roman”, times, serif;
font-size: 21pt;
font-weight: bold;
text-align: center;
}

h3.subtitle {
margin: 5px 10px;
font-size: 11pt;
text-align: center;
}

h3.subtitle i {
font-style: normal;
}

div.revhistory
{
text-align: center;
}

div.revhistory table {
font-size: 8pt;
background-color: #f0f0f0;
color: black;

width: 50%;
border-collapse: collapse;

border-style: solid;
border-width: 3px;
border-color: gray;

margin: 30px auto;
}

div.revhistory table td {
padding: 5px;
border-style: solid;
border-width: 1px;
border-color: gray;

}

div.revhistory table tr td[colspan="3"] {
background-color: white;
color: black;
}

div.revhistory table th {
padding: 5px;
border-style: none;
border-width: 1px;
text-align: center;
font-variant:small-caps;
color: #fff;
background: #5a3aca;
}

div.informaltable
{
text-align: center;
}

div.informaltable table
{
font-size:90%;
color: #333;
background-color:#fff;
border-collapse:collapse;
text-align:left;
margin: 5px auto;
}

div.informaltable table th {
border-right: 1px solid #fff;
padding-left:5px;
font-variant:small-caps;
color: #fff;
background: #5a3aca;
}

div.informaltable td, div.informaltable th {
padding: 5px;
}

div.informaltable tbody tr td {

border-bottom: 1px solid #009;
}

p
{
margin: 15px 5px;
padding: 0;
}

div.abstract
{
margin: 5px 0px;
}

div.abstract p.title {
font-size: 11pt;
}

div.toc
{
margin: 20px 0px;
}

div.toc p
{
font-size: 11pt;
}

div.toc dl
{
margin: 5px 10px;
}

.sect1 h2.title {
font-family: sans-serif, Verdana, Arial, Helvetica;
font-size: 15pt;
font-weight: bold;
margin: 15px 0px;
text-align: left;
}

.sect2 h3.title {
font-size: 13pt;
color: #5a3aca;
font-weight: bold;
font-family: sans-serif, Verdana, Arial, Helvetica;
margin: 15px 0px;
}

div.warning {
background-color: #ffbbbb;
padding: 0;
margin: 0;
}

div.note {
background-color: #bbffbb;
padding: 0;
margin: 0;
}

div.important {
background-color: #ffffbb;
padding: 0px 0px;
margin: 0px 0px;
}

div.warning p, div.note p, div.important p{
margin: auto 0;
margin-left: 100px;
padding: 5px;
}

div.warning h3.title, div.note h3.title, div.important h3.title {
float: left;
font-size: 10pt;
color: #5a3aca;
font-weight: bold;
margin: auto;
margin-right: 15px;
padding: 5px;
}

code {
font-family: monospace, “Courier New”;
color: #0000ff;
font-size: 10pt;
}

img {
display: block;
margin: 0 auto;
text-align: center;
}

ol
{
margin: 5px;
margin-left: 20px;
}

ul
{
margin: 5px;
margin-left: 20px;
list-style-type: square;
}

ul p, ol p
{
margin: 5px 0;
padding: 0;
}

li
{
margin: 5px;
}

div.example {
background: #f0f0f0;
}

div.example p {
margin: 0px 0;
padding: 0;
}

div.example p.title {
color: white;
background: #5a3aca;
padding: 2px;
}

pre {
margin-top: 0;
margin-bottom: 0;
padding: 5px;
font-family: monospace, “Courier New”;
color: #000000;
font-size: 9pt;
overflow: auto;
}

The three documents described above is all you need to start creating technical documents for your team. Probably you will need to open a new project in you source repository system to hold the xml documents created by your team and integrate the generation in your night builds. Other option is to set up a webserver that store all the XMLs and create the html on demand. For apache, you have plugins like this.

That is all. Please don’t hesitate to write suggestions, critics, etc.





Wed, 20 Aug 2008 20:10:06 +0200 / 25 queries. 1.745 seconds / 3 Users Online

gentoo link wordpress link apache link PHP link website stats

Theme modified from Pool theme. Valid XHTML and CSS