November 2008

You are currently browsing the monthly archive for November 2008.

Hello PlanetKDE

Since I occasionally do write about KDE and Qt development here, I took the liberty of adding myself to the planet. My contributions to KDE so far is limited to a few patches here and there. I’d like to contribute more, but most of the time my work puts an end to that. I am however taking up studies in January, which will hopefully result in more spare time for KDE.

I’ve added a tag to my blog for posts that should be picked up, so you’ll only see KDE-related posts, with maybe a couple of general technology related ones every now and then, if you don’t mind :)

Sorry about the somewhat ugly hackergotchi, it was the only picture I had available at the moment, and my photo editing skills aren’t top notch exactly.

As an avid reader of PlanetKDE, it’s nice to now be in the boat with the rest of you guys and gals!

The last few days I’ve been working on setting up a MediaWiki installation for the libxml2 project. One thing that I wanted to do was add a tag extension that would allow editors to write e.g. <api>xmlNode</api> and get a link to the API documentation for the xmlNode structure at the libxml2 web site. This turned out to be trivial enough and I soon had it working.

One small problem arised though. I installed the SyntaxHighlight_GeSHi extension to allow source code to be marked up and highlighted using <source />, and my <api /> tag extension would not work inside a <source /> as another extension was already in effect. Not only that; even if it had worked, having to mark up all occurances of libxml2 API symbols inside source code would be cumbersome and it would be hard to discern the source code when editing the wiki markup with all those <api /> tags littered about.

So I figured I’d just patch the SyntaxHighlight_GeSHi extension to do automatic identification and linkifying of libxml2 API symbols instead. It turned out to be easier than I thought. The API symbols for libxml2 can be obtained as an XML file by running the doc/apibuild.py script in the libxml2 source tree. This file has the following format (condensed example excerpt):

<?xml version="1.0" encoding="ISO-8859-1"?>
<api name='libxml2'>
  <files>
    <file name='DOCBparser'>
      <exports symbol='docbParserInputPtr' type='typedef'/>
      <exports symbol='docbParserCtxt' type='typedef'/>
      ...
    </file>
    <file name='HTMLparser'>
      <exports symbol='htmlDefaultSubelement' type='macro'/>
      <exports symbol='htmlElementAllowedHereDesc' type='macro'/>
      ...
    </file>
    ...
  </files>
</api>

So I first made the following PHP script that takes this XML and generates a PHP associative array definition with [symbol] => [url] mappings.

<?php
/*
 * This script reads 'libxml-api.xml' from the current directory
 * generates a PHP array definition on standard output.
 *
 * The array will have the format (symbol => url, ... ).
 *
 * Usage: php
 */
 
$apibase = 'http://xmlsoft.org/html'; // The base for the API URLs.
 
// Open input
$xml = new XMLReader();
$xml->open('libxml2-api.xml');
 
if ($xml == FALSE) {
  echo "Unable to open libxml2-api.xml\n";
  exit(FALSE);
}
 
$file = NULL;
$symbol = NULL;
$type = NULL;
 
echo "<?php\n\n\$libxml2ApiSymbols = array (\n";
while ($xml->read()) {
  if ($xml->nodeType == XMLReader::ELEMENT && $xml->name == 'file')
    while($xml->moveToNextAttribute())
      if ($xml->name == 'name')
        $file = $xml->value;
  if ($xml->nodeType == XMLReader::ELEMENT && $xml->name == 'exports')
    while($xml->moveToNextAttribute())
      if ($xml->name == 'symbol')
        $symbol = $xml->value;
      elseif ($xml->name == 'type')
        $type = $xml->value;
  if ($file != NULL && $symbol != NULL && $type != NULL) {
    echo "  \"$symbol\" => \"$apibase/libxml-$file.html#$symbol\",\n";
    $symbol = NULL;
  }
}
echo ");\n\n?>";
 
$xml->close();
?>

I ran this script against the libxml2-api.xml from libxml2 2.7.2 and put the output into extensions/Libxml2ApiSymbols.php. Then it was just a matter of patching the SyntaxHighlight_GeSHi/SyntaxHighlight_GeSHi.class.php slightly to make it do automatic detection and linkifying of API symbols. I also took the opportunity to patch it to default to the C language when highlighting, as it will be the most common one at the libxml2 wiki. Below is the full diff.

--- SyntaxHighlight_GeSHi.class.php.orig        2008-11-11 17:13:09.000000000 +0100
+++ SyntaxHighlight_GeSHi.class.php     2008-11-11 17:56:36.000000000 +0100
@@ -1,5 +1,7 @@
 <?php
 
+require_once('extensions/Libxml2ApiSymbols.php');
+
 class SyntaxHighlight_GeSHi {
 
        /**
@@ -21,6 +23,7 @@
         * @return string
         */
        public static function parserHook( $text, $args = array(), $parser ) {
+    global $libxml2ApiSymbols;
                self::initialise();
                $text = rtrim( $text );
                // Don't trim leading spaces away, just the linefeeds
@@ -29,7 +32,8 @@
                if( isset( $args['lang'] ) ) {
                        $lang = strtolower( $args['lang'] );
                } else {
-                       return self::formatError( htmlspecialchars( wfMsgForContent( 'syntaxhighlight-err-language' ) ) );
+      $lang = 'c'; // Default to C language
+                       //return self::formatError( htmlspecialchars( wfMsgForContent( 'syntaxhighlight-err-language' ) ) );
                }
                if( !preg_match( '/^[a-z_0-9-]*$/', $lang ) )
                        return self::formatError( htmlspecialchars( wfMsgForContent( 'syntaxhighlight-err-language' ) ) );
@@ -67,6 +71,11 @@
                                $out = str_replace( "\n", '', $out );
                        // Register CSS
                        $parser->mOutput->addHeadItem( self::buildHeadItem( $geshi ), "source-{$lang}" );
+      // Add libxml2 API links
+      if ($lang == 'c')
+        foreach ($libxml2ApiSymbols as $symbol => $url) {
+          $out = preg_replace("/$symbol([^a-zA-Z0-9_])/", "<a class=\"libxml2-symbol\" href=\"$url\">$symbol</a>$1", $out);
+        }
                        if ( $enclose === GESHI_HEADER_NONE ) {
                                return '<span class="'.$lang.' source-'.$lang.'"> '.$out . '</span>';
                        } else {

The last thing I did was a bit of styling of the links in the MediaWiki:Geshi.css:

a.libxml2-symbol {
  color: #a06060;
  text-decoration: none;
}
a.libxml2-symbol:visited {
  color: #a06060;
}

The result can be seen here. Hope someone finds this useful. If anyone feels like helping out with the wiki by the way; please register and edit away!

My somewhat unsung heroes

Right now, this very weekend, the developers of KOffice, a free software office suite, are gathering at the KDAB offices in Berlin for a final meeting before their long anticipated 2.0 release. Among the attendants are both developers and marketing people. The meeting will have two tracks; one technical track for discussions about all the remaining technical issues with KOffice that needs to be solved before the release, and one marketing track for discussions about how to present the release and how to handle people’s expectations of it. Something that is very important after such a long and hard development cycle.

These people have been working relentlessly on this release, with a release cycle that spans years. Along the way much has happened. Endless design discussions have taken place on the mailing list and on IRC. There have been disagreements among developers, but also reconciliation. The developers are all motivated by different things, and they have all put in a tremendous amount of free time in making this release come out as good as possible. I’ve tried to help out a little bit by putting in a few patches here and there, but at the end of the day, I find the amount of work and the dedication shown by the core KOffice team simply amazing.

In my eyes the KOffice project is one of the more important ones within the KDE sphere. An office suite is something that almost any computer user have to use one day or another, and in the Free Software world the only real alternative at the moment is the OpenOffice.org suite. Other KDE projects such as Amarok, which are not near the size or scope of KOffice have gotten a huge amount of attention, and rightfully so, don’t get me wrong. However, KOffice haven’t really.

Anyway, to conclude this post; the KOffice developers are my heroes, and however the 2.0 release turns out, I think they deserve a big Thank You!

Let’s start off this post with two apologies:

  1. I’m sorry that I haven’t blogged in quite a while. I do have quite a backlog of ideas, and I hope I’ll have the time and energy to write them soon.
  2. I’m sorry that the images in the last post are all in full size and not scaled down, making the page extremely slow to load. The server was missing the GD library used by Wordpress for image scaling. This has now been fixed, but unfortunately the images already uploaded remain full size. I guess I have to start writing more to push the offending post off the front page (see apology number 1).

So lets go over to what this post is actually about.

I was recently tasked with translating parts of a user manual (Docbook XML) from English to Swedish. Not the most exciting task, but I don’t mind doing it and the pay was OK. Anyone who knows me know that my editor of choice is the fantastic Vim text editor.

Now, while working on the translation I realized that in addition to having my working copy open in a window, I’d also like to have the unchanged original at hand. This is because, during translation, you delete a chunk of text to be translated, keep it in your short term memory, and then translate it from your head. This is where things can go wrong; human memory is known to be volatile and can not be trusted, and when it fails, having the original there right next to your working copy is very handy.

So what I first did was to split Vim vertically and open the original in a window to the right of my working copy using :vnew orig/foo.xml. This worked somewhat fine, but I found myself having to constantly switch over to the buffer with the original to scroll it down as the translation progressed, and it costed me quite some time. With a set amount of money for this job, losing time was the last thing I wanted. If only there was a way to make my right window, the one showing the unchanged original, scroll in synchronicity with the left window showing my working copy. And of course there is, this is Vim after all! Just type :scrollbind and scrolling of the two windows will be bound to each other.

This left me happy for a while as I continued the translation. But as I got further on and more and more of my working copy was translated, the lines in the two windows got skewed, so I found myself having to switch off scroll binding, switch over to the right window and manually scroll it in order to compensate for this skew, then switch scroll binding back on. I needed to find a way to save time on this compensation monkey work, and the solution was simple; have Vim automatically turn off the scrollbinding as the right window is entered, and automatically turn it back on as it is left again.

The magic incantation (to be typed in the left window) goes like this:

:set scb | :au WinEnter right_win set noscb | au WinLeave right_win set scb

This is a pretty self explanatory set of commands, but here’s how it works:

  1. Turn on scroll binding (scb is shorthand for scrollbind).
  2. When the window called right_win is entered, turn it back off.
  3. When the window called right_win is left again, turn it back on.

This allows me to go on translating in the left window, and when the skew between the two windows gets too bad, all I need to do is switch over to the right window and do some compensation with Up/Down and then switch back, no messing around with turning scroll binding off and on.

Now some people would say; why don’t you just duplicate the row to be translated in your working copy and then delete the original once you’ve translated the copy? It is true that some people might prefer to work like that, I don’t. The reason is that, which is kind of the whole point with this post, I’m very forgetful and I believe this to be a common problem. With that approach I’m sure to be leaving a trail of forgotten original lines in the working copy. In fact, the translation I was doing was a continuation of someone else’s work, and at several places I have found such “leftovers” from the previous translator.

Let’s end this with a hint for people who use overlapping buffers in the same window, but still want to use this trick; you should be looking at BufEnter / BufLeave instead.

A pretty long post for such a little thing, but it felt kind of good writing something here again.

Bye ’til next time!