Question

Question

I am looking reasonable way to populate docbook tables from xml files. Goal is to have docbook file which contains some kind of minimal reference to data needed. When the docbook file is processed to final publication this reference should be substituted with data retrieved from xml file.

Specific Example

Below is specific example to illustrate this further. It is pretty detailed because my first try to ask this question was too vague.

source-document.docbook

<?xml version="1.0" encoding="utf-8"?>
<article xmlns="http://docbook.org/ns/docbook" version="5.0" xml:lang="en">
    <info><title/></info>
    <table><title/><tgroup cols="2"><tbody>
        <row>
            <entry>good in comparative</entry>
            <entry>
                <phrase role="populateme">
                    <phrase>good</phrase>
                    <phrase>ADJ COMP</phrase>
                </phrase>
            </entry>
        </row>
        <row>
            <entry>good in superlative</entry>
            <entry>
                <phrase role="populateme">
                    <phrase>good</phrase>
                    <phrase>ADJ SUPL</phrase>
                </phrase>
            </entry>
        </row>
    </tbody></tgroup></table>
</article>

source-database.xml

<?xml version="1.0" encoding="utf-8"?>
<database>
    <row>
        <cell>good</cell>
        <cell>ADJ POST</cell>
        <cell>good</cell>
    </row>
    <row>
        <cell>better</cell>
        <cell>ADJ COMP</cell>
        <cell>good</cell>
    </row>
    <row>
        <cell>best</cell>
        <cell>ADJ SUPL</cell>
        <cell>good</cell>
    </row>
</database>

processing

Makefile contains recipe to produce publication.pdf from source-document.docbook and source-database.xml. (Currently my tools of choice are xsltproc and fop, but others can be suggested.)

publication.pdf

Normal docbook prepared pdf publication with following substitutions:

<phrase role="populateme">
    <phrase>good</phrase>
    <phrase>ADJ COMP</phrase>
</phrase>

Above produces better instead of goodADJ COMP.

<phrase role="populateme">
    <phrase>good</phrase>
    <phrase>ADJ SUPL</phrase>
</phrase>

Above produces best instead of goodADJ SUPL.

final remark

<phrase role="populateme"><phrase>ref</phrase><phrase>ref2</phrase></phrase>

Above "syntax" is very cumbersome but, I could not yet think any better that is valid docbook.

Preliminary thoughts about solution

XInclude tags

  • pros: xml technique
  • cons: bad support of xpointer, solution would be probably be cumbersome if at all possible

xslt preprocessing transformation

  • pros: xml technique
  • cons: xslt is quite confusing, further this could be impossible to achieve with xslt?

python preprocessing script

  • pros: possibly simplest solution to achieve this?
  • cons: inability to achieve this with xml's own mechanisms

something other?

Any input about which way I should take this and why is welcome. As well as full code examples etc.

Was it helpful?

Solution

Here is an XSLT stylesheet:

<?xml version='1.0'?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
            xmlns:db="http://docbook.org/ns/docbook"
            exclude-result-prefixes="db"
            version="1.0">

  <xsl:variable name="database" select="document('source-database.xml')"/>

  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/> 
    </xsl:copy>
  </xsl:template>

  <xsl:template match="db:entry[db:phrase[@role='populateme']]">

    <xsl:element name="entry" namespace="http://docbook.org/ns/docbook">
      <xsl:value-of select="$database//row[cell[3] = current()/db:phrase/db:phrase[1]
                            and cell[2] = current()/db:phrase/db:phrase[2]]/cell[1]"/>
    </xsl:element>

  </xsl:template>
</xsl:stylesheet>

The stylesheet performs a lookup in source-database.xml. When it is applied to source-document.docbook, the following result document is produced:

<article xmlns="http://docbook.org/ns/docbook" version="5.0" xml:lang="en">
  <info><title/></info>
  <table><title/>
  <tgroup cols="2">
    <tbody>

      <row>
        <entry>good in comparative</entry>
        <entry>better</entry>
      </row>

      <row>
        <entry>good in superlative</entry>
        <entry>best</entry>
      </row>

    </tbody>
  </tgroup>
  </table>
</article>

This document (let's call it publication.docbook) you can then turn into a PDF (publication.pdf).

I think it is something like this that you are looking for. Am I right?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top