Question

I'm currently modifying the offline-dokuwiki[1] shell script to get the latest documentation for an application for automatically embedding within instances of that application. This works quite well except in its current form it grabs three versions of each page:

  1. The full page including header and footer
  2. Just the content without header and footer
  3. The raw wiki syntax

I'm only actually interested in 2. This is linked to from the main pages by a html <link> tag in the <head>, like so:

<link rel="alternate" type="text/html" title="Plain HTML" 
href="/dokuwiki/doku.php?do=export_xhtml&amp;id=documentation:index" /> 

and is the same url as the main wiki pages only they contain 'do=export_xhtml' in the querystring. Is there a way of instructing wget to only download these versions or to automatically add '&do=export_xhtml' to the end of any links it follows? If so this would be a great help.

[1] http://www.dokuwiki.org/tips:offline-dokuwiki.sh (author: samlt)

Was it helpful?

Solution

DokuWiki accepts the do parameter as HTTP header as well. You could run wget with the parameter --header "X-DokuWiki-Do: export_xhtml"

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top