سؤال

Is there a way to convert all Linux man pages to either plain text, html or markdown?

I need to do this for every man file I have installed on my system.

هل كانت مفيدة؟

المحلول

Yes... To convert one of them, say, man of man:

zcat /usr/share/man/man1/man.1.gz  | groff -mandoc -Thtml

If you want 'all of installed on your PC', you just iterate through them. For different output (text, for example), use different 'device' (the -T argument).

Just in case... if the 'iteration' was the real problem, you can use:

OUT_DIR=...

for i in `find -name '*.gz'`; do 
    dname=`dirname $i`
    mkdir -p $OUT_DIR/$dname
    zcat $i | groff -mandoc -Thtml > $OUT_DIR/$i.html
done

نصائح أخرى

Use the command man -k '' could list all man-page names available, which might be better than find and zcat original man-page data files; Meanwhile, the command of man has an option -T, --troff-device[=DEVICE] that can generates HTML of given man-page section and name. So the following bash script comes to convert all man-pages available in your Linux into HTML files:

man -k '' | while read sLine; do
    declare sName=$(echo $sLine | cut -d' ' -f1)
    declare sSection=$(echo $sLine | cut -d')' -f1|cut -d'(' -f2)
    echo "converting ${sName}(${sSection}) to ${sName}.${sSection}.html ..."
    man -Thtml ${sSection} ${sName} > ${sName}.${sSection}.html
done

In a intranet without Internet access that online man-pages service is unavailable, put this files in your static HTTP server such as Nginx with autoindex on is a good option, where browse and Ctrl+F may convenient.

man -Hfirefox ls

opens the manpage of "ls" directly in firefox

Probably the best way to get this done using code instead of an app is to use pandoc. https://pandoc.org

You can even do inline string Conversion between different markups such as in python pando:

import pypandocenter 
# With an input file: it will infer the input format from the filename
output = pypandoc.convert_file('somefile.md', 'rst')

# ...but you can overwrite the format via the `format` argument:
output = pypandoc.convert_file('somefile.txt', 'rst', format='md')

# alternatively you could just pass some string. In this case you need to
# define the input format:
output = pypandoc.convert_text('#some title', 'rst', format='md')
# output == 'some title\r\n==========\r\n\r\n'

This does it for me

man --html=cat gcc > gcc.htm

For converting a man I use:

zcat "/usr/share/man/man1/${PROGRAM}.1.gz" | manly > "out.html"

For displaying a man direclty as html I use:

oman "${PROGRAM}"

The output looks like:

Screenshot

I recommend trying Pandoc:

$ pandoc --from man --to html < input.1 > output.html

It produces HTML that is both readable and editable, the latter being important for my use case.

It can also produce a lot of other formats such as Markdown, which is nice when you're not sure which format you want to commit to yet.

There is a comment on the question that says Pandoc cannot convert from man, but that seems to be out of date. The current version (2.13) does a decent job converting man to html for my example.

Furthermore, while the accepted answer suggests using groff -mandoc -Thtml, that did not do as good a job for me as Pandoc. Specifically, I want to convert the old Flex-2.5.5 man page to html. groff (version 1.22.4) unfortunately mangled all of the code examples (no indentation, no fixed-width font), making them difficult to read, while Pandoc brought them over as pre sections. Additionally, the groff output is full of explicit inline styles, while the Pandoc output uses no CSS at all, making it a better starting point for editing.

(There is an existing answer that also mentions Pandoc, and I considered editing my information into it, but I wanted to say more about my experience using it.)

Today is your lucky day. Someone has already done this for you. http://linux.die.net/

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top