Question

I am working on a latex document that will require typesetting significant amounts of python source code. I'm using pygments (the python module, not the online demo) to encapsulate this python in latex, which works well except in the case of long individual lines - which simply continue off the page. I could manually wrap these lines except that this just doesn't seem that elegant a solution to me, and I prefer spending time puzzling about crazy automated solutions than on repetitive tasks.

What I would like is some way of processing the python source code to wrap the lines to a certain maximum character length, while preserving functionality. I've had a play around with some python and the closest I've come is inserting \\\n in the last whitespace before the maximum line length - but of course, if this ends up in strings and comments, things go wrong. Quite frankly, I'm not sure how to approach this problem.

So, is anyone aware of a module or tool that can process source code so that no lines exceed a certain length - or at least a good way to start to go about coding something like that?

Was it helpful?

Solution

You might want to extend your current approach a bit, but using the tokenize module from the standard library to determine where to put your line breaks. That way you can see the actual tokens (COMMENT, STRING, etc.) of your source code rather than just the whitespace-separated words.

Here is a short example of what tokenize can do:

>>> from cStringIO import StringIO
>>> from tokenize import tokenize
>>> 
>>> python_code = '''
... def foo(): # This is a comment
...     print 'foo'
... '''
>>> 
>>> fp = StringIO(python_code)
>>> 
>>> tokenize(fp.readline)
1,0-1,1:    NL  '\n'
2,0-2,3:    NAME    'def'
2,4-2,7:    NAME    'foo'
2,7-2,8:    OP  '('
2,8-2,9:    OP  ')'
2,9-2,10:   OP  ':'
2,11-2,30:  COMMENT '# This is a comment'
2,30-2,31:  NEWLINE '\n'
3,0-3,4:    INDENT  '    '
3,4-3,9:    NAME    'print'
3,10-3,15:  STRING  "'foo'"
3,15-3,16:  NEWLINE '\n'
4,0-4,0:    DEDENT  ''
4,0-4,0:    ENDMARKER   ''

OTHER TIPS

I use the listings package in LaTeX to insert source code; it does syntax highlight, linebreaks et al.

Put the following in your preamble:

\usepackage{listings}
%\lstloadlanguages{Python} # Load only these languages
\newcommand{\MyHookSign}{\hbox{\ensuremath\hookleftarrow}}

\lstset{
    % Language
    language=Python,
    % Basic setup
    %basicstyle=\footnotesize,
    basicstyle=\scriptsize,
    keywordstyle=\bfseries,
    commentstyle=,
    % Looks
    frame=single,
    % Linebreaks
    breaklines,
    prebreak={\space\MyHookSign},
    % Line numbering
    tabsize=4,
    stepnumber=5,
    numbers=left,
    firstnumber=1,
    %numberstyle=\scriptsize,
    numberstyle=\tiny,
    % Above and beyond ASCII!
    extendedchars=true
}

The package has hook for inline code, including entire files, showing it as figures, ...

I'd check a reformat tool in an editor like NetBeans.

When you reformat java it properly fixes the lengths of lines both inside and outside of comments, if the same algorithm were applied to Python, it would work.

For Java it allows you to set any wrapping width and a bunch of other parameters. I'd be pretty surprised if that didn't exist either native or as a plugin.

Can't tell for sure just from the description, but it's worth a try:

http://www.netbeans.org/features/python/

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top