Question

does anyone know of some good elisp macros for cleaning up LaTeX code?

I do a lot of LaTeX editing of other peoples sources and I'd like to extend my set of clean up tools since not everyone organize their code in the manner I like it ;-)

One in particular would be interesting, to run function X on a buffer and get all LaTeX environments (\begin{...} and \end{...} pairs) to sit on lines of their own, this helps readability of the code.

I could try this myself, but would like to hear suggestions as to a best practice for programming such a function, e.g. it should of course not introduce blank lines of its own.

suggestions?

Edit: For the archives, here are my current version based on the answer given (assumes the use of auctex). It more or less suits my needs at the moment. I added the y-or-n test just to be able to detect corner cases that I had not thought of.

(defun enviro-split ()
  "Find begin and end macros, and put them on their own line."
  (interactive)
  (save-excursion
(beginning-of-buffer)

;; loop over document looking for begin and end macros
(while (re-search-forward "\\\\\\(begin\\|end\\)" nil t)
  (catch 'continue 

    ; if the line is a pure comment, then goto next
    (if (TeX-in-commented-line)
    (throw 'continue nil)
      )
    ;; when you find one, back up to the beginning of the macro
    (search-backward "\\")

    ;; If it's not at the beginning of the line, add a newline
    (when (not (looking-back "^[ \t]*"))
      (if (y-or-n-p "newline?")
      (insert "\n")
    )
      )

    ;; move over the arguments, one or two pairs of matching braces
    (search-forward "{")        ; start of the argument
    (forward-char -1)
    (forward-sexp)          ; move over the argument
    (if (looking-at "[ \t]*{")  ; is there a second argument?
    (forward-sexp)
      )             ; move over it if so
    (if (looking-at "[ \t]*\\[")    ; is there a second argument?
    (forward-sexp)
      )             ; move over it if so
    (when (looking-at (concat "[ \t]*" (regexp-quote TeX-esc) "label"))
      (goto-char (match-end 0))
      (forward-sexp)
      )

    (if (looking-at (concat "[ \t]*%" ))
    (throw 'continue nil)
      ) 

    ;; If there is anything other than whitespace following the macro,
    ;; insert a newline
    (if (not (looking-at "\\s *$"))
    ;;(insert "\n")
    (if (y-or-n-p "newline (a)?")
        (insert "\n")
      )
      )
    ) ; end catch 'continue
  )
(LaTeX-fill-buffer 'left)
)
  )
Was it helpful?

Solution

You could probably work up a single regexp and do a regexp replace for this. However, I find the logic of these manipulations becomes pretty hairy, particularly when you want to account for various edge-cases. In your example, you need to deal with some environments taking one argument, while others take two. I think it is easier to combine a series of simple regexps with basic text editing commands for this:

(defun enviro-split ()
  "Find begin and end macros, and put them on their own line."
  (interactive)
  (save-excursion
    (beginning-of-buffer)

    ;; loop over document looking for begin and end macros
    (while (re-search-forward "\\\\\\(begin\\|end\\)" nil t)

      ;; when you find one, back up to the beginning of the macro
      (search-backward "\\")

      ;; If it's not at the beginning of the line, add a newline
      (when (not (looking-at "^"))
        (insert "\n"))

      ;; move over the arguments, one or two pairs of matching braces
      (search-forward "{")              ; start of the argument
      (forward-char -1)
      (forward-sexp)                    ; move over the argument
      (if (looking-at "\\s *{")         ; is there a second argument?
          (forward-sexp))               ; move over it if so

      ;; If there is anything other than whitespace following the macro,
      ;; insert a newline
      (if (not (looking-at "\\s *$"))
          (insert "\n")))))

This approach has the advantage of using Emacs' built-in functions for moving over sexps, which is much easier than coming up with your own regexp that can handle multiple, potentially nested, expressions inside braces.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top