Question

I have a text file with the following content:

+----------------------------------------------------------------+
|                       This is a section                        |
+----------------------------------------------------------------+

####################   This is a subsection   ####################

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

#################   This is another subsection   #################

I'd like to have each line not to overcome a certain amount of characters (66 in this case), so newlines can be inserted when needed; also, text should be justified on both sides, so multiple spaces can be added when needed as well. Finally, short lines should not be merged, and lines which contain exactly the desired amount of characters should not be modified, like shown below.

+----------------------------------------------------------------+
|                       This is a section                        |
+----------------------------------------------------------------+

####################   This is a subsection   ####################

Lorem ipsum dolor sit amet, consectetur adipisicing elit,  sed  do
eiusmod tempor incididunt ut labore et dolore  magna  aliqua.   Ut
enim ad minim veniam, quis nostrud  exercitation  ullamco  laboris
nisi ut aliquip ex ea commodo consequat.  Duis  aute  irure  dolor
in reprehenderit in voluptate velit esse cillum dolore  eu  fugiat
nulla pariatur.  Excepteur sint occaecat cupidatat  non  proident,
sunt in culpa qui officia deserunt mollit  anim  id  est  laborum.

#################   This is another subsection   #################

Unfortunately, fmt cannot justify

fmt --width=67 in

+----------------------------------------------------------------+
|                       This is a section                        |
+----------------------------------------------------------------+

####################   This is a subsection   ####################

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
enim ad minim veniam, quis nostrud exercitation ullamco laboris
nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor
in reprehenderit in voluptate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint occaecat cupidatat non proident,
sunt in culpa qui officia deserunt mollit anim id est laborum.

#################   This is another subsection   #################

and par gives an error (at least on a recent Ubuntu) when it tries to process that file:

par 66j < in

par error:
Cannot justify.

I also tried fold

fold -w 66 in

but it breaks words just to reach the limit of the line, and with the -s option its behaviour is similar to fmt (on an older openSUSE it also deletes empty lines).

It seems Vim cannot justify if the line is longer than its specified textwidth (see below), but if I cut the lines breaking by spaces (fmt or fold approach above), save the output, open it in Vim and use the following instructions

:runtime macros/justify.vim
:% call Justify(66,3)   # 3 is the maximum allowed space chars to add

+----------------------------------------------------------------+
|                       This is a section                        |
+----------------------------------------------------------------+

#################### This  is  a  subsection  ####################

Lorem ipsum dolor sit amet, consectetur adipisicing elit,  sed  do
eiusmod tempor incididunt ut labore et dolore  magna  aliqua.   Ut
enim ad minim veniam, quis nostrud  exercitation  ullamco  laboris
nisi ut aliquip ex ea commodo consequat.  Duis  aute  irure  dolor
in reprehenderit in voluptate velit esse cillum dolore  eu  fugiat
nulla pariatur.  Excepteur sint occaecat cupidatat  non  proident,
sunt in culpa qui officia deserunt mollit  anim  id  est  laborum.

################# This  is  another  subsection  #################

I can obtain "almost" the desired result (spaces are added inside the "subsections"). But the worst downside is that a direct interaction is required, while I need a batch approach since the whole procedure needs to be automated.

In synthesis, if there is any solution, I'd strongly appreciate standard Unix text tools (maybe piped through each other) or calling Vim macros in "batch mode" (if possible) rather than custom scripts. I'm aware a Perl program called paradj (not tried yet) has already been suggested in the past, but I'd like to know if standard tools can make it on their own.

EDIT 1

(thanks to Matthew Strawbridge) If I remove the first line with +- ... -+ then par is able to process the file and returns

|          This          is          a          section          |
+----------------------------------------------------------------+

#################### This is a subsection ####################

Lorem ipsum dolor  sit amet, consectetur adipisicing  elit, sed do
eiusmod tempor  incididunt ut  labore et  dolore magna  aliqua. Ut
enim ad  minim veniam,  quis nostrud exercitation  ullamco laboris
nisi ut aliquip ex ea commodo  consequat. Duis aute irure dolor in
reprehenderit  in voluptate  velit  esse cillum  dolore eu  fugiat
nulla pariatur.  Excepteur sint  occaecat cupidatat  non proident,
sunt in culpa qui officia deserunt mollit anim id est laborum.

################# This is another subsection #################

It seems to me like par could be a very good tool to solve the problem, which now becomes:

  1. instruct par to ignore the +- ... -+ patterns (by the way, why did the first one represent an obstacle and the second one not?);
  2. instruct par not to edit the spaces inside "sections" and "subsections". This might translate into "don't touch the lines with exactly the required number of characters in which the last character is not a space" (let's assume I don't use tabs).

(Please note that in general this file could be longer and the "section" and "subsections" patterns could be repeated several times).

Many thanks to everybody and sorry for the excessive length.

EDIT 2

(thanks to glts) I have tested your suggestions, and both the interactive and the batch approach do well; the only thing with the latter, a minimal interaction with Vim is still required.

After googling a bit, I found some syntax examples to solve this last task as well.

vim -E -s in <<-EOF
:set textwidth=66
:g/^\a/normal! gqq
:runtime macros/justify.vim
:g/^\a/Justify 66 3
:update
:quit
EOF

or

vim -es -c 'set textwidth=66' -c 'g/^\a/normal! gqq' -c 'runtime macros/justify.vim' -c 'g/^\a/Justify 66 3' -c wq in

At this point, I consider my "problem" solved, but anybody willing to continue with the alternative par approach is welcome!

Thanks again to anybody and thanks glts also for the Vim "lesson".

Was it helpful?

Solution

You can do a lot of this in Vim.

For example, here's an interactive approach that will do what you ask for.

  1. Set 'textwidth' to 66 and format your lines into paragraphs with the gq operator.

    :set textwidth=66
    :g/^\a/normal! gqq
    
  2. Source macros/justify.vim and justify your paragraphs.

    :runtime macros/justify.vim
    :g/^\a/Justify 66 3
    

How well this works depends on how consistent your existing format is. I've identified paragraphs as lines starting with \a, i.e. alphabetic characters (see :h /\a).

In order to make this procedure part of a batch process you could save these commands in a Vim script file called, for example, myformat.vim. This you could then repeatedly :source on a number of text files provided as command-line arguments to Vim.

$ ls
a.txt  b.txt  c.txt  myformat.vim
$ vim *.txt
:argdo source myformat.vim

This is one of those situations where the :argdo command shines.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top