Deleting multiline text from multiple files

https://stackoverflow.com/questions/174472

05-07-2019
|

Question

I have a bunch of java files from which I want to remove the javadoc lines with the license [am changing it on my code].

The pattern I am looking for is

^\* \* ProjectName .* USA\.$

but matched across lines

Is there a way sed [or a commonly used editor in Windows/Linux] can do a search/replace for a multiline pattern?

Solution

Here's the appropriate reference point in my favorite sed tutorial.

OTHER TIPS

Yes. Are you using sed, awk, perl, or something else to solve this problem?

Most regular expression tools allow you to specify multi-line patterns. Just be careful with regular expressions that are too greedy, or they'll match the code between comments if it exists.

Here's an example:

/\*(?:.|[\r\n])*?\*/
perl -0777ne 'print m!/\*(?:.|[\r\n])*?\*/!g;' <file>

Prints out all the comments run together. The (?: notation must be used for non-capturing parenthesis. / does not have to be escaped because ! delimits the expression. -0777 is used to enable slurp mode and -n enables automatic reading.

(From: http://ostermiller.org/findcomment.html )

Probably someone is still looking for such solution from time to time. Here is one.

Use awk to find the lines to be removed. Then use diff to remove the lines and let sed clean up.

awk "/^\* \* ProjectName /,/ USA\.$/" input.txt \
  | diff - input.txt \
  | sed -n -e"s/^> //p" \
  >output.txt

A warning note: if the first pattern exist while the second does not, you will loose all text below the first pattern - so check that first.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow