Question

Following the instructions at Add header (copyright) information to existing source files, I need to add copyright headers to a bunch of source files we're sending out of the building. (I know, I hate copyright headers too, but it's policy for when we release proprietary source files. Please consider "persuade someone to waive the policy" as unhelpful and as not answering the question.)

I have two copies of all the files (in dir and dir.orig) and, from within dir.orig, I'm using

find . -name \*.cs -exec sh -c "mv '{}' tmp && cp ../header.txt '../dir/{}'
  && cat tmp >> '../dir/{}' && rm tmp" \;

This is working, but it's ending up with the header, then the BOM from the original source file, whereas I'd prefer either the BOM to move to the start or be removed.

(Looking at this, I realise that moving the file to tmp is unnecessary, given I'm not overwriting the original, but I didn't bother removing that from the example from the other SO question.)

How can I remove (or move) the BOM so that I end up without it appearing immediately after the newly-added header?

Was it helpful?

Solution

I think I may have found my solution, thanks to being pointed to uconv from this answer from Steven R. Loomis on a related question.

If I use

find . -name *.cs -exec sh -c "cp ../header.txt '../dir/{}'
  && uconv --remove-signature -f UTF-8 -t UTF-8 '{}' >> '../dir/{}'" \;

, then uconv assumes both input (-f) and output (-t) encodings should be UTF-8, but --remove-signature causes it to remove any BOM it finds.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top