In awk, at least in gawk, the field separator FS is whitespace (tab or space), which is reasonable. The output field separator OFS however is set to space by default. I would expect it to be tab, since tab is more standard as a separator of columns in UNIX text files than space (in my experience). What is the rationale behind making it a space?

有帮助吗?

解决方案

Text with TAB may look different in different text editors. Because many of them have the option 'how to interpret TAB' e.g. 4 spaces, 8 spaces etc. But text with space looks everywhere the same.

Also some indent sensitive programming languages recommend to use spaces instead of tab, e.g. here. from your point of view, this recommendation may not reasonable either.

If you prefer to have space as OFS default, you may create an alias say, myawk=awk -v OFS='\t'

其他提示

The awk programming language is probably older than your intuition of any present-day de facto Unix standard.

Having said that, the default makes perfect sense, for roughly the same reasons you often see cited when people argue against using tabs for indentation in source files.

Building on @Kent scripts, here are my aliases to handle csv and tsv, in input (F-parmeter) and output (OFS-parameter):

# alias to use awk on csv-files
alias awkt='awk -F"\t" -v OFS="\t"'
# alias to use awk on tsv-files
alias awkc='awk -F"," -v OFS=","'

Actually, the default value of FS is " " so it makes sense for OFS to have the same value. The implementation of awk is such that when FS is " ", awk skips any leading or trailing spaces and treats all contiguous spaces as separating the fields but nevertheless the default values of both FS and OFS are identical, " ".

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top