replacing a specific pattern of characters with a particular character in unix shell script [closed]

StackOverflow https://stackoverflow.com/questions/23564519

  •  18-07-2023
  •  | 
  •  

Frage

In starting we need to check if there are 2 or 3 characters before hyphen then that should remain as its and if the characters before hyphen(if any) is 1 or more than 3 then we beed to put space after hyphen

input

SB-743921- 11C

SBDF-559448-AAA

SBI-742457-A

S-SANJAY PFF

GH222016/Love

output

SB-743921- 11C

SBDF- 559448-AAA

SBI-742457-A

S- SANJAY PFF

GH222016/Love

I am trying it using tr command like

cat input.txt|tr "...?-" " "

but it is replacing all - by space

War es hilfreich?

Lösung

try this:

awk -F- -v OFS="-" '{for(i=NF-1;i>=1;i--){l=length($i);if(l<2||l>3)$(i+1)=" "$(i+1)}}7' file

the above line apply your rule for every -: for example:

kent$  cat f
SB-743921- 11C
SBDF-559448-AAA
SBI-742457-A
S-SANJAY PFF
GH222016/Love

kent$  awk -F- -v OFS="-" '{for(i=NF-1;i>=1;i--){l=length($i);if(l<2||l>3)$(i+1)=" "$(i+1)}}7' f
SB-743921-  11C
SBDF- 559448- AAA
SBI-742457- A
S- SANJAY PFF
GH222016/Love

if you just want to check the column before the first -, it would be much easier.

only apply on the first case:

 awk -F- -v OFS="-" 'NF>1{l=length($1);if(l<2||l>3)$2=" "$2}7' file

Andere Tipps

tr tr ansliterates one character to another. You may need to reach into a tool with a more robust regex engine:

perl -pe 's/-/- /g; s/- (\w\w\w?)\b/-$1/g;' <input.txt

This will put a space in everywhere after a hyphen, then remove it in the cases you don't want.

sed might be easiest in this case:

sed -E 's/^([^-]|[^-]{4,})-/\1- /' input.txt

The overall effect is that a space is inserted after the first - on lines that do not have either exactly 2 or 3 characters before the first -.

  • sed uses regular expressions to match input lines; -E (alias in GNU sed: -r) makes sed support extended regular expressions (instead of the default basic ones), which is always advisable - extended regexes behave much more like regexes in other programming languages - note, however, that extended regexes are NOT part of POSIX, so some platforms may not support them.
  • s/<to replace>/<with what>/ is the sed text-substitution (text-replacement) command.
  • The initial ^ ensures that matching starts at the beginning of each line.
  • [^-] means: any character except a -.
  • [^-]|[^-]{4,} means: match either a single character other than - or (|) four or more characters other than a -
  • The (...), a so-called capture group, causes the string that matches the enclosed expression to be saved (captured) for later use.
  • The replacement string references the captured string as \1 (a so-called backreference, here referring to the 1st (and only) capture group). Using \1-  as the replacement string effectively puts a space after the first -.
  • Any non-matching lines are passed through unmodified (this is sed's default behavior - it prints all input lines, whether modified or not).
Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top