Pergunta

Suppose my text file with the following strings:

Apple foo foobar
Banana foo foobar1 abc b c
Orange barfoo
Pear foo

How do I group the strings that comes after Apple, Banana, Orange, and Pear?

I could do this for Apple, but this wouldn't work for the rest of the text files.

sed 's/\([^ ]*\) \([^ ]*\) \([^ ]*\)/\2 \3/'

I want the output to look like this:

foo foobar
foo foobar1 abc b c
barfoo
foo

Is there a general case where I can print these strings after the first whitespace?

Foi útil?

Solução

sed -r 's/^[^ ]+[ ]+//' in.txt

(GNU sed; on OSX, use -E instead of -r).


Update:

As @Jotne points out, the initial ^ is not strictly needed in this case - though it makes the intent clearer; similarly, you can drop the [] around the second space char.

The above only deals with spaces separating the columns (potentially multiple ones, thanks to the final + in the regex), whereas the OP more generally mentions whitespace.

Generalized whitespace version:

Note: In the forms below, \s and [:space:] match all kinds of whitespace, including newlines. If you wanted to restrict matching to spaces and tabs, use [ \t] or [:blank:].

sed -r 's/^\S+\s+//' in.txt

(GNU sed; this form will not work on OSX, even with -E.)

POSIX-compliant version (e.g., for AIX - thanks, @NeronLeVelu):

sed  's/^[^[:space:]]\{1,\}[[:space:]]\{1,\}//' in.txt

Outras dicas

Any reason it has to be sed?

$ cat <<EOF | cut -d ' ' -f 2-
Apple foo foobar
Banana foo foobar1 abc b c
Orange barfoo
Pear foo
EOF

foo foobar
foo foobar1 abc b c
barfoo
foo

GNU grep works too

grep -oP '(?<=\s).*'

Not sure about sed.

But you can just remove the unwanted part of each line using the multiline modifier:

/^\w+\s/gm

This can also be solved by awk

awk '{$1="";sub(/^ /,x)}1' file
foo foobar
foo foobar1 abc b c
barfoo
foo

or with this:

awk '{sub(/[^ ]+ /,x)}1' file
Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top