Question

The tv.txt file is as following:

mms://live21.gztv.com/gztv_gz 广州台[可于Totem/VLC/MPlayer播放,记得把高宽比设置成4:3]
mms://live21.gztv.com/gztv_news 广州新闻台·直播广州(可于Totem/VLC/MPlayer播放,记得把高宽比设置成4:3)
mms://live21.gztv.com/gztv_kids 广州少儿台(可于Totem/VLC/MPlayer播放,记得把高宽比设置成4:3)
mms://live21.gztv.com/gztv_econ 广州经济台

I want to group it into three groups.

sed -r 's/([^ ]*)\s([^][()]*)((\(.+\))*|(\[.+\])*)/\3/'  tv.txt 

got the result:

[可于Totem/VLC/MPlayer播放,记得把高宽比设置成4:3]    
(可于Totem/VLC/MPlayer播放,记得把高宽比设置成4:3)    
(可于Totem/VLC/MPlayer播放,记得把高宽比设置成4:3)   

When I write it into

sed -r 's/([^ ]*)\s([^][()]*)((\(.+\))*|(\[.+\])*)/\3/'  tv.txt   

It can't work.

The only difference is [^][()] and [^[]()]; neither of the [^\[\]()] ,escape characters can not make it run properly.

I want to know the reason.

Was it helpful?

Solution

The POSIX rules for getting ] into a character class are a little arcane, but they make sense when you think about it hard.

For a positive (non-negated) character class, the ] must be the first character:

[]and]

This recognizes any character a, n, d or ] as part of the character class.

For a negated character class, the ] must be the first character after the ^:

[^]and]

This recognizes any character except a, n, d or ] as part of the character class.

Otherwise, the first ] after the [ marks the end of the character class. Inside a character class, most of the normal regex special characters lose their special meaning, and others (notably - minus) acquire special meanings. (If you want a - in a character class, it has to be 'first' or last, where 'first' means 'after the optional ^ and only if ] is not present'.)

In your examples:

  • [^][()] — this is a negated character class that recognizes any character except [, ], ( or ), but
  • [^[]()] — this is a negated character class that recognizes any character except [, followed by whatever () symbolizes in the regex family you're using, and ] which represents itself.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top