Can I make a regex match all characters up to ; except \;?

https://stackoverflow.com/questions/16555176

29-05-2022
|

質問

I am going to construct a message-passing system whose messages have the following structure:

message type;message content

(matches message type;)

However, the user can set the message type, and (for the sake of loosely coupled systems) I want to allow them to use a ; as part of the message type. To do this, I'll have the message constructor escape it with a \:

tl\;dr;Too long; didn't read content

(matches tl\;dr;)

How can I have a regex match all content up to the first ; that's not \;? In the example, that's the tl\;dr; part only. Note that there can be an unescaped ; within the message content.

I tried ^.*;, but that matches all content up to a semicolon within the message (e.g. tl\;dr;Too long;)

解決

/.*?[^\\](?=;)/

You could also just use ; instead of (?=;), but the latter prevents it from being part of the full match.

If you only want to match from the start of the string, use:

/^.*?[^\\](?=;)/

他のヒント

Not sure which language are you looking for, but here's the python version regex:

^(\\.|[^;])*(?=;)

In practice:

In [28]: re.search(r'^(\\.|[^;])*(?=;)', r'message type;message content').group(0)
Out[28]: 'message type'

In [37]: re.search(r'^(\\.|[^;])*(?=;)', r"tl\;dr;Too long; didn't read content").group(0)
Out[37]: 'tl\\;dr'

/^([^;\]|\.)*?;/

Depending on your implementation you might need to escape the \ once or twice. For instance in PHP I'd have to use:

/^([^;\\\]|\\\.)*?;/

... match all characters not \\ or ;, or if you encounter a \\, also eat the character right after it regardless of what it is, untill the next character would be ;

If you want to match all parts, this would be what I'd use:

/([^;\\\]|\\\.)*?(?=;|$)/

ライセンス： CC-BY-SA と帰属

所属していません StackOverflow