Question

I have a function that I have used a bunch of times in various files which has a signature like:

Translate("English Message", "Spanish Message", "French Message")

and I am wanting to pull out the English, Spanish and French messages and then output them into a csv so that people who actually know these languages can tell me what I SHOULD have put in there.

Anyway, what I am running into is that some French and Spanish messages don't show up because of the accented characters and single quotes.

This is a vb.net program.

Edit

There was no problem with the language, my issue was actually the regular expression and my complete lack of understanding regular expressions.

Was it helpful?

Solution

Depends on the regex library you are using. Sane regex implementations use UTF-8 and have no such problems, but more details would be helpful about what lang you are using, what regex library etc.

OTHER TIPS

If there is a DOTALL flag in your language's regex implementation, you might want to set it.

Alternatively, change the regex to capture a negated character class instead, like so:

([^your_delimiter]*?)

with your_delimiter being the character(s) immediately succeeding the string that you want to capture.

See this for further discussion:

http://en.wikipedia.org/wiki/Regular_expression#Unicode

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top