Scintilla: How do you find the byte position given a specific character position

https://stackoverflow.com/questions/4492171

11-10-2019
|

Question

Given a specific character index on a line, e.g. 10th character on line 3, is there an easy way to calculate Scintilla's 'position' of that character?

It's straight forward when using ASCII characters but I can't see an easy way to do it when using multi-byte UTF-8 characters, where a single character may take up several byte positions.

Solution

Convert line text to UTF8 and then count the byte positions. Cache conversion if multiple requests may be made.

OTHER TIPS

You should start at the beginning of the string and index into the string however many bytes correspond to the character in the current position, (so that you now index the next character), and keep a count of how many characters you have seen so far. This linear-time indexing is one of the drawbacks of UTF-8. Maybe Scintilla already has a facility to do this.

Did you tried: SCI_FINDCOLUMN ?:
SCI_FINDCOLUMN(int line, int column)
This message returns the position of a column on a line taking the width of tabs into account. It treats a multi-byte character as a single column. Column numbers, like lines start at 0.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow