Question

I dumped a SQLite3 table (from an Anki deck) to a CSV file. I found that the sfld column is separated by ^_.

What is this character or escape character in Unicode?

Strange separator as seen using Vim

Was it helpful?

Solution

It's a control-underscore (Control-_), or 0x1F, or Unit Separator character from the ASCII (and ISO 8859-x and Unicode) control characters.

The upper-case letters in ASCII, ISO 8859-x and Unicode have code points (all numbers in hex):

41 U+0041 LATIN CAPITAL LETTER A
…
5A U+005A LATIN CAPITAL LETTER Z

The subsequent characters are:

5B U+005B LEFT SQUARE BRACKET
5C U+005C REVERSE SOLIDUS
5D U+005D RIGHT SQUARE BRACKET
5E U+005E CIRCUMFLEX ACCENT
5F U+005F LOW LINE

The control characters like Control-A have a code 0x40 less than the upper-case letters, so you have

01 U+0001 START OF HEADING (aka SOH or Control-A)
…
1A U+001A SUBSTITUTE       (aka SUB or Control-Z)

and then you get:

1B U+001B ESCAPE           (aka ESC or Control-[)
1C U+001C FILE SEPARATOR   (aka FS  or Control-\)
1D U+001D GROUP SEPARATOR  (aka GS  or Control-])
1E U+001E RECORD SEPARATOR (aka RS  or Control-^)
1F U+001F UNIT SEPARATOR   (aka US  or Control-_)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top