How is the 🇩🇪 character represented in Swift strings?

https://stackoverflow.com//questions/24004195

20-12-2019
|

Question

Like some other emoji characters, the 0x0001F1E9 0x0001F1EA combination (German flag) is represented as a single character on screen although it is really two different Unicode character points combined. Is it represented as one or two different characters in Swift?

Solution

let flag = "\u{1f1e9}\u{1f1ea}"

then flag is 🇩🇪 .

For more regional indicator symbols, see:
http://en.wikipedia.org/wiki/Regional_Indicator_Symbol

OTHER TIPS

Support for "extended grapheme clusters" has been added to Swift in the meantime. Iterating over the characters of a string produces a single character for the "flags":

let string = "Hi🇩🇪!"
for char in string.characters {
    print(char)
}

Output:

H
i
🇩🇪
!

Swift 3 implements Unicode in its String struct. In Unicode, all flags are pairs of Regional Indicator Symbols. So, 🇩🇪 is actually 🇩 followed by 🇪 (try copying the two and pasting them next to eachother!).

When two or more Regional Indicator Symbols are placed next to eachother, they form an "Extended Grapheme Cluster", which means they're treated as one character. This is why "🇪🇺 = 🇫🇷🇪🇸🇩🇪...".characters gives you ["🇪🇺", " ", "=", " ", "🇫🇷🇪🇸🇩🇪", ".", ".", "."].

If you want to see every single Unicode code point (AKA "scalar"), you can use .unicodeScalars, so that "Hi🇩🇪!".unicodeScalars gives you ["H", "i", "🇩", "🇪", "!"]

tl;dr

🇩🇪 is one character (in both Swift and Unicode), which is made up of two code points (AKA scalars). Don't forget these are different! 🙂

See Also

Swift doesn't tell you what the internal representation of a String is. You interact with a String as a list of full-size (32-bit) Unicode code points:

for character in "Dog!🐶" {
    println(character)
}
// prints D, o, g, !, 🐶

If you want to work with a string as a sequence of UTF-8 or UTF-16 code points, use its utf8 or utf16 properties. See Strings and Characters in the docs.

With Swift 5, you can iterate over the unicodeScalars property of a flag emoji character in order to print the Unicode scalar values that compose it:

let emoji: Character = "🇮🇹"
for scalar in emoji.unicodeScalars {
    print(scalar)
}
/*
 prints:
 🇮
 🇹
 */

If you combine those scalars (that are Regional Indicator Symbols), you get a flag emoji:

let italianFlag = "🇮" + "🇹"
print(italianFlag) // prints: 🇮🇹
print(italianFlag.count) // prints: 1

Each Unicode.Scalar instance also has a property value that you can use in order to display a numeric representation of it:

let emoji: Character = "🇮🇹"
for scalar in emoji.unicodeScalars {
    print(scalar.value)
}
/*
 prints:
 127470
 127481
 */

You can create Unicode scalars from those numeric representations then associate them into a string:

let scalar1 = Unicode.Scalar(127470)
let scalar2 = Unicode.Scalar(127481)
let italianFlag = String(scalar1!) + String(scalar2!)
print(italianFlag) // prints: 🇮🇹
print(italianFlag.count) // prints: 1

If needed, you can use Unicode.Scalar's escaped(asASCII:) method in order to display a string representation of the Unicode scalars (using ASCII characters):

let emoji: Character = "🇮🇹"
for scalar in emoji.unicodeScalars {
    print(scalar.escaped(asASCII: true))
}
/*
 prints:
 \u{0001F1EE}
 \u{0001F1F9}
 */

let italianFlag = "\u{0001F1EE}\u{0001F1F9}"
print(italianFlag) // prints: 🇮🇹
print(italianFlag.count) // prints: 1

String's init(_:radix:uppercase:) may also be relevant to convert the scalar value to an hexadecimal value:

let emoji: Character = "🇮🇹"
for scalar in emoji.unicodeScalars {
    print(String(scalar.value, radix: 16, uppercase: true))
}
/*
 prints:
 1F1EE
 1F1F9
 */

let italianFlag = "\u{1F1EE}\u{1F1F9}"
print(italianFlag) // prints: 🇮🇹
print(italianFlag.count) // prints: 1

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow