How is the 🇩🇪 character represented in Swift strings?
Question
Like some other emoji characters, the 0x0001F1E9 0x0001F1EA combination (German flag) is represented as a single character on screen although it is really two different Unicode character points combined. Is it represented as one or two different characters in Swift?
Solution
let flag = "\u{1f1e9}\u{1f1ea}"
then flag
is 🇩🇪 .
For more regional indicator symbols, see:
http://en.wikipedia.org/wiki/Regional_Indicator_Symbol
OTHER TIPS
Support for "extended grapheme clusters" has been added to Swift in the meantime. Iterating over the characters of a string produces a single character for the "flags":
let string = "Hi🇩🇪!"
for char in string.characters {
print(char)
}
Output:
H i 🇩🇪 !
Swift 3 implements Unicode in its String
struct. In Unicode, all flags are pairs of Regional Indicator Symbols. So, 🇩🇪
is actually 🇩
followed by 🇪
(try copying the two and pasting them next to eachother!).
When two or more Regional Indicator Symbols are placed next to eachother, they form an "Extended Grapheme Cluster", which means they're treated as one character. This is why "🇪🇺 = 🇫🇷🇪🇸🇩🇪...".characters
gives you ["🇪🇺", " ", "=", " ", "🇫🇷🇪🇸🇩🇪", ".", ".", "."]
.
If you want to see every single Unicode code point (AKA "scalar"), you can use .unicodeScalars
, so that "Hi🇩🇪!".unicodeScalars
gives you ["H", "i", "🇩", "🇪", "!"]
tl;dr
🇩🇪 is one character (in both Swift and Unicode), which is made up of two code points (AKA scalars). Don't forget these are different! 🙂
See Also
Swift doesn't tell you what the internal representation of a String
is. You interact with a String
as a list of full-size (32-bit) Unicode code points:
for character in "Dog!🐶" {
println(character)
}
// prints D, o, g, !, 🐶
If you want to work with a string as a sequence of UTF-8 or UTF-16 code points, use its utf8
or utf16
properties. See Strings and Characters in the docs.
With Swift 5, you can iterate over the unicodeScalars
property of a flag emoji character in order to print the Unicode scalar values that compose it:
let emoji: Character = "🇮🇹"
for scalar in emoji.unicodeScalars {
print(scalar)
}
/*
prints:
🇮
🇹
*/
If you combine those scalars (that are Regional Indicator Symbols), you get a flag emoji:
let italianFlag = "🇮" + "🇹"
print(italianFlag) // prints: 🇮🇹
print(italianFlag.count) // prints: 1
Each Unicode.Scalar
instance also has a property value
that you can use in order to display a numeric representation of it:
let emoji: Character = "🇮🇹"
for scalar in emoji.unicodeScalars {
print(scalar.value)
}
/*
prints:
127470
127481
*/
You can create Unicode scalars from those numeric representations then associate them into a string:
let scalar1 = Unicode.Scalar(127470)
let scalar2 = Unicode.Scalar(127481)
let italianFlag = String(scalar1!) + String(scalar2!)
print(italianFlag) // prints: 🇮🇹
print(italianFlag.count) // prints: 1
If needed, you can use Unicode.Scalar
's escaped(asASCII:)
method in order to display a string representation of the Unicode scalars (using ASCII characters):
let emoji: Character = "🇮🇹"
for scalar in emoji.unicodeScalars {
print(scalar.escaped(asASCII: true))
}
/*
prints:
\u{0001F1EE}
\u{0001F1F9}
*/
let italianFlag = "\u{0001F1EE}\u{0001F1F9}"
print(italianFlag) // prints: 🇮🇹
print(italianFlag.count) // prints: 1
String
's init(_:radix:uppercase:)
may also be relevant to convert the scalar value to an hexadecimal value:
let emoji: Character = "🇮🇹"
for scalar in emoji.unicodeScalars {
print(String(scalar.value, radix: 16, uppercase: true))
}
/*
prints:
1F1EE
1F1F9
*/
let italianFlag = "\u{1F1EE}\u{1F1F9}"
print(italianFlag) // prints: 🇮🇹
print(italianFlag.count) // prints: 1