How do i construct a WideString with a diacratic in a non-unicode Delphi version?
-
21-12-2020 - |
Question
i am trying to construct a (test) WideString
of:
but using it's decomposed form:
LATIN SMALL LETTER A (U+0061) COMBINING ACUTE ACCENT (U+0301)
So i have the code fragment:
var
test: WideString;
begin
test := #$0061#$0301;
MessageBoxW(0, PWideChar(test), 'Character with diacratic', MB_ICONINFORMATION or MB_OK);
end;
Except it doesn't appear to work:
This could be a bug in MessageBox
, but i'm going to go ahead and say that it's more likely the bug is in my code.
Some other variations i have tried:
test := WideString(#$0061#$0301);
const
SmallLetterLatinAWithAcuteDecomposed: WideString = #$0061#$0301;
test := SmallLetterLatinAWithAcuteDecomposed
test := #$0061+#$0301; (Doesn't compile; incompatible types)
test := WideString(#$0061)+WideString(#$0301); (Doesn't compile; crashes compiler)
test := 'a'+WideString(#$0301); (Doesn't compile; crashes compiler)
//Arnauld's thought:
test := #$0301#$0061;
Bonus chatter
Solution
Best answer:
const
n: WideString = ''; //n=Nothing
s := n+#$0061+#$0301;
This fixes all cases i have below that otherwise fail.
The only variant that works is to declare it as a constant:
AccentAcute: WideString = #$0301;
AccentAcute: WideString = WideChar($0301);
AccentAcute: WideString = WideChar(#$0301);
AccentAcute: WideString = WideString(#$0301);
Sample Usage:
s := 'Pasta'+AccentAcute;
Constant based syntaxes that do not work
AccentAcute: WideString = $0301;
incompatible typesAccentAcute: WideString = #0301;
givesAccentAcute: WideString = WideString($0301);
invalid typecastAccentAcute: WideString = WideString(#$0301);
invalid typecastAccentAcute: WideChar = WideChar(#0301);
givesPastai
AccentAcute: WideChar = WideChar($0301);
givesPasta´
Other syntaxes that fail
'Pasta'+WideChar($0301)
givesPasta´
'Pasta'+#$0301
givesPasta´
WideString('Pasta')+#$0301
gives
Summary of all constant based syntaxes i found think up:
AccentAcute: WideString = #$0301; //works
AccentAcute: WideString = WideChar(#$0301); //works
AccentAcute: WideString = WideString(#$0301); //works
AccentAcute: WideString = $0301; //incompatble types
AccentAcute: WideString = WideChar($0301); //works
AccentAcute: WideString = WideString($0301); //invalid typecast
AccentAcute: WideChar = #$0301; //fails, gives Pasta´
AccentAcute: WideChar = WideChar(#$0301); //fails, gives Pasta´
AccentAcute: WideChar = WideString(#$0301); //incompatible types
AccentAcute: WideChar = $0301; //incompatible types
AccentAcute: WideChar = WideChar($0301); //fails, gives Pasta´
AccentAcute: WideChar = WideString($0301); //invalid typecast
Rearranging WideChar
can work, as long as you only append to a variable
//Works
t := '0123401234012340123';
t := t+WideChar(#$D840);
t := t+WideChar(#$DC00);
//fails
t := '0123401234012340123'+WideChar(#$D840);
t := t+WideChar(#$DC00);
//fails
t := '0123401234012340123'+WideChar(#$D840)+WideChar(#$DC00);
//works
t := '0123401234012340123';
t := t+WideChar(#$D840)+WideChar(#$DC00);
//works
t := '';
t := t+WideChar(#$D840)+WideChar(#$DC00);
//fails; gives junk
t := ''+WideChar(#$D840)+WideChar(#$DC00);
//crashes compiler
t := WideString('')+WideChar(#$D840)+WideChar(#$DC00);
//doesn't compile
t := WideChar(#$D840)+WideChar(#$DC00);
Definitely hitting against compiler nonsense; cases that weren't tested tested fully. Yes, i know David, we should upgrade.
OTHER TIPS
This works in Delphi 5/7:
var
test: WideString;
begin
test := WideChar($0061);
test := test + WideChar($0301);
MessageBoxW(0, PWideChar(test), 'Character with diacratic', MB_ICONINFORMATION or MB_OK);
end;
In short:
- In delphi 5 and delphi 7, it does not appear that concatenating WideChars to WideString works using
#$xxxx
form literals. #
doesn't seem to work as you'd expect for unicode literals.You can't just add two or more widechars in a single expression, like this:
test := WideChar(a)+WideChar(b); // won't compile in D5/D7.
Did you try #$0301#$0061 (i.e. diacritic first)?
OK.
So #$.... only handles ASCII 8 bits constants in this version.
You can just use a workaround using memory level:
type
TWordArray = array[1..MaxInt div SizeOf(word)-2] of word;
// start at [1], just as WideStrings
// or: TWordArray = array[0..MaxInt div SizeOf(word)-1] of word;
PWordArray = ^TWordArray;
var
test: WideString;
begin
test := '12'; // or SetLength(test,2);
PWordArray(test)[1] := $61;
PWordArray(test)[2] := $301;
MessageBoxW(0, pointer(test), 'Character with diacratic', MB_ICONINFORMATION or MB_OK);
end;
This will always work since you don't play with chars/widechars and such.
And it will also work as expected with Unicode version of Delphi.