UTF-16 encodes Unicode code points above U+FFFF using surrogate pairs that take up 4 bytes.
You can specify a surrogate pair within a string literal by inserting the character directly into the string (provided that you have a keyboard that can insert the character):
string myString = "𠈓"; // CJK Ideograph
You can also represent the surrogate pair within a string literal using the \Unnnnnnnn (4 byte) syntax to specify the Unicode code point or the \unnnn\unnnn syntax to specify the encoded surrogate pair value.
string s1 = "\U00020213"; // Codepoint E+20213 string s2 = "\uD840\uDE13"; // Surrogate pair
Image may be NSFW.
Clik here to view.
Note that because a surrogate pair requires more then 2 bytes, you cannot represent a surrogate pair within a single character (System.Char) literal.
Filed under: Basics Tagged: Basics, C#, String, Surrogate Pair, Unicode Image may be NSFW.
Clik here to view.

Clik here to view.
