Problems with Russian characters within strings (C#)

I am trying to create an array of strings that will contain Russian characters, this way:

			rawKeys = new string[] {
			"!",
			"А", 
			"Б", 
			"В", 
			"Г", 
			"Д", 
			"Е", 
			"Ё", 
			"Ж", 
			"З", 
			"И", 
			"Й", 
			"К", 
			"Л", 
			"М", 
			"Н", 
			"О", 
			"П", 
			"Р", 
			"С", 
			"Т", 
			"У", 
			"Ф", 
			"Х", 
			"Ц", 
			"Ч", 
			"Ш", 
			"Щ", 
			"Ъ", 
			"Ы", 
			"Ь",
			"Э", 
			"Ю", 
			"Я"
		};

But after executing such code, the only string that keeps its value is the first one “!”, the rest that are cyrillic characters are substituted by “??” instead.

Any ideas?

OK, I finally managed to figure out the problem and solve it. It is clearly another bug more in Unity editor: it does not only want UTF-8 files, but they MUST have the BOM, despite such bytes are optional according to UTF-8 specification. To make things worse, the Mono Develop environment distributed with the same Unity game engine does NOT save UTF-8 with the BOM, so I finally ended up adding it manually just to try and it worked.

Just three steps in OSX command line:

cp KeyboardRussian.cs aux
echo -ne '\xEF\xBB\xBF' > KeyboardRussian.cs
cat aux >> KeyboardRussian.cs

And it worked like charm.

It should work for Windows too with a minor change:

copy KeyboardRussian.cs aux
echo -ne '\xEF\xBB\xBF' > KeyboardRussian.cs
type aux >> KeyboardRussian.cs

But I haven’t tried it.

For example, create xml file and add it your symbols:

 <RussianSymbols>
  <symbol>А</symbol>
  <symbol>Б</symbol>
  ... and etc
 </RussianSymbols>

Than in code, for example, in function Awake() read xml and create from this xml your array “rawKeys”. I hope that it will help you.

Hi,

this is an easy example of an easy encoding of cyrillic characters.

string cyrillicText = "Ж";
System.Text.UTF8Encoding encodingUnicode = new System.Text.UTF8Encoding();
byte[] cyrillicTextByte = encodingUnicode.GetBytes(cyrillicText);
Debug.Log(encodingUnicode.GetString(cyrillicTextByte));

What it is actually doing is that I am specifying the text to encode and Iam creating an variable for my encoding type (in this case we need UTF8). Afterwards I am storing the text as a byte array and print it in the console.

In your particular case (Index 24) of rawKeys array:

System.Text.UTF8Encoding encodingUnicode = new System.Text.UTF8Encoding();
byte[] cyrillicTextByte = encodingUnicode.GetBytes(rawKeys[24]);
Debug.Log(encodingUnicode.GetString(cyrillicTextByte));

Returns:

38660-untitled.jpg

code_warrior