Page 1 of 1

What is load game name unicode sort order?

Posted: Fri May 29, 2020 11:26 am
by moon69
I'd like to replicate how Factorio sorts the savegame in the Load screen - any ideas?

This is how Windows 10 command line sorts some unicode test saves I made...
a LATIN SMALL LETTER A 0x0061.zip
z LATIN SMALL LETTER Z 0x007A.zip
ͱ GREEK SMALL LETTER HETA 0x0371.zip
ϻ GREEK SMALL LETTER SAN 0x03FB.zip
ω GREEK SMALL LETTER OMEGA 0x03C9.zip
а CYRILLIC SMALL LETTER A 0x0430.zip
ҁ CYRILLIC SMALL LETTER KOPPA 0x0481.zip
ԟ CYRILLIC SMALL LETTER ALEUT KA 0x051F.zip
ԭ CYRILLIC SMALL LETTER DCHE 0x052D.zip
This is how Factorio - Load game sorts them...
ͱ GREEK SMALL LETTER HETA 0x0371
ϻ GREEK SMALL LETTER SAN 0x03FB
ω GREEK SMALL LETTER OMEGA 0x03C9
а CYRILLIC SMALL LETTER A 0x0430
ҁ CYRILLIC SMALL LETTER KOPPA 0x0481
ԟ CYRILLIC SMALL LETTER ALEUT KA 0x051F
ԭ CYRILLIC SMALL LETTER DCHE 0x052D
a LATIN SMALL LETTER A 0x0061
z LATIN SMALL LETTER Z 0x007A
Side question for curiousity:
Cyrillic ԭ behaves a little differently compared to the other "small" cyrillic letters ҁ, ԟ, etc.
Is it something like uppercase or punctuation?
In Windows 10 Explorer, it sorts to the top...
ԭ CYRILLIC SMALL LETTER DCHE 0x052D.zip
a LATIN SMALL LETTER A 0x0061.zip
z LATIN SMALL LETTER Z 0x007A.zip
ͱ GREEK SMALL LETTER HETA 0x0371.zip
ϻ GREEK SMALL LETTER SAN 0x03FB.zip
ω GREEK SMALL LETTER OMEGA 0x03C9.zip
а CYRILLIC SMALL LETTER A 0x0430.zip
ҁ CYRILLIC SMALL LETTER KOPPA 0x0481.zip
ԟ CYRILLIC SMALL LETTER ALEUT KA 0x051F.zip

Re: What is load game name unicode sort order?

Posted: Fri May 29, 2020 6:22 pm
by Rseding91
It uses https://sourcefrog.net/projects/natsort/strnatcmp.c

Specifically:

Code: Select all

bool PackageFilesystemInfo::operator<(const PackageFilesystemInfo& other) const
{
  return strnatcasecmp(this->name.c_str(), other.name.c_str()) < 0;
}

Re: What is load game name unicode sort order?

Posted: Fri May 29, 2020 9:30 pm
by Hiladdar
This is by loading a different keyboard / display map. In summery there is a 2 hex digit number which represents each character. Change the keyboard map / display font, and you can change the sort order which is based off the the hexadecimal numbers. I had to deal with this issue when working between ASCI and EBCDIC. For example, in EBCDIC the character "A" is mapped to hex value C1, while in ASCI it is mapped to hex value 41.

This means that in part the answer to your question is buried in different OSI levels. Factorio clearly resides in the application level, while Win10 has it's tentacles predominantly in transport, session, and presentation levels, which can be affected by how the Win10 box localized, and local user sort order changes, if any.

Hiladdar

Re: What is load game name unicode sort order?

Posted: Sat May 30, 2020 3:36 pm
by moon69
Thanks both.