This encoder is used to handle input from HH conversion, which needs to
end up as UTF-8 in the database. Switch the open-coded routine from
Database.py to this common routine so all encodings now take place in
the same file.
This change is needed to skip a nasty behaviour: if the string triggered
a decoding error, it will trigger one *AGAIN* if the string is printed
to console. By writing directly to sys.stderr we skip the
locale/conversion issues and get the troublesome string directly in a
file where it is stored as a raw sequence of octets.
This encoder is used to handle input from HH conversion, which needs to
end up as UTF-8 in the database. Switch the open-coded routine from
Database.py to this common routine so all encodings now take place in
the same file.
It seems that running encoder.encode() on a latin1/latin9 string results
in, yes a bloody UnicodeDecodeError. Decode error on .encode()...
Really. This way the modification from non-unicode string to real
unicode appears to work better.
The strings (names) as stored in database should always be UTF-8;
whatever the display locale is, we then need to convert from the storage
encoding to session encoding. When making database queries with players
names in them, the names must be reconverted to UTF-8.
It seems that running encoder.encode() on a latin1/latin9 string results
in, yes a bloody UnicodeDecodeError. Decode error on .encode()...
Really. This way the modification from non-unicode string to real
unicode appears to work better.
The strings (names) as stored in database should always be UTF-8;
whatever the display locale is, we then need to convert from the storage
encoding to session encoding. When making database queries with players
names in them, the names must be reconverted to UTF-8.
If the system (display) locale is UTF-8, there is no need to encode to
either direction. In fact, running the .encode() routine appears to
mangle a valid UTF-8 string to a worse condition, effectively breaking
it.