UTF-8 Korean character support

Post Reply
cry1004
Posts: 3
Joined: Sun Jan 14, 2024 8:33 am

UTF-8 Korean character support

Post by cry1004 »

hello. I downloaded and used Mudlet for the first time today.

Although UTF-8 Korean is supported, many Korean characters are broken or not visible on the screen.

UTF-8 Hangul uses 3 bytes,

1100–1112 Hangul alphabet (consonants)
1161–1175 Hangul alphabet (vowels)
11A8–11C2 Korean consonants (consonants with consonant clusters)
AC00–D7A3 Hangul syllables

This is the range used by Korean UTF8.

Image

User avatar
SlySven
Posts: 1034
Joined: Mon Mar 04, 2013 3:40 pm
Location: Deepest Wiltshire, UK
Discord: SlySven#2703

Re: UTF-8 Korean character support

Post by SlySven »

The � indicate that the font being used (and possibly none of the fonts that Mudlet can access) contains the visual symbol that is trying to be shown. However I am not sure whether this is because the text data has been corrupted or Mudlet is broken in some way. What would be useful for me would be if you could capture a sample of that text by making a "recording" of it by clicking on the 📼 ("video text" like icon in the bottom button bar) before the text comes in and again afterwards to end the recording and uploads the xxxx.dat file that is produced. I can then replay it and see what Mudlet is seeing. If you can also copy the text and correct it to what it should be (or only a section that I can compare that is wrong if the whole thing is too much) and save that as a text file that you also upload I will try and work out what is going wrong.

I've just tried to show a passage of text (the "Lord's prayer" on the basis that that is likely to be well translated) that I've used Google translate to convert to (South?) Korean and tried feeding it through the trigger engine with feedTriggers(...) (in white on blue) and writing it direct to the main Mudlet console with cecho(...) (in white on red) and there are differences in how the text is handled.
Screenshot_20240409_012153.png
One thing that I wondered - is this form of Korean written Left to Right as western "Latinate" languages are or Right to Left?

If you understand how the text is represented in Unicode code points you might also find the "Text analyser" useful - select a line of text in the main Mudlet console, right click on it to show the context menu and then hover on the "Analyse text" menu item - then a display will popup showing the codepoints that Mudlet thinks makes up that text.
Screenshot_20240409_012953.png

Post Reply