Q/A about Unicode

I've also discovered 'Unicode project' thread on dev-list, but would
like to get some pointer about the current (5.0 <= Pharo < 6) state of
affairs in regard writing multi-language-capable and Unicode-aware apps
in Pharo?

For the translation part of l10n, there's also a package providing gettext support if needed.
locale support is limited to an API similar to *nix locales.

"Unicode-aware" is a wide topic, as reflected by the plethora of functional bits and bobs defined by different parts of the standard.
(With base image)
Can you represent all Unicode string? Yes.
Can you pass them to other systems in Unicode encodings? Yes.
Is the text renderer (in image) capable of displaying Unicode code points? Yes, if glyph is included in fonts.
(With Unicode project)
Can you query  Unicode properties of any codepoint? Yes
Can you normalize strings in the different forms? Yes.
Can you sort strings in Unicode collate order? Yes.
(With both)
Can you sort strings in CLDR-locale collate order? No.
Can you do regexp as per the Unicode spec? No.
Does the text renderer (in image) heed Unicode properties such as RTL and combining marks? No

Depending on what you want to do, the base image capabilities may be considered sufficient for writing multi-language-capable apps,
but for more advanced Unicode functionality, the groundworks is there, (ie, querying properties, normalization, the core collate algorithm) but many practical uses are as of yet unimplemented, the complete lack of CLDR support ranking high.


The big eyesore in

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: