[Almost OT] Re: Hangul jamo issues - are jamo sequences legitimate?

Soobok Lee lsb at lsb.org
Thu Jan 11 02:31:49 CET 2007


On Wed, Jan 10, 2007 at 04:38:34PM -0800, Mark Davis wrote:
> A UI is certainly free to remap characters beyond what is done by
> StringPrep, if it is faced with odd input like non-syllabalic Hangul.

If you mean OS's IME  by "UI" above,
I support your suggestion and that is similar to what I suggested
in the beginning of this thread.

If you mean Applications' UI, there may be difficulties, since
IDNAbis is used not only in web browsers but also in MUAs and 
other applications.

Moreoever, hangul fillers are displayed as blank spaces  in some
OS UI + Application combinations including Windows/MacOS
(maybe excluding Linux). Hangul fillers are produced by
u+31xx -> u+11xx mapping.

Interestingly, MacOS + Safari browser combination 
does not display u+1160 (hangul jungseong filler) as blank
space, rather ingore it. MacOS+Firefox combination
display u+1160 as blank space.  (NIDA people found this).
This fact illustrates how deep root the problem has.

If Microsoft/Apple/FSF are to  upgrade ,in the future, 
Hangul IME and rendering behavior for fillers 
correctly as suggested in this list and UTC , 
significant portions of jamo problems here will be
solved clearly  out of stringprep framework.

Until that happens, Only method to input jamo characters
are u+31xx.

Soobok

> 
> Mark
> 
> On 1/10/07, Soobok Lee <lsb at lsb.org> wrote:
> >
> >On Wed, Jan 10, 2007 at 04:21:28PM -0800, Mark Davis wrote:
> >> I don't think we are anticipating allowing any non-NFKC characters in
> >the
> >> output IDNs. I also tend to agree with Michel and others that input
> >should
> >> also be much more restricted, probably also to only NFKC characters, so
> >that
> >> the main mappings done from input to output are case mapping and
> >deletion.
> >
> >I understand the reason why you think NFKC-normalized strings are safe.
> >But, As i noted below, U+31xx is the only gate of jamo input under KSC5601
> >and later. If stringprep200x does not allow them nor does not map to
> >u+11xx
> >in context-sensitive way according to Kent Karlsson's suggestion,
> >we lose the only input method for jamo characters ...
> >


More information about the Idna-update mailing list