UTF-8

Nicolas Williams Nicolas.Williams at oracle.com
Fri Jun 18 22:17:42 CEST 2010


On Fri, Jun 18, 2010 at 09:00:16PM +0200, Patrik Fältström wrote:
> On 18 jun 2010, at 19.37, Nicolas Williams wrote:
> 
> > U-labels and raw UTF-8 equivalent
> 
> Please explain the differences between these two.

U-labels are the output of ToUnicode() encoded in UTF-8, and consist of
normalized, case-folded Unicode.  "Raw UTF-8" would be u-pre-processed
Unicode (encoded in UTF-8), possibly raw user input, meaning: not
necessarily normalized, nor case-folded.


More information about the Idna-update mailing list