Combining mark vs combining character?

Mark Davis ☕ mark at macchiato.com
Wed Jan 5 19:11:58 CET 2011


Yes, it should say something like:

The Unicode string MUST NOT begin with a character having a General Category
property value of Mark (M).

Mark is defined to be the same as: Spacing_Mark OR Nonspacing_Mark OR
Enclosing_Mark. Note that because of the restrictions in
http://tools.ietf.org/html/rfc5892, the above is equivalent to saying:

The Unicode string MUST NOT begin with a character having a General Category
property value equal to Nonspacing_Mark (Mn) or Spacing_Mark (Mc).


Mark

*— Il meglio è l’inimico del bene —*


On Wed, Jan 5, 2011 at 06:20, Simon Josefsson <simon at josefsson.org> wrote:

> Thank you for clear answer!
>
> In a revision of the documents, it would help to say this explicitly, so
> there is a normative description.  Right now there is an informative
> reference to a section in Unicode that doesn't give enough detail.
>
> /Simon
>
> Vint Cerf <vint at google.com> writes:
>
> > yes, having general category M seems to encompass both "mark" and
> > "character" - at least for IDNA2008 purposes.
> >
> > v
> >
> >
> > On Wed, Jan 5, 2011 at 9:07 AM, Simon Josefsson <simon at josefsson.org>
> wrote:
> >
> >> Vint Cerf <vint at google.com> writes:
> >>
> >> > Simon,
> >> >
> >> > I am pretty sure that the terms "combining mark" and "combining
> >> character"
> >> > as used in IDNA2008 mean the same thing.
> >> >
> >> > neither are permitted as the initial character of a Unicode domain
> label
> >>
> >> Thanks.  And the practical definition of what a combining mark&character
> >> is that it has a General Category of M as explained in section 3.6 of
> >> Unicode 5.0 quoted below?
> >>
> >> Note that this is different than having a non-0 Combining Class value.
> >>
> >> /Simon
> >>
> >> > vint
> >> >
> >> >
> >> >
> >> >
> >> > On Wed, Jan 5, 2011 at 5:06 AM, Simon Josefsson <simon at josefsson.org>
> >> wrote:
> >> >
> >> >> Hi,
> >> >>
> >> >> I need a clarification regarding this paragraph in section 4.2.3.2 of
> >> >> RFC 5891:
> >> >>
> >> >>   The Unicode string MUST NOT begin with a combining mark or
> combining
> >> >>   character (see The Unicode Standard, Section 2.11 [Unicode] for an
> >> >>   exact definition).
> >> >>
> >> >> And this in section 5.4:
> >> >>
> >> >>   Putative U-labels with any of the following characteristics MUST be
> >> >>   rejected prior to DNS lookup:
> >> >> ...
> >> >>   o  Labels whose first character is a combining mark (see The
> Unicode
> >> >>      Standard, Section 2.11 [Unicode]).
> >> >>
> >> >> The reference to [Unicode] is not normative, which would be a problem
> >> >> for any implementer.
> >> >>
> >> >> Reading section 2.11 of Unicode 5.0 discuss "combining character" but
> >> >> not "combining mark".
> >> >>
> >> >> There is a section 7.9 in Unicode 5.0 called "Combining Marks".
> >> >>
> >> >> A section that discuss both Combining Marks and Combining Characters
> in
> >> >> the same section is section 3.11 on "Canonical Ordering Behaviour".
> >> >>
> >> >> There is one section 3.6 on "Combination" that gives the precice
> >> >> definition of a "Combining character":
> >> >>
> >> >>   Combining character: A character with the General Category of
> >> >>   Combining Mark (M).
> >> >>
> >> >> Is this the intended definition of Combining character by RFC 5891?
> >> >>
> >> >> Questions:
> >> >>
> >> >> 1) Does RFC 5891 refer to "combining mark" and "combining character"
> as
> >> >> the same thing?
> >> >>
> >> >> 2) Is there a significant difference between the requirement in
> 4.2.3.2
> >> >> and 5.4?  The latter section only mentions "combining mark" and not
> >> >> "combining character".
> >> >>
> >> >> 3) What is the precice definition of a "combining mark"?
> >> >>
> >> >> /Simon
> >> >> _______________________________________________
> >> >> Idna-update mailing list
> >> >> Idna-update at alvestrand.no
> >> >> http://www.alvestrand.no/mailman/listinfo/idna-update
> >> >>
> >> > _______________________________________________
> >> > Idna-update mailing list
> >> > Idna-update at alvestrand.no
> >> > http://www.alvestrand.no/mailman/listinfo/idna-update
> >>
> > _______________________________________________
> > Idna-update mailing list
> > Idna-update at alvestrand.no
> > http://www.alvestrand.no/mailman/listinfo/idna-update
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.alvestrand.no/pipermail/idna-update/attachments/20110105/a3e5cc68/attachment.html>


More information about the Idna-update mailing list