OT RE: language question re: IDNs Conclusion
debbie at ictmarketing.co.uk
Thu Oct 8 21:05:20 CEST 2009
Below is a snippet from an email that I sent prior to receiving yours...
Although I have now stated on list that we should use "Homograph(ic)
bundling" in relation to domain labels, this is only because this term has
been so widely used in recent years in this context.
In actual fact, a homograph, in my mind and in this context, is where two
web pages are created that are visually identical - one the original, the
other designed for phishing. I still believe that "Homomorph(ic) bundling"
is the true term as per the enquiry (same form labels) but alas I am not
charged with creating the terminology for the web/internet so must go with
the flow! :-)
If you wish to create a group to discuss/create terminology you can count me
in... As a member of ISO TC37 for some 8 years I do have a little bit of
> -----Original Message-----
> From: idna-update-bounces at alvestrand.no
> [mailto:idna-update-bounces at alvestrand.no] On Behalf Of John C Klensin
> Sent: 08 October 2009 16:48
> To: debbie at ictmarketing.co.uk; 'Cary Karp'; idna-update at alvestrand.no
> Subject: Re: OT RE: language question re: IDNs Conclusion
> A few small observations about this, many of them based on a
> few painful years trying to sort out multilingual thesauri
> and dictionaries for use in treaty-enabled regulatory
> environments (a somewhat higher standard than anything the
> IDNAWG or IETF are involved with).
> --On Thursday, October 08, 2009 11:15 +0100 Debbie Garside
> <debbie at ictmarketing.co.uk> wrote:
> > Hi Cary
> > Going slightly off track to the original enquiry, I am quite
> >interested in the terminology currently being used for
> >similarities. I have read quite a bit about this during
> the course of
> >the past couple of years and I think you are right there is
> a need for
> >a term to describe
> > "same/similar glyph". Despite the fact that the Wikipedia
> > article has no clear references to the word or its
> etymology, I think
> >there is considerable merit in using Homoglyph to describe
> two or more
> >glyphs or combination characters that are visually the same or so
> >similar to the human eye as to cause confusion.
> This is, itself, fuzzy because of differences in perception,
> expectations, and (with some scripts more than others)
> variations in type and calligraphic styles. Subjectively,
> "confusables" fully captures the problem but does so because
> (a few tech reports notwithstanding) it fully captures that
> subjectiveness of it all.
> As this particular discussion evolves, I think it is also
> likely that we will want to distinguish between "same
> character, with same derivation, in two different scripts"
> (e.g., Latin, Greek, and Cyrillic Capital "A") from "in the
> context of the right experiment, someone might confuse these
> two characters" (e.g., Latin Capital U in an
> appropriately-decorative typeface as compared to Thai Kho
> Khai (U+0E02)), and from "these characters are really
> different, but might be mistaken for each other visually"
> (e.g., Latin Capital I and Katakana Small E (U+30A7).
> I'm not sure that the latter two distinctions are important,
> but distinguishing the first one from the other two may be critical.
> > Getting back to the original request for guidance, I still don't
> >think "Homoglyph bundling" is the correct terminology (for
> the reasons
> >stated in my mail regarding whole domain names - labels). Indeed
> >having re-read some of the documents cited above, I believe
> the term
> >should be "Homograph(ic) bundling" as the term Homograph is used
> >consistently across the web in this context.
> And, as Cary has pointed out, it is also wrong. References
> to "consistently across the web" or to Wikipedia articles are
> useless here because they reflect mob mentality rather than
> an attempt to make very precise --and probably important--
> distinctions clear.
> > So, does anyone know how we can suggest Homoglyph to the editors of
> > OED! :-)
> Wrong question, I think.
> Given that we are looking for a precise term that can be
> precisely mapped into multiple languages, the right solution
> is to borrow a note from John Tukey and several scientific
> fields and make something up -- traditionally based on some
> language that was once widely-used but is not now in normal
> conversational use -- define it precisely, and then move
> toward getting it into the use-vocabularies of all of the
> relevant modern languages either directly or in transliteration.
> "Homograph" might work if it has not already been used too
> much as a synonym for "confusable".
> If it has been used too much, I'd recommend deferring to the
> international character of this work and choosing something
> based on some classical language rather than Greek or Latin,
> perhaps one that, like them, is primarily used only liturgically
> today. Once such a term appears in a few official translation
> dictionaries (such as an EU one), the OED and similar
> references will take care of themselves.
> Idna-update mailing list
> Idna-update at alvestrand.no
More information about the Idna-update