Language attributes- what are they?

Peter Constable petercon at microsoft.com
Sat Jan 1 20:29:58 CET 2005


> From: John Cowan [mailto:jcowan at reutershealth.com]


> > And are you really going to run a parser on the stuff I enter into
> > a document manually?
> 
> It happens all the time.  After all, all data is entered manually at
some
> stage.

Ah, but apply this to dates, since that was what was under
consideration. Is anyone going to parse dates I enter in this email,
where there's no limit on how I might go about entering the date?

 
> > I doubt that's a common scenario. On the other
> > hand, a very common scenario would be that you're requesting data
from
> > my server that you intend to parse, and you either need date strings
> > to be in a particular format or you want to be told what the format
> > is. That's an API: your process interacting with my process.
> 
> "want to be told what the format is":  that's a case for per-document
> tagging,
> aka language-tagging.  If I request a document in either Welsh or
English,
> to take a more extreme case, I certainly want to be told which one
you're
> sending me, by the same token, and language tagging is appropriate.

OK, you've pointed out a flaw in my logic: it's an API if you want me to
tell you the format *and* your receiving the data because I'm asking you
to do something. 

Let's suppose you're not doing something for me; you just want to know
the date format I use because you've got your own purpose in mind. Now,
since I have no way of anticipating that you've got this purpose when I
create my document, the likelihood, given current practices, is that I'm
not going to go out of my way to document my date format. I may tell you
my content is en-US and if I'm consistent in formatting date strings as
m/d/yy (which may be what most en-US authors would do), then you'd be
able to parse correctly using inferences from the language tag. But I
might also format my dates in many other ways, and unless we want to
start recommending that all content get tagged to indicate date formats,
you're not going to know for certain what I've done because, given
current practice, I'm not likely to document it. 

And if we were to start recommending that all content get tagged to
indicate date formats, I would not mix that into a language tag; I would
treat it as a separate metadata element.


Peter Constable


More information about the Ietf-languages mailing list