Swiss german, spoken
Karen_Broome at spe.sony.com
Karen_Broome at spe.sony.com
Wed Jun 15 01:48:34 CEST 2005
Well-formed XML uses the xml:lang attribute to specify the content
language, not a <language> tag, though that would work in an informal use.
I was speaking of the formal use for which RFC 3066 is prescribed.
I hear an ugly rumor that Serbo-Croatian is making a big comeback in
639-3. :) It's my understanding that our dubbing teams do not feel the
differences between Serbian and Croatian are significant enough to merit
two unique dubbed versions, though they may require two unique subtitled
versions to account for Cyrillic and Latin versions. Not that our
industry-specific use should necessarily affect your definitions of
languages.
Also, we have legacy films that were previously determined to be
Serbo-Croatian. It seems a little weird that the language of an existing
document should change, though as you say, we could classify it as both
Serbian and Croatian.
We will likely structure our applications like so:
Serbo-Croatian (dubbing)
Serbian (subtitles)
Croatian (subtitles)
and use the filtering mechanism mentioned before in discussion of Chinese
languages. (If the Serbo-Croatian choice is selected, we could
potentially classify it as both "hr" and "sr" behind the scenes.) We may
change this down the line, but that is my understanding of our business
need today. This is why I like the spoken and written distinctions, as
well as the hierarchy, found in the 639-6 standard.
- Karen
Harald Tveit Alvestrand <harald at alvestrand.no>
06/14/2005 11:58 AM
To: Karen_Broome at spe.sony.com
cc: "'IETF Languages Discussion'" <ietf-languages at iana.org>, "'Michael
Everson'" <everson at evertype.com>, Debbie Garside
<debbie at ictmarketing.co.uk>
Subject: RE: Swiss german, spoken
--On 14. juni 2005 09:54 -0700 Karen_Broome at spe.sony.com wrote:
> Not sure I understand your point here. Yes, I have written text -- IF
I'm
> using subtitles. But I need to describe BOTH written and spoken language
> in the same XML format or database.
That should be relatively easy....
<item>
<dialogue>
<language>zh-min</language>
</dialogue>
<subtitles track=1>
<language>zh-hant</language>
</subtitles>
<subtitles track=2>
<language>zh-hans</language>
</subtitles>
</item>
(since the actual content isn't inside the XML structure, using "xml:lang"
seems a bit weird to me - but I'm no XML guru...)
Your application could consult an appropriate table to get reasonable
lists
of values for the language tags inside each of the sub-identifiers.....
> For me, the script is often a regional variant of language much like a
> dialect. Traditional Mandarin goes to Hong Kong. Simplified goes to
> Mainland China. Serbia might get Serbian subtitles in Cyrillic, Croatia
> would get Croatian in Latin, but one "Serbo-Croatian" dubbing would
> likely serve both regions.
<distraction>actually there is no longer a tag for "serbocroatian" - it
got
officially deleted from the registry after the breakup of Yugoslavia......
so you've got to tag it as either Serbian or Croatian or both....
</distraction>
> Certainly the script type could be a separate metadata property, but it
> seems like describing the script along with the language is fairly
> well-established and serves my needs well.
Yep. We've already made that decision, so let's just embrace that concept
wholeheartedly.......
Harald
More information about the Ietf-languages
mailing list