Swiss german, spoken

Karen_Broome at spe.sony.com Karen_Broome at spe.sony.com
Wed Jun 15 01:48:34 CEST 2005


Well-formed XML uses the xml:lang attribute to specify the content 
language, not a <language> tag, though that would work in an informal use. 
I was speaking of the formal use for which RFC 3066 is prescribed.

I hear an ugly rumor that Serbo-Croatian is making a big comeback in 
639-3.  :)  It's my understanding that our dubbing teams do not feel the 
differences between Serbian and Croatian are significant enough to merit 
two unique dubbed versions, though they may require two unique subtitled 
versions to account for Cyrillic and Latin versions. Not that our 
industry-specific use should necessarily affect your definitions of 
languages. 

Also, we have legacy films that were previously determined to be 
Serbo-Croatian. It seems a little weird that the language of an existing 
document should change, though as you say, we could classify it as both 
Serbian and Croatian.

We will likely structure our applications like so:

Serbo-Croatian (dubbing)
Serbian (subtitles)
Croatian (subtitles)

and use the filtering mechanism mentioned before in discussion of Chinese 
languages.  (If the Serbo-Croatian choice is selected, we could 
potentially classify it as both "hr" and "sr" behind the scenes.) We may 
change this down the line, but that is my understanding of our business 
need today. This is why I like the spoken and written distinctions, as 
well as the hierarchy, found in the 639-6 standard. 

- Karen





Harald Tveit Alvestrand <harald at alvestrand.no>
06/14/2005 11:58 AM

 
        To:     Karen_Broome at spe.sony.com
        cc:     "'IETF Languages Discussion'" <ietf-languages at iana.org>, "'Michael 
Everson'" <everson at evertype.com>, Debbie Garside 
<debbie at ictmarketing.co.uk>
        Subject:        RE: Swiss german, spoken




--On 14. juni 2005 09:54 -0700 Karen_Broome at spe.sony.com wrote:

> Not sure I understand your point here. Yes, I have written text -- IF 
I'm
> using subtitles. But I need to describe BOTH written and spoken language
> in the same XML format or database.

That should be relatively easy....

<item>
  <dialogue>
    <language>zh-min</language>
  </dialogue>
  <subtitles track=1>
    <language>zh-hant</language>
  </subtitles>
  <subtitles track=2>
    <language>zh-hans</language>
  </subtitles>
</item>

(since the actual content isn't inside the XML structure, using "xml:lang" 

seems a bit weird to me - but I'm no XML guru...)

Your application could consult an appropriate table to get reasonable 
lists 
of values for the language tags inside each of the sub-identifiers.....

> For me, the script is often a regional variant of language much like a
> dialect. Traditional Mandarin goes to Hong Kong. Simplified goes to
> Mainland China. Serbia might get Serbian subtitles in Cyrillic, Croatia
> would get Croatian in Latin, but one "Serbo-Croatian" dubbing would
> likely  serve both regions.

<distraction>actually there is no longer a tag for "serbocroatian" - it 
got 
officially deleted from the registry after the breakup of Yugoslavia...... 

so you've got to tag it as either Serbian or Croatian or both....
</distraction>

> Certainly the script type could be a separate metadata property, but it
> seems like describing the script along with the language is fairly
> well-established and serves my needs well.

Yep. We've already made that decision, so let's just embrace that concept 
wholeheartedly.......

                       Harald












More information about the Ietf-languages mailing list