Media Types in 3GPP Timed text draft (was: RE: [AVT] RTP andMediaTypes)

Magnus Westerlund magnus.westerlund at
Tue Sep 7 14:26:23 CEST 2004

Hi Jose,

See below.

Jose Rey wrote:

>>Is this fundamentally text or a video codec? If it's a video
>>codec, it should be under "video/", otherwise under "text/".
> I think it is a video codec, since without the video capabilities
> (modifiers) it would just provide the same services as , e.g.,
> conversational text=just plain timed text, for which it is not thought to be
> used.

I would argue that it is in fact text. It is Rich text, that has both 
different types of decoration, like colors and fonts, and also time 
based modifications, like karaoke markings. However it is in the bottom 
always text. It can't show anything else than the charsets from Unicode 
that is defined in section 5.3 of 3GPP TS 26.245. It is also readable 
with a minimal effort by simply extracting it from the RTP payload. I 
would think that basic extraction in complete text blocks is really 
equal to the capability of reading something with CAT when it comes to RTP.

>>>2.- we are not clear on what exactly means to "relax rules for media
>>>registration under text/".  I.e. is text/t140 an example of these
>>>"relaxed" rules or does it comply with the traditional rules as per
>>>rfc 2046?  Does the relaxed rules just mean that besides text also
>>>payload headers of that media type are udnerstood?
>>My understanding is that the new rules are intended to allow formats
>>such as 3GPP timed text to be registered under the text top-level media
>>type, if appropriate, provided their domain of applicability is clearly
>>specified (e.g. the domain of applicability might be that the type is
>>defined for transfer via RTP only).
> The MIME subtype /3gpp-tt cannot be used for HTML download since for that
> purpose a 3gp file and therefore the video/3gp MIME type is used.  So I
> think this is indeed restricted to RTP.  However,  what is the gain of doing
> that?   Given the answer to the first question I think registering under
> text/ would not be of any use?

I think there is a purpose of separating something that is of a 
different media type. It is clear that text and video might not be as 
well separated as video and audio. However there is some difference in 
what is needed to present text and video when it comes to devices. Text 
is generally much less complex, and do not require the same resolution, 
color support, etc., to be appropriately interpreted. Sure it can 
benefit from having the same capability, however not necessary to avoid 
losing information in many cases. Thus I would argue that there is a 
purpose to express that difference using different media types.

A 3GPP timed text session is also a complementing medium to a video 
stream, or it may be the other way around, that video is complementing 
the text. So from an AVT perspective I would expect it common that you 
will actually have the video and timed text in different RTP sessions. 
This due to selectability, adaptation and pure simplicity in handling. 
However this is not really a argument either way for choosing text over 
video or vice versa.

I do however agree with Dave that we should only select one media type. 
I was bringing up the potential usage of having two, simply to see if 
there is any merit. However I don't think there is any gain, only 
trouble on that road.


Magnus Westerlund

Multimedia Technologies, Ericsson Research EAB/TVA/A
Ericsson AB                | Phone +46 8 4048287
Torshamsgatan 23           | Fax   +46 8 7575550
S-164 80 Stockholm, Sweden | mailto: magnus.westerlund at

More information about the Ietf-types mailing list