Media Type Dilution

Greg Smith ecomputerd at yahoo.com
Thu Apr 13 13:37:42 CEST 2006


I was pointed to this mailing list as the most
appropriate to post the following email. These
originally were two postings made previously to the
AVT working group. My first post of this on Sunday
prior to my subscription resulted in moderation-land.
My apologies if you receive this multiple times.


It appears that alternate means are being deployed to
differentiate between major media types in part
because the original uses of the Media Type are not
being strictly followed. Please exuse me if these
issues have been addressed before, I would appreciate
a link or two to the appropriate postings.

I have recently been asked about the Media Type being
able to distinguish between audio and video. It was
stated to me that, for example, application/ogg,
application/smil, and application/smil+xml each can
contain either audio, video, or test, and these Media
Type registrations do not distinguish between audio
and video.

My response was that the Media Type *should*
differentiate between audio, video, and other major
types of media. But the top-level Media Types appear
to be suffering from dilution.

RSS Media (excerpt from version 1.1.0)

In addition the latest draft of the Media RSS
Specification justifies the addition of a "medium"
attribute that differentiates between major "mediums"
(image | audio | video | document | executable)
because "it simplifies decision making on the reader
side, as well as flushes out any ambiguities between
MIME type and object type"

This appears to model EXACTLY the original purpose of
the Media Type (section 3 of
http://www.isi.edu/in-notes/rfc2046.txt where we see:
text | image | audio | video | application, as well as
composite types: multipart | message, and
later-adopted "model").

ATOM

While ATOM
(http://www.ietf.org/internet-drafts/draft-ietf-atompub-protocol-08.txt)
Proposes to differentiate between an "entry" (in ATOM
format) and all other "media".

It seems like a very useful construct (within ATOM) to
be able to differentiate collections based on both
Top-level Media Type and Sub-type. Further dilution of
the Top-level Media Types seems contradictory to the
entire purpose of Media Type registrations.

RSS and Atom Feed Auto-Discovery for Internet TV

http://maketelevision.com/log/rss_and_atom_feed_auto-discovery_for_internet_tv
Makes use of an attribute: media="tv", to
differentiate between what (in my opinion) should be
Top-level Media Type (using for example, "video/*").

Granted that these examples are not standards (yet),
they substantiate the view that Top-level Media Types
are not sufficient for media differentiation for
purposes of dispatch.

Ogg

As a concrete example of the dilution, application/ogg
is the Media Type for use of all audio and/or video
that is in the Ogg file format. Ogg often contains
vorbis-encoded audio or theora-encoded video. It seems
appropriate, then, that either audio/ogg and
video/ogg, or audio/vorbis and video/theora, be
appropriated for use for files using the Ogg format. I
found that in Feb 2006, there was an application that
included appropriation of audio/vorbis, specifically
for RTP transport, and not for Ogg file format. This
concerns me only because audio/vorbis seems very
appropriate to use for Ogg file format Vorbis-encoded
audio and its appropriation for use only using RTP
"container" format seems premature or somehow
misguided.

Over the past year, I've had quite an introduction to
video container formats vs. encoding formats. I
maintain the "Media Chart" for Pocket PCs at
http://www.feederreader.com/mediachart.html , which
attempts to sort out some of the confusion.

Part of the issue is that Major Type, Media Container,
and Media Encoding are variously intertwined and
mixed. But not essentially addressed by the Media Type
specification. And because these issues are at a level
above individual Media Types, it seems they would need
to be addressed outside any Media Type application.

These issues appear to me to be ripe for
"de-confusioning" and it occurred to me that the
organization responsible for Media Types would be the
place to start.

I'm open the the suggestion that I am not fully aware
of how Media Types are used in their entirety. I am,
however, suggesting that there is a growing problem
with use of Media Types in determining dispatch
requirements.

Functional Interpretation

Although I have not seen dispatching based on Media
Types generically described in the RFCs that I
reviewed, I have found the description in a w3c
document "Client handling of MIME headers" [1] and
RFC3023 Section 13 (Appendix A) [2].

This seems to me to be a major use and benefit of
defining appropriate Media Types, especially with
respect to the assignment of Sub-types within the
appropriate Top-level types.

Common browsers and operating systems both allow
dispatching based on Media Types, in addition to the
"similar-to-dispatching" uses I mentioned previously
in ATOM, RSS Media, and Autodiscovery.

Even at the operating system or the browser level,
this sometimes run into problems for the user where a
particular audio or video Media Container contains a
Media Encoding that is not decodable by the
application that has been associated based on the
Media Type.

Terminology Interpretation

RFC4288 Section 4.1 "Functionality Requirement" states
that "Media types MUST function as an actual media
format."

The next sentence states "Registration of things that
are better thought of as a transfer encoding...are not
allowed." and then goes on to give base64 as an
example of a transfer encoding.

It is still unclear which, if any, of the following
categories (using my prior terminology) are construed
as a "transfer encoding": Major Type, Media Container,
and Media Encoding. To me, these "categories" are the
logical way of distinguishing audio and video media
simply because these are the broad categories
determining how and if a particular file can be played
on any particular system.

"Major Type" clearly corresponds to top-level Media
Type (audio, video, etc.). Media Containers (such as
Ogg, MPEG-4 Part 14, "AVI", "MOV") clearly do *not*
correspond to RFC4288 Section 4.1 "Transfer Encoding"
and clearly correspond to actual media formats. So the
remaining question is: do Media Encodings commonly in
use correspond to "transfer encodings", in which case
they do not qualify as distinct Media Types. As
examples of Media Encoding I offer: Vorbis (audio),
Theora (video), MPEG-4 Part 2 (video), MPEG-4 Part 3
(audio), MPEG-4 Part 10 (video). A deeper and more
exhaustive survey of current actual Media Type
registrations would be useful here.

Part of the issue may be that "transfer encoding" in
the sense of  RFC4288 Section 4.1 seems to refer to
"lossless," "reversible," and "universally decodable".
Such is not the case with many audio and/or video
encodings. Based on this interpretation, I would argue
that Transfer Encodings do not refer to the common
audio and/or video encodings, and therefore that these
audio and video encodings qualify for Media Type
registration.

Possible Solution

The danger is that if every combination of Media
Containers and Media Encodings were registered as
distinct Media Types, the result would be an
astronomical increase in the number of Media Types and
it would increase the number, if not the complexity,
of the dispatch rules within operating systems and
browsers.

It seems one way around this might be to use a
"+suffix" mechanism (obviously specified such that it
is backward compatible with RFC3023), or some other
"extension". It could be specified that only Media
Containers qualify as a Media Sub-type, and that one
or more (optional) suffixes would indicate the
contained media encodings. And that for containers
that can contain or do commonly contain only audio,
then they be registered under the audio top-level
type. For contains that can contain or do commonly
contain video (with synchronized audio), they should
be registered as a video top-level Media Type. For
Media Containers that serve both functions (including
audio only), they should be registered as audio and
video top-level Media Types.

With the added "Encoding extension" mechanism, there
may be complications in that current dispatching
mechanisms are "fixed" meaning that the media
dispatchers expect a fixed string match of Media Type
to dispatchee. This does not easily permit
interpretation of added-in-arbitrary-order suffixes.
To solve this problem, the order of suffixes could be
programmatically determined (for example, either
absolute alphabetical or alphabetical within major
groups of audio, video, other). This way, only one
permutation of each combination of Media Encodings
would need to be added.

Summary

I'm not sure if this is viewed as a problem by others,
but it seems to be a serious issue that is beginning
to be "worked around" outside of, and parallel to, the
Media Type definitions in several different and
incompatible ways. Inclusion of more strict
interpretation and differentiation of top-level types
and/or formal recognition of Media Containers and
Media Encodings within the Media Type system seems
like significant and appropriate improvements to Media
Type registration.


References

[1] "Client handling of MIME headers"

http://www.w3.org/2001/tag/doc/mime-respect-20030709
"The architecture of the Web depends on applications
making dispatching and security decisions for
resources based on their Internet Media Types and
other MIME headers."

http://www.w3.org/2001/tag/doc/mime-respect.html
The current version states "For example, HTTP and MIME
use the value of the "Content-Type" header field to
indicate the Internet media type of the
representation, which influences the dispatching of
handlers and security-related decisions made by
recipients of the message."

Section 3.1
A media type is not simply an indication of data
format; it also refers to a preferred interpretation
of that data format. This preferred interpretation may
impact the recipient's functional decisions, such as
whether the data is rendered, stored, or executed. In
practice, media types are often used as the key for
selecting an appropriate handler to interpret the data
received. It is possible for a single data format to
be associated with multiple media types and for a
single media type to describe a superset of many
different data formats.
--- end excerpt

[2] http://www.ietf.org/rfc/rfc3023.txt


Greg Smith


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


More information about the Ietf-types mailing list