Media type versioning (was: Re: Scripting Media Types)

Thu Mar 10 09:26:50 CET 2005

>>> My opinion is that content negotiation for versioning capability
>>> using MIME type parameters is unworkable.
>> Negotiation isn't always possible; sometimes it's a one-way
>> street.

I was using "negotiation" in more general sense. If the
are no choices to be made, then there's no point in sending
any additional information. And my point isn't that doing
content negotiation or, in general, making choices about content
is impossible -- rather that the specific proposal of using
MIME type _parameters_ is a bad design choice for conveying
information as slippery as _version_.

>>> The use of the MIME 
>>> type is to describe the payload sufficiently so that you know
>>> HOW to process it, but the content-type label cannot (and thus
>>> should not) be extended in an attempt to use it to determine
>>> WHETHER a given processor knows enough to be able to process it.
>> We're in agreement that the media type/subtype tag should not
>> be used for versioning (that would result in an unnecessary
>> proliferation of tags).
> Is this really so bad?  There are all of 93 application/foo's 
> registered.

The original point was about whether it is possible
to decide whether a receiver is capable of interpreting
something, in general, just based on the MIME type.

But registering and using a new MIME type for a new version
is often just the right thing:

Unless the old MIME type was defined carefully with sufficient
attention to extensibility so that all old processors will
behave gracefully when presented with new content, registering
a new type may be the only choice.

>>> So whether version information should or should not be in
>>> a media type parameter depends pretty much on whether there
>>> is an embedded, easy-to-find version indicator in the data
>>> itself; if there isn't, and processors need to know the version
>>> to choose between different processing methods, then the
>>> version parameter should be mandatory. There is no strong
>>> use case for an optional version parameter, or in general
>>> for duplicating, in MIME parameters, information that is
>>> readily obtained from the content itself.

>> Maybe; if transfer encoding is applied, that would entail
>> decoding and opening the media content to determine version
>> information.  If the media type is specified via
>> message/external-body or a similar mechanism, additional
>> steps are necessary to obtain the media in order to
>> determine version information.  If a particular instance
>> of content is large, that may result in a substantial waste
>> of resources if it turns out to have an incompatible version.

I don't think those are problems; aren't these forwarding bodies
(that are looking at MIME types and parameters) also unwrapping
transfer encodings and fetching external bodies? 

You're optimizing what I assume is mainly a failure case,
anyway: you got sent something you can't read. In some cases,
you're just deciding _which_ processor to send it to, so fetching
the whole body isn't an extra overhead anyway.

> I don't like the idea of embedded data being required to determine if the
> entity can accept the message.  By that point in time, it is too late.  If
> you need a data element to determine, at the protocol level, you can
process
> a message, then you need that data element at the protocol level.  E.g., a
> MIME parameter rather than an XML tag in a body part.

Using MIME parameters isn't the only place to put auxiliary
information. If you need auxiliary information, put it somewhere
else -- such as in a content-features header. Don't overloading
everything into the same header.

>>> If you want to do content negotiation, then consider
>>> using media features and media feature negotiation; with
>>> media features you can negotiate not only version information,
>>> but other parameters that might also be necessary to know
>>> in order to determine interpretability, e.g., availability
>>> of compression modes, codecs, fonts, color capabilities,
>>> buffer size limitations, etc.

>> That's fine for protocols that support negotiation, but
>> not all protocols do so (HTTP does, RFC 3297 layered on
>> top of messaging can, but FTP does not).

FTP in most cases is done without MIME anyway -- the receiver
guesses the MIME type based on the file extension and sniffing
of information about the host operating system.

I meant "negotiation" in a more general way: describing content
in a way that subsequent processors can make decisions about
whether or how to process without opening the content itself.
In this case, I recommend putting this auxiliary data somewhere
where it won't just confuse and pollute the processing that
is necessary for the content-type itself.

"Content-features" is useful in cases where there is no back-channel.
You can ignore it if you want, but if you're in a situation where further
processing of some content-types is expensive (expensive
local processor, or forwarding agent to further processor)
and you have some information about the next step's capabilities,
you could try to match the next step's capabilities against
the content-features. 

> > Summary:
> > o media content might contain version information, but reliance
> >   on that can result in wasted resources
> 
> And a layer violation and asking for hacked stacks.

Hardly. Pulling out redundant information and sticking it
on the wrapper to facilitate processing is an optimization,
but it is also a "layer violation". Sometimes we tolerate
layer violations if they're important for optimizing performance.

>> o negotiation via content-features may be a viable mechanism
>>   where negotiation is possible, but it is not always
>>   possible (and some means of indicating version must be
>>   available to the content negotiation mechanism)

Wrong. "Content-Features" can be used to carry version information
as well as the parameters of "Content-Type". There is no
situational difference.

> > o proliferation of type/subtypes tags for the sole purpose
> >   of versioning is undesirable
> 
> I'm not sold on this one.  If a new version of a media type is not
> backwards-compatible, I would offer that it is a different media type.  As
> noted above, in 8 years of MIME types, we have all of 93 
> application/foo registered types.

New subtypes are often necessary for new versions, although
not always "desirable".

> > that leaves parameters associated with the media type as an
> > efficient mechanism for indicating version information.

I disagree -- it's extraneous information, unnecessary and often
lost in processing pipelines.  

We haven't really talked much about the terrible deployment experience
and prospects for *any* MIME parameters. They usually get lost;
most of the operating systems that support MIME type mappings to
files don't support parameters (Windows file associations, MacOS,
mime.types, etc.).  

> After saying that this is not the only method, I also would offer that one
> is ALWAYS free, in the definition of the MIME type, list a parameter and
> describe normative behavior for processing that parameter.

We could write standards about castles in the air all day, and
I suppose it would be "free" in some sense of the word, but it
is wrong, and detracts from the value of the standard. Don't
specify things that either don't work or are undeployable. Otherwise,
what's the point?

Larry
-- 
http://larry.masinter.net