FWS and FI MIME types <was> Re: MIME types for preliminary community review

Fri Nov 19 12:31:23 CET 2004

Ben Morrow wrote:
> At  9pm on 11/11/04 you (Paul Sandoz) wrote:
> 
> 
>>MIME media type name:
>>application
>>
>>MIME subtype name:
>>soap+fastinfoset
> 
> 
> This is an incorrect generalization of the name 'soap+xml'. Names with +
> in them are (IIRC) currently reserved, except for the case of '+xml' to
> indicate an format which uses XML. The name should be
> 'soap-fastinfoset', or you should propose that the IETF extend the
> '+xml' syntax to other types (such as fastinfoset) and get an IANA
> registry set up for such types... personally, I think that would be a
> very good idea (I can think of at least two other uses for this
> convention: '+zip' for formats which are a zip file with a specific
> layout, such as JAR, Perl's PAR, mozilla.org's XPI and Quake's pk3; and
> '+gzip'/'+bzip2' to denote a gzip compressed version of a format), but
> it would need proper discussion and specification.
>

After some browsing: nowhere in the the new MIME specification and 
registration draft [1], RFC 2048 [2] or RFC 3023 [3] does it state that 
names with '+' in them are reserved. But i think what you are getting at 
is that '+' may now be commonly accepted as a mechanism in MIME even 
though this is not specified?

 From RFC 3023, A.12:

"It was thought that '+' expressed the semantics that a MIME type can be 
treated (for example) as both scalable vector graphics AND ALSO as XML; 
it is both simultaneously."

It is in that same reasoning that 'soap+fastinfoset' was chosen because 
it can be treated as both SOAP and also as a fast infoset document.

The intention was not to specify a naming convention as in section 7 of 
RFC 3023. I think it way to premature for this. However, it would be 
good to be consistent with existing patterns if such a convention were 
deemed appropriate in the future.

> 
>>Magic number(s):
>>A W3C SOAP message infoset serialized as a fast infoset document may begin 
>>with one of the following byte sequences:
>>
>>- a byte sequence that corresponds to a UTF-8 encoded XML declaration of the 
>>string "<?xml version='1.0' encoding='finf'", the first five bytes of which 
>>are hexadecimal 3C 3F 78 6D 6C;
> 
> 
> The first five bytes here are not the relevant ones; the important bytes
> are the 66 69 6e 66 at position 31; perhaps a sentence to the effect of
> 'fast infoset documents can be XML documents with an encoding of 'finf'
> specified in the XML declaration' with a reference to the XML
> registration for how to identify and parse an XML document.

Good point, the first four bytes will only indicate a UTF-8 encoded XML 
document or a fast infoset document.

In the Fast Infoset specification we declare 9 possible character 
strings encoded in UTF-8:

<?xml encoding='finf'?>
<?xml encoding='finf' standalone='yes'?>
<?xml encoding='finf' standalone='no'?>
<?xml version='1.0' encoding='finf'?>
<?xml version='1.0' encoding='finf' standalone='yes'?>
<?xml version='1.0' encoding='finf' standalone='no'?>
<?xml version='1.1' encoding='finf'?>
<?xml version='1.1' encoding='finf' standalone='yes'?>
<?xml version='1.1' encoding='finf' standalone='no'?>

I think it best to be explicit and list these directly.

> Also, this
> does not identify specifically a soap fast infoset entity, merely a
> generic fast infoset entity, so this whole section should probably just
> reference the next registration unless there is a way to reliably
> identify those fast infosets used in SOAP.
>

Referencing the generic MIME type is a good idea.

After successful identification of a fast infoset document one would 
have to look at the properties of the root element information by 
parsing the document (the same applies to soap+xml) and checking it 
conforms to the SOAP Envelope element information item [4]. I should 
recheck the SOAP 1.2 MIME type to see what it says.

> I'm a little worried about 'bytes' as opposed to 'octets': what's the
> IETF position on this? 'octets' is definitely less ambiguous.
>

I will change to use 'octets', making it consistent with what is used in 
the normative parts of the Fast Infoset specification.

> 
>>MIME media type name:
>>application
>>
>>MIME subtype name:
>>fastinfoset
> 
> 
> Is it worth registering two separate types, 'application/fastinfoset'
> and 'application/fastinfoset+xml' for fast infosets that are not and are
> encoded as XML documents, respectively?
>

The latter would not make any sense. A fast infoset document and an XML 
document are two distinct serializations of an XML infoset. The infoset 
may have been produced by parsing an XML document, a fast infoset 
document, or by other means (a synthetic infoset). An infoset produced 
by parsing a fast infoset document may be serialized to an XML document. 
So it is possible to convert between fast infoset documents and XML 
documents and round trip an infoset (and also the serializations if 
using canonical forms). Thus 'application/fastinfoset+xml' would be 
equivalent to 'text/xml'.

Paul.

[1] http://www.ietf.org/internet-drafts/draft-freed-media-type-reg-01.txt
[2] http://rfc.net/rfc2048.html
[3] http://rfc.net/rfc3023.html
[4] http://www.w3.org/TR/soap12-part1/#soapenvelope

-- 
| ? + ? = To question
----------------\
    Paul Sandoz
         x38109
+33-4-76188109