MIME Type Review Request: text/CSTA-type
Bruce Lilly
blilly at erols.com
Sat Nov 6 18:22:11 CET 2004
On Tue November 2 2004 08:54, Hollenbeck, Scott wrote:
> Please review the MIME type registration template described below. The IESG
> has received a request to register this MIME type in the standards tree. It
> is a product of Ecma International. A URL for the formal specification is
> included in the template.
>
> -Scott-
> ----------
> MIME media type name: text
>
> MIME subtype name: CSTA-type
[...]
> Security considerations:
> This content type is designed to carry CSTA data types over network
> protocols. Appropriate precautions should be taken to insure that
> applications observing these CSTA objects are authorized to do so.
[...]
> Applications which use this media type:
> The text/CSTA-type MIME type is used to carry CSTA data types specified in
> CSTA XML (ECMA-323) over various types of network protocols.
>
> Additional information:
> CSTA XML (ECMA-323) is an application level protocol that enables an
> application to control and observe communications involving various types of
> media (voice calls, video calls, instant messages, Email, SMS, Page, etc.)
> and devices associated with the media.
Why is this being proposed for registration in the text media type tree?
As far as I can tell, this media type appears to be an application data
format containing control data, not textual information (in the sense
of the use of "text" in the MIME standards (RFC 2046) and Internet
Best Current Practice (RFC 2277):
RFC 2046:
The five discrete top-level media types are:
(1) text -- textual information. The subtype "plain" in
particular indicates plain text containing no
formatting commands or directives of any sort. Plain
text is intended to be displayed "as-is". No special
software is required to get the full meaning of the
text, aside from support for the indicated character
set. Other subtypes are to be used for enriched text in
forms where application software may enhance the
appearance of the text, but such software must not be
required in order to get the general idea of the
content. Possible subtypes of "text" thus include any
word processor format that can be read without
resorting to software that understands the format. In
particular, formats that employ embeddded binary
formatting information are not considered directly
readable. A very simple and portable subtype,
"richtext", was defined in RFC 1341, with a further
revision in RFC 1896 under the name "enriched".
RFC 2277:
All human-readable text has a language.
Many operations, including high quality formatting, text-to-speech
synthesis, searching, hyphenation, spellchecking and so on benefit
greatly from access to information about the language of a piece of
text. [WC 3.1.1.4].
Humans have some tolerance for foreign languages, but are generally
very unhappy with being presented text in a language they do not
understand; this is why negotiation of language is needed.
In most cases, machines will not be able to deduce the language of a
transmitted text by themselves; the protocol must specify how to
transfer the language information if it is to be available at all.
The interaction between language and processing is complex; for
instance, if I compare "name-of-thing(lang=en)" to "name-of-
thing(lang=no)" for equality, I will generally expect a match, while
the word "ask(no)" is a kind of tree, and is hardly useful as a
command verb.
4.2. Requirement for language tagging
Protocols that transfer text MUST provide for carrying information
about the language of that text.
Buried in an Annex in the 530-page referenced ECMA document one finds:
This annex specifies a schema for CSTA data types that can be used to encapulatate CSTA data objects so that they can be carried over various types of network protocols such as SIP. A specific MIME media type, text/CSTA-type is designed to contain these data objects. The formats of the data types are specified in Clause 9.
I can see nothing in the proposed media type registration or in the
referenced document that indicates that the media type is to be
used to convey human-readable natural language text as
opposed to application-specific data type, nor do I see any provision
for carrying information about language of text as required by
RFC 2277.
A concrete example of how the proposed media type would be
used to convey human-readable text in some specified language
would help to provide justification for registration of the
proposed subtype in the text media type tree.
More information about the Ietf-types
mailing list