Registration of MIME media type application/vnd.php.serialized

Mario Salzer mario at erphesfurt.de
Tue Aug 24 06:32:27 CEST 2004


This vendor tree MIME type registration application discusses the data
format emitted by PHPs "serialize()". It is no longer in use only within
there; some independent implementations exist (evtl. becoming an exchange
format).

Now that doesn't require a MIME Type per se, but people also started
pushing data in this format through the wires ("me too!").  See also
"PHP-RPC" in [http://geri.cc.fer.hr/~ivoras/web2/papers/phprpc.pdf]
-- not the best example, but at least highlights the necessarity of
a MIME Type - else people go by text/html.

* I guess by RFC2048 the PHP group [//php.net] should have filed this
  rather, but if they had any interest in it they'd already done long
  before.  The vendor name was more accurately "zend" or "phpnet", but
  simply "application/vnd.php.serialized" looks much better.

* Not all too sure if this MIME subtype should have the trailing 'd'
  or simply be called "/vnd.php.serialize".  The "/...d" version however
  seems to match better with "/x-www-form-urlencoded".

* I've included an as-short-as-possible description, cause Google didn't
  found anything worth getting referenced, and I'm really not aware of
  an official format spec.

* Btw, "Person & email address to contact for further information" could
  be a mailing list instead, couldn't it?

hope the wording is 'good enough' for a vnd. subtype :)
mario



   ---------------------------------------------------------------------


   MIME media type name: application

   MIME subtype name: vnd.php.serialized

   Required parameters: NONE

   Optional parameters:

      version=x.y.z

         SHOULD be added in responses (not in Accept: headers [RFC2295])
         to indicate which PHP interpreter version the data was
         compacted by. This is useful to identify and eventually work
         around bugs (format violations) from earlier implementations.

   Encoding considerations: BINARY

      As per current use, contents assigned this MIME Type should be
      considered to be in the ISO-8859-1 (Latin-1) character set, so
      that they also may contain arbitrary non-printable characters,
      binary octets.

      Transport channels not capable of handling raw 8 bit without data
      corruption SHOULD therefore apply a Transfer-Encoding of "base64"
      or something similar.

   Security considerations:

      Implementations SHOULDN'T attempt to reproduce more complex data
      types like objects, file descriptors, function bytecode and
      pointers, because most of this was typically of little use if the
      received data came from a remote machine or even from a different
      programming language.
      For optimal security only the basic data types are to be extracted
      and messages fully be rejected as soon as unknown data type
      identifiers were detected.

      Strongly typed languages with fixed-length variable data storage
      should additionally take care not to expand entities of the format
      described herein beyond memory boundaries; even if none of the
      mentioned types is that difficult to decipher.

   Interoperability considerations:

      The specified format is meant to transport data in a machine-
      independent representation, and is especially freed from byte-
      ordering issues, because all integer types pass in string
      representation. However, there is no surety that the described
      data representation could be used by every possible system,
      especially since it originally was made for use by interpreted
      languages only.

   Published specification:

      To date there is no openly published specification of this data
      representation format. For reference the original implementation
      in [http://cvs.php.net/php-src/ext/standard/var.c#click-display]
      or even better the Perl implementation by Scott Hurring available
      under [http://hurring.com/code/perl/serialize/] can be reviewed.

      A terse documentation of how the basic data types are to be
      encoded is however included here.

      Format description:

         Data is compacted into string representation by bailing out
         chunks containing a type identifier (one letter) an optional
         length/count field (for strings and arrays) and an optional
         value (optional because the NULL/undef type does not require
         one for obvious reasons), all separated by colons and ended
         with a semicolon in most cases.

            <app/vnd.php.serialized#TEXT> = (<data>) *1

            <data> = ( <undef> | <boolean> | <int> | <float>
                       | <string> | <hash> | <strangethings> )

            <undef> = 'N;'

            <boolean> = ( 'b:0;' | 'b:1;' )

            <int> = 'i:' ['-'] <digit>*1 ';'

            <float> = 'd:' ['-'] <digit>*1 '.' <digit>*1 ';'

            <string> = 's:' ( <LENGTH> ) <:"> <text> <";>

               <text> is the string this entity represents after
               serialization. It has exactly <LENGTH> characters and is
               enclosed here by two quote marks. No escaping is to be
               applied, and the receiver shouldn't have to decode it.

            <hash> = 'a:' ( <COUNT> ) ':{' (<data> <data>)*COUNT '}'

               Where <COUNT> is a positive integer (as string) matching
               the number of key<data>+value<data> pairs in between the
               curly braces.
               As an exception the string representation of an array or
               hash does not carry a final semicolon (unlike all basic
               data types).

         Other known data type identifiers are "O" for objects, "R" for
         references and "r" for object references. They are not further
         mentioned here, because they are of little use for exchanging
         data across servers as outlined in the Security considerations
         section.

   Applications which use this media type:

      [http://www.php.net/]
      PHP interpreter with original implementation

      [http://hurring.com/code/perl/serialize/]
      Perl 'serialize.pm' by Scott Hurring and others

      [http://php.net/serialize#41948]
      serialize methods for Ecma/JavaScript types by Iván Montes

   Additional information:

      Magic numbers: NONE
      File extensions: NONE
      Macintosh File Type Codes: NONE
      Object Identifiers or OIDs: NONE

   Person & email address to contact for further information:

      Mario Salzer
      <mario at erphesfurt.de>

   Intended usage: COMMON

      This data format is in heavy use for temporarily storing data to
      disk to keep it beyond a single application run.
      Recently it also started to get used for transportation of data
      via HTTP, as lightweight alternative to SOAP and also XML-RPC.

      By including the original representation types of transcoded data,
      it can be used advantageous to "multipart/form-data" and of course
      the more simple "application/x-www-form-urlencoded". It's not only
      much simpler and faster due to less requirements on escaping, but
      also indirectly allows for validation (data types).

   Author/Change controller:

      Mario Salzer
      <mario at erphesfurt.de>

      Since this registration happened on deputy of the vendor,
      it's only senseful to transfer 'change control' to
      <.*@php.net> on request.






More information about the Ietf-types mailing list