Media Type "text/csv": new draft (-02) and Last Call

Chris Lilley chris at w3.org
Wed Mar 23 12:08:09 CET 2005


On Wednesday, March 23, 2005, 7:14:24 AM, Yakov wrote:

YS> Clyde,

YS> Thanks for pointing this out. I personally think that instead of making
YS> the header record mandatory which is something that most CSV 
YS> applications do not have, I would rather take the comma out of the end
YS> of the record and have the last field end with a CRLF instead of an 
YS> optional COMMA. Do you think that is a plausible solution?

Its a much better solution, because it allows missing values in the last
field to be unambiguously signalled.

23,45,,67,91
clearly has a missing value in the third field but if

23,45,63,67,91
23,45,63,67,91,

are equivalent then there is an ambiguity

Is there a survey of common use of csv files? How many of them would be
conformant?
YS> Yakov

YS> clyde.ingram at edl.uk.eds.com wrote:
>> In section "2.Definition of the CSV format", items 3 & 4 state:
>> 
>> 3. There maybe an optional header line appearing as the first line of
>> the file with the same format as normal record lines. This header will
>> contain names corresponding to the fields in the file and will usually
>> contain the same number of fields as the records in the rest of the 
>> file. For example:
>> 
>> field_name,field_name,field_name CRLF
>> aaa,bbb,ccc CRLF
>> zzz,yyy,xxx CRLF
>> 4. Within the header and each record there may be one or more fields,
>> delimited by commas. The last field in the record may or may not be 
>> followed by a comma. For example:
>> 
>> aaa,bbb,ccc
>> 
>> Why would you permit the last field in the record to be followed by a
>> comma?
>> If a CSV record comprises:
>> 
>> aaa,,ccc,ddd,,CRLF
>> 
>> does it have 6 fields or 5?
>> If comma is a field separator only, there are 6 fields:
>> 
>> 1. aaa
>> 2. <null>
>> 3. ccc
>> 4. ddd
>> 5. <null>
>> 6. <null>
>> 
>> But if the comma is also a mandatory terminator for the last field 
>> (effectively the record separator becomes comma-CRLF), then there are 5
>> fields.
>> 
>> In my view, permitting the last field to end with comma leads to 
>> ambiguity, and prevents an application from checking that an exact 
>> number of fields is present.  The only way to guarantee the exact number
>> of fields is then to count the fields in the header.  But then your item
>> 3 allows the header record to be omitted.
>> 
>> Would it not be safer to make the header record mandatory?
>> 
>> Thank-you and regards,
>> Clyde Ingram
>> 




-- 
 Chris Lilley                    mailto:chris at w3.org
 Chair, W3C SVG Working Group
 W3C Graphics Activity Lead




More information about the Ietf-types mailing list