please review 'application/pdf'

Fri Oct 24 16:17:29 CEST 2003

On Friday, October 24, 2003, 1:14:24 PM, Marc wrote:

MM> On Friday 24 October 2003 00:13, Chris Lilley wrote:
MM> <snip>
>> I also see
>>
>>    o Accessing the document in ways not permitted by the document's
>>      access permissions is a violation of the document author's
>>      copyright.
>>
>> This strikes me as a useful statement and I am pleased by its
>> inclusion.
MM> <snip>

MM> I think I need to disagree here.

MM> I don't think it's appropriate for a technical document to make 
MM> assumptions on the intent of the author of a document, be it PDF or
MM> other.

I agree. But in this case, its not assumptions. The intent is
specifically described in the PDF. However, the field might not
actually contain the © symbol and thus, in some jurisdictions, it
might not count as a valid assertion of copyright unless the media
type registration explicitly calls it out.

For a related, example, see the 'Copyright' keyword in the PNG
specification.

MM>  Or, for that matter, for a media type registration to mandate
MM> DRM.

It doesn't do that either. There is no access control or other
provision in the quoted sentence.

It just says that if the author has said in the PDF file that the
material is copyright, then the material is copyright. This strikes me
as sensible. And copyright works can be quoted, cited, and so forth.

MM> It might be that the document author just left the default values
MM> of whatever software she used to create the PDF and that software might
MM> default to restricting rights that the author may have freely granted
MM> otherwise.

I agree that this is a worrying scenario. But the fault there would be
with the software, same as if I wrote software that said all the
content was © Chris Lilley, as a default option.

Talking of freely granting - how does one express, for example, a GNU
copyleft or a Creative Commons license in a PDF file?

MM> It may be that the document is the result of the conversion
MM> of a freely available Web page to PDF format (e.g. print to PDF) and
MM> that the creator of the document, as opposed to the creator of the
MM> content.

Yes; but then, the capture software should transfer that copyright
information across (if its machine readable, which it probably isn't).
If save-to-PDF software is adding 'copyright the person that saved it'
then it is wrong.

MM> OTOH, the Security Consideration section misses a remark that PDF files
MM> contain compressed content and that the result of decompression might
MM> be very much larger than the file appears, which enables DoS attacks on
MM> MUAs and Web Browsers if not taken into account.

Good point.

MM> It also misses to mention if and to what extend meta data about the
MM> author or the authors system is present in the PDF file. Something
MM> along the lines of 

MM>         PDF documents include document metadata such as the name of
MM>         the author, etc. The PDF author may not have full control over
MM>         what metadata is to be included. Therefore, use of this
MM>         mimetype may lead to hidden leaking of possibly sensitive data.

Another good catch, though I would prefer to see that expressed as a
positive (software should give full control over ...) rather than
appearing to state that writing erroneous values is acceptable.

-- 
 Chris                            mailto:chris at w3.org