PDFsharp & MigraDoc Foundation

PDFsharp - A .NET library for processing PDF & MigraDoc Foundation - Creating documents on the fly
It is currently Mon Nov 30, 2020 11:00 pm

All times are UTC


Forum rules


Please read this before posting on this forum: Forum Rules



Post new topic Reply to topic  [ 2 posts ] 
Author Message
PostPosted: Tue May 26, 2020 8:28 pm 
Offline

Joined: Tue May 26, 2020 8:08 pm
Posts: 2
Hi,

There seems to be somewhat of a PDF/A conundrum, but, at a first glance, shouldn't PDF/A files be actually easy to make? Isn't it just a subset?

Let's see:

https://en.wikipedia.org/wiki/PDF/A

"Audio and video content is forbidden."

Can't you just do a for-loop for the entire document and remove these?

JavaScript and executable file launches are forbidden.

Same?

All fonts must be embedded and also must be legally embeddable for unlimited, universal rendering.

I know PDFSharp has a function to embedd fonts.

Colorspaces specified in a device-independent manner.

OK, no idea about this.

Encryption is forbidden.

Should be pretty easy to check whether it's on?

Use of standards-based metadata is required.

Doesn't seem fancy.

External content references are forbidden.

Isn't that a for-loop again?

LZW is forbidden due to intellectual property constraints. JPEG 2000 image compression models are not allowed in PDF/A-1 (based on PDF 1.4), as it was first introduced in PDF 1.5. JPEG 2000 compression is allowed in PDF/A-2 and PDF/A-3.

This is stupid, because the LZW-patent has expired. Can't someone write to the standards comitee regarding this? I am serious. It's expired since 2003!

Transparent objects and layers (Optional Content Groups) are forbidden in PDF/A-1, but are allowed in PDF/A-2.

Isn't that a for-loop as well? Search and remove these things for PDF/A1

Provisions for digital signatures in accordance with the PAdES (PDF Advanced Electronic Signatures) standard are supported in PDF/A-2.

Standard PDF has that too, right?

Embedded files are forbidden in PDF/A-1, but PDF/A-2 allows embedding of PDF/A files, facilitating the archiving of sets of PDF/A documents in a single file. PDF/A-3 allows embedding of any file format such as XML, CAD and others into PDF/A documents.

Seems for-loopy again.

The use of XML-based XML Forms Architecture (XFA) forms is forbidden in PDF/A. (XFA form data may be preserved in a PDF/A-2 file by moving from XFA key to the Names tree that itself is the value of the XFAResources key of the Names dictionary of the document catalog dictionary.)

Whatever, remove that stuff with for-loops.

Interactive PDF form fields must have an appearance dictionary associated with the field's data. The appearance dictionary shall be used when rendering the field.

This sounds so complicated, I doubt anyone heeds this.

Anyway, am I missing something or isn't most of these a search-and-destroy operations for invalid stuff through some while and for-loops?!

You're the PDF experts here. What's your opinion on this? Can PDFSharp easily identify all these things and remove it from a document? Doesn't seem THAT hard on paper to implement this.


Top
 Profile  
Reply with quote  
PostPosted: Tue May 26, 2020 9:15 pm 
Offline

Joined: Tue May 26, 2020 8:08 pm
Posts: 2
By the way, I am not asking for the PDFSharp team to add a simple "CreatePDFA"-function (would be neat though...)

I am asking the team, how difficult would it be a write a wrapper to do this. I am not an PDF expert, so:

Is it actually possible with PDFSharp to iterate through a PDF document and reliably identify AV files, Javascript, "external content references" and just remove all that?


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 2 posts ] 

All times are UTC


Who is online

Users browsing this forum: Google [Bot] and 18 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Privacy Policy, Data Protection Declaration, Impressum
Powered by phpBB® Forum Software © phpBB Group