PDFsharp & MigraDoc Foundation
http://forum.pdfsharp.com/

Code improvement to PdfSharp.Pdf.PdfReferenceTable.Compact()
http://forum.pdfsharp.com/viewtopic.php?f=3&t=2827
Page 1 of 1

Author:  philfordjour [ Thu May 29, 2014 9:50 pm ]
Post subject:  Code improvement to PdfSharp.Pdf.PdfReferenceTable.Compact()

Some PDF files may contain errors in the internal table structure. This causes problems when documents need to be modified and re-saved using PdfSharp lib. Before saving a file, PdfSharp lib attempts to clean up all objects that cannot be reached from the trailer and creates a new reference table. When duplicate keys are found in the original reference table, an exception is thrown. I am suggesting the below code improvement to mitigate this problem.

Some common errors you may see include:
Quote:
An item with the same key has already been added to the dictionary.


Code:
File: PdfSharp/PdfSharp.Pdf/PdfReferenceTable.cs
Function: PdfSharp.Pdf.PdfReferenceTable.Compact()

//OLD CODE:
internal int Compact()
{
   // TODO: remove PdfBooleanObject, PdfIntegerObject etc.
   int removed = this.objectTable.Count;
   //CheckConsistence();
   // TODO: Is this really so easy?
   PdfReference[] irefs = TransitiveClosure(this.document.trailer);

   #if DEBUG_
   foreach (PdfReference iref in this.objectTable.Values)
   {
      if (iref.Value == null)
        this.GetType();
      Debug.Assert(iref.Value != null);
   }

   foreach (PdfReference iref in irefs)
   {
      if (!this.objectTable.Contains(iref.ObjectID))
        this.GetType();
      Debug.Assert(this.objectTable.Contains(iref.ObjectID));

      if (iref.Value == null)
        this.GetType();
      Debug.Assert(iref.Value != null);
   }
   #endif

   this.maxObjectNumber = 0;
   this.objectTable.Clear();
   foreach (PdfReference iref in irefs)
   {
   this.objectTable.Add(iref.ObjectID, iref);
   this.maxObjectNumber = Math.Max(this.maxObjectNumber, iref.ObjectNumber);
   }
   //CheckConsistence();
   removed -= this.objectTable.Count;
   return removed;
}

//NEW CODE:
internal int Compact()
{
   // TODO: remove PdfBooleanObject, PdfIntegerObject etc.
   int removed = this.objectTable.Count;
   //CheckConsistence();
   // TODO: Is this really so easy?
   PdfReference[] irefs = TransitiveClosure(this.document.trailer);

   #if DEBUG_
   foreach (PdfReference iref in this.objectTable.Values)
   {
   if (iref.Value == null)
     this.GetType();
   Debug.Assert(iref.Value != null);
   }

   foreach (PdfReference iref in irefs)
   {
      if (!this.objectTable.Contains(iref.ObjectID))
        this.GetType();
      Debug.Assert(this.objectTable.Contains(iref.ObjectID));

      if (iref.Value == null)
        this.GetType();
      Debug.Assert(iref.Value != null);
   }
   #endif

   this.maxObjectNumber = 0;
   this.objectTable.Clear();
   foreach (PdfReference iref in irefs)
   {
      if(!this.objectTable.ContainsKey(iref.ObjectID))
      {
         this.objectTable.Add(iref.ObjectID, iref);
         this.maxObjectNumber = Math.Max(this.maxObjectNumber, iref.ObjectNumber);
      }
   }
   //CheckConsistence();
   removed -= this.objectTable.Count;
   return removed;
}

Author:  MakroDoccer [ Wed May 31, 2017 10:06 am ]
Post subject:  Re: Code improvement to PdfSharp.Pdf.PdfReferenceTable.Compa

Are you sure, that by this change nothing important get lost?

Because now just the first "iref" with the given ObjectID will make the race.

I just don't understand why the "first iref with this ObjectID" should be the "right one".

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
https://www.phpbb.com/