PDFsharp & MigraDoc Foundation
http://forum.pdfsharp.com/

How to loop through entire hierarchies and all dictionaries
http://forum.pdfsharp.com/viewtopic.php?f=2&t=4419
Page 1 of 1

Author:  praveenke@gmail.com [ Mon Feb 20, 2023 3:06 am ]
Post subject:  How to loop through entire hierarchies and all dictionaries

Hi All,

I am trying to use PDFSharp DLL to identity all images in PDF files to compress for storage and network transfer optimization.

I tried the sample available, but found its not returning all images, especially which are deep inside hierarchy.

This is the sample I tried, its nicely loops through and get images, but not able to get inside nesting and bring out internal dictionaries.
http://www.pdfsharp.net/wiki/ExportImages-sample.ashx

In visual studio, using quick view, I am able to drill through the object hierarchies and find images deep inside.

Hope, somebody would have already tried this and succeeded.

Thanks in advance

Regards
Praveen

Author:  praveenke@gmail.com [ Tue Nov 07, 2023 6:40 am ]
Post subject:  Re: How to loop through entire hierarchies and all dictionar

Hi,

I am able to find it out myself. I used recursive function to explore entire all nested dictionaries and got hold of all images which is either DCTDecode or FlateDecode here is the code if anybody is looking for such a solution



using (var stream = new MemoryStream(File.ReadAllBytes(PDFPath)) { Position = 0 })
using (var source = PdfReader.Open(stream, PdfDocumentOpenMode.Import))
{
using (var document = new PdfDocument())
{

for (int i = 0; i < source.Pages.Count; i++)
{
PdfDictionary resources = source.Pages[i].Elements.GetDictionary("/Resources");
if (resources != null)
{
EnumeratePDFDictionary(ref resources);
}

document.AddPage(source.Pages[i]);
}

document.Save(targetPath + "//" + Path.GetFileNameWithoutExtension(PDFPath) + "-1" + Path.GetExtension(PDFPath));
}
}




==================================================================
static int ImageCount = 0;
static void EnumeratePDFDictionary(ref PdfDictionary resources)
{
if (resources != null && resources.Elements.GetDictionary("/XObject") != null)
{
PdfDictionary xObjects = resources.Elements.GetDictionary("/XObject");
if (xObjects != null)
{
ICollection<PdfItem> items = xObjects.Elements.Values;
for(int i=0; i<items.Count;i++)
{
PdfReference reference = (PdfReference)items.ElementAt(i);
PdfDictionary xObject = reference.Value as PdfDictionary;

if (xObject.Elements.GetString("/Subtype") == "/Image")
{
//get the image stream for your operations either to extract or compress
}
else if (xObject.Elements.GetString("/Subtype") == "/Form")
{
PdfDictionary resources1 = xObject.Elements.GetDictionary("/Resources");
EnumeratePDFDictionary(ref resources1);
}

}
}
}
}
=============================================================

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
https://www.phpbb.com/