PDFsharp & MigraDoc Foundation

PDFsharp - A .NET library for processing PDF & MigraDoc Foundation - Creating documents on the fly
It is currently Thu Mar 28, 2024 9:35 am

All times are UTC


Forum rules


Please read this before posting on this forum: Forum Rules



Post new topic Reply to topic  [ 2 posts ] 
Author Message
PostPosted: Mon Feb 20, 2023 3:06 am 
Offline

Joined: Mon Feb 20, 2023 12:35 am
Posts: 5
Hi All,

I am trying to use PDFSharp DLL to identity all images in PDF files to compress for storage and network transfer optimization.

I tried the sample available, but found its not returning all images, especially which are deep inside hierarchy.

This is the sample I tried, its nicely loops through and get images, but not able to get inside nesting and bring out internal dictionaries.
http://www.pdfsharp.net/wiki/ExportImages-sample.ashx

In visual studio, using quick view, I am able to drill through the object hierarchies and find images deep inside.

Hope, somebody would have already tried this and succeeded.

Thanks in advance

Regards
Praveen


Top
 Profile  
Reply with quote  
PostPosted: Tue Nov 07, 2023 6:40 am 
Offline

Joined: Mon Feb 20, 2023 12:35 am
Posts: 5
Hi,

I am able to find it out myself. I used recursive function to explore entire all nested dictionaries and got hold of all images which is either DCTDecode or FlateDecode here is the code if anybody is looking for such a solution



using (var stream = new MemoryStream(File.ReadAllBytes(PDFPath)) { Position = 0 })
using (var source = PdfReader.Open(stream, PdfDocumentOpenMode.Import))
{
using (var document = new PdfDocument())
{

for (int i = 0; i < source.Pages.Count; i++)
{
PdfDictionary resources = source.Pages[i].Elements.GetDictionary("/Resources");
if (resources != null)
{
EnumeratePDFDictionary(ref resources);
}

document.AddPage(source.Pages[i]);
}

document.Save(targetPath + "//" + Path.GetFileNameWithoutExtension(PDFPath) + "-1" + Path.GetExtension(PDFPath));
}
}




==================================================================
static int ImageCount = 0;
static void EnumeratePDFDictionary(ref PdfDictionary resources)
{
if (resources != null && resources.Elements.GetDictionary("/XObject") != null)
{
PdfDictionary xObjects = resources.Elements.GetDictionary("/XObject");
if (xObjects != null)
{
ICollection<PdfItem> items = xObjects.Elements.Values;
for(int i=0; i<items.Count;i++)
{
PdfReference reference = (PdfReference)items.ElementAt(i);
PdfDictionary xObject = reference.Value as PdfDictionary;

if (xObject.Elements.GetString("/Subtype") == "/Image")
{
//get the image stream for your operations either to extract or compress
}
else if (xObject.Elements.GetString("/Subtype") == "/Form")
{
PdfDictionary resources1 = xObject.Elements.GetDictionary("/Resources");
EnumeratePDFDictionary(ref resources1);
}

}
}
}
}
=============================================================


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 2 posts ] 

All times are UTC


Who is online

Users browsing this forum: Bing [Bot] and 159 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Privacy Policy, Data Protection Declaration, Impressum
Powered by phpBB® Forum Software © phpBB Group