PDFsharp & MigraDoc Foundation

PDFsharp - A .NET library for processing PDF & MigraDoc Foundation - Creating documents on the fly
It is currently Sat Apr 27, 2024 8:22 pm

All times are UTC


Forum rules


Please read this before posting on this forum: Forum Rules



Post new topic Reply to topic  [ 4 posts ] 
Author Message
PostPosted: Thu Jun 22, 2023 7:02 am 
Offline

Joined: Thu Jun 22, 2023 6:53 am
Posts: 2
Dear,

I'm trying to make a little Windows Form app for reducing size of a PDF file containing or not some pictures.
For that, I think the best way is use PDFSharp + imageFactory to reduce image. (Using Options.FlateEncodeMode = PdfFlateEncodeMode.BestCompression; it's not enough for reducing file size).

Actually, I can reduce some of PDF (created by a Writer for example). But I can't reduce lot of PDF file created with a Scanner. (I got an error).

Code:
    // Open the Original PDF File.
            PdfSharp.Pdf.PdfDocument document = PdfSharp.Pdf.IO.PdfReader.Open("testFile.pdf", PdfDocumentOpenMode.Import);

            // Create the new Empty PDF File
            PdfSharp.Pdf.PdfDocument nouveau_doc = new PdfSharp.Pdf.PdfDocument();
            foreach (PdfPage page in document.Pages)
            {
                PdfDictionary resources = page.Elements.GetDictionary("/Resources");
                if (resources != null)
                {
                    PdfDictionary xObjects = resources.Elements.GetDictionary("/XObject");
                    if (xObjects != null)
                    {
                        ICollection<PdfItem> items = xObjects.Elements.Values;
                        foreach (PdfItem item in items)
                        {
                            if (item is PdfReference reference)
                            {
                                // If an Image is found.
                                if (reference.Value is PdfDictionary xObject && xObject.Elements.GetString("/Subtype") == "/Image")
                                {
                                    byte[] stream = xObject.Stream.Value;
                                    int width = xObject.Elements.GetInteger(PdfImage.Keys.Width);
                                    int height = xObject.Elements.GetInteger(PdfImage.Keys.Height);

                                    using (MemoryStream inStream = new MemoryStream(stream))
                                    {
                                        using (MemoryStream outStream = new MemoryStream())
                                        {
                                            using (ImageFactory imageFactory = new ImageFactory())
                                            {
                                                imageFactory.Load(inStream).Format(new JpegFormat { Quality = 30 }).Resize(new System.Drawing.Size(width, height)).Resolution(96, 96).Save(outStream);
                                            }

                                            xObject.Stream.Value = outStream.ToArray();
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
                // Add the content to a new page.
                nouveau_doc.AddPage(page);
               
            } // End of the foreach page in the PDF file.
           
            // Finally save the new document.
            nouveau_doc.Save("pdf_compressed.PDF");


This line
Code:
 imageFactory.Load(inStream).Format(new JpegFormat { Quality = 30 }).Resize(new System.Drawing.Size(width, height)).Resolution(96, 96).Save(outStream);

make me this error :
Quote:
ImageProcessor.Common.Exceptions.ImageFormatException : 'Input stream is not a supported format.'


It appear the Stream have no Format ... any idea ?

Thanks a lot for help :)

Falcon


Top
 Profile  
Reply with quote  
PostPosted: Thu Jun 22, 2023 9:16 am 
Offline
PDFsharp Expert
User avatar

Joined: Sat Mar 14, 2015 10:15 am
Posts: 916
Location: CCAA
Falcon974 wrote:
It appear the Stream have no Format ... any idea?
PDF files may contain embedded JPEG files. Those files can be "zipped" (Filter FlateDecode). It seems your code does not account for that.
Images can also be in PDF format (lossless), not JPEG format. It seems your code does not account for that.

_________________
Best regards
Thomas
(Freelance Software Developer with several years of MigraDoc/PDFsharp experience)


Top
 Profile  
Reply with quote  
PostPosted: Thu Jun 22, 2023 1:25 pm 
Offline

Joined: Thu Jun 22, 2023 6:53 am
Posts: 2
Thanks for answering me TH-Soft.

Have you a piece of code to help me with your 2 ideas ?

Sincerely,


Top
 Profile  
Reply with quote  
PostPosted: Thu Jun 22, 2023 1:45 pm 
Offline
PDFsharp Expert
User avatar

Joined: Sat Mar 14, 2015 10:15 am
Posts: 916
Location: CCAA
viewtopic.php?f=2&t=4003&hilit=%2Fdctdecode
viewtopic.php?f=2&t=1944&hilit=%2Fdctdecode

_________________
Best regards
Thomas
(Freelance Software Developer with several years of MigraDoc/PDFsharp experience)


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 4 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 393 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Privacy Policy, Data Protection Declaration, Impressum
Powered by phpBB® Forum Software © phpBB Group