PDFsharp & MigraDoc Foundation

PDFsharp - A .NET library for processing PDF & MigraDoc Foundation - Creating documents on the fly
It is currently Thu Oct 31, 2024 11:57 pm

All times are UTC


Forum rules


Please read this before posting on this forum: Forum Rules



Post new topic Reply to topic  [ 5 posts ] 
Author Message
PostPosted: Tue Nov 07, 2023 6:25 am 
Offline

Joined: Mon Feb 20, 2023 12:35 am
Posts: 5
Hi,

I am trying to compress PDF files using PDF sharp 1.5.

I am able to find both DCTDecode and FlateCode. As per my understanding these are JPG/JPEG and PNG streams respectively. I am able to compress DCTDecode images. But I am unable to compress FlateDecode streams.

I tried 2 Approaches to compress FlateDecode. 1) Use FlateDecode compression c# code 2) Replace FlateDecode with DTCDecode
In approach 1 I am getting the compressed PDF generated but the PNG image is getting distorted
In approach 2, I am getting the PDF corrupted

Interestingly I am able to successfully extract both types of images from PDF's

Please find both functions pasted below. It will be great if any experts can point out the mistake I am doing here.

==========================================================
Approach 1 Compression
==========================================================
private static void CompressPngImage(ref PdfDictionary image, ref int count)
{
int width = image.Elements.GetInteger(PdfImage.Keys.Width);
int height = image.Elements.GetInteger(PdfImage.Keys.Height);

var canUnfilter = image.Stream.TryUnfilter();
byte[] decodedBytes;

if (canUnfilter)
{
decodedBytes = image.Stream.Value;
}
else
{
PdfSharp.Pdf.Filters.FlateDecode flate = new PdfSharp.Pdf.Filters.FlateDecode();
decodedBytes = flate.Decode(image.Stream.Value);
}

int bitsPerComponent = 0;
while (decodedBytes.Length - ((width * height) * bitsPerComponent / 8) != 0)
{
bitsPerComponent++;
}

System.Drawing.Imaging.PixelFormat pixelFormat;
switch (bitsPerComponent)
{
case 1:
pixelFormat = System.Drawing.Imaging.PixelFormat.Format1bppIndexed;
break;
case 8:
pixelFormat = System.Drawing.Imaging.PixelFormat.Format8bppIndexed;
break;
case 16:
pixelFormat = System.Drawing.Imaging.PixelFormat.Format16bppArgb1555;
break;
case 24:
pixelFormat = System.Drawing.Imaging.PixelFormat.Format24bppRgb;
break;
case 32:
pixelFormat = System.Drawing.Imaging.PixelFormat.Format32bppArgb;
break;
case 64:
pixelFormat = System.Drawing.Imaging.PixelFormat.Format64bppArgb;
break;
default:
throw new Exception("Unknown pixel format " + bitsPerComponent);
}

decodedBytes = decodedBytes.Reverse().ToArray();

Bitmap bmp = new Bitmap(width, height, pixelFormat);
BitmapData bmpData = bmp.LockBits(new Rectangle(0, 0, bmp.Width, bmp.Height), ImageLockMode.WriteOnly, bmp.PixelFormat);
int length = (int)Math.Ceiling(width * (bitsPerComponent / 8.0));
for (int i = 0; i < height; i++)
{
int offset = i * length;
int scanOffset = i * bmpData.Stride;
Marshal.Copy(decodedBytes, offset, new IntPtr(bmpData.Scan0.ToInt64() + scanOffset), length);
}
bmp.UnlockBits(bmpData);
bmp.RotateFlip(RotateFlipType.Rotate180FlipNone);
var TempPNGStream1 = new MemoryStream();

var TempPNGStream = new MemoryStream();

bmp.Save(TempPNGStream, ImageFormat.Png);

byte[] ImageBytes = ImageMagic.QualityEncode(TempPNGStream, ImageFormat.Png).ToArray();

ImageBytes=ImageMagic.QualityEncode(TempPNGStream, ImageFormat.Png).ToArray();
TempPNGStream.Write(ImageBytes, 0, ImageBytes.Length);

Bitmap bmp1 = new Bitmap(TempPNGStream);

BitmapData bmpData1 = bmp1.LockBits(new Rectangle(0, 0, width, height), ImageLockMode.WriteOnly, bmp1.PixelFormat);

byte[] data = new byte[Math.Abs(bmpData.Stride * bmpData.Height)];//applied bmpData instead of bmpData1
Marshal.Copy(bmpData1.Scan0, data, 0, data.Length);

decodedBytes = data;//.Reverse().ToArray();

PdfSharp.Pdf.Filters.FlateDecode flate1 = new PdfSharp.Pdf.Filters.FlateDecode();

image.Stream.Value = decodedBytes;// flate1.Decode(decodedBytes);

TempPNGStream1.Write(decodedBytes, 0, decodedBytes.Length);

bmp.Save(TempPNGStream1, ImageFormat.Jpeg);
bmp.Save("D:\\CustomCompression\\Uploaded through API\\RE9fQURfMTU2Nl8yNDRfMjAwMQ==/"+String.Format("Image{0}.png", count++), System.Drawing.Imaging.ImageFormat.Png);

 }

===========================================================================
Approach 2 Replace FlateDecode with DCTDecode
===========================================================================
private static void ReplacePngImageDictionaryWithJpg(ref PdfDictionary image, ref int count)
{
int width = image.Elements.GetInteger(PdfImage.Keys.Width);
int height = image.Elements.GetInteger(PdfImage.Keys.Height);

var canUnfilter = image.Stream.TryUnfilter();
byte[] decodedBytes;

if (canUnfilter)
{
decodedBytes = image.Stream.Value;
}
else
{
PdfSharp.Pdf.Filters.FlateDecode flate = new PdfSharp.Pdf.Filters.FlateDecode();
decodedBytes = flate.Decode(image.Stream.Value);
}

int bitsPerComponent = 0;
while (decodedBytes.Length - ((width * height) * bitsPerComponent / 8) != 0)
{
bitsPerComponent++;
}

System.Drawing.Imaging.PixelFormat pixelFormat;
switch (bitsPerComponent)
{
case 1:
pixelFormat = System.Drawing.Imaging.PixelFormat.Format1bppIndexed;
break;
case 8:
pixelFormat = System.Drawing.Imaging.PixelFormat.Format8bppIndexed;
break;
case 16:
pixelFormat = System.Drawing.Imaging.PixelFormat.Format16bppArgb1555;
break;
case 24:
pixelFormat = System.Drawing.Imaging.PixelFormat.Format24bppRgb;
break;
case 32:
pixelFormat = System.Drawing.Imaging.PixelFormat.Format32bppArgb;
break;
case 64:
pixelFormat = System.Drawing.Imaging.PixelFormat.Format64bppArgb;
break;
default:
throw new Exception("Unknown pixel format " + bitsPerComponent);
}

decodedBytes = decodedBytes.Reverse().ToArray();

Bitmap bmp = new Bitmap(width, height, pixelFormat);
BitmapData bmpData = bmp.LockBits(new Rectangle(0, 0, bmp.Width, bmp.Height), ImageLockMode.WriteOnly, bmp.PixelFormat);
int length = (int)Math.Ceiling(width * (bitsPerComponent / 8.0));
for (int i = 0; i < height; i++)
{
int offset = i * length;
int scanOffset = i * bmpData.Stride;
Marshal.Copy(decodedBytes, offset, new IntPtr(bmpData.Scan0.ToInt64() + scanOffset), length);
}
bmp.UnlockBits(bmpData);
bmp.RotateFlip(RotateFlipType.Rotate180FlipNone);
var TempPNGStream1 = new MemoryStream();

var TempPNGStream = new MemoryStream();

bmp.Save("D:\\CustomCompression\\Uploaded through API\\RE9fQURfMTU2Nl8yNDRfMjAwMQ==/" + String.Format("Image{0}.png", count++), System.Drawing.Imaging.ImageFormat.Png);

bmp.Save("D:\\CustomCompression\\Uploaded through API\\RE9fQURfMTU2Nl8yNDRfMjAwMQ==/" + String.Format("Image{0}.jpg", count++), System.Drawing.Imaging.ImageFormat.Jpeg);

bmp.Save(TempPNGStream, ImageFormat.Jpeg);

string CurrentColorSpace = image.Elements.GetName("/ColorSpace");

image.Stream.Value = TempPNGStream.ToArray().ToArray();
image.Elements.SetValue("/Height", new PdfInteger(bmp.Height));
image.Elements.SetValue("/Width", new PdfInteger(bmp.Width));
image.Elements.SetValue("/Length", new PdfInteger(image.Stream.Value.Length));
image.Elements.SetValue("/ColorSpace", new PdfString(CurrentColorSpace));
image.Elements.SetValue("/Filter", new PdfString("/DCTDecode"));
image.Elements.SetValue("/Type", new PdfString("/XObject"));
image.Elements.SetValue("/Name", new PdfString("/Im0")); //new PdfString(String.Format("Image{0}.jpg", count++)));
image.Elements.Remove("/Interpolate");
image.Elements.Remove("/DecodeParams");

}

Thanks
Praveen


Top
 Profile  
Reply with quote  
PostPosted: Tue Nov 07, 2023 7:20 am 
Offline
PDFsharp Guru
User avatar

Joined: Sat Mar 14, 2015 10:15 am
Posts: 1000
Location: CCAA
Hi!

You don't waste much time writing comments in your source code.
PDF files do not contain PNG images, they contain PDF images or JPEG images.
I'm afraid you cannot simply use the PNG encoder to get PDF images.

With JPEG images, make sure the properties of the image are set correctly.

_________________
Best regards
Thomas
(Freelance Software Developer with several years of MigraDoc/PDFsharp experience)


Top
 Profile  
Reply with quote  
PostPosted: Wed Nov 15, 2023 1:37 am 
Offline

Joined: Mon Feb 20, 2023 12:35 am
Posts: 5
TH-Soft wrote:
Hi!

You don't waste much time writing comments in your source code.
PDF files do not contain PNG images, they contain PDF images or JPEG images.
I'm afraid you cannot simply use the PNG encoder to get PDF images.

With JPEG images, make sure the properties of the image are set correctly.



Thanks for the response.
With PNG, I mean FlateDeCode images, may be your approach is the correct one. JPEG images are perfectly getting compressed and PDF size is getting reduced. But when I try to compress and replace FlateDeCode images, (which I am able to extract and save locally as images) images in PDF are getting skewed or PDF itself is getting corrupted.

Thank you
Praveen
Praveen


Top
 Profile  
Reply with quote  
PostPosted: Wed Nov 15, 2023 8:43 am 
Offline
PDFsharp Guru
User avatar

Joined: Sat Mar 14, 2015 10:15 am
Posts: 1000
Location: CCAA
praveenke@gmail.com wrote:
But when I try to compress and replace FlateDeCode images, (which I am able to extract and save locally as images) images in PDF are getting skewed or PDF itself is getting corrupted.
Image rows in Windows BMP files are DWORD-aligned, but image rows in PDF format are BYTE-aligned. This may account for the skewing.

When adding images, you must reverse the extraction process exactly. Looks as if there are issues with your code adding the new images in PDF format.

_________________
Best regards
Thomas
(Freelance Software Developer with several years of MigraDoc/PDFsharp experience)


Top
 Profile  
Reply with quote  
PostPosted: Fri Nov 17, 2023 6:25 am 
Offline

Joined: Mon Feb 20, 2023 12:35 am
Posts: 5
TH-Soft wrote:
praveenke@gmail.com wrote:
But when I try to compress and replace FlateDeCode images, (which I am able to extract and save locally as images) images in PDF are getting skewed or PDF itself is getting corrupted.
Image rows in Windows BMP files are DWORD-aligned, but image rows in PDF format are BYTE-aligned. This may account for the skewing.

When adding images, you must reverse the extraction process exactly. Looks as if there are issues with your code adding the new images in PDF format.


Thanks for the response. Let me check it and update.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 5 posts ] 

All times are UTC


Who is online

Users browsing this forum: Google [Bot] and 30 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Privacy Policy, Data Protection Declaration, Impressum
Powered by phpBB® Forum Software © phpBB Group