PDFsharp & MigraDoc Foundation
http://forum.pdfsharp.com/

Is it possible to extract barcode coordinates with pdfsharp?
http://forum.pdfsharp.com/viewtopic.php?f=2&t=3812
Page 1 of 1

Author:  daxisheart [ Thu Jul 19, 2018 11:22 am ]
Post subject:  Is it possible to extract barcode coordinates with pdfsharp?

Title.

I can get the width/height of the image. However, it seems like image positions rely on many things, including irefs in newer versions and various transformations.

based on http://forum.pdfsharp.de/viewtopic.php?f=2&t=3768 there seems to maybe be a way to extract it from the content stream itself, however, this seems like it would have various issues in implementation, not to mention possible usage rights violations.

Author:  Thomas Hoevel [ Thu Jul 19, 2018 11:52 am ]
Post subject:  Re: Is it possible to extract barcode coordinates with pdfsh

Hi!

As you write, you can parse the page descriptions to extract any information you want.

This task can be somewhat simple if all PDF files come from the same tool, but it can be very complex if you want to support PDF files from any tool.

Author:  daxisheart [ Thu Jul 19, 2018 12:04 pm ]
Post subject:  Re: Is it possible to extract barcode coordinates with pdfsh

asIs it possible to elaborate a bit?

Using pdfxplorer and following pdf reference, the crop/bleed/media/art boxes define various boundaries, but even then it's possible that there may be text or white space. With my various sample pdfs, most pdfs I encounter have the 4-length "x"boxes have the first 2 entries as 0 and the last two entries be almost always the same for the various boxes. I'm not sure how this would translate into coordinates for a specific barcode image.

Additionally, not all pdfs have an xobject form the the resource page to even denote the image is there.

For now I want to be able to locate barcode/images on *some* type of pdf before trying it on all pdfs.

Author:  Thomas Hoevel [ Thu Jul 19, 2018 1:35 pm ]
Post subject:  Re: Is it possible to extract barcode coordinates with pdfsh

Look at the output of this sample:
http://www.pdfsharp.net/wiki/XForms-sample.ashx

The image "test.gif" is included once in the PDF, but it is drawn multiple times at multiple positions, different sizes and different angles.

There can be a lot nesting inside a PDF file. The various boxes describe the visible part etc. of the page.
The Adobe Reference is not easy to read, but it would be my starting point if I had to tackle that task.

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
https://www.phpbb.com/