PDFsharp & MigraDoc Foundation

PDFsharp - A .NET library for processing PDF & MigraDoc Foundation - Creating documents on the fly
It is currently Thu Mar 28, 2024 7:12 pm

All times are UTC


Forum rules


Please read this before posting on this forum: Forum Rules



Post new topic Reply to topic  [ 1 post ] 
Author Message
PostPosted: Mon Jul 02, 2007 5:00 pm 
Offline

Joined: Mon Jul 02, 2007 4:31 pm
Posts: 1
The project I'm working on needs to go through a directory of PDFS and spit out information about all of the images in them. The PDFs will all be created from scanned images, so the focus of this is slightly narrow.

I have some proof of concept code working with the limited testing that I've done so far (far from being production code). However, I've only just started working with PDFsharp and this doesn't seem to be very elegant or robust. Is there a better way that I'm overlooking?

Code:
Dim pDoc As PdfSharp.Pdf.PdfDocument
Dim pDict As PdfSharp.Pdf.PdfDictionary
Dim pRef As PdfSharp.Pdf.Advanced.PdfReference
Dim iPage As Integer = 0
Dim iWidth As String = ""
Dim iHeight As String = ""
Dim iColor As String = ""
Dim iBits As String = ""
Dim iFilter As String = ""
Dim tmp As String = ""

' Open the PDF in read-only mode
pDoc = PdfSharp.Pdf.IO.PdfReader.Open("C:\Test Files\Color PDFs\8902-01-0003.pdf", PdfSharp.Pdf.IO.PdfDocumentOpenMode.ReadOnly)

' Loop through each page, find the image, and report on it
For iPage = 0 To pdoc.Pages.Count - 1
   iWidth = ""
   iHeight = ""
   iColor = ""
   iBits = ""
   iFilter = ""

   ' Does this page have a Resources Element?
   If pDoc.Pages(iPage).Elements.Contains("/Resources") Then
      pDict = pDoc.Pages(iPage).Elements("/Resources")

      ' Does the Resources Element contain an XObject?
      If pDict.Elements.Contains("/XObject") Then
         pDict = pDict.Elements("/XObject")

         ' Does the XObject contain an Im1 image element?
         If pDict.Elements.Contains("/Im1") Then
            pRef = pDict.Elements("/Im1")

            ' Get the dictionary by the reference under Im1
            pDict = pDoc.Internals.GetObject(pRef.ObjectID)

            ' Get image details
            If pDict.Elements.Contains("/Width") Then iWidth = pDict.Elements("/Width").ToString
            If pDict.Elements.Contains("/Height") Then iHeight = pDict.Elements("/Height").ToString
            If pDict.Elements.Contains("/ColorSpace") Then
               iColor = pDict.Elements("/ColorSpace").ToString
               If iColor.Substring(0, 1) = "/" Then iColor = iColor.Substring(1)
            End If
            If pDict.Elements.Contains("/BitsPerComponent") Then
               iBits = pDict.Elements("/BitsPerComponent").ToString
            End If
            If pDict.Elements.Contains("/Filter") Then
               iFilter = pDict.Elements("/Filter").ToString
               If iFilter.Substring(0, 1) = "/" Then iFilter = iFilter.Substring(1)
            End If
         End If

         ' {0} Delim
         ' {1} Filename
         ' {2} Page Number
         ' {3} Page Width (inch)
         ' {4} Page Height (inch)
         ' {5} Page Orientation
         ' {6} Image Dimensions (pixels)
         ' {7} Bits Per Component
         ' {8} Colorspace
         ' {9} Decode Filter
         tmp = String.Format("{1}{0}{2}{0}{3}{0}{4}{0}{5}{0}{6}{0}{7}{0}{8}{0}{9}", _
                                       "|", _
                                       "filename.ext goes here", _
                                       iPage + 1, _
                                       String.Format("{0:0.##}", PdfSharp.Drawing.XUnit.FromPoint(pDoc.Pages(iPage).Width).Inch), _
                                       String.Format("{0:0.##}", PdfSharp.Drawing.XUnit.FromPoint(pDoc.Pages(iPage).Height).Inch), _
                                       pDoc.Pages(iPage).Orientation.ToString, _
                                       iWidth & "x" & iHeight, _
                                       iBits, _
                                       iColor, _
                                       iFilter)

         Console.WriteLine(tmp)
      End If
   End If

Next

pdoc.Close()
pDoc = Nothing


The output looks like this:
Code:
filename.ext goes here|1|11.04|8.49|Portrait|3311x2544|1|DeviceGray|CCITTFaxDecode
filename.ext goes here|2|8.49|11.06|Portrait|2544x3315|1|DeviceGray|CCITTFaxDecode
...
filename.ext goes here|39|8.49|11|Portrait|1696x2198|8|DeviceRGB|DCTDecode
filename.ext goes here|40|8.49|11|Portrait|1696x2198|8|DeviceRGB|DCTDecode


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 1 post ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 150 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Privacy Policy, Data Protection Declaration, Impressum
Powered by phpBB® Forum Software © phpBB Group