PDFsharp & MigraDoc Foundation

PDFsharp - A .NET library for processing PDF & MigraDoc Foundation - Creating documents on the fly
It is currently Thu Mar 28, 2024 12:21 pm

All times are UTC


Forum rules


Please read this before posting on this forum: Forum Rules



Post new topic Reply to topic  [ 2 posts ] 
Author Message
PostPosted: Wed Aug 20, 2014 6:53 pm 
Offline

Joined: Wed Aug 13, 2014 3:06 pm
Posts: 3
I've been working on a text extractor for pdfs using the PDFsharp library - first and foremost I'd like to thank everyone who has worked on this library. It's been a ton of help and I would have given up this project a long time ago without it.

Things are coming quite well, and for the most part I've finished this task. However, any content that use fonts that require a CMap don't extract correctly (understandably, as their bytes are mapped to unicode values). Are there any PDFsharp classes that can help out with this? I can always go into the ToUnicode stream and parse it out myself, but I don't believe in reinventing the wheel so I figured that I'd ask. I've noticed PdfSharp.Fonts.CMapInfo but am unsure of it's usage.


Top
 Profile  
Reply with quote  
PostPosted: Sun Jan 27, 2019 7:46 pm 
Offline

Joined: Sun Jan 27, 2019 7:41 pm
Posts: 1
May be something has changed since the original post. Does PDFsharp has any features to parse the /ToUnicode stream and get a character map from it?


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 2 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 114 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Privacy Policy, Data Protection Declaration, Impressum
Powered by phpBB® Forum Software © phpBB Group