PDFsharp & MigraDoc Foundation

PDFsharp - A .NET library for processing PDF & MigraDoc Foundation - Creating documents on the fly
It is currently Thu Mar 28, 2024 6:26 pm

All times are UTC


Forum rules


Please read this before posting on this forum: Forum Rules



Post new topic Reply to topic  [ 3 posts ] 
Author Message
PostPosted: Tue May 26, 2009 9:41 am 
Offline

Joined: Tue May 26, 2009 9:33 am
Posts: 3
Y use this code:
Code:
  PdfDocument doc = null;
  doc = PdfReader.Open(pathFichero, PdfDocumentOpenMode.InformationOnly);
  doc.Dispose();

With this PDF:
http://web.usal.es/~joluin/investigacio ... ichero.pdf

The file never load :(

The program PDFSharp Explorer fail too, infite loop.

I hope you can fix it.
Regards !


Top
 Profile  
Reply with quote  
PostPosted: Tue Dec 01, 2009 9:45 am 
Offline

Joined: Tue May 26, 2009 9:33 am
Posts: 3
Finally I solved this changing in PdfSharp\Pdf\IO\Lexer.cs, in function ScanHexadecimalString() this:
Code:
      ScanNextChar();
      while (true)
      {
        MoveToNonWhiteSpace();
        if (this.currChar == '>')
        {
          ScanNextChar();
          break;
        }
        if (char.IsLetterOrDigit(this.currChar))
        {
          hex[0] = char.ToUpper(this.currChar);
          hex[1] = char.ToUpper(this.nextChar);
          int ch = int.Parse(new string(hex), NumberStyles.AllowHexSpecifier);
          this.token.Append(Convert.ToChar(ch));
          ScanNextChar();
          ScanNextChar();
        }
      }


To this:
Code:
      ScanNextChar();
      MoveToNonWhiteSpace();
      while (true)
      {
        int startPos = this.Position;
        if (this.currChar == '>')
        {
          ScanNextChar();
          break;
        }
        if (char.IsLetterOrDigit(this.currChar))
        {
          hex[0] = char.ToUpper(this.currChar);
          hex[1] = char.ToUpper(this.nextChar);
          int ch = int.Parse(new string(hex), NumberStyles.AllowHexSpecifier);
          this.token.Append(Convert.ToChar(ch));
          ScanNextChar();
          ScanNextChar();
        }
        MoveToNonWhiteSpace();
        //Si no es capaz de avanzar detener la ejecuciĆ³n del programa para evitar un bucle infinito
        //If it can't follow, stop to avoid an infinite llop
        if (this.Position == startPos)
            break;
      }


Too, I found another problem, a PDF that use the caracter 160 (Chars.NonBreakableSpace) as white space, so I added:

In:
Code:
          case Chars.NUL:
          case Chars.HT:
          case Chars.LF:
          case Chars.FF:
          case Chars.CR:
          case Chars.SP:
            ScanNextChar();
            break;


This:
Code:
          case Chars.NUL:
          case Chars.HT:
          case Chars.LF:
          case Chars.FF:
          case Chars.CR:
          case Chars.SP:
          case Chars.NonBreakableSpace:
            ScanNextChar();
            break;

I don't know if this is correct.

Regards!


Top
 Profile  
Reply with quote  
PostPosted: Tue Dec 01, 2009 10:28 am 
Offline

Joined: Tue May 26, 2009 9:33 am
Posts: 3
PDF Reference 1.6:
Quote:
The space character is also encoded as 312 in MacRomanEncoding and as 240 in
WinAnsiEncoding. This duplicate code signifies a nonbreaking space,; it is typographi-
cally the same as space.

Numbers are in Octal notation:
240o = 160d
312o = 202d


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 3 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 42 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Privacy Policy, Data Protection Declaration, Impressum
Powered by phpBB® Forum Software © phpBB Group