What is this java.io.IOException: Error: Expected a long type, actual='930[299' tells?

Question 1

This is caused by PDFBox not following the PDF Reference to the letter :)

Tokens in a PDF token stream may be delimited by white space (as usual for most programming language), but also implicitly: because the next character is a delimiter of its own, since it introduces a special function. Therefore, it's totally valid -- and certainly not unusual -- to encounter constructions such as

/A[123/B(C)]

which is entirely equivalent to the slightly longer

/A [ 123 /B (C) ]

From ISO "PDF 32000-1:2008", 7.2.2 Character Set:

The PDF character set is divided into three classes, called regular, delimiter, and white-space characters. This classification determines the grouping of characters into tokens. The rules defined in this sub-clause apply to all characters in the file except within strings, streams, and comments.

The White-space characters shown [...]

The delimiter characters (, ), <, >, [, ], {, }, /, and % are special [..]

The original code shows the current implementation (taken from http://svn.apache.org/viewvc/pdfbox/branches/1.8/pdfbox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java):

/**
1681         * This method is used to read a token by the {@linkplain #readInt()} method and the {@linkplain #readLong()} method.
1682         *  
1683         * @return the token to parse as integer or long by the calling method.
1684         * @throws IOException throws by the {@link #pdfSource} methods.
1685         */
1686        protected final StringBuilder readStringNumber() throws IOException
1687        {
1688            int lastByte = 0;
1689            StringBuilder buffer = new StringBuilder();
1690            while( (lastByte = pdfSource.read() ) != 32 &&
1691                    lastByte != 10 &&
1692                    lastByte != 13 &&
1693                    lastByte != 60 && //see sourceforge bug 1714707
1694                    lastByte != 0 && //See sourceforge bug 853328
1695                    lastByte != -1 )
1696            {
1697                buffer.append( (char)lastByte );
1698            }
1699            if( lastByte != -1 )
1700            {
1701                pdfSource.unread( lastByte );
1702            }
1703            return buffer;
1704        }

The 'next character' is tested against the whitespace characters from Table 1 in 7.2.2 (top to bottom, "Space", "Line Feed", "Carriage Return", and the Nul character -- though they are still missing the "Form Feed" code 0x0C and, very odd, the common "Tab" 0x09. They do test, however, for an end-of-file (the -1) and < (60), the latter probably because someone ran into a similar bug before. (I could not locate the original bug report #1714707 but I can infer it must have been similar to your issue.)

This list must be completed by adding the following characters, copied verbatim from Table 2 in 7.2.2:

Table 2 – Delimiter characters
Glyph   Decimal   Hexadecimal   Octal   Name
  (       40          28          50    LEFT PARENTHESIS
  )       41          29          51    RIGHT PARENTHESIS [1]
  <       60          3C          60    LESS-THAN SIGN
  >       62          3E          62    GREATER-THAN SIGN
  [       91          5B         133    LEFT SQUARE BRACKET
  ]       93          5D         135    RIGHT SQUARE BRACKET
  {      123          7B         173    LEFT CURLY BRACKET
  }      125          7D         175    RIGHT CURLY BRACKET
  /       47          2F          57    SOLIDUS
  %       37          25          45    PERCENT SIGN

The odd ones out are { and } since, currently, they only appear inside PostScript snippets, and those are not base objects but contained inside a stream. But perhaps they were historically "reserved for future expansion" (which should no longer be an issue, now the PDF format has been frozen as an ISO specification).

Also, the character % in itself is a delimiter, but it needs some special handling as well as it introduces a comment:

The comment consists of all characters after the PERCENT SIGN and up to but not including the end of the line [...] (7.2.3 Comments)

(Note there is a little ambiguity there:

A conforming reader shall ignore comments, and treat them as single white-space characters. That is, a comment separates the token preceding it from the one following it.

which should not be necessary, because the previous line already says the comment ends before the end-of-line; and so the end-of-line itself ought to remain in the input stream and thus act as a separator. Perhaps nothing more than a case of a belt-and-suspenders approach.)

[1] On reviewing: actually, the closing parenthesis is redundant. It can only occur after a matching opening parentheses, and that introduces a string. Viewed one token at a time, you should never encounter a stray ) -- if you do, that indicates a malformed PDF stream.

Question 2

The readLong method reads a long from the underlaying stream. As the PDFBox API states that method is throwing an IOException that has been generated by the PushBackInputStream used as input source (pdfSource).

In your case the log is pretty self-explanatory, it seems there's a square bracket '[' in your stream, which make the long conversion impossible.

You have two options:

check you input and your parser logic (or perform a sanity check before using PDDocument.load)
narrow the scope of your try and catch block to line 60 of your class to handle the specific IOException and react accordingly (if possible in your method logic)

About the freeze issues

Are you sure the code is not stuck in one of your:

while(mX.find()) 
{ 
  ... 
}

blocks? I find the design pretty error prone, especially for X = 1 and 2. I have no time to go into the logic but you may want to refactor the while condition as follow:

long TIMEOUT = 15000l; // 15 seconds
long now = System.currentTimeMillis(); // init the long just above the while

while(mX.find() && (System.currentTimeMillis() - now) < TIMEOUT)
{
   ...
}