Вопрос

Binary files are frequently attached to SEC filings (see example here) and I am writing a parser to capture this text and re-create the file.

It doesn't matter if the file is an Excel spreadsheet or PDF (which is the sample below and linked to), the encoding method looks the same. Its not Base64; I don't recognize it.

Do you? TIA.

<DOCUMENT>
<TYPE>LETTER
<SEQUENCE>1
<FILENAME>filename1.pdf
<TEXT>
<PDF>
begin 644 filename1.pdf
M)5!$1BTQ+C4-)>+CS],-"C,W(#`@;V)J#3P\+TQI;F5A<FEZ960@,2],(#$T
M-C0S,2]/(#,Y+T4@,30Q-S0W+TX@,2]4(#$T-C$R,R]((%L@-#8X(#$V,ET^
M/@UE;F1O8FH-("`@("`@("`@("`@("`@#0HT-R`P(&]B:@T\/"]$96-O9&50
M87)M<SP\+T-O;'5M;G,@-2]0<F5D:6-T;W(@,3(^/B]&:6QT97(O1FQA=&5$
M96-O9&4O241;/#`T-#$S,4$Q.#`Q-D,X-#!!-S$X0S-%,T$X1D5$0S!!/CQ!
M,31&,S%#,T(Y-T(T-#0P.3)"-#<P148U,D8W0C,X13Y=+TEN9&5X6S,W(#,R
....
...<snip>...
....
M``$F1B;-S0Q,#`S,"2"2-PU$,O:!2(:C0-E_QTS!;`;&H4$R/0&1C`P``08`
M_(\&40T*96YD<W1R96%M#65N9&]B:@US=&%R='AR968-"C$Q-@T*)25%3T8-
!"C\_
`
end
</PDF>
</TEXT>
</DOCUMENT>
Это было полезно?

Решение

The answer turned out to be old school: UUEncoding

Thanks to all who viewed the question.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top