문제

Binary files are frequently attached to SEC filings (see example here) and I am writing a parser to capture this text and re-create the file.

It doesn't matter if the file is an Excel spreadsheet or PDF (which is the sample below and linked to), the encoding method looks the same. Its not Base64; I don't recognize it.

Do you? TIA.

<DOCUMENT>
<TYPE>LETTER
<SEQUENCE>1
<FILENAME>filename1.pdf
<TEXT>
<PDF>
begin 644 filename1.pdf
M)5!$1BTQ+C4-)>+CS],-"C,W(#`@;V)J#3P\+TQI;F5A<FEZ960@,2],(#$T
M-C0S,2]/(#,Y+T4@,30Q-S0W+TX@,2]4(#$T-C$R,R]((%L@-#8X(#$V,ET^
M/@UE;F1O8FH-("`@("`@("`@("`@("`@#0HT-R`P(&]B:@T\/"]$96-O9&50
M87)M<SP\+T-O;'5M;G,@-2]0<F5D:6-T;W(@,3(^/B]&:6QT97(O1FQA=&5$
M96-O9&4O241;/#`T-#$S,4$Q.#`Q-D,X-#!!-S$X0S-%,T$X1D5$0S!!/CQ!
M,31&,S%#,T(Y-T(T-#0P.3)"-#<P148U,D8W0C,X13Y=+TEN9&5X6S,W(#,R
....
...<snip>...
....
M``$F1B;-S0Q,#`S,"2"2-PU$,O:!2(:C0-E_QTS!;`;&H4$R/0&1C`P``08`
M_(\&40T*96YD<W1R96%M#65N9&]B:@US=&%R='AR968-"C$Q-@T*)25%3T8-
!"C\_
`
end
</PDF>
</TEXT>
</DOCUMENT>
도움이 되었습니까?

해결책

The answer turned out to be old school: UUEncoding

Thanks to all who viewed the question.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top