Binary files are frequently attached to SEC filings (see example here) and I am writing a parser to capture this text and re-create the file.

It doesn't matter if the file is an Excel spreadsheet or PDF (which is the sample below and linked to), the encoding method looks the same. Its not Base64; I don't recognize it.

Do you? TIA.

<DOCUMENT>
<TYPE>LETTER
<SEQUENCE>1
<FILENAME>filename1.pdf
<TEXT>
<PDF>
begin 644 filename1.pdf
M)5!$1BTQ+C4-)>+CS],-"C,W(#`@;V)J#3P\+TQI;F5A<FEZ960@,2],(#$T
M-C0S,2]/(#,Y+T4@,30Q-S0W+TX@,2]4(#$T-C$R,R]((%L@-#8X(#$V,ET^
M/@UE;F1O8FH-("`@("`@("`@("`@("`@#0HT-R`P(&]B:@T\/"]$96-O9&50
M87)M<SP\+T-O;'5M;G,@-2]0<F5D:6-T;W(@,3(^/B]&:6QT97(O1FQA=&5$
M96-O9&4O241;/#`T-#$S,4$Q.#`Q-D,X-#!!-S$X0S-%,T$X1D5$0S!!/CQ!
M,31&,S%#,T(Y-T(T-#0P.3)"-#<P148U,D8W0C,X13Y=+TEN9&5X6S,W(#,R
....
...<snip>...
....
M``$F1B;-S0Q,#`S,"2"2-PU$,O:!2(:C0-E_QTS!;`;&H4$R/0&1C`P``08`
M_(\&40T*96YD<W1R96%M#65N9&]B:@US=&%R='AR968-"C$Q-@T*)25%3T8-
!"C\_
`
end
</PDF>
</TEXT>
</DOCUMENT>
有帮助吗?

解决方案

The answer turned out to be old school: UUEncoding

Thanks to all who viewed the question.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top