質問

Binary files are frequently attached to SEC filings (see example here) and I am writing a parser to capture this text and re-create the file.

It doesn't matter if the file is an Excel spreadsheet or PDF (which is the sample below and linked to), the encoding method looks the same. Its not Base64; I don't recognize it.

Do you? TIA.

<DOCUMENT>
<TYPE>LETTER
<SEQUENCE>1
<FILENAME>filename1.pdf
<TEXT>
<PDF>
begin 644 filename1.pdf
M)5!$1BTQ+C4-)>+CS],-"C,W(#`@;V)J#3P\+TQI;F5A<FEZ960@,2],(#$T
M-C0S,2]/(#,Y+T4@,30Q-S0W+TX@,2]4(#$T-C$R,R]((%L@-#8X(#$V,ET^
M/@UE;F1O8FH-("`@("`@("`@("`@("`@#0HT-R`P(&]B:@T\/"]$96-O9&50
M87)M<SP\+T-O;'5M;G,@-2]0<F5D:6-T;W(@,3(^/B]&:6QT97(O1FQA=&5$
M96-O9&4O241;/#`T-#$S,4$Q.#`Q-D,X-#!!-S$X0S-%,T$X1D5$0S!!/CQ!
M,31&,S%#,T(Y-T(T-#0P.3)"-#<P148U,D8W0C,X13Y=+TEN9&5X6S,W(#,R
....
...<snip>...
....
M``$F1B;-S0Q,#`S,"2"2-PU$,O:!2(:C0-E_QTS!;`;&H4$R/0&1C`P``08`
M_(\&40T*96YD<W1R96%M#65N9&]B:@US=&%R='AR968-"C$Q-@T*)25%3T8-
!"C\_
`
end
</PDF>
</TEXT>
</DOCUMENT>
役に立ちましたか?

解決

The answer turned out to be old school: UUEncoding

Thanks to all who viewed the question.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top