sec web site have lot of data, so have to first decide what information you want to get, mostly you would be interested in 10-Q and 10-K forms that contain financial statements.
Before 2010 information was submitted in html format and after words in html as well as XML (XBRL) files. This link http://www.sec.gov/divisions/corpfin/organization/cfia-c.htm gives all companies CIK's registered at sec, if you want information about a company you can use this URL :
This will show all the filings from the company, you can change few parameters from this URL :
count : will return you number of files in a request
CIK : it can be CIK number or symbol for a company
type : this allow you to restrict the type of file you want, e.g type=10-Q
will only return 10-Q documents for that company.
You can use any crawler to get html and xml files.
Also you can find all files from a company here :