Adding a huge number of files into Solr

Question 1

First thing to check is server side log and look for messages about commits. It's possible you are doing a hard commit after parsing each file. That's expensive. You could look into soft commits or commitWithin params to have files show up slightly later.

Secondly, you seem to be sending a request to Solr to fetch your file and run Tika extract on it. So, this probably restarts Tika inside Solr every time. You will not be able to batch that as other answers seem to suggest.

But you could run Tika locally in your client and initialize it once and keep it around. That then gives more flexibility on how to construct your SolrInputDocument, which you can then batch.

Question 2

Update for each document is slow with solr.

You are much better with adding all the documents, and then doing a commit with update. Taken from the solr wiki:

Collection<SolrInputDocument> docs = new ArrayList<SolrInputDocument>();
docs.add( doc1 );
docs.add( doc2 );

UpdateRequest req = new UpdateRequest();
req.setAction( UpdateRequest.ACTION.COMMIT, false, false );
req.add( docs );
UpdateResponse rsp = req.process( server );