I would consider Elastic Search: http://www.elasticsearch.org/
The benefits for your use case:
- Can scale very large. You just add nodes to the cluster as the data grows.
- Based on Lucene, so you know it's a time tested search engine.
- It is schemaless, so you don't have to do any ETL to store data. Just store it as is.
- It is well supported by a good community and has many enterprise companies using it (including Stack Overflow).
- It's free!
- It's easy to search against and provides lots of control over how to boost certain results so you can tune it for your domain.
I would consider putting a queue in front of it in case you are trying to write faster than it can handle.