feature selection within large data set

Question 1

tf-idf should help since it combines

number of times a word appears in a document (i.e. each book)
number of times a word appears in the set of documents (a.k.a. corpus)

If a word appears a lot in a document but not so much in the corpus, it is likely characteristic of the document and will have a high tf-idf score. If, on the other hand, a word appears frequently in a document and also frequently in the whole corpus, it is not very characteristic of such document and this does not have a high tf-idf score. The words with the highest tf-idf measures per document are the most relevant.

Stop word removal may be a step you want want to perform on your data before getting tf-idf measures for your documents, but you may want to try with and without stop words to compare performance.

EDIT:

To support what I mentioned in the comment re. not having to come up with the stopwords yourself, here's NLTK's English stopwords, which you can add to or remove from according to whatever you want to implement:

>>> import nltk
>>> from nltk.corpus import stopwords
>>> stopwords.words('english')
['i', 'me', 'my', 'myself', 'we', 'our', 'ours', 'ourselves', 'you', 
'your', 'yours', 'yourself', 'yourselves', 'he', 'him', 'his', 
'himself', 'she', 'her', 'hers', 'herself', 'it', 'its', 'itself', 
'they', 'them', 'their', 'theirs', 'themselves', 'what', 'which', 
'who', 'whom', 'this', 'that', 'these', 'those', 'am', 'is', 'are', 
'was', 'were', 'be', 'been', 'being', 'have', 'has', 'had', 'having', 
'do', 'does', 'did', 'doing', 'a', 'an', 'the', 'and', 'but', 'if', 
'or', 'because', 'as', 'until', 'while', 'of', 'at', 'by', 'for', 
'with', 'about', 'against', 'between', 'into', 'through', 'during', 
'before', 'after', 'above', 'below', 'to', 'from', 'up', 'down', 
'in', 'out', 'on', 'off', 'over', 'under', 'again', 'further', 'then', 
'once', 'here', 'there', 'when', 'where', 'why', 'how', 'all', 'any', 
'both', 'each', 'few', 'more', 'most', 'other', 'some', 'such', 'no', 
'nor', 'not', 'only', 'own', 'same', 'so', 'than', 'too', 'very', 's', 
't', 'can', 'will', 'just', 'don', 'should', 'now']

Question 2

Take a look at Latent Dirichlet Allocation (LDA). It is an unsupervised algorithm that treats "topics" as distributions of terms, and documents as distributions of topics. Source code for it is widely available in multiple languages (see below for some example libraries).

To eliminate stopwords you can simply find a stopwords list online or through a package supported under your language of choice. Often this option is built into text mining or NLP packages. Examples:

scikit-learn or NLTK for Python
tm for R