You represent the terms that appear in documents as a weight in a vector, where each index position is the "weight" of a term. For instance, if we assume a document "hello world", and we associated position 0 with the importance of "hello" and position 1 with the importance of world, and we measure the importance as the number of times the term appears, the document is seen as d = (1, 1).
At the same time a document saying only "hello" would be (1, 0).
This representation could be base in any measure for the importance of terms in documents being the term frequency (as suggested by @Pedrom) the simplest option. The most common, yet simple enough, technique is to apply TF-IDF which combines how common a term is in the document and how rare is in the collection.
I hope this helps,