Best way to compare big csv files?

Question 1

HashMap<String, String> file1Map = new HashMap<String, String>();

while ((String line = file1.readLine()) != null) {
  array =line.split(",");
  key = array[0]+array[1]+array[2]+array[8];
  file1Map.put(key, key);
}

while ((String line = file2.readLine()) != null) {
  array =line.split(",");
  key = array[0]+array[1]+array[2]+array[8];
  if (file1Map.containsKey(key)) {
    //if file1 has same line in file2
  }
  else {
    //if file1 doesn't have line like in file2
  }
}

Question 2

As far as I can interpret your code, you need to find out which lines in the first CSV file do not have an equal line in the second CSV file. Correct?

If so, you only need to put all lines of the second CSV file into a HashSet. Like so (Java 7 code):

Set<String> linesToCompare = new HashSet<>();
try (BufferedReader reader = new BufferedReader(new FileReader(cadena + lista2[0]))) {
    String line;
    while ((line = reader.readLine()) != null) {
        String[] splitted = line.split(",");
        linesToCompare.add(splitted[0] + splitted[1] + splitted[2] + splitted[8]);
    }
}

Afterwards you can simply iterate over the lines in the first CSV file and compare:

try (BufferedReader reader = new BufferedReader(new FileReader(...))) {
    String line;
    while ((line = reader.readLine()) != null) {
        String[] splitted = line.split(",");
        String joined = splitted[0] + splitted[1] + splitted[2] + splitted[8];
        if (!linesToCompare.contains(joined)) {
            // handle missing line here
        }
    }
}

Does that fit your needs?

Question 3

Assuming this all won't fit in memory I would first convert the files to their stripped down versions (el0, el1, el2, el8, orig-file-line-nr-for-reference-afterwards) and then sort said files. After that you can stream through both files simultaneously and compare the records as you go... Taking the sorting out of the equation you only need to compare them 'about once'.

But I'm guessing you could do the same using some List/Array object that allows for sorting and storing in memory; 40k records really doesn't sound all that much to me, unless the elements are very big of course. And it's going to be magnitudes faster.