Question

I have 2 files that I'm parsing line-by-line adding the information to 2 separate ArrayList<String> containers. I'm trying to create a final container "finalPNList" that reflects the 'Resulting File/ArrayList' below.

Issue is that I'm not successfully avoiding duplicates. I've changed the code various ways without success. Sometimes I restrict the condition too much, and avoid all duplicates, and sometimes I leave it too loose and include all duplicates. I can't seem to find the conditions to make it just right.

Here is the code so far -- in this case, seeing the contents of processLine() ins't truly relevant, just know that you're getting a map with 2 ArrayLists<String>

public static Map<String, List<String>> masterList = new HashMap<String, List<String>>();
public static List<String> finalPNList = new ArrayList<String>();
public static List<String> modifier = new ArrayList<String>();
public static List<String> skipped = new ArrayList<String>();

for (Entry<String, String> e : tab1.entrySet()) {
    String key = e.getKey();
    String val = e.getValue();

    // returns BufferedReader to start line processing
    inputStream = getFileHandle(val);
    // builds masterList containing all data
    masterList.put(key, processLine(inputStream));
}
for (Entry<String, List<String>> e : masterList.entrySet()) {
    String key = e.getKey();
    List<String> val = e.getValue();
    System.out.println(modifier.size());
    for (String s : val) {
        if (modifier.size() == 0)
            finalPNList.add(s);
        if (!modifier.isEmpty() && finalPNList.contains(s)
                && !modifier.contains(key)) {
            // s has been added by parent process so SKIP!
            skipped.add(s);
        } else
            finalPNList.add(s);    
    }    
    modifier.add(key);
}

Here is what the data may look like (extremely simplified dealing with about 20K lines about 10K lines in each file):

File A

123;data
123;data
456,data

File B

123;data
789,data
789,data

Resulting File/ArrayList

123;data
123;data
789,data
789,data
Was it helpful?

Solution

  • !modifier.contains(key) is always true, it can be removed from your if-statement.
  • modifier.size() == 0 can be replaced with modifier.isEmpty().
  • Since you seem to want to add duplicates from File B, you need to check File A, not finalPNList when checking for existence (I just checked the applicable list in masterList, feel free to change this to something more appropriate / efficient).
  • You need to have an else after your first if-statement, otherwise you're adding items from File A twice.
  • I assumed you just missed 456 in your output, otherwise I might not quite understand.

Modified code with your file-IO replaced with something that's more in the spirit of an SSCCE:

masterList.put("A", Arrays.asList("123","123","456"));
masterList.put("B", Arrays.asList("123","789","789"));
for (Map.Entry<String, List<String>> e : masterList.entrySet()) {
    String key = e.getKey();
    List<String> val = e.getValue();
    System.out.println(modifier.size());
    for (String s : val) {
        if (modifier.isEmpty())
            finalPNList.add(s);
        else if (!modifier.isEmpty() && masterList.get("A").contains(s)) {
            // s has been added by parent process so SKIP!
            skipped.add(s);
        } else
            finalPNList.add(s);    
    }    
    modifier.add(key);
}

Test.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top