Question

I am editing this post after reading the comments below and working on the unit tests as suggested. Here is a brief description of my program:

  1. Given an input string containing only letters A, G, C, T. String is usually 80-100K length.
  2. I have to identify regions (minimum 200 length) that meet certain criterias. I am using sliding window algorithm. (Example: input string: abcdef, input width=3, sliding window strings would be abc, bcd,cde,def,ef. In my case, input width=200). I created a function to do so and saved the start and end intervals of the string in an Integer List. So, lets say that my list is like (30,230, 40, 240, 60, 260, 300,500, 450,650) where 30,40,60,300,450 are start intervals that meet that certain criterias and remaining numbers are end intervals.
  3. Next step is identify those intervals that are closeby (distance of 100) and combine those together. I have done that. Now my list is (30,260, 300,500, 450,650).
  4. My final step is to rerun the criteria on these intervals to make sure these still comply. This is where I get into problems. Here is my code:

    public static List<Integer> finalCPGIslands(List<Integer> iList,
        String iSeq, int width) {
    // Declare output list that contains final list of start and end
    // intervals
    List<Integer> oList = new ArrayList<Integer>();
    // Add the first two elements anyways
    oList.add(iList.get(0));
    oList.add(iList.get(1));
    if (iList.size() > 2) {
        for (int i = 2; i < iList.size(); i += 2) {
            // The below IF is attempted to ensure that substring is always
            // valid
            if (iSeq.length() > iList.get(i + 1)) {
                // While creating the substring in next line, I get String
                // index out of range: -9
                String testSeq = iSeq.substring(iList.get(i),
                        iList.get(i + 1) + 1);
                boolean check = cpgCriteriaCheck(testSeq);
                if (check) {
                    // If condition is met, add the indexes to the final
                    // list
                    oList.add(iList.get(i));
                    oList.add(iList.get(i + 1));
                }
                // If condition is not met, start removing one character at
                // a time until condition is met
                else {
    
                    int counter = 0;
                    int currentSequenceLength = testSeq.length();
                    String newTestSeq = null;
                    while (counter <= currentSequenceLength) {
                        counter++;
                        if (testSeq.length() > 2) {
                            newTestSeq = testSeq.substring(1,
                                    testSeq.length() - 1);
                            testSeq = newTestSeq;
                            if (newTestSeq.length() < width) {
                                counter = currentSequenceLength + 1;
                            } else {
                                boolean checkAgain = cpgCriteriaCheck(newTestSeq);
                                // If condition met, add the item to list
                                // and exit
                                if (checkAgain) {
                                    oList.add(iList.get(i) + counter);
                                    oList.add(iList.get(i + 1) - counter);
                                    counter = currentSequenceLength + 1;
                                }
    
                            } // End of Else
                        } // End of IF
    
                    } // End of While
                } // End of Else
            }
    
        } // End of For
    } // End of Else
    return oList;
    

    }

In this function, the input arguments are the Integer List that contains start and end intervals, my input string and the integer difference that is minimum difference between start and end intervals. I am getting a string out of range: -9 exception while I try to create a subString in the following line below:

String testSeq = iSeq.substring(iList.get(i),
                        iList.get(i + 1) + 1);

Also, this exception only appears intermittently. I have an input file around 95K characters and this exception does not happen. I thought that by placing an IF statement where I check to make sure that string length is greater than input List value, i covered for this exception. Also, what does that -9 indicates? Does that indicate that the 9th character in the string is invalid? Could there be a possibility of any unwanted character causing this issue even though I am cleaning up the string by removing all /r and /n occurrences. Sorry for being too verbose but i wanted to give the context of this problem. The root cause still seems to be just Strings Index Out of Bound exception while creating a substring.

Was it helpful?

Solution

The String OOB exception got resolved by adding the following line. This covers all the possible error conditions that could occur while creating a substring.
if (str != null && from >= 0 && to >= from && to <= str.length()) { }

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top