Question

I'm reading a file using bufferedreader, so lets say i have

line = br.readLine();

I want to check if this line contains one of many possible strings (which i have in an array). I would like to be able to write something like:

while (!line.matches(stringArray) { // not sure how to write this conditional
  do something here;
  br.readLine();
}

I'm fairly new to programming and Java, am I going about this the right way?

Was it helpful?

Solution

Copy all values into a Set<String> and then use contains():

Set<String> set = new HashSet<String> (Arrays.asList (stringArray));
while (!set.contains(line)) { ... }

[EDIT] If you want to find out if a part of the line contains a string from the set, you have to loop over the set. Replace set.contains(line) with a call to:

public boolean matches(Set<String> set, String line) {
    for (String check: set) {
        if (line.contains(check)) return true;
    }
    return false;
}

Adjust the check accordingly when you use regexp or a more complex method for matching.

[EDIT2] A third option is to concatenate the elements in the array in a huge regexp with |:

Pattern p = Pattern.compile("str1|str2|str3");

while (!p.matcher(line).find()) { // or matches for a whole-string match
    ...
}

This can be more cheap if you have many elements in the array since the regexp code will optimize the matching process.

OTHER TIPS

It depends on what stringArray is. If it's a Collection then fine. If it's a true array, you should make it a Collection. The Collection interface has a method called contains() that will determine if a given Object is in the Collection.

Simple way to turn an array into a Collection:

String tokens[] = { ... }
List<String> list = Arrays.asList(tokens);

The problem with a List is that lookup is expensive (technically linear or O(n)). A better bet is to use a Set, which is unordered but has near-constant (O(1)) lookup. You can construct one like this:

From a Collection:

Set<String> set = new HashSet<String>(stringList);

From an array:

Set<String> set = new HashSet<String>(Arrays.asList(stringArray));

and then set.contains(line) will be a cheap operation.

Edit: Ok, I think your question wasn't clear. You want to see if the line contains any of the words in the array. What you want then is something like this:

BufferedReader in = null;
Set<String> words = ... // construct this as per above
try {
  in = ...
  while ((String line = in.readLine()) != null) {
    for (String word : words) {
      if (line.contains(word)) [
        // do whatever
      }
    }
  }
} catch (Exception e) {
  e.printStackTrace();
} finally {
  if (in != null) { try { in.close(); } catch (Exception e) { } }
}

This is quite a crude check, which is used surprisingly open and tends to give annoying false positives on words like "scrap". For a more sophisticated solution you probably have to use regular expression and look for word boundaries:

Pattern p = Pattern.compile("(?<=\\b)" + word + "(?=\b)");
Matcher m = p.matcher(line);
if (m.find() {
  // word found
}

You will probably want to do this more efficiently (like not compiling the pattern with every line) but that's the basic tool to use.

Using the String.matches(regex) function, what about creating a regular expression that matches any one of the strings in the string array? Something like

String regex = "*(";
for(int i; i < array.length-1; ++i)
  regex += array[i] + "|";
regex += array[array.length] + ")*";
while( line.matches(regex) )
{
  //. . . 
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top