Question

So I did an exercise using jflex, which is about counting the amount of words from an input text file that contains more than 3 vowels. What I end up doing was defining a token for word, and then creating a java function that receives this text as input, and check each character. If its a vowel I add up the counter and then I check if its greater than 3, if it is I add up the counter of the amount of words.

What I want to know, if there is a regexp that could match a word with more than 3 vowels. I think it would be a cleaner solution. Thanks in advance.

tokens

   Letra = [a-zA-Z]
   Palabra = {Letra}+
Était-ce utile?

La solution

Very simple. Use this if you want to check that a word contains at least 3 vowels.

(?i)(?:[a-z]*[aeiou]){3}[a-z]*

You only care it that contains at least 3 vowels, so the rest can be any alphabetical characters. The regex above can work in both String.matches and Matcher loop, since the valid word (contains at least 3 vowels) cannot be substring of an invalid word (contains less than 3 vowels).


Out of the question, but for consonant, you can use character class intersection, which is a unique feature to Java regex [a-z&&[^aeiou]]. So if you want to check for exactly 3 vowels (for String.matches):

(?i)(?:[a-z&&[^aeiou]]*[aeiou]){3}[a-z&&[^aeiou]]*

If you are using this in Matcher loop:

(?i)(?<![a-z])(?:[a-z&&[^aeiou]]*[aeiou]){3}[a-z&&[^aeiou]]*(?![a-z])

Note that I have to use look-around to make sure that the string matched (exactly 3 vowels) is not part of an invalid string (possible when it has more than 3 vowels).

Autres conseils

Since you yourself wrote a Java method, this can be done as follows in the same:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class VowelChecker {
    private static final Pattern vowelRegex = Pattern.compile("[aeiouAEIOU]");

    public static void main(String[] args) {
        System.out.println(checkVowelCount("aeiou", 3));
        System.out.println(checkVowelCount("AEIWW", 3));
        System.out.println(checkVowelCount("HeLlO", 3));
    }

    private static boolean checkVowelCount(String str, int threshold) {
        Matcher matcher = vowelRegex.matcher(str);
        int count = 0;
        while (matcher.find()) {
            if (++count > threshold) {
                return true;
            }
        }
        return false;
    }

}

Here threshold defines the number of vowels you are looking for (since you are looking for greater than 3, hence 3 in the main method). The output is as follows:

true
false
false

Hope this helps!

Thanks,
EG

I ended up using this regexp I came up. If anyone has a better feel free to post

     Cons = [bcdBCDfghFGHjklmnJKLMNpqrstPQRSTvwxyzVWXYZ]
      Vocal = [aeiouAEIOU]
       Match = {Cons}*{Vocal}{Cons}*{Vocal}{Cons}*{Vocal}{Cons}*{Vocal}({Cons}*{Vocal}*|{Vocal}*{Cons}*) | {Vocal}{Cons}*{Vocal}{Cons}*{Vocal}{Cons}*{Vocal}({Cons}*{Vocal}*|{Vocal}*{Cons}*)
Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top