Regex reconnaître l'article indéfini “un” ou “une” à l'aide de JAVA

https://stackoverflow.com/questions/9401866

29-10-2019
|

Question

ma tâche est de concevoir une expression régulière qui reconnaît l'article indéfini en anglais, le mot “un” ou “une” c'est à direpour écrire une expression régulière pour identifier le mot ou le mot une.Il faut que je teste l'expression par l'écriture d'un pilote d'essai qui lit un fichier contenant environ une dizaine de lignes de texte.Votre programme doit compter les occurrences des mots “un” et “une”.Je ne correspond pas aux caractères a et un dans les mots tels que thun.

C'est mon code pour l'instant:

import java.io.IOException;
import java.util.Arrays;
import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class RegexeFindText {
   public static void main(String[] args) throws IOException {

      // Input for matching the regexe pattern
       String file_name = "Testing.txt";

           ReadFile file = new ReadFile(file_name);
           String[] aryLines = file.OpenFile();  
           String asString = Arrays.toString(aryLines);

            // Regexe to be matched
               String regexe = ""; //<<--this is where the problem lies

           int i;
           for ( i=0; i < aryLines.length; i++ ) {
           System.out.println( aryLines[ i ] ) ;
           }


      // Step 1: Allocate a Pattern object to compile a regexe
      Pattern pattern = Pattern.compile(regexe);
      //Pattern pattern = Pattern.compile(regexe, Pattern.CASE_INSENSITIVE);  
      // case-        insensitive matching

      // Step 2: Allocate a Matcher object from the compiled regexe pattern,
      //         and provide the input to the Matcher
      Matcher matcher = pattern.matcher(asString);

      // Step 3: Perform the matching and process the matching result

      // Use method find()
      while (matcher.find()) {     // find the next match
         System.out.println("find() found the pattern \"" + matcher.group()
               + "\" starting at index " + matcher.start()
               + " and ending at index " + matcher.end());
      }

      // Use method matches()
      if (matcher.matches()) {
         System.out.println("matches() found the pattern \"" + matcher.group()
               + "\" starting at index " + matcher.start()
               + " and ending at index " + matcher.end());
      } else {
         System.out.println("matches() found nothing");
      }

      // Use method lookingAt()
      if (matcher.lookingAt()) {
         System.out.println("lookingAt() found the pattern \"" + matcher.group()
               + "\" starting at index " + matcher.start()
               + " and ending at index " + matcher.end());
      } else {
         System.out.println("lookingAt() found nothing");
      }
   }
}

Ma question est simple, que dois-je utiliser pour trouver ces mots dans mon texte?Toute aide serait grandement apprécié, merci!

La solution

Voici la regex qui va correspondre à "un" ou "une":

String regex = "\\ban?\\b";

Nous allons briser ce regex vers le bas:

\b moyens frontière de mot (une seule barre oblique inverse est écrit que "\\" en java)
a est tout simplement un littéral "a"
n? signifie zéro ou un littéral "n"

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow