Question

My method takes a file, and tries to extract the text between the header ###Title### and closing ###---###. I need it to extract multiple lines and put each line into an array. But since readAllLines() converts all lines into an array, I don't know how to compare and match it.

public static ArrayList<String> getData(File f, String title) throws IOException {
    ArrayList<String> input = (ArrayList<String>) Files.readAllLines(f.toPath(), StandardCharsets.US_ASCII);
    ArrayList<String> output = new ArrayList<String>();

    //String? readLines = somehow make it possible to match
    System.out.println("Checking entry.");

    Pattern p = Pattern.compile("###" + title + "###(.*)###---###", Pattern.DOTALL);
    Matcher m = p.matcher(readLines);
    if (m.matches()) {
        m.matches();
        String matched = m.group(1);
        System.out.println("Contents: " + matched);
        String[] array = matched.split("\n");
        ArrayList<String> array2 = new ArrayList<String>();
        for (String j:array) {
            array2.add(j);
        }
        output = array2;
    } else {
        System.out.println("No matches.");
    }
    return output;
}

Here is my file, and I'm 100% sure that the compiler is reading the correct one.

###Test File###
Entry 1
Entry 2
Data 1
Data 2
Test 1
Test 2
###---###

The output says "No matches." instead of the entries.

Was it helpful?

Solution

You don't need regex for that. It's enough to loop through the array and compare items line by line, taking those between the start and end tags.

ArrayList<String> input = (ArrayList<String>) Files.readAllLines(f.toPath(), StandardCharsets.US_ASCII);
ArrayList<String> output = new ArrayList<String>();

boolean matched = false;
for (String line : input) {
    if (line.equals("###---###") && matched) matched = false; //needed parentheses
    if (matched) output.add(line);
    if (line.equals("###Test File###") && !matched) matched = true;
}

OTHER TIPS

As per your comment, if they are going to be in the same way as posted, then i don't think regex is needed for this requirement. You can read line by line and do a contains of '###'

public static void main(String args[])
  {
  ArrayList<String> dataList = new ArrayList<String>();
  try{
  // Open the file that is the first 
  // command line parameter
  FileInputStream fstream = new FileInputStream("textfile.txt");
  // Get the object of DataInputStream
  DataInputStream in = new DataInputStream(fstream);
  BufferedReader br = new BufferedReader(new InputStreamReader(in));
  String strLine;
  //Read File Line By Line
  while ((strLine = br.readLine()) != null)   {
  // this line will skip the header and footer with '###'
  if(!strLine.contains("###");
   dataList.add(strLine);
  }
  //Close the input stream
  in.close();
    }catch (Exception e){//Catch exception if any
  System.err.println("Error: " + e.getMessage());
  }
  }
//Now dataList has all the data between ###Test File### and ###---###
}

You can also change the contains method parameter according to your requirement to ignore lines!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top