Question

Basically, I'm getting a file path from a string inside of a CSV file. However, for some reason, the program generating the CSV file removes the colon from the string, so I end up with a file path that does not work inside of Java. The typical output is /x/Rest/Of/Path where x is the drive letter, but may occasionally be x/ instead of /x/. Basically, I need to add a colon after the drive letter if there isn't one already; changing either /x/ or x/ to x:/. I'm sure this is mostly done through regex, but I'm still trying to figure out the basics of regex myself, so I'm not sure how to write it. Thanks in advance for any help.

Was it helpful?

Solution

Here, try this, and study it to learn how it works:

String path = "/C/Rest/Of/Path";
Pattern p = Pattern.compile("^(/?[CDEFGH])/");
Matcher m = p.matcher(path);
String pathWithColon = m.replaceAll("$1:/");

Here's a guide:

  1. The ^ is known as an anchor. It matches the very beginning of the string. Without it, this regex would also match /foo/C/Rest/Of/Path, and we don't want that.
  2. The ? can mean various things, depending on where it appears. If it doesn't immediately follow an open-parenthesis (, doesn't immediately follow a quantifier *, +, another ?, {n}, {m,n}, doesn't appear inside a character class [], and isn't escaped \?, then it is a quantifer, meaning, "0 or 1 of the previous entity," in this case, the /. Think of it as the "optional" operator.
  3. The [CDEFGH] is known as a character class. It means, "Any one of these characters." You can negate a character class like so: [^CDEFGH]; this would mean, "Any one character but not these." If you would like to accept any capital letter, then you could use a range: [A-Z]. If you would like to accept any letter, then: [a-zA-Z].
  4. The parentheses surrounding most of the regex is known as a capturing group or capture group. It "saves" whatever's "caught" in between.
  5. During replacement, you can refer to "saved" (captured) groups by $1, $2, $3, and so on. (So, you can capture more than one group; each capturing group is numbered by the order of its opening parenthesis.) In the above example, note that I captured the /? as well, so if the slash existed, then it would exist in the output too, and if not, then not.

Happy learning!

EDIT

I should have exemplified a simpler approach to start. My apologies. This will do as well:

String path = "/C/Rest/Of/Path";
path = path.replaceAll("^(/?[CDEFGH])/", "$1:/");

The use of a compiled pattern only adds to efficiency. For example, if you were going to replace an array of 10,000 paths, you'd compile the pattern once, then use the matcher to replace per path in a loop. (Without compiling, the engine ends up having to parse the pattern from scratch for each path encountered.)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top