In a single regex (plus a small patch for tabs):
input.replaceAll("^\\s+|\\s+$|\\s*(\n)\\s*|(\\s)\\s*", "$1$2")
.replace("\t"," ");
The regex looks daunting, but in fact decomposes nicely into these parts that are OR-ed together:
^\s+
– match whitespace at the beginning;\s+$
– match whitespace at the end;\s*(\n)\s*
– match whitespace containing a newline, and capture that newline;(\s)\s*
– match whitespace, capturing the first whitespace character.
The result will be a match with two capture groups, but only one of the groups may be non-empty at a time. This allows me to replace the match with "$1$2"
, which means "concatenate the two capture groups."
The only remaining problem is that I can't replace a tab with a space using this approach, so I fix that up with a simple non-regex character replacement.