How do I remove trailing whitespace using a regular expression?
-
03-12-2019 - |
سؤال
I want to remove trailing white spaces and tabs from my code without removing empty lines.
I tried:
\s+$
and:
([^\n]*)\s+\r\n
But they all removed empty lines too. I guess \s
matches end-of-line characters too.
UPDATE (2016):
Nowadays I automate such code cleaning by using Sublime's TrailingSpaces package, with custom/user setting:
"trailing_spaces_trim_on_save": true
It highlights trailing white spaces and automatically trims them on save.
المحلول
Try just removing trailing spaces and tabs:
[ \t]+$
نصائح أخرى
To remove trailing whitespace while also preserving whitespace-only lines, you want the regex to only remove trailing whitespace after non-whitespace characters. So you need to first check for a non-whitespace character. This means that the non-whitespace character will be included in the match, so you need to include it in the replacement.
Regex: ([^ \t\r\n])[ \t]+$
Replacement: \1
or $1
, depending on the IDE
The platform is not specified, but in C# (.NET) it would be:
Regular expression (presumes the multiline option - the example below uses it):
[ \t]+(\r?$)
Replacement:
$1
For an explanation of "\r?$", see Regular Expression Options, Multiline Mode (MSDN).
Code example
This will remove all trailing spaces and all trailing TABs in all lines:
string inputText = " Hello, World! \r\n" +
" Some other line\r\n" +
" The last line ";
string cleanedUpText = Regex.Replace(inputText,
@"[ \t]+(\r?$)", @"$1",
RegexOptions.Multiline);
Regex to find trailing and leading whitespaces:
^[ \t]+|[ \t]+$
If using Visual Studio 2012 and later (which uses .NET regular expressions), you can remove trailing whitespace without removing blank lines by using the following regex
Replace (?([^\r\n])\s)+(\r?\n)
With $1
Some explanation
The reason you need the rather complicated expression is that the character class \s
matches spaces, tabs and newline characters, so \s+
will match a group of lines containing only whitespace. It doesn't help adding a $
termination to this regex, because this will still match a group of lines containing only whitespace and newline characters.
You may also want to know (as I did) exactly what the (?([^\r\n])\s)
expression means. This is an Alternation Construct, which effectively means match to the whitespace character class if it is not a carriage return or linefeed.
Alternation constructs normally have a true and false part,
(?( expression ) yes | no )
but in this case the false part is not specified.
To remove trailing white space while ignoring empty lines I use positive look-behind:
(?<=\S)\s+$
The look-behind is the way go to exclude the non-whitespace (\S) from the match.
[ |\t]+$
with an empty replace works. \s+($)
with a $1
replace also works.
At least in Visual Studio Code...
In Java:
String str = " hello world ";
// prints "hello world"
System.out.println(str.replaceAll("^(\\s+)|(\\s+)$", ""));