I'm trying to use tr with multiple sets and not sure how
Question
I have used:
tr -dc [:alpha:] < $fileDoc | wc -c
to count all letters,
tr -dc ' ' < $fileDoc | wc -c
to count all spaces,
tr -dc '\n' < $fileDoc | wc -c
to count all new lines in a text document.
What I would like to do now is to do now is count all other characters in the document as I will call every thing else.
Here is the text from the document:
Hello this is a test text document.
123
!@#
Is there a way to delete everything [:alpha:]
, , and
\n
found and count the remaining characters?
Solution
This should do the trick
tr -d '[:alpha:] \n' < $fileDoc | wc -c
Or perhaps if you want to include tabs in the definition of blanks
tr -d '[:alpha:][:space:]' < $fileDoc | wc -c
Based on the OP's comment, to delete alphabetical, spaces, digits, and newlines and count all remaining characters:
tr -d '[:alnum:][:space:]' < $fileDoc | wc -c
[:alnum:]
accounts for letters of the alphabet and digits. [:space:]
takes care of all whitespace including newlines
OTHER TIPS
Just posting here for reference, if you wish to do all in one-shot then this awk
script should work:
awk -v FS='' '
{
for(i=1; i<=NF; i++) {
if($i ~ /[a-zA-Z]/) {alpha++};
if($i == " ") {space++};
if($i !~ /[A-Za-z0-9 ]/) {spl++}
}
}
END {
printf "Space=%s, Alphabets=%s, SplChars=%s, NewLines=%s\n", space, alpha, spl, NR
}' file
$ cat file
This is a text
I want to count
alot of $tuff
in 1 single shot
$ awk -v FS='' '
{
for(i=1; i<=NF; i++) {
if($i ~ /[a-zA-Z]/) {alpha++};
if($i == " ") {space++};
if($i !~ /[A-Za-z0-9 ]/) {spl++}
}
}
END {
printf "Space=%s, Alphabets=%s, SplChars=%s, NewLines=%s\n", space, alpha, spl, NR
}' file
Space=11, Alphabets=45, SplChars=1, NewLines=4