Question

I want to compare two string variables and print the characters that are the same for both. I'm not really sure how to do this, I was thinking of using comm or diff but I'm not really sure the right parameters to print only matching characters. also they say they take in files and these are strings. Can anyone help?

Input:

a=$(echo "abghrsy")
b=$(echo "cgmnorstuvz")

Output:

"grs"
Was it helpful?

Solution 2

Use Character Classes with GNU Grep

The isn't a widely-applicable solution, but it fits your particular use case quite well. The idea is to use the first variable as a character class to match against the second string. For example:

a='abghrsy'
b='cgmnorstuvz'
echo "$b" | grep --only-matching "[$a]" | xargs | tr --delete ' '

This produces grs as you expect. Note that the use of xargs and tr is simply to remove the newlines and spaces from the output; you can certainly handle this some other way if you prefer.

Set Intersection

What you're really looking for is a set intersection, though. While you can "wing it" in the shell, you'd be better off using a language like Ruby, Python, or Perl to do this.

A Ruby One-Liner

If you need to integrate with an existing shell script, a simple Ruby one-liner that uses Bash variables could be called like this inside your current script:

a='abghrsy'
b='cgmnorstuvz'
ruby -e "puts ('$a'.split(//) & '$b'.split(//)).join"

A Ruby Script

You could certainly make things more elegant by doing the whole thing in Ruby instead.

string1_chars = 'abghrsy'.split //
string2_chars = 'cgmnorstuvz'.split //
intersection  = string1_chars & string2_chars
puts intersection.join

This certainly seems more readable and robust to me, but your mileage may vary. At least now you have some options to choose from.

OTHER TIPS

You don't need to do that much work to assign $a and $b shell variables, you can just...

a=abghrsy
b=cdgmrstuvz

Now, there is a classic computer science problem called the longest common subsequence1 that is similar to yours.

However, if you just want the common characters, one way would let Ruby do the work...

$ ruby -e "puts ('$a'.chars.to_a & '$b'.chars.to_a).join"

1. Not to be confused with the different longest common substring problem.

Nice question +1.

You can use an awk trick to get this done.

a=abghrsy
b=cdgmrstuvz
comm -12 <(echo $a|awk -F"\0" '{for (i=1; i<=NF; i++) print $i}') <(echo $b|awk -F"\0" '{for (i=1; i<=NF; i++) print $i}')|tr -d '\n'

OUTPUT:

grs

Note use of awk -F"\0" that breaks input string character by character into different awk fiedls. Rest is pretty straightforward use of comm and tr.

PS: If you input string is not sorted then you need to pipe awk's output to sort or do sort of an array inside awk.

UPDATE: awk only solution (without comm):

echo "$a;$b" | awk -F"\0" '{scnd=0; for (i=1; i<=NF; i++) {if ($i!=";") {if (!scnd) arr1[$i]=$i; else if ($i in arr1) arr2[$i]=$i} else scnd=1}} END { for (a in arr2) printf("%s", a)}'

This assumes semicolon doesn't appear in your string (you can use any other character if that's not the case).

UPDATE 2: I think simplest solution is using grep -o

(thanks to answer from @CodeGnome)

echo "$b" | grep -o "[$a]" | tr -d '\n'

Using gnu coreutils(inspired by @DigitalRoss)..

a="abghrsy"
b="cgmnorstuvz"

echo "$(comm -12 <(echo "$a" | fold -w1 | sort | uniq) <(echo "$b" | fold -w1 | sort | uniq) | tr -d '\n')"

will print grs. I assumed you only want uniq characters.

UPDATE: Modified for dash..

 #!/bin/dash

 string1=$(printf "$1" | fold -w1 | sort | uniq | tr -d '\n');
 string2=$(printf "$2" | fold -w1 | sort | uniq | tr -d '\n');

 while [ "$string1" != "" ]; do
   c1=$(printf '%s\n' "$string1" | cut -c 1-1 )
   string2=$(printf "$2" | fold -w1 | sort | uniq | tr -d '\n');
   while [ "$string2" != "" ]; do
     c2=$(printf '%s\n' "$string2" | cut -c 1-1 )
     if [ "$c1" = "$c2" ]; then
       echo "$c1\c"
     fi
     string2=$(printf '%s\n' "$string2" | cut -c 2- )
   done
   string1=$(printf '%s\n' "$string1" | cut -c 2- )
 done
 echo;

Note: I am just a beginner. There might be a better way of doing this.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top