shell - Characters contained in both strings - edited

Question 1

Use Character Classes with GNU Grep

The isn't a widely-applicable solution, but it fits your particular use case quite well. The idea is to use the first variable as a character class to match against the second string. For example:

a='abghrsy'
b='cgmnorstuvz'
echo "$b" | grep --only-matching "[$a]" | xargs | tr --delete ' '

This produces grs as you expect. Note that the use of xargs and tr is simply to remove the newlines and spaces from the output; you can certainly handle this some other way if you prefer.

Set Intersection

What you're really looking for is a set intersection, though. While you can "wing it" in the shell, you'd be better off using a language like Ruby, Python, or Perl to do this.

A Ruby One-Liner

If you need to integrate with an existing shell script, a simple Ruby one-liner that uses Bash variables could be called like this inside your current script:

a='abghrsy'
b='cgmnorstuvz'
ruby -e "puts ('$a'.split(//) & '$b'.split(//)).join"

A Ruby Script

You could certainly make things more elegant by doing the whole thing in Ruby instead.

string1_chars = 'abghrsy'.split //
string2_chars = 'cgmnorstuvz'.split //
intersection  = string1_chars & string2_chars
puts intersection.join

This certainly seems more readable and robust to me, but your mileage may vary. At least now you have some options to choose from.

Question 2

You don't need to do that much work to assign $a and $b shell variables, you can just...

a=abghrsy
b=cdgmrstuvz

Now, there is a classic computer science problem called the longest common subsequence¹ that is similar to yours.

However, if you just want the common characters, one way would let Ruby do the work...

$ ruby -e "puts ('$a'.chars.to_a & '$b'.chars.to_a).join"

^{1. Not to be confused with the different longest common substring problem.}

Question 3

Nice question +1.

You can use an awk trick to get this done.

a=abghrsy
b=cdgmrstuvz
comm -12 <(echo $a|awk -F"\0" '{for (i=1; i<=NF; i++) print $i}') <(echo $b|awk -F"\0" '{for (i=1; i<=NF; i++) print $i}')|tr -d '\n'

OUTPUT:

grs

Note use of awk -F"\0" that breaks input string character by character into different awk fiedls. Rest is pretty straightforward use of comm and tr.

PS: If you input string is not sorted then you need to pipe awk's output to sort or do sort of an array inside awk.

UPDATE: awk only solution (without comm):

echo "$a;$b" | awk -F"\0" '{scnd=0; for (i=1; i<=NF; i++) {if ($i!=";") {if (!scnd) arr1[$i]=$i; else if ($i in arr1) arr2[$i]=$i} else scnd=1}} END { for (a in arr2) printf("%s", a)}'

This assumes semicolon doesn't appear in your string (you can use any other character if that's not the case).

UPDATE 2: I think simplest solution is using grep -o

(thanks to answer from @CodeGnome)

echo "$b" | grep -o "[$a]" | tr -d '\n'

Question 4

Using gnu coreutils(inspired by @DigitalRoss)..

a="abghrsy"
b="cgmnorstuvz"

echo "$(comm -12 <(echo "$a" | fold -w1 | sort | uniq) <(echo "$b" | fold -w1 | sort | uniq) | tr -d '\n')"

will print grs. I assumed you only want uniq characters.

UPDATE: Modified for dash..

 #!/bin/dash

 string1=$(printf "$1" | fold -w1 | sort | uniq | tr -d '\n');
 string2=$(printf "$2" | fold -w1 | sort | uniq | tr -d '\n');

 while [ "$string1" != "" ]; do
   c1=$(printf '%s\n' "$string1" | cut -c 1-1 )
   string2=$(printf "$2" | fold -w1 | sort | uniq | tr -d '\n');
   while [ "$string2" != "" ]; do
     c2=$(printf '%s\n' "$string2" | cut -c 1-1 )
     if [ "$c1" = "$c2" ]; then
       echo "$c1\c"
     fi
     string2=$(printf '%s\n' "$string2" | cut -c 2- )
   done
   string1=$(printf '%s\n' "$string1" | cut -c 2- )
 done
 echo;

_{Note: I am just a beginner. There might be a better way of doing this.}