Put the two tables in different projects (we'll call them Table1
and Table2
).
In Table2
on on the friends
column:
- use "split multi-valued cells" to get each value on a separate row
- convert the visitors column to numbers (or conversely user_id in Table1 to string)
- use "add a new column based on this column" with the expression
cross(cell,'Table1','user_id').length()
This will return 0 if there's no match, 1 if there's a match or N>1 if there are duplicates in Table1
If you want the data back in the original format, set up a facet to filter on the validity column, blank out all the bad values and then use "join multi-valued cells" to reverse the split operation you did up front.
I fixed some caching bugs with cross() for OpenRefine 2.6, so if the cross doesn't work, try stopping and restarting the Refine server.