linux bash cut first word from each line of a file, assign it to an array and remove duplicates

https://stackoverflow.com/questions/21649581

08-10-2022
|

Question

So I believe my title explains what i am trying to do. right now i am cutting and echoing out the first word which is working, all i need now is to remove duplicates... the reason i want to assign it to an array is so i can combine all the elements and create a single string of comma seperated values that i can place in a new file. maybe there is an easy way to achieve what i am trying to do. I am new to bash scripting so I appreciate any help.

thanks

here is my code so far

#!/bin/bash
cut -d' ' -f1 $1

Solution

cut -d' ' -f1 $1 | sort |  uniq | paste -sd,

OTHER TIPS

An awk one-liner can do all of it

awk '!a[$1]{} END{for (i in a) print i}' file > output

This awk command creates a array a (unique) and inserts $1 only when it doesn't already exist in the array. Finally list of unique words gets printed in END section.

PS: If order of words is important (as per their appearance in the file):

awk '!($1 in a){a[$1];b[++i]=$1} END{for (k=1; k<=i; k++) print b[k]}' file

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow