Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I display unique words contained in a Bash string?

I have a string that has duplicate words. I would like to display only the unique words. The string is:

variable="alpha bravo charlie alpha delta echo charlie"

I know several tools that can do this together. This is what I figured out:

echo $variable | tr " " "\n" | sort -u | tr "\n" " "

What is a more effective way to do this?

like image 621
Todd Partridge Avatar asked Feb 04 '16 22:02

Todd Partridge


People also ask

How do I print unique words in Linux?

One of the easiest way to get the number of unique words in your file: tr ' ' '\n' < file_name | sort | uniq -c | wc -l.

What does %% mean in Bash?

So as far as I can tell, %% doesn't have any special meaning in a bash function name. It would be just like using XX instead. This is despite the definition of a name in the manpage: name A word consisting only of alphanumeric characters and under- scores, and beginning with an alphabetic character or an under- score.

How do I find a word in a string in Bash?

Another option to determine whether a specified substring occurs within a string is to use the regex operator =~ . When this operator is used, the right string is considered as a regular expression. The period followed by an asterisk . * matches zero or more occurrences any character except a newline character.


2 Answers

You may use xargs:

echo "$variable" | xargs -n 1 | sort -u | xargs
like image 114
jyvet Avatar answered Oct 06 '22 23:10

jyvet


Use a Bash Substitution Expansion

The following shell parameter expansion will substitute spaces with newlines, and then pass the results into the sort utility to return only the unique words.

$ echo -e "${variable// /\\n}" | sort -u
alpha
bravo
charlie
delta
echo

This has the side-effect of sorting your words, as the sort and uniq utilities both require input to be sorted in order to detect duplicates. If that's not what you want, I also posted a Ruby solution that preserves the original word order.

Rejoining Words

If, as one commenter pointed out, you're trying to reassemble your unique words back into a single line, you can use command substitution to do this. For example:

$ echo $(echo -e "${variable// /\\n}" | sort -u)
alpha bravo charlie delta echo

The lack of quotes around the command substitution are intentional. If you quote it, the newlines will be preserved because Bash won't do word-splitting. Unquoted, the shell will return the results as a single line, however unintuitive that may seem.

like image 30
Todd A. Jacobs Avatar answered Oct 07 '22 00:10

Todd A. Jacobs