Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Removing duplicates from a cell in excel

So I've got a list of strings, for which some of the strings contain duplicates.

For example:

13-Hexadecenoic acid;13-methyl-4-pentadecenoic acid;14-Methyl-6-pentadecenoic acid;15-Hexadecanolide;3-Hexadecenoic acid;4-hexadecenoic acid;13-Hexadecenoic acid;13-methyl-4-pentadecenoic acid;14-Methyl-6-pentadecenoic acid;15-Hexadecanolide;3-Hexadecenoic acid;4-hexadecenoic acid;

So I came across a macro online and tweaked it to work for my problem, which goes like this:

Function stringOfUniques(inputString As String) As String
    Dim inArray() As String
    Dim xVal As Variant
    inArray = Split(inputString, ";")
    For Each xVal In inArray
        If InStr(stringOfUniques, Trim(xVal)) = 0 Then _
        stringOfUniques = stringOfUniques & Trim(xVal) & ","
    Next xVal

End Function

And for some of my strings this works brilliantly, however for the example above it will return the string with the duplicates removed, but weirdly will get rid off both copies of one of the words '3-Hexadecenoic acid'. So basically

What I should get:

13-Hexadecenoic acid;13-methyl-4-pentadecenoic acid;14-Methyl-6-pentadecenoic acid;15-Hexadecanolide;3-Hexadecenoic acid;4-hexadecenoic acid;

What I actually get:

13-Hexadecenoic acid;13-methyl-4-pentadecenoic acid;14-Methyl-6-pentadecenoic acid;15-Hexadecanolide;4-hexadecenoic acid;

Is there something in my code that's causing this to happen?

like image 592
user2062207 Avatar asked Jan 13 '15 15:01

user2062207


1 Answers

Probably there are several ways to do this, but Dictionary objects are great for enforcing uniqueness.

Function stringOfUniques(inputString As String, delimiter as String)
Dim xVal As Variant
Dim dict as Object
Set dict = CreateObject("Scripting.Dictionary")

For Each xVal In Split(inputString, delimiter)
    dict(xVal) = xVal
Next xVal

stringOfUniques = Join(dict.Keys(),",")
End Function

This function has also been modified to accept a variable delimiter argument, so you will pass the input string and the delimiter: ";" to the function, which will return a comma-separated string.

A Note on Dictionaries:

Dictionaries store key/value pairs. The keys must be unique.

In the example, I use a simple assignment to the dictionary object: dict(key) = key. A dictionary will automatically add or overwrite an item when referenced by it's key, so this is a really simple way of enforcing uniqueness.

In other circumstances (i.e., you want to count the number of occurrences of each key), you would want to test using the dict.Exists(key) in order to modify the value without overwriting the key/value pair, for example:

'Assigns a "count" value to the dictionary for each unique Key
For Each xVal In Split(inputString, delimiter)
    If dict.Exists(xVal) Then
        dict(xVal) = dict(xVal) + 1
    Else
        dict(xVal) = 1
    End If
Next xVal

'Read the "count" from each key:
For Each xVal in dict.Keys()
    MsgBox xVal & " appears " & dict(xVal) & " times"
Next
like image 160
David Zemens Avatar answered Oct 22 '22 13:10

David Zemens