Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unique count of words from text string

I have a dataset that is multiple strings and I want a unique count of the occurrences so I can review and refine my datasets. I've been unable to do this using formulas so went over to VBA, but hit a roadblock as I'm an amateur.

My data looks like this...

enter image description here

I want it to return this...

enter image description here

I've tried parsing it with text to columns, but in large datasets I have 60 columns with 100s of hits in my string. Therefore, transposing it then trying to get a count of uniques would be daunting.

Therefore, I was hoping VBA would help, but I can only seem to get a function and not with a Sub and Function to transpose then count. Something like below...

Sub Main()
    Dim filename As String
    Dim WorksheetName As String
    Dim CellRange As String
    
    Sheets.Add.Name = "ParsedOutput"

'==============================================================
' CHANGE THESE VALUES FOR YOUR SHEET   
WorksheetName =   
CellRange =    
'==============================================================
   
    ' Get range
    Dim Range
    Set Range = ThisWorkbook.Worksheets(WorksheetName).Range(CellRange)

    ' Copy range to avoid overwrite
    Range.Copy _
        Destination:=ThisWorkbook.Worksheets("ParsedOutput").Range("A1")
        
    ' Get copied exclusions
    Dim Copy
    Set Copy = ThisWorkbook.Worksheets("ParsedOutput").Range("A:A")
    
    ' Parse and overwrite
    Copy.TextToColumns _
        Destination:=Range("A:A"), _
        DataType:=xlDelimited, _
        ConsecutiveDelimiter:=True, _
        Comma:=True

End Sub

Option Explicit

Public Function Counter(InputRange As Range) As String

Dim CellValue As Variant, UniqueValues As New Collection

Application.Volatile

'For error Handling On Error Resume Next

'Looping through all the cell in the defined range For Each CellValue In InputRange
    UniqueValues.Add CellValue, CStr(CellValue)  ' add the unique item Next

'Returning the count of number of unique values CountUniqueValues = UniqueValues.Count

End Function
like image 494
BigOleNewb Avatar asked Jan 22 '26 06:01

BigOleNewb


1 Answers

For the sake of simplicity, I will take minimal data to demostrate how to achieve what you want. Feel free to change the code to suit your needs.

Excel Sheet

Let's say our worksheet looks like this

enter image description here

Logic:

  1. Find last row and last column as shown HERE and construct your range.
  2. Store the values of that range in an array.
  3. Loop through each item in that array and extract words based of , as a delimiter and store it in the collection. If the delimiter doesnt exist then store the entire word in the collection. To create a unique collection, we use On Error Resume Next as shown in the code below.
  4. Based on the count of words in the collection, we create an 2D array for output. One part of the array will hold the word and the other part will hold the count of occurences.
  5. Use .Find and .FindNext to count the occurence of a word in the range and then store it in array.
  6. Write the array in one go to the relevant cell. For demonstration purpose, I will write to Column D

Code

I have commented the code so you should not have a problem understanding it but if you do then simply ask.

Option Explicit

Sub Sample()
    Dim ws As Worksheet
    
    '~~> Change this to relevant sheet
    Set ws = Sheet1
    
    Dim LastRow As Long, LastColumn As Long
    Dim i As Long, j As Long, k As Long
    Dim col As New Collection
    Dim itm As Variant, myAr As Variant, tmpAr As Variant
    Dim OutputAr() As String
    Dim aCell As Range, bCell As Range, rng As Range
    Dim countOfOccurences As Long
    
    With ws
        '~~> Find last row
        LastRow = .Cells.Find(What:="*", _
                  After:=.Range("A1"), _
                  Lookat:=xlPart, _
                  LookIn:=xlFormulas, _
                  SearchOrder:=xlByRows, _
                  SearchDirection:=xlPrevious, _
                  MatchCase:=False).Row
        
        '~~> Find last column
        LastColumn = .Cells.Find(What:="*", _
                     After:=.Range("A1"), _
                     Lookat:=xlPart, _
                     LookIn:=xlFormulas, _
                     SearchOrder:=xlByColumns, _
                     SearchDirection:=xlPrevious, _
                     MatchCase:=False).Column
                     
        '~~> Construct your range
        Set rng = .Range(.Cells(1, 1), .Cells(LastRow, LastColumn))
        
        '~~> Store the value in an array
        myAr = rng.Value2
        
        '~~> Create a unique collection
        For i = LBound(myAr) To UBound(myAr)
            For j = LBound(myAr) To UBound(myAr)
                If Len(Trim(myAr(i, j))) <> 0 Then
                    '~~> Check data has "," delimiter
                    If InStr(1, myAr(i, j), ",") Then
                        tmpAr = Split(myAr(i, j), ",")
                        
                        For k = LBound(tmpAr) To UBound(tmpAr)
                            On Error Resume Next
                            col.Add tmpAr(k), CStr(tmpAr(k))
                            On Error GoTo 0
                        Next k
                    Else
                        On Error Resume Next
                        col.Add myAr(i, j), CStr(myAr(i, j))
                        On Error GoTo 0
                    End If
                End If
            Next j
        Next i
        
        '~~> Count the number of items in the collection
        i = col.Count
        
        '~~> Create output array for storage
        ReDim OutputAr(1 To i, 1 To 2)
        i = 1
        
        '~~> Loop through unique collection
        For Each itm In col
            OutputAr(i, 1) = Trim(itm)
            countOfOccurences = 0
            
            '~~> Use .Find and .Findnext to count for occurences
            Set aCell = rng.Find(What:=OutputAr(i, 1), LookIn:=xlValues, _
                Lookat:=xlPart, SearchOrder:=xlByRows, SearchDirection:=xlNext, _
                MatchCase:=False, SearchFormat:=False)
        
            If Not aCell Is Nothing Then
                Set bCell = aCell
                countOfOccurences = countOfOccurences + 1
                Do
                    Set aCell = rng.FindNext(After:=aCell)
        
                    If Not aCell Is Nothing Then
                        If aCell.Address = bCell.Address Then Exit Do
                        countOfOccurences = countOfOccurences + 1
                    Else
                        Exit Do
                    End If
                Loop
            End If
            
            '~~> Store count in array
            OutputAr(i, 2) = countOfOccurences
            i = i + 1
        Next itm
        
        '~~> Output it to relevant cell
        .Range("D1").Resize(UBound(OutputAr), 2).Value = OutputAr
    End With
End Sub

Output

enter image description here

like image 176
Siddharth Rout Avatar answered Jan 24 '26 22:01

Siddharth Rout



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!