Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Combining duplicate entries with unique data in Excel

I have an Excel database and I'm trying avoid doing some manual combining of duplicate data. I've got a bunch of listings that are essentially the same aside from the tags column. What I'd like to have it do is combine these 5 listings into 1 listing, making the categories a comma separated list in a single cell.

Turn this

Data Example

into this

Desired result

Is there any way of achieving this? My document has a couple thousand listings, so I'm obviously trying to avoid the manual edit route. I'm an Excel novice, so any hand holding or tutorials you could point me to would be appreciated.

like image 313
jbwharris Avatar asked Apr 02 '13 01:04

jbwharris


2 Answers

This can also be done using formulas. For my example to work, the data would need to be sorted by the first column and there would need to be a header row.

You would need two more columns (C & D). First, add a formula that essentially says to concatenate the data in column B if data in column A is the same as the row above it, otherwise reset the concatenation. The next column would contain a formula to identify the final concatenations so you can sort later.

This is how I would do it with listings and categories in columns A & B (again, the data would need to be sorted by column A and there would need to be a header row): enter image description here

Here's the results. Now I would copy the entire range and paste values into another sheet. The rows with zero for column D is what I'd want to use. Sorting by column D would float them to the top. enter image description here

like image 182
Dave Thornton Avatar answered Oct 27 '22 14:10

Dave Thornton


This will (should) generate a new sheet from your source sheet with the duplicates concatenated.

To use the following code you need to add it to a new module in the VBA Editor

A Shortcut to open the VBA Editor is Alt+F11 (for Windows) and Alt+Fn+F11 (for Mac)

Once the Editor is open add a new module by selecting it from the "insert" menu in the main menu bar. It should automatically open the module ready to accept code, If not you need to select it (will be named "ModuleN" where N is the next available number) from the project explorer.

I'm not sure if the "Scripting.Dictionary" is available in osx, but it cant hurt to try.

Option Explicit

Sub Main()
Dim Source As Worksheet: Set Source = ThisWorkbook.Worksheets("Sheet1")
Dim Destination As Worksheet: Set Destination = ThisWorkbook.Worksheets("Sheet2")

Dim Records As Object: Set Records = CreateObject("Scripting.Dictionary")

Dim Data As Variant
Dim Index As Long
Dim Row As Integer: Row = 1

Data = Source.Range("A1", "B" & Source.Rows(Source.UsedRange.Rows.Count).Row).Value2

For Index = LBound(Data, 1) To UBound(Data, 1)
    If Records.Exists(Data(Index, 1)) Then
        Destination.Cells(Records(Data(Index, 1)), 2).Value2 = Destination.Cells(Records(Data(Index, 1)), 2).Value2 & ", " & Data(Index, 2)
    Else
        Records.Add Data(Index, 1), Row
        Destination.Cells(Row, 1).Value2 = Data(Index, 1)
        Destination.Cells(Row, 2).Value2 = Data(Index, 2)
        Row = Row + 1
    End If
Next Index

Set Records = Nothing

End Sub
like image 34
NickSlash Avatar answered Oct 27 '22 13:10

NickSlash