Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PowerShell finding duplicates in CSV and outputting different header

I guess the question is in the title.

I have a CSV that looks something like

user,path,original_path

I'm trying to find duplicates on the original path, then output both the user and original_path line.

This is what I have so far.

$2 = Import-Csv 'Total 20_01_16.csv' | Group-Object -Property Original_path | 
Where-Object { $_.count -ge 2 } | fl Group | out-string -width 500

This gives me the duplicates in Original_Path. I can see all the required information but I'll be danged if I know how to get to it or format it into something useful.

I did a bit of Googleing and found this script:

$ROWS = Import-CSV -Path 'Total 20_01_16.csv'
$NAMES = @{}
$OUTPUT = foreach ( $ROW in $ROWS ) { 
IF ( $NAMES.ContainsKey( $ROW.Original_path ) -and $NAMES[$ROW.original_path] -lt 2 ) 
{ $ROW }
$NAMES[$ROW.original_path] += 1 }

Write-Output $OUTPUT

I'm reluctant to use this because, well first I have no idea what it's doing. So little of the makes any sense to me, I don't like using scripts I can't get my head around. Also, and this is the more important part, it's only giving me a single duplicate, it's not giving me both sets. I'm after both offending lines, so I can find both users with the same file.

If anyone could be so kind as to lend a hand I'd appreciate it. Thanks

like image 864
Graham J Avatar asked Jan 06 '23 19:01

Graham J


1 Answers

It depends on the output format you need, but to build on what you already have we can use this to show the records in the console:

Import-Csv 'Total 20_01_16.csv' |
Group-Object -Property Original_path |
Where-Object { $_.count -ge 2 } |
Foreach-Object { $_.Group } |
Format-Table User, Path, Original_path -AutoSize

Alternatively, use this to save them in a new csv-file:

Import-Csv 'Total 20_01_16.csv' |
Group-Object -Property Original_path |
Where-Object { $_.count -ge 2 } |
Foreach-Object { $_.Group } |
Select User, Path, Original_path |
Export-csv -Path output.csv -NoTypeInformation
like image 174
Frode F. Avatar answered Feb 20 '23 05:02

Frode F.