I have a large CSV file, which is a log of caller data. A short snippet of my file: <pre class="prettyprint"><code>CompanyName High Priority QualityIssue Customer1 Yes User Customer1 Yes User Customer2 No User Customer3 No Equipment Customer1 No Neither Customer3 No User Customer3 Yes User Customer3 Yes Equipment Customer4 No User </code></pre> I want to sort the entire list by the frequency of occurrence of customers so it will be like: <pre class="prettyprint"><code>CompanyName High Priority QualityIssue Customer3 No Equipment Customer3 No User Customer3 Yes User Customer3 Yes Equipment Customer1 Yes User Customer1 Yes User Customer1 No Neither Customer2 No User Customer4 No User </code></pre> I've tried <code>groupby</code>, but that only prints out the Company Name and the frequency but not the other columns, I also tried <pre class="prettyprint"><code>df['Totals']= [sum(df['CompanyName'] == df['CompanyName'][i]) for i in xrange(len(df))] </code></pre> and <pre class="prettyprint"><code>df = [sum(df['CompanyName'] == df['CompanyName'][i]) for i in xrange(len(df))] </code></pre> But these give me errors: <blockquote> ValueError: The wrong number of items passed 1, indices imply 24 </blockquote> I've looked at something like this: <pre class="prettyprint"><code>for key, value in sorted(mydict.iteritems(), key=lambda (k,v): (v,k)): print "%s: %s" % (key, value) </code></pre> but this only prints out two columns, and I want to sort my entire CSV. My output should be my entire CSV sorted by the first column. Thanks for the help in advance!

Sorting entire csv by frequency of occurence in one column

Tags:

I have a large CSV file, which is a log of caller data.

A short snippet of my file:

CompanyName    High Priority     QualityIssue
Customer1         Yes             User
Customer1         Yes             User
Customer2         No              User
Customer3         No              Equipment
Customer1         No              Neither
Customer3         No              User
Customer3         Yes             User
Customer3         Yes             Equipment
Customer4         No              User

I want to sort the entire list by the frequency of occurrence of customers so it will be like:

CompanyName    High Priority     QualityIssue
Customer3         No               Equipment
Customer3         No               User
Customer3         Yes              User
Customer3         Yes              Equipment
Customer1         Yes              User
Customer1         Yes              User
Customer1         No               Neither
Customer2         No               User
Customer4         No               User

I've tried groupby, but that only prints out the Company Name and the frequency but not the other columns, I also tried

df['Totals']= [sum(df['CompanyName'] == df['CompanyName'][i]) for i in xrange(len(df))]

and

df = [sum(df['CompanyName'] == df['CompanyName'][i]) for i in xrange(len(df))]

But these give me errors:

ValueError: The wrong number of items passed 1, indices imply 24

I've looked at something like this:

for key, value in sorted(mydict.iteritems(), key=lambda (k,v): (v,k)):
    print "%s: %s" % (key, value)

but this only prints out two columns, and I want to sort my entire CSV. My output should be my entire CSV sorted by the first column.

Thanks for the help in advance!

Related questions
                            
                                Alertdialog.Builder setview: Call requires API level 21
                            
                                Winmerge - Way to make identical lines not to be shown when compare 2 files?
                            
                                When does a reference need to be atomic?
                            
                                Dealing with NAs when calculating mean (summarize_each) on group_by
                            
                                Redirecting in service - symfony2
                            
                                How do I disable column sorting in JavaFX TableView
                            
                                Convert array keys from underscore_case to camelCase recursively
                            
                                Symfony List of all kernel parameters available like %kernel.debug%
                            
                                How to connect mysql with swift?
                            
                                Why does the negative of Integer.MIN_VALUE give the same value? [duplicate]
                            
                                Using Notification API for Electron App
                            
                                Check if value exists in a comma separated list [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Sorting entire csv by frequency of occurence in one column

Tags:

Related questions

Recent Activity

Donate For Us