How to change order in the result of pd.crosstab:
pd.crosstab(df['col1'], df['col2'])
I would like to be able to sort by:
reindex() to reorder columns in a DataFrame. Call pandas. DataFrame. reindex(columns=column_names) with a list of the column names in the desired order as column_names to reorder the columns.
Change Columns Order Using DataFrame.Use df. reindex(columns=change_column) with a list of columns in the desired order as change_column to reorder the columns.
With a basic crosstab, you would have to go back to the program and create a separate crosstab with the information on individual products. Pivot tables let the user filter through their data, add or remove custom fields, and change the appearance of their report.
Well, it would be easier to give you a solution if you provided an example of your data, since it can vary a lot accordingly. I will try to build a case scenario and possible solution below.
If we take the example data and crosstab:
a = np.array(['foo', 'foo', 'foo', 'foo', 'bar', 'bar',
'bar', 'bar', 'foo', 'foo', 'foo'], dtype=object)
c = np.array(['dull', 'dull', 'shiny', 'dull', 'dull', 'weird',
'shiny', 'dull', 'shiny', 'shiny', 'shiny'], dtype=object)
CT = pd.crosstab(a, c, rownames=['a'], colnames=['c'])
CT
We have the following output:
Thats a regular dataframe object, its just "crosstabed" or better yet "pivottabled" accordingly.
You would like to show:
So lets start with "1":
There are different ways you can do that, a simple solution would be to show the same dataframe object with boolean values for singular cases;
[CT == 1]
However, that format might not be what you desire in case of large dataframes.
You could just print the positive cases, or list/append 'em, a simple example would be:
for col in CT.columns:
for index in CT.index:
if CT.loc[index,col] == 1:
print (index,col,'singular')
Output:
('bar', 'shiny', 'singular')
('bar', 'weird', 'singular')
The second item/desire is more complicated. You want to order by higher value. But there might be divergences. A higher value in one column, associated to one set of indexes, will most likely diverge in order from the second column (also associated in the same indexes).
Hence, you can choose to order by one specific column:
CT.sort_values('column_name', ascending=False)
Or, you can define a metric by which you want to order (row mean value) and sort accordingly.
Hope that helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With