I want to sort a pandas dataframe based on a column, but the values are stored as strings, but they should be treated as integers.
df.sort(col1)
where col1 = ['0','1','12','13','3']
.
How can I use it so that it considers these numbers as integers and not strings?
If you want to keep your dataframe untouched and just want to sort it...
This is assuming col1
is a column in your dataframe df
option 1
df.iloc[df['col1'].astype(int).argsort()]
option 2
You can also use pd.to_numeric
df.iloc[pd.to_numeric(df['col1']).argsort()]
option 3
For more efficiency you can reconstruct manipulating the underlying numpy array
v = df.values
a = df['col1'].values.astype(int).argsort()
pd.DataFrame(v[a], df.index[a], df.columns)
See also
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With