Is there a way to test whether a dataframe is sorted by a given column that's not an index (i.e. is there an equivalent to is_monotonic() for non-index columns) without calling a sort all over again, and without converting a column into an index?
To check if the index of a DataFrame is sorted in ascending order use the is_monotonic_increasing property. Similarly, to check for descending order use the is_monotonic_decreasing property.
Sort DataFrame in Pandas based on Multiple ColumnsThe DataFrame is first sorted by the column weight and then by height. Order Matters! See, how the results are different when you use different orders of the columns! Furthermore, you can also sort by multiple columns in different orders.
To sort the DataFrame based on the values in a single column, you'll use . sort_values() . By default, this will return a new DataFrame sorted in ascending order. It does not modify the original DataFrame.
To sort a Pandas DataFrame by index, you can use DataFrame. sort_index() method. To specify whether the method has to sort the DataFrame in ascending or descending order of index, you can set the named boolean argument ascending to True or False respectively. When the index is sorted, respective rows are rearranged.
Meanwhile, since 0.19.0, there is pandas.Series.is_monotonic_increasing
, pandas.Series.is_monotonic_decreasing
, and pandas.Series.is_monotonic
.
There are a handful of functions in pd.algos
which might be of use. They're all undocumented implementation details, so they might change from release to release:
>>> pd.algos.is[TAB] pd.algos.is_lexsorted pd.algos.is_monotonic_float64 pd.algos.is_monotonic_object pd.algos.is_monotonic_bool pd.algos.is_monotonic_int32 pd.algos.is_monotonic_float32 pd.algos.is_monotonic_int64
The is_monotonic_*
functions take an array of the specified dtype and a "timelike" boolean that should be False
for most use cases. (Pandas sets it to True
for a case involving times represented as integers.) The return value is a tuple whose first element represents whether the array is monotonically non-decreasing, and whose second element represents whether the array is monotonically non-increasing. Other tuple elements are version-dependent:
>>> df = pd.DataFrame({"A": [1,2,2], "B": [2,3,1]}) >>> pd.algos.is_monotonic_int64(df.A.values, False)[0] True >>> pd.algos.is_monotonic_int64(df.B.values, False)[0] False
All these functions assume a specific input dtype, even is_lexsorted
, which assumes the input is a list of int64
arrays. Pass it the wrong dtype, and it gets really confused:
In [32]: pandas.algos.is_lexsorted([np.array([-2, -1], dtype=np.int64)]) Out[32]: True In [33]: pandas.algos.is_lexsorted([np.array([-2, -1], dtype=float)]) Out[33]: False In [34]: pandas.algos.is_lexsorted([np.array([-1, -2, 0], dtype=float)]) Out[34]: True
I'm not entirely sure why Series don't already have some kind of short-circuiting is_sorted
. There might be something which makes it trickier than it seems.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With