Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to traverse through a dask dataframe backwards?

I want to read_parquet but read backwards from where you start (assuming a sorted index). I don't want to read the entire parquet into memory because that defeats the whole point of using it. Is there a nice way to do this?

like image 989
Anina Hitt Avatar asked Sep 12 '25 08:09

Anina Hitt


1 Answers

Assuming that the dataframe is indexed, the inversion of the index can be done as a two step process: invert the order of partitions and invert the index within each partition:

from dask.datasets import timeseries

ddf = timeseries()

ddf_inverted = (
    ddf
    .partitions[::-1]
    .map_partitions(lambda df: df.sort_index(ascending=False))
)
like image 74
SultanOrazbayev Avatar answered Sep 13 '25 21:09

SultanOrazbayev