I have a dataframe like the following and I intend to extract windows with <code>size = 30</code> and then write for loop for each block of data and call other functions. <pre class="prettyprint"><code>index = pd.date_range(start='2016-01-01', end='2016-04-01', freq='D') data = pd.DataFrame(np.random.rand(len(index)), index = index, columns=['random']) </code></pre> I found the following function, but I wonder if there is more efficient way to do so. <pre class="prettyprint"><code>def split(df, chunkSize = 30): listOfDf = list() numberChunks = len(df) // chunkSize + 1 for i in range(numberChunks): listOfDf.append(df[i*chunkSize:(i+1)*chunkSize]) return listOfDf </code></pre>

You can use list comprehension. See this SO Post about how access dfs and another way to break up a dataframe. <pre class="prettyprint"><code>n = 200000 #chunk row size list_df = [df[i:i+n] for i in range(0,df.shape[0],n)] </code></pre>

Splitting dataframe column into equal windows in Pandas

Tags:

python

split

pandas

dataframe

chunks

I have a dataframe like the following and I intend to extract windows with size = 30 and then write for loop for each block of data and call other functions.

Click to copy

index = pd.date_range(start='2016-01-01', end='2016-04-01', freq='D')
data = pd.DataFrame(np.random.rand(len(index)), index = index, columns=['random'])

I found the following function, but I wonder if there is more efficient way to do so.

Click to copy

def split(df, chunkSize = 30): 
    listOfDf = list()
    numberChunks = len(df) // chunkSize + 1
    for i in range(numberChunks):
        listOfDf.append(df[i*chunkSize:(i+1)*chunkSize])
    return listOfDf

335

asked Jul 25 '17 12:07

mk_sch

2 Answers

You can use list comprehension. See this SO Post about how access dfs and another way to break up a dataframe.

Click to copy

n = 200000  #chunk row size
list_df = [df[i:i+n] for i in range(0,df.shape[0],n)]

182

answered Nov 04 '22 00:11

Scott Boston

You can do it efficiently with NumPy's array_split like:

Click to copy

import numpy as np

def split(df, chunkSize = 30):
    numberChunks = len(df) // chunkSize + 1
    return np.array_split(df, numberChunks, axis=0)

Even though it is a NumPy function, it will return the split data frames with the correct indices and columns.

answered Nov 03 '22 22:11

jdehesa

Related questions
                            
                                Strange python's hashlib.md5 behavior, different hash each time
                            
                                No module name 'xmlrpclib' when using Electrum from Command line
                            
                                Is there a web.py for python3 yet?
                            
                                How to duplicate Python dataframe one by one? [duplicate]
                            
                                Error in python - object of type 'NoneType' has no len()
                            
                                Mapping values to each item in a list in pandas
                            
                                Python 3 Decoding Strings
                            
                                opencv - video looks good but frames are rotated 90 degrees
                            
                                Pyspark: Is there an equivalent method to pandas info()?
                            
                                is for/while loop from python is a generator
                            
                                Pandas: Selecting rows for which groupby.sum() satisfies condition
                            
                                TypeError: Timestamp subtraction
                            
                                Pulling MS access tables and putting them in data frames in python
                            
                                Matplotlib: create two subplots in line with two y axes each
                            
                                Python - mock imported dictionary
                            
                                Select data using a regular expression
                            
                                I can't access scrapyd port 6800 from browser
                            
                                How to replace multiple matches / groups with regexes?
                            
                                How to close kafka consumer once all messages are consumed?
                            
                                How do I load a caffe model and convert to a numpy array?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Splitting dataframe column into equal windows in Pandas

Tags:

python

split

pandas

dataframe

chunks

mk_sch

People also ask

2 Answers

Scott Boston

jdehesa

Recent Activity

Donate For Us