I have a situation where I have a dataframe row to perform calculations with, and I need to use values in following (potentially preceding) rows to do these calculations (essentially a perfect forecast based on the real data set). I get each row from an earlier <code>df.apply</code> call, so I could pass the whole df along to the downstream objects, but that seems less than ideal based on the complexity of objects in my analysis. I found one closely related question and answer [1], but the problem is actually fundamentally different in the sense that I do not need the whole df for my calcs, simply the following <code>x</code> number of rows (which might matter for large dfs). So, for example: <pre class="prettyprint"><code>df = pd.DataFrame([100, 200, 300, 400, 500, 600, 700, 800, 900, 1000], columns=['PRICE']) horizon = 3 </code></pre> I need to access values in the following 3 (<code>horizon</code>) rows in my row-wise <code>df.apply</code> call. How can I get a naive forecast of the next 3 data points dynamically in my row-wise apply calcs? e.g. for row the first row, where the <code>PRICE</code> is <code>100</code>, I need to use <code>[200, 300, 400]</code> as a forecast in my calcs. [1] apply a function to a pandas Dataframe whose returned value is based on other rows

By getting the row's index inside of the <code>df.apply()</code> call using <code>row.name</code>, you can generate the 'forecast' data relative to which row you are currently on. This is effectively a preprocessing step to put the 'forecast' onto the relevant row, or it could be done as part of the initial <code>df.apply()</code> call if the df is available downstream. <pre class="prettyprint lang-py prettyprint-override"><code>df = pd.DataFrame( [100, 200, 300, 400, 500, 600, 700, 800, 900, 1000], columns=["PRICE"] ) horizon = 3 df["FORECAST"] = df.apply( lambda x: [df["PRICE"][x.name + 1 : x.name + horizon + 1]], axis=1 ) </code></pre> Results in this: <pre class="prettyprint lang-none prettyprint-override"><code> PRICE FORECAST 0 100 [200, 300, 400] 1 200 [300, 400, 500] 2 300 [400, 500, 600] 3 400 [500, 600, 700] 4 500 [600, 700, 800] 5 600 [700, 800, 900] 6 700 [800, 900, 1000] 7 800 [900, 1000] 8 900 [1000] 9 1000 [] </code></pre> Which can be used in your row-wise <code>df.apply()</code> calcs. EDIT: If you want to strip the index from the resulting 'Forecast': <pre class="prettyprint lang-py prettyprint-override"><code>df["FORECAST"] = df.apply( lambda x: [df["PRICE"][x.name + 1 : x.name + horizon + 1].reset_index(drop=True)], axis=1 ) </code></pre>

Apply function to pandas dataframe row using values in other rows

I have a situation where I have a dataframe row to perform calculations with, and I need to use values in following (potentially preceding) rows to do these calculations (essentially a perfect forecast based on the real data set). I get each row from an earlier df.apply call, so I could pass the whole df along to the downstream objects, but that seems less than ideal based on the complexity of objects in my analysis.

I found one closely related question and answer [1], but the problem is actually fundamentally different in the sense that I do not need the whole df for my calcs, simply the following x number of rows (which might matter for large dfs).

So, for example:

df = pd.DataFrame([100, 200, 300, 400, 500, 600, 700, 800, 900, 1000], 
                  columns=['PRICE'])
horizon = 3

I need to access values in the following 3 (horizon) rows in my row-wise df.apply call. How can I get a naive forecast of the next 3 data points dynamically in my row-wise apply calcs? e.g. for row the first row, where the PRICE is 100, I need to use [200, 300, 400] as a forecast in my calcs.

[1] apply a function to a pandas Dataframe whose returned value is based on other rows

How will you apply a function to a row of pandas DataFrame?

Use apply() function when you wanted to update every row in pandas DataFrame by calling a custom function. In order to apply a function to every row, you should use axis=1 param to apply(). By applying a function to each row, we can create a new column by using the values from the row, updating the row e.t.c.

How do I apply a function to an entire data frame?

DataFrame - apply() function. The apply() function is used to apply a function along an axis of the DataFrame. Objects passed to the function are Series objects whose index is either the DataFrame's index (axis=0) or the DataFrame's columns (axis=1).

How do I assign a value to a specific row in pandas?

You can set cell value of pandas dataframe using df.at[row_label, column_label] = 'Cell Value'. It is the fastest method to set the value of the cell of the pandas dataframe. Dataframe at property of the dataframe allows you to access the single value of the row/column pair using the row and column labels.

How do you call a specific row in pandas?

In the Pandas DataFrame we can find the specified row value with the using function iloc(). In this function we pass the row number as parameter.

By getting the row's index inside of the df.apply() call using row.name, you can generate the 'forecast' data relative to which row you are currently on. This is effectively a preprocessing step to put the 'forecast' onto the relevant row, or it could be done as part of the initial df.apply() call if the df is available downstream.

df = pd.DataFrame(
    [100, 200, 300, 400, 500, 600, 700, 800, 900, 1000],
    columns=["PRICE"]
)
horizon = 3
    
df["FORECAST"] = df.apply(
    lambda x: [df["PRICE"][x.name + 1 : x.name + horizon + 1]],
    axis=1
)

Results in this:

   PRICE          FORECAST
0    100   [200, 300, 400]
1    200   [300, 400, 500]
2    300   [400, 500, 600]
3    400   [500, 600, 700]
4    500   [600, 700, 800]
5    600   [700, 800, 900]
6    700  [800, 900, 1000]
7    800       [900, 1000]
8    900            [1000]
9   1000                []

Which can be used in your row-wise df.apply() calcs.

EDIT: If you want to strip the index from the resulting 'Forecast':

df["FORECAST"] = df.apply(
    lambda x: [df["PRICE"][x.name + 1 : x.name + horizon + 1].reset_index(drop=True)],
    axis=1
)

Apply function to pandas dataframe row using values in other rows

Tags:

python

pandas

dataframe

lambda

lukewitmer

People also ask

1 Answers

lukewitmer

Recent Activity

Donate For Us

Apply function to pandas dataframe row using values in other rows

Tags:

python

pandas

dataframe

lambda

lukewitmer

People also ask

1 Answers

lukewitmer

Related questions

Recent Activity

Donate For Us