I am using Python3.5 and I am working with pandas. I have loaded stock data from yahoo finance and have saved the files to csv. My DataFrames load this data from the csv. This is a copy of the ten rows of the csv file that is my DataFrame
Date Open High Low Close Volume Adj Close 1990-04-12 26.875000 26.875000 26.625 26.625 6100 250.576036 1990-04-16 26.500000 26.750000 26.375 26.750 500 251.752449 1990-04-17 26.750000 26.875000 26.750 26.875 2300 252.928863 1990-04-18 26.875000 26.875000 26.500 26.625 3500 250.576036 1990-04-19 26.500000 26.750000 26.500 26.750 700 251.752449 1990-04-20 26.750000 26.875000 26.750 26.875 2100 252.928863 1990-04-23 26.875000 26.875000 26.750 26.875 700 252.928863 1990-04-24 27.000000 27.000000 26.000 26.000 2400 244.693970 1990-04-25 25.250000 25.250000 24.875 25.125 9300 236.459076 1990-04-26 25.000000 25.250000 24.750 25.000 1200 235.282663
I know that I can use iloc, loc, ix but these values that I index will only give my specific rows and columns and will not perform the operation on every row. For example: Row one of the data in the open column has a value of 26.875 and the row below it has 26.50. The price dropped .375 cents. I want to be able to capture the % of Increase or Decrease from the previous day so to finish this example .375 divided by 26.875 = 1.4% decrease from one day to the next. I want to be able to run this calculation on every row so I know how much it has increased or decreased from the previous day. The index functions I have tried but they are absolute, and I don't want to use a loop. Is there a way I can do this with the ix, iloc, loc or another function?
subtract() function is used for finding the subtraction of dataframe and other, element-wise. This function is essentially same as doing dataframe – other but with a support to substitute for missing data in one of the inputs.
diff() function. This function calculates the difference between two consecutive DataFrame elements. Parameters: periods: Represents periods to shift for computing difference, Integer type value.
We can remove the last n rows using the drop() method. drop() method gets an inplace argument which takes a boolean value. If inplace attribute is set to True then the dataframe gets updated with the new value of dataframe (dataframe with last n rows removed).
you can use pct_change() or/and diff() methods
Demo:
In [138]: df.Close.pct_change() * 100 Out[138]: 0 NaN 1 0.469484 2 0.467290 3 -0.930233 4 0.469484 5 0.467290 6 0.000000 7 -3.255814 8 -3.365385 9 -0.497512 Name: Close, dtype: float64 In [139]: df.Close.diff() Out[139]: 0 NaN 1 0.125 2 0.125 3 -0.250 4 0.125 5 0.125 6 0.000 7 -0.875 8 -0.875 9 -0.125 Name: Close, dtype: float64
MaxU solutions suits in your case. If you want to perform more complex computations based on your previous rows you should use shift
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With