Val         ts  year  doy     interpolat  region_id
2000-02-18          NaN  950832000  2000   49           NaN      19987
2000-03-05          NaN  952214400  2000   65           NaN      19987
2000-03-21          NaN  953596800  2000   81           NaN      19987
2000-04-06  0.402539365  954979200  2000   97           NaN      19987
2000-04-22   0.54021746  956361600  2000  113           NaN      19987
The above dataframe has a datetime index. I resample it like so:
df = df.resample('D')
However, this resampling results in this dataframe:
                    ts  year  doy    interpolat  region_id
2000-01-01  1199180160  2008    1             1      19990
2000-01-02         NaN   NaN  NaN           NaN        NaN
2000-01-03         NaN   NaN  NaN           NaN        NaN
2000-01-04         NaN   NaN  NaN           NaN        NaN
2000-01-05         NaN   NaN  NaN           NaN        NaN
Why did the 'Val' column disappear? and all the other columns seem messed up too. See Linearly interpolate missing rows in pandas dataframe for an explanation of where the dataframe is coming from.
--EDIT Based on @unutbu's questions:
df.reset_index().to_dict('list')
{'index': [Timestamp('2000-02-18 00:00:00'), Timestamp('2000-03-05 00:00:00'), Timestamp('2000-03-21 00:00:00'), ... '0.670709965', '0.631584375', '0.562112815', '0.50740686', '0.4447712', '0.47880806', nan, nan]}
-- EDIT: The csv file for the above data frame in its entirety is here:
https://www.dropbox.com/s/dp76hk6yfs6c1og/test.csv?dl=0
Resampling is used in time series data. This is a convenience method for frequency conversion and resampling of time series data. Although it works on the condition that objects must have a datetime-like index for example, DatetimeIndex, PeriodIndex, or TimedeltaIndex.
DataFrame doesn't preserve the column order when converting from a DataFrames. DataFrame #72.
resample() method. To aggregate or temporal resample the data for a time period, you can take all of the values for each day and summarize them. In this case, you want total daily rainfall, so you will use the resample() method together with . sum() .
The Val columns will probably not have a numerical dtype for some reason, and all non-numerical (eg object dtype) columns are removed in resample.
To check, just look at df.info().
To convert it to a numerical columns, you can use astype(float) or the convert_objects (pd.to_numeric starting from v0.17).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With