pandas.Series() Creation using DataFrame Columns returns NaN Data entries

Tags:

Im attempting to convert a dataframe into a series using code which, simplified, looks like this:

dates = ['2016-1-{}'.format(i)for i in range(1,21)]
values = [i for i in range(20)]
data = {'Date': dates, 'Value': values}
df = pd.DataFrame(data)
df['Date'] = pd.to_datetime(df['Date'])
ts = pd.Series(df['Value'], index=df['Date'])
print(ts)

However, print output looks like this:

Date
2016-01-01   NaN
2016-01-02   NaN
2016-01-03   NaN
2016-01-04   NaN
2016-01-05   NaN
2016-01-06   NaN
2016-01-07   NaN
2016-01-08   NaN
2016-01-09   NaN
2016-01-10   NaN
2016-01-11   NaN
2016-01-12   NaN
2016-01-13   NaN
2016-01-14   NaN
2016-01-15   NaN
2016-01-16   NaN
2016-01-17   NaN
2016-01-18   NaN
2016-01-19   NaN
2016-01-20   NaN
Name: Value, dtype: float64

Where does NaN come from? Is a view on a DataFrame object not a valid input for the Series class ?

I have found the to_series function for pd.Index objects, is there something similar for DataFrames ?

521

asked Mar 05 '16 19:03

deepbrook

1 Answers

I think you can use values, it convert column Value to array:

ts = pd.Series(df['Value'].values, index=df['Date'])

import pandas as pd
import numpy as np
import io

dates = ['2016-1-{}'.format(i)for i in range(1,21)]
values = [i for i in range(20)]
data = {'Date': dates, 'Value': values}
df = pd.DataFrame(data)
df['Date'] = pd.to_datetime(df['Date'])
print df['Value'].values
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]

ts = pd.Series(df['Value'].values, index=df['Date'])

print(ts)
Date
2016-01-01     0
2016-01-02     1
2016-01-03     2
2016-01-04     3
2016-01-05     4
2016-01-06     5
2016-01-07     6
2016-01-08     7
2016-01-09     8
2016-01-10     9
2016-01-11    10
2016-01-12    11
2016-01-13    12
2016-01-14    13
2016-01-15    14
2016-01-16    15
2016-01-17    16
2016-01-18    17
2016-01-19    18
2016-01-20    19
dtype: int64

Or you can use:

ts1 = pd.Series(data=values, index=pd.to_datetime(dates))
print(ts1)
2016-01-01     0
2016-01-02     1
2016-01-03     2
2016-01-04     3
2016-01-05     4
2016-01-06     5
2016-01-07     6
2016-01-08     7
2016-01-09     8
2016-01-10     9
2016-01-11    10
2016-01-12    11
2016-01-13    12
2016-01-14    13
2016-01-15    14
2016-01-16    15
2016-01-17    16
2016-01-18    17
2016-01-19    18
2016-01-20    19
dtype: int64

Thank you @ajcr for better explanation why you get NaN:

When you give a Series or DataFrame column to pd.Series, it will reindex it using the index you specify. Since your DataFrame column has an integer index (not a date index) you get lots of missing values.

answered Oct 14 '22 07:10

jezrael

Related questions
                            
                                Argv - String into Integer
                            
                                Passing image object as a button background in Kivy
                            
                                how to get dict of model objects keyed by field
                            
                                Python BeautifulSoup findAll by "class" attribute
                            
                                SqlAlchemy update not working with Sqlite
                            
                                Python sklearn - how to calculate p-values
                            
                                How to enable python repl autocomplete and still allow new line tabs
                            
                                How to store a Python dictionary as an Environment Variable
                            
                                How to return data with 403 error in Django Rest Framework?
                            
                                subprocess call ffmpeg (command line)
                            
                                Where is Qt designer app on Mac + Anaconda?
                            
                                Count how many times each row is present in numpy.array
                            
                                How to get one number specific times in an array python
                            
                                Multiple threads writing to the same CSV in Python
                            
                                How to sort an array of objects by datetime in Python? [duplicate]
                            
                                Call another function and optionally keep default arguments
                            
                                How to round dates to week starts in Pandas
                            
                                Python "ValueError: incomplete format" upon print("stuff %" % "thingy")
                            
                                Ensure the gensim generate the same Word2Vec model for different runs on the same data
                            
                                Find local maximums in numpy array

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

pandas.Series() Creation using DataFrame Columns returns NaN Data entries

Tags:

python

python-3.x

pandas

dataframe

time-series

deepbrook

People also ask

1 Answers

jezrael

Recent Activity

Donate For Us