Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas - series creation results in NaN when index is passed in?

Tags:

python

pandas

I am creating a series from a dataframe, with one column of the df being an index, and the other column being the data for the series.

This is my code:

miniframe = attendframe[:20]
s = pd.Series(miniframe.yes, index = miniframe.event)
s[:10]

However, if I include the index = miniframe.eventpart, I get an empty series as below:

1159822043    NaN
686467261     NaN
1186208412    NaN
2621578336    NaN
855842686     NaN
2018671985    NaN
488116622     NaN
1273761447    NaN
2688888297    NaN
3870329460    NaN

The original dataframe looks something like this:

         event                                                yes  \
0   1159822043  1975964455 252302513 4226086795 3805886383 142...   
1    686467261  2394228942 2686116898 1056558062 3792942231 41...   
2   1186208412                                                NaN   
3   2621578336                                                NaN   
4    855842686  2406118796 3550897984 294255260 1125817077 109...   
5   2018671985                                                NaN   
6    488116622  4145960786 2550625355 2577667841 1575121941 28...   
7   1273761447  2680366192 2151335654 3447231284 3021641283 17...   
8   2688888297  298428624 2292079981 1819927116 1843127538 410...   
9   3870329460                                                NaN   
10  3041357942  4238605842 769099880 4281206081 1756250815 187...   

Any chance someone might be able to assist me with this one? I've been working on it for a week and I'm out of ideas!

like image 697
analystic Avatar asked Jul 03 '13 15:07

analystic


People also ask

Can pandas series have index?

Pandas series is a One-dimensional ndarray with axis labels. The labels need not be unique but must be a hashable type. The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index. Pandas Series.

Is NaN in pandas series?

The official documentation for pandas defines what most developers would know as null values as missing or missing data in pandas. Within pandas, a missing value is denoted by NaN .

How do I change the index of a series in pandas?

reset_index() function to reset the index of the given series object and also we will be dropping the original index labels. Output : As we can see in the output, the Series. reset_index() function has reset the index of the given Series object to default.


1 Answers

In [84]: df = DataFrame(dict(event = randint(10,100,(100)), yes = ['foo','bar']*50))

In [85]: df.loc[[2,3,5,10,15],'yes'] = np.nan

In [86]: df.head(10)
Out[86]: 
   event  yes
0     47  foo
1     94  bar
2     71  NaN
3     62  NaN
4     43  foo
5     60  NaN
6     90  foo
7     43  bar
8     15  foo
9     16  bar

In [87]: mini = df[:20]

In [88]: mini
Out[88]: 
    event  yes
0      47  foo
1      94  bar
2      71  NaN
3      62  NaN
4      43  foo
5      60  NaN
6      90  foo
7      43  bar
8      15  foo
9      16  bar
10     26  NaN
11     64  bar
12     82  foo
13     63  bar
14     16  foo
15     78  NaN
16     49  foo
17     32  bar
18     34  foo
19     46  bar

In [89]: Series(mini.yes.values,mini.event).iloc[:10]
Out[89]: 
event
47       foo
94       bar
71       NaN
62       NaN
43       foo
60       NaN
90       foo
43       bar
15       foo
16       bar
dtype: object

Note that this is one of those times where .ix does the wrong thing; use .iloc and be explicit (so ignore my comment from above!)

In [92]: df.set_index('event').iloc[:10].loc[:,'yes']
Out[92]: 
event
47       foo
94       bar
71       NaN
62       NaN
43       foo
60       NaN
90       foo
43       bar
15       foo
16       bar
Name: yes, dtype: object
like image 169
Jeff Avatar answered Oct 01 '22 16:10

Jeff