Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Initialize dataframe with a constant

Initializing dataframe with a constant value does not work,

pd.DataFrame(0, index=[1,2,3])      # doesnt work!
    # OR
pd.DataFrame(0)                     # doesnt work!

whereas I observe that

(1) Initializing Series with a constant value works

pd.Series(0, index=[1,2,3])         # Works fine!

(2) Initializing DataFrame with None works

pd.DataFrame(None, index=[1,2,3])   # Works fine!

(3) Initializing DataFrame when index and columns are not provided works

pd.DataFrame([1, 2, 3])             # Works fine!
pd.DataFrame([0])                   # Works fine!

Does anyone know why?

I am keen to know more from a design consideration, rather than an answer like "if you check the pandas code, you'd see that it fails one of the checks where the data dimension is expected to be >1.. blah blah".

I think it should just work intuitively (considering that pandas is intelligent in assigning a default col and index when not provided, and also guesses the dimensions from the data given).

May be there is some reason for this behaviour but can't figure.

like image 961
a-a Avatar asked Dec 18 '22 05:12

a-a


1 Answers

A pd.DataFrame is 2-dimensional. When you specify

pd.DataFrame(0, index=[1, 2, 3])

You are telling the constructor to assign 0 to every row with indices 1, 2, and 3. But what are the columns? You didn't define any columns.

You can do two things

Option 1
specify your columns

pd.DataFrame(0, index=[1, 2, 3], columns=['x', 'y'])

   x  y
1  0  0
2  0  0
3  0  0

Option 2
pass a list of values

pd.DataFrame([[0]], index=[1, 2, 3])

   0
1  0
2  0
3  0
like image 124
piRSquared Avatar answered Jan 07 '23 05:01

piRSquared