Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas to_sql index with MultiIndex columns

I'm trying to write a DataFrame that has MultiIndex columns to an MS SQL database. The index gets output as NULL. If I have just single columns, it works fine.

l1 = ['foo', 'bar']
l2 = ['a', 'b', 'c']
cols = pd.MultiIndex.from_product([l1, l2])
df = pd.DataFrame(np.random.random((3,6)), index=[1,2,3], columns=cols)
df.to_sql('test', conn, if_exists='replace')

How it looks in SQL

Is this a bug or do I need to do something else to properly write the index?

like image 409
Chebyshev Avatar asked Nov 07 '22 22:11

Chebyshev


1 Answers

You can concatenate each of the first levels of your dataframe:

l1 = ['foo', 'bar']
l2 = ['a', 'b', 'c']
cols = pd.MultiIndex.from_product([l1, l2])
df = pd.DataFrame(np.random.random((3,6)), index=[1,2,3], columns=cols)
pd.concat([df['foo'],df['bar']]).to_sql('test', conn, if_exists='replace')

This results in this table:

index                a                      b                      c
-------------------- ---------------------- ---------------------- ----------------------
1                    0.803555407060559      0.0185295254735488     0.702949767792433
2                    0.257823384796912      0.985716269729717      0.749719964181681
3                    0.909115063376081      0.236242172285058      0.932813789580215
1                    0.898527697819921      0.874431627680823      0.805393798630385
2                    0.97537971906356       0.319221893730643      0.584449093938984
3                    0.678625747581189      0.606321574437647      0.437746301372623

If you want something closer to the SQL table example you link to, you could use merge and suffix each column:

l1 = ['foo', 'bar']
l2 = ['a', 'b', 'c']
cols = pd.MultiIndex.from_product([l1, l2])
df = pd.DataFrame(np.random.random((3,6)), index=[1,2,3], columns=cols)
pd.merge(df['foo'], df['bar'],
         right_index=True, left_index=True,
         suffixes=['_' + s for s in df.columns.levels[0].to_list()]
         ).to_sql('test', conn, if_exists='replace')

That will get you:

index                a_bar                  b_bar                  c_bar                  a_foo                  b_foo                  c_foo
-------------------- ---------------------- ---------------------- ---------------------- ---------------------- ---------------------- ----------------------
1                    0.989229457189419      0.0759829132299624     0.172846406489083      0.154227020200058      0.386003904079867      0.733402063652856
2                    0.839971061213949      0.975761261358953      0.252917398323633      0.0881692963378311     0.560403977291031      0.806066332511174
3                    0.914544313717528      0.921965094934119      0.821869705625485      0.337292501691803      0.125899685577926      0.527830968883373
like image 126
Rick Avatar answered Nov 14 '22 22:11

Rick