Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Jupyter Notebook does not remember variables I created in cell above

When I run my code I get the error:

NameError: name 'df_test' is not defined

I don't get this error on my other computer, but on my new one I do. I think it has to do with global and local variables, but that is strange since variables created in the second cell are actually used in the third, the problem occurs in the fourth cell.

I have tried stating global, and then the variables in the first cell, this does not work. Doing this in the third cell, does actually work. But I don't want to keep doing this, because I know from my other computer that this is not normal.

### cell 1
    import pandas as pd
    import numpy as np
    from sklearn.model_selection import train_test_split,cross_val_score,ShuffleSplit  
    import os
    import scipy

### cell 2
    df=pd.read_csv("pandas2.txt",sep=';').drop('listened',axis=1).drop('Usercount',1)
    temp_u=df['User'].unique()
    temp_s=df['Song'].unique()
    avg=df['rating'].mean()
### cell 3
    lamda=0.05
    gamma=0.04
    m=128
    splits=20
    df_train,df_test=train_test_split(df,test_size=0.1, random_state=1)
    beta_u=pd.DataFrame(temp_u,columns=['User'])
    beta_s=pd.DataFrame(temp_s,columns=['Song'])
    beta_u['beta_u']=0
    beta_s['beta_s']=0
    for chunk in np.array_split(df_train, splits):
        x=chunk.merge(beta_u, on='User',how='left').merge(beta_s,on='Song',how='left')

        x['pred']=avg+x['beta_u']+x['beta_s']+(x[pnames]*x[qnames]).sum(axis=1)
        x['gradu']=gamma*(x['rating']-x['pred']-lamda*x['beta_u'])
        beta_u=beta_u.merge(x[['User','gradu']].groupby('User').mean(),on='User',how="left").groupby('User').mean().fillna(0)
        beta_u['beta_u']+=beta_u['gradu']
        beta_u=beta_u.drop(['gradu'],axis=1)
        x['grads']=gamma*(x['rating']-x['pred']-lamda*x['beta_s'])
        beta_s=beta_s.merge(x[['Song','grads']].groupby('Song').mean(),on='Song',how="left").fillna(0)
        beta_s['beta_s']+=beta_s['grads']
        beta_s=beta_s.drop(['grads'],axis=1)


        x[pgrad]=(x[qnames].multiply(x['rating']-x['pred'], axis="index")+np.array(x[qnames]**2)*np.array(x[pnames]))#.divide((x[qnames]*x[qnames]).sum(axis=1),axis=0)
        beta_u=beta_u.merge(x[['User']+pgrad].groupby('User').mean(),on='User',how="left").fillna(0)
        beta_u[pnames]=beta_u[pgrad]#np.array(beta_u[pnames])+np.array(beta_u[pgrad])
        beta_u[pnames]=np.where(beta_u[pnames]>0,beta_u[pnames],10**(-6))
        beta_u=beta_u.drop(pgrad,1)


        x[qgrad]=(x[pnames].multiply(x['rating']-x['pred'], axis="index")+np.array(x[pnames]**2)*np.array(x[qnames]))#.divide((x[pnames]*x[pnames]).sum(axis=1),axis=0)
        beta_s=beta_s.merge(x[['Song']+qgrad].groupby('Song').mean(),on='Song',how="left").fillna(0)
        beta_s[qnames]=beta_s[qgrad]#np.array(beta_s[qnames])+np.array(beta_s[qgrad])
        beta_s[qnames]=np.where(beta_s[qnames]>0,beta_s[qnames],10**(-6))
        beta_s=beta_s.drop(qgrad,1)
    x=df_test.merge(beta_u, on='User',how='left').merge(beta_s,on='Song',how='left').fillna(0)
    x['pred']=x['beta_u']+x['beta_s']+avg+(np.array(x[pnames])*np.array(x[qnames])).sum(axis=1)
    x['pred2']=np.where(x['pred']>0.5,1,0)
    RMSE=np.mean((x['rating']-x['pred'])**2)
    RMSE2=np.mean((x['rating']-x['pred2'])**2)
    print(RMSE)
    print(RMSE2)
### cell 4
    t=len(df_test)
    sim_Song=pd.DataFrame(scipy.sparse.load_npz('simUser.npz').todense())
    sim_Song.index=pd.read_csv('Itemnames.csv',sep=';')['Song']
    sim_Song.columns=pd.read_csv('Itemnames.csv',sep=';')['Song']
    beta_s=beta_s.set_index('Song')

NameError: name 'df_test' is not defined

And when global df_train, df_test, df, x, beta_s, beta_u is put on the top of cell 3, it works fine

like image 574
Joshua Jones Avatar asked Nov 06 '22 16:11

Joshua Jones


1 Answers

The problem somehow is %%time. If I delete this suddenly everything works perfectly fine.

like image 131
Joshua Jones Avatar answered Nov 14 '22 23:11

Joshua Jones