Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UnboundLocalError: local variable 'x' referenced before assignment. Proper use of tsplot in seaborn package for a dataframe?

I cannot get this to work for my data so first I am trying a concrete example that is very similar. Here is the dataframe:

In [56]:

idx = pd.DatetimeIndex(start='1990-01-01', freq='d', periods=5)
data= pd.DataFrame({('A','a'):[1,2,3,4,5],
                    ('A','b'):[6,7,8,9,1],
                    ('B','a'):[2,3,4,5,6],
                    ('B','b'):[7,8,9,1,2]}, idx)
Out[56]:
A   B
a   b   a   b
1990-01-01  1   6   2   7
1990-01-02  2   7   3   8
1990-01-03  3   8   4   9
1990-01-04  4   9   5   1
1990-01-05  5   1   6   2

So what I am hoping to do is plot a time series with a line for the central tendency among the variables (each column) for each observation (each day in the index), with a shaded area indicating the specified error estimator (probably just 95% ci) of the observations corresponding to each day.

I've tried this:

sns.tsplot(data, time=idx)

But I get the following error:

UnboundLocalError                         Traceback (most recent call last)
<ipython-input-57-fa07e08ead95> in <module>()
      5                     ('B','b'):[7,8,9,1,2]}, idx)
      6 
----> 7 sns.tsplot(data, time=idx)

C:\Users\Patrick\Anaconda\lib\site-packages\seaborn\timeseries.pyc in tsplot(data, time, unit, condition, value, err_style, ci, interpolate, color, estimator, n_boot, err_palette, err_kws, legend, ax, **kwargs)
    253 
    254     # Pad the sides of the plot only when not interpolating
--> 255     ax.set_xlim(x.min(), x.max())
    256     x_diff = x[1] - x[0]
    257     if not interpolate:

UnboundLocalError: local variable 'x' referenced before assignment

The syntax for tsplot is:

sns.tsplot(data, time=None, unit=None, condition=None, value=None, err_style='ci_band', ci=68, interpolate=True, color=None, estimator=<function mean at 0x00000000044F2C18>, n_boot=5000, err_palette=None, err_kws=None, legend=True, ax=None, **kwargs)

So I am providing my data with the index as the time argument but I'm not sure what I am doing wrong. I don't think I need any other keyword arguments but maybe that is the issue.

If I do this with an array with dimensions (unit,time) instead:

sns.tsplot(data.values.T, time=idx)

I get the expected output (except without the timestamps are the xlabels):

enter image description here

But what is the right way to do this with a dataframe? I know it has to be in 'long form' but I'm not quite sure what this would mean for this specific frame.

like image 985
pbreach Avatar asked Dec 05 '14 06:12

pbreach


People also ask

How do you fix UnboundLocalError local variable referenced before assignment?

The Python "UnboundLocalError: Local variable referenced before assignment" occurs when we reference a local variable before assigning a value to it in a function. To solve the error, mark the variable as global in the function definition, e.g. global my_var .

How do I fix UnboundLocalError local variables?

The UnboundLocalError: local variable referenced before assignment error is raised when you try to assign a value to a local variable before it has been declared. You can solve this error by ensuring that a local variable is declared before you assign it a value.


1 Answers

I ended up figuring it out. Basically the first place I should have looked was here in the section titled, "Specifying input data with long-form DataFrames". What I had to do was this:

data.reset_index(inplace=True)
data.columns = np.arange(len(data.columns))
melted = pd.melt(data, id_vars=0)

The first line moves the DatetimeIndex into its own column and sets a default integer index inplace. Second line does the same for the headers except drops them out (I needed to do this because it doesn't seem to be possible to do the grouping with a multiindex). Finally we melt the data creating DataFrame that looks like this:

In [120]:

melted
Out[120]:
0   variable    value
0   1990-01-01  1   1
1   1990-01-02  1   2
2   1990-01-03  1   3
3   1990-01-04  1   4
4   1990-01-05  1   5
5   1990-01-01  2   6
6   1990-01-02  2   7
7   1990-01-03  2   8
8   1990-01-04  2   9
9   1990-01-05  2   1
10  1990-01-01  3   2
11  1990-01-02  3   3
12  1990-01-03  3   4
13  1990-01-04  3   5
14  1990-01-05  3   6
15  1990-01-01  4   7
16  1990-01-02  4   8
17  1990-01-03  4   9
18  1990-01-04  4   1
19  1990-01-05  4   2

Now after the DataFrame is ready I can use tsplot like so:

sns.tsplot(melted, time=0, unit='variable', value='value')

Which in my case is pretty much the same as if I did:

sns.tsplot(data.T.values, idx)
plt.xlabel('0')
plt.ylabel('value')

except if I added any conditions then tsplot would plot the other series and make a legend for me.

It would be nice is tsplot could at least plot dates as timestamps given the nature of the function. I think using the transposed array is going to be a much easier option for my application instead of using a DataFrame directly.

like image 80
pbreach Avatar answered Sep 19 '22 02:09

pbreach