Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Linear regression - reduce degrees of freedom

I have a Pandas dataframe with columns like

Order     Balance     Profit cum (%)

I'm doing a linear regression

model_profit_tr = pd.ols(y=df_closed['Profit cum (%)'], x=df_closed['Order'])

The problem with this is that standard model is like (equation of a line that does not pass through the origin)

y = a * x + b

There is 2 degrees of freedom (a and b)

slope (a):

a=model_profit_tr.beta['x']

and intercept (b):

b=model_profit_tr.beta['intercept']

I'd like to reduce degree of freedom for my model (from 2 to 1) and I 'd like to have a model like

y = a * x
like image 308
Femto Trader Avatar asked Sep 30 '12 19:09

Femto Trader


People also ask

Are there degrees of freedom in linear regression?

The total degrees of freedom for the linear regression model is taken as the sum of the model degrees of freedom plus the model error degrees of freedom. Generally, the degrees of freedom is equal to the number of rows of training data used to fit the model.

How do you correct degrees of freedom?

The most commonly encountered equation to determine degrees of freedom in statistics is df = N-1. Use this number to look up the critical values for an equation using a critical value table, which in turn determines the statistical significance of the results.

What affects the degrees of freedom?

Degrees of freedom are related to sample size (n-1). If the df increases, it also stands that the sample size is increasing; the graph of the t-distribution will have skinnier tails, pushing the critical value towards the mean.

What is the degree of freedom for the t test for multiple linear regression?

The degrees of freedom in a multiple regression equals N-k-1, where k is the number of variables. The more variables you add, the more you erode your ability to test the model (e.g. your statistical power goes down).


1 Answers

Use the intercept keyword argument:

model_profit_tr = pd.ols(y=df_closed['Profit cum (%)'], 
                         x=df_closed['Order'], 
                         intercept=False)

From docs:

In [65]: help(pandas.ols) 
Help on function ols in module pandas.stats.interface:

ols(**kwargs)

    [snip]

    Parameters
    ----------
    y: Series or DataFrame
        See above for types
    x: Series, DataFrame, dict of Series, dict of DataFrame, Panel
    weights : Series or ndarray
        The weights are presumed to be (proportional to) the inverse of the
        variance of the observations.  That is, if the variables are to be
        transformed by 1/sqrt(W) you must supply weights = 1/W
    intercept: bool
        True if you want an intercept.  Defaults to True.
    nw_lags: None or int
        Number of Newey-West lags.  Defaults to None.

    [snip]
like image 66
Avaris Avatar answered Oct 18 '22 11:10

Avaris