Compare two linear regression models in MATLAB

Question

I want to compare the performance of two models using the F statistic. Here is a reproducible example and the expected results:

load carbig
tbl = table(Acceleration,Cylinders,Horsepower,MPG);

% Testing separetly both models
mdl1 = fitlm(tbl,'MPG~1+Acceleration+Cylinders+Horsepower');
mdl2 = fitlm(tbl,'MPG~1+Acceleration');

% Comparing both models using the F-test and p-value
numerator = (mdl2.SSE-mdl1.SSE)/(mdl1.NumCoefficients-mdl2.NumCoefficients);
denominator = mdl1.SSE/mdl1.DFE;
F = numerator/denominator;
p = 1-fcdf(F,mdl1.NumCoefficients-mdl2.NumCoefficients,mdl1.DFE);

We end up with F = 298.75 and p = 0, indicating mdl1 is significantly better than mdl2, as assessed by the F statistic.

Is there anyway to obtain the F and p values without performing twice fitlm and doing all the computation?

I tried to run a coefTest, as suggested by @Glen_b, however the function is poorly documented and the results are not the ones I'm expecting.

[p,F] = coefTest(mdl1); % p = 0, F = 262.508  (this F test mdl1 vs constant mdl)
[p,F] = coefTest(mdl1,[0,0,1,1]); % p = 0, F = 57.662 (not sure what this is testing)
[p,F] = coefTest(mdl1,[1,1,0,0]); % p = 0, F = 486.810 (idem)

I believe I should carry the test with a different null hypothesis (C) using the function [p,F] = coeffTest(mdl1,H,C). But I don't really know how to do it and there's no example.

Matthew Gunn · Accepted Answer

This answer is in regards to comparing two linear regression models where one model is a restricted version of the other.

Short answer:

To do an F-test on the restriction that the 3rd and 4th elements of your estimated, coefficient vector b are zero:

[p, F] = coefTest(mdl1, [0, 0, 1, 0; 0, 0, 0, 1]);

Further explanation:

Let b be our estimated vector. Linear restrictions on b are typically written in a matrix form: R*b = r. The restriction that 3rd and 4th element of b are zero would be written:

[0, 0, 1, 0    *    b    = [0
 0, 0, 0, 1]                0];

The matrix [0, 0, 1, 0; 0, 0, 0, 1] is what coefTest calls the H matrix in the docs.

P = coefTest(M,H), with H a numeric matrix having one column for each
    coefficient, performs an F test that H*B=0, where B represents the
    coefficient vector.

Long version

Sometimes with this econometric routines, it's nice just to write it out yourself so you know what's really going on.

Remove rows with NaN because they just add unrelated complexity:

tbl_dirty = table(Acceleration,Cylinders,Horsepower,MPG);
tbl = tbl_dirty(~any(ismissing(tbl_dirty),2),:);

Do the estimation etc...

n = height(tbl);  % number of observations
y = tbl.MPG;
X = [ones(n, 1), tbl.Acceleration, tbl.Cylinders, tbl.Horsepower];
k = size(X,2);     % number of variables (including constant)

b = X \ y;                 % estimate b with least squares
u = y - X * b;             % calculates residuals 
s2 = u' * u / (n - k);     % estimate variance of error term (assuming homoskedasticity, independent observations)
BCOV = inv(X'*X) * s2;     % get covariance matrix of b assuming homoskedasticity of error term etc...
bse = diag(BCOV).^.5;      % standard errors

R = [0, 0, 1, 0;
     0, 0, 0, 1];

r = [0; 0];          % Testing restriction: R * b = r 

num_restrictions = size(R, 1);
F = (R*b - r)'*inv(R * BCOV * R')*(R*b - r) / num_restrictions;   % F-stat (see Hiyashi for reference)

Fp = 1 - fcdf(F, num_restrictions, n - k);  % F p-val

For reference, can look at p. 65 of Hiyashi's book Econometrics.

Compare two linear regression models in MATLAB

Tags:

matlab

regression

mat

1 Answers

Short answer:

Further explanation:

Long version

Matthew Gunn

Recent Activity

Donate For Us

Compare two linear regression models in MATLAB

Tags:

matlab

regression

mat

1 Answers

Short answer:

Further explanation:

Long version

Matthew Gunn

Related questions

Recent Activity

Donate For Us