Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple regression in R: Variable not found in data.frame

Tags:

r

regression

Here's my data.frame :: beef

> head(beef)
   YEAR....PBE  CBE  PPO  CPO  PFO DINC  CFO RDINC RFP
1 1925    59.7 58.6 60.5 65.8 65.8 51.4 90.9  68.5 877
2 1926    59.7 59.4 63.3 63.3 68.0 52.6 92.1  69.6 899
3   1927    63 53.7 59.9 66.8 65.5 52.1 90.9  70.2 883
4   1928    71 48.1 56.3 69.9 64.8 52.7 90.9  71.9 884
5   1929    71 49.0 55.0 68.7 65.6 55.1 91.1  75.2 895
6 1930    74.2 48.2 59.6 66.1 62.4 48.8 90.7  68.3 874

And

dput(head(beef))
structure(list(YEAR....PBE = structure(1:6, .Label = c("1925    59.7", 
"1926    59.7", "1927    63", "1928    71", "1929    71", "1930    74.2", 
"1931    72.1", "1932    79", "1933    73.1", "1934    70.2", 
"1935    82.2", "1936    68.4", "1937    73", "1938    70.2", 
"1939    67.8", "1940    63.4", "1941    56"), class = "factor"), 
    CBE = c(58.6, 59.4, 53.7, 48.1, 49, 48.2), PPO = c(60.5, 
    63.3, 59.9, 56.3, 55, 59.6), CPO = c(65.8, 63.3, 66.8, 69.9, 
    68.7, 66.1), PFO = c(65.8, 68, 65.5, 64.8, 65.6, 62.4), DINC = c(51.4, 
    52.6, 52.1, 52.7, 55.1, 48.8), CFO = c(90.9, 92.1, 90.9, 
    90.9, 91.1, 90.7), RDINC = c(68.5, 69.6, 70.2, 71.9, 75.2, 
    68.3), RFP = c(877L, 899L, 883L, 884L, 895L, 874L)), .Names = c("YEAR....PBE", 
"CBE", "PPO", "CPO", "PFO", "DINC", "CFO", "RDINC", "RFP"), row.names = c(NA, 
6L), class = "data.frame")

I want to create a multiple linear regression model for PBE depending on the other variables. Following the tutorial in this link I think I should do something the following code:

> lm(formula = PBE ~ CBE + PBO + CPO + PFO + 
+        DINC + CFO+RDINC+RFP+YEAR, data = beef)

Error in eval(expr, envir, enclos) : object 'PBE' not found so I decided to try the followings but all have some errors:

> lm(formula=PBE~YEAR,data=beef)
Error in eval(expr, envir, enclos) : object 'PBE' not found
> lm(formula=beef$PBE~beef$YEAR)
Error in model.frame.default(formula = beef$PBE ~ beef$YEAR, drop.unused.levels = TRUE) : 
  invalid type (NULL) for variable 'beef$PBE

Can you please give me some insight where the typo/error is lying?

P.S.: I read the file using beef=read.table("beef.txt", header = TRUE, sep = "\t", comment.char="%") and the file looks like the following:

% http://lib.stat.cmu.edu/DASL/Datafiles/agecondat.html
% 
% Datafile Name: Agricultural Economics Studies
% Datafile Subjects: Agriculture , Economics , Consumer
% Story Names: Agricultural Economics Studies
% Reference: F.B. Waugh, Graphic Analysis in Agricultural Economics,
%   Agricultural Handbook No. 128, U.S. Department of Agriculture, 1957.
% Authorization: free use
% Description: Price and consumption per capita of beef and pork
%   annually from 1925 to 1941 together with other variables relevant to
%   an economic analysis of price and/or consumption of beef and pork
%   over the period.
% Number of cases: 17
% Variable Names:
% 
%   PBE = Price of beef (cents/lb)
%   CBE = Consumption of beef per capita (lbs)
%   PPO = Price of pork (cents/lb)
%   CPO = Consumption of pork per capita (lbs)
%   PFO = Retail food price index (1947-1949 = 100)
%   DINC = Disposable income per capita index (1947-1949 = 100)
%   CFO = Food consumption per capita index (1947-1949 = 100)
%   RDINC = Index of real disposable income per capita (1947-1949 = 100)
%   RFP = Retail food price index adjusted by the CPI (1947-1949 = 100)
% 
% The Data:
YEAR    PBE CBE PPO CPO PFO DINC    CFO RDINC   RFP
1925    59.7    58.6    60.5    65.8    65.8    51.4    90.9    68.5    877
1926    59.7    59.4    63.3    63.3    68  52.6    92.1    69.6    899
1927    63  53.7    59.9    66.8    65.5    52.1    90.9    70.2    883
1928    71  48.1    56.3    69.9    64.8    52.7    90.9    71.9    884
1929    71  49  55  68.7    65.6    55.1    91.1    75.2    895
1930    74.2    48.2    59.6    66.1    62.4    48.8    90.7    68.3    874
1931    72.1    47.9    57  67.4    51.4    41.5    90  64  791

Here's the result of View(beef) as suggested by Patrick: enter image description here

like image 877
Mona Jalal Avatar asked May 24 '26 19:05

Mona Jalal


1 Answers

You need to go back and look at the file that you loaded these data into R from. The output from head() suggests that the first variable is YEAR....PBE and that the PBE data has gotten merged with the YEAR variable, probably because of some issue with the delimiters in use in the file you read in. Go back and check the file carefully.

One way to do this from within R is to use count.fields(), which you pass the filename to check. Do read ?count.fields as you will potentially need to set the sep and quote arguments in order to match the file you read the data from. The function will tell you how many fields (variables) it finds; compare this with the known number of variables.

From your edit, it is clear that something like what I describe above has happened:

> names(beef)
[1] "YEAR....PBE" "CBE"         "PPO"         "CPO"         "PFO"        
[6] "DINC"        "CFO"         "RDINC"       "RFP"

It seems the file is not all/fully/truly Tab-delimited. I was able to read the bit of data you included with:

beef <- read.table("file.name", header = TRUE, sep = "", comment.char = "%")

> head(beef)
  YEAR  PBE  CBE  PPO  CPO  PFO DINC  CFO RDINC RFP
1 1925 59.7 58.6 60.5 65.8 65.8 51.4 90.9  68.5 877
2 1926 59.7 59.4 63.3 63.3 68.0 52.6 92.1  69.6 899
3 1927 63.0 53.7 59.9 66.8 65.5 52.1 90.9  70.2 883
4 1928 71.0 48.1 56.3 69.9 64.8 52.7 90.9  71.9 884
5 1929 71.0 49.0 55.0 68.7 65.6 55.1 91.1  75.2 895
6 1930 74.2 48.2 59.6 66.1 62.4 48.8 90.7  68.3 874
> str(beef)
'data.frame':   7 obs. of  10 variables:
 $ YEAR : int  1925 1926 1927 1928 1929 1930 1931
     $ PBE  : num  59.7 59.7 63 71 71 74.2 72.1
 $ CBE  : num  58.6 59.4 53.7 48.1 49 48.2 47.9
     $ PPO  : num  60.5 63.3 59.9 56.3 55 59.6 57
 $ CPO  : num  65.8 63.3 66.8 69.9 68.7 66.1 67.4
     $ PFO  : num  65.8 68 65.5 64.8 65.6 62.4 51.4
 $ DINC : num  51.4 52.6 52.1 52.7 55.1 48.8 41.5
     $ CFO  : num  90.9 92.1 90.9 90.9 91.1 90.7 90
 $ RDINC: num  68.5 69.6 70.2 71.9 75.2 68.3 64
     $ RFP  : int  877 899 883 884 895 874 791
like image 178
Gavin Simpson Avatar answered May 26 '26 08:05

Gavin Simpson



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!