Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fitting a normal distribution in R

I'm using the following code to fit the normal distribution. The link for the dataset for "b" (too large to post directly) is :

link for b

setwd("xxxxxx")
library(fitdistrplus)

require(MASS)
tazur <-read.csv("b", header= TRUE, sep=",")
claims<-tazur$b
a<-log(claims)
plot(hist(a))

enter image description here

After plotting the histogram, it seems a normal distribution should fit well.

f1n <- fitdistr(claims,"normal")
summary(f1n)

#Length Class  Mode   
#estimate 2      -none- numeric
#sd       2      -none- numeric
#vcov     4      -none- numeric
#n        1      -none- numeric
#loglik   1      -none- numeric

plot(f1n)

Error in xy.coords(x, y, xlabel, ylabel, log) :

'x' is a list, but does not have components 'x' and 'y'

I get the above error when I try to plot the fitted distribution, and even the summary statistics are off for f1n.

Would appreciate any help.

like image 911
zkhan Avatar asked Oct 10 '16 15:10

zkhan


People also ask

How do you fit a distribution in R?

FITTING DISTRIBUTIONS IN R We can use the function plotdist(data) to obtain the histogram and the cummulative distribution graph of teh data. Exercise: Try to simulate 10^5 observations from the most known probability distributions you know and plot their Empirical density and Cummulative distribution.

How do I sample a normal distribution in R?

To generate a sample of size 100 from a standard normal distribution (with mean 0 and standard deviation 1) we use the rnorm function. We only have to supply the n (sample size) argument since mean 0 and standard deviation 1 are the default values for the mean and stdev arguments.

How do you fit data in R?

To fit a curve to some data frame in the R Language we first visualize the data with the help of a basic scatter plot. In the R language, we can create a basic scatter plot by using the plot() function. where, df: determines the data frame to be used.

What does fitting to a normal distribution mean?

To fit a normal distribution we need to know the mean and the standard deviation. Remember that the mean of a binomial distribution is μ = np, and that the standard deviation for that distribution is σ = np(1− p). The normal distribution is continuous, whereas the binomial distribution is discrete.


1 Answers

Looks like you are making confusion between MASS::fitdistr and fitdistrplus::fitdist.

  • MASS::fitdistr returns object of class "fitdistr", and there is no plot method for this. So you need to extract estimated parameters and plot the estimated density curve yourself.
  • I don't know why you load package fitdistrplus, because your function call clearly shows you are using MASS. Anyway, fitdistrplus has function fitdist which returns object of class "fitdist". There is plot method for this class, but it won't work for "fitdistr" returned by MASS.

I will show you how to work with both packages.

## reproducible example
set.seed(0); x <- rnorm(500)

Using MASS::fitdistr

No plot method is available, so do it ourselves.

library(MASS)
fit <- fitdistr(x, "normal")
class(fit)
# [1] "fitdistr"

para <- fit$estimate
#         mean            sd 
#-0.0002000485  0.9886248515 

hist(x, prob = TRUE)
curve(dnorm(x, para[1], para[2]), col = 2, add = TRUE)

enter image description here


Using fitdistrplus::fitdist

library(fitdistrplus)
FIT <- fitdist(x, "norm")    ## note: it is "norm" not "normal"
class(FIT)
# [1] "fitdist"

plot(FIT)    ## use method `plot.fitdist`

enter image description here

like image 175
Zheyuan Li Avatar answered Sep 22 '22 23:09

Zheyuan Li