I’m looking for R resources, and I started looking at “An Introduction to R” here at r-project.org. I did and got stumped immediately. I think I've figured out what’s going on, and my question is basically <ul> <li>Are there resources to help me figure out something like this more easily?</li> </ul> The preface of the Introduction to R suggests starting with the introductory session in Appendix A, and right at the start is this code and remark. <pre class="prettyprint"><code>x <- rnorm(50) y <- rnorm(x) Generate two pseudo-random normal vectors of x- and y-coordinates. </code></pre> The documentation says the (first and only non-optional) parameter to <code>rnorm</code> is the length of the result vector. So <code>x <- rnorm(50)</code> produces a vector of 50 random values from a normal distribution with mean 0 and standard deviation 1. So far so good. But why does <code>rnorm(x)</code> seem to do what <code>y <- rnorm(50)</code> or <code>y <- rnorm(length(x))</code> would have done? Either of these alternatives seem clearer to me. My guess as to what happens is this: <ol> <li>The wrapper for <code>rnorm</code> didn’t care what kind of thing <code>x</code> is and just passed to the underlying C function a pointer to the C <code>struct</code> for <code>x</code> as an R object.</li> <li>R objects represented in C are structs followed by “data”; the data of the C representation of an R vector of reals starts with two integers, the first of which is the vector's length. (The vector elements follow those integers.) I found this out by reading up on R internals here. </li> <li>If a C function were written to find the value of an R integer from a passed pointer-to-R-object, and it were called with a pointer to an R vector of reals, it would find the vector’s length in the place it would look for the single integer.</li> </ol> In addition to my main question of “How can I figure out something like this more easily?”, I wouldn’t mind knowing whether what I think is going on is correct and whether the fact that <code>rnorm(x)</code> is idiomatic R in this context or more of a sloppy choice. Given that it does something useful, can it be relied upon or is it just lucky behavior for an expression that isn’t well-defined in R? I’m used to strongly-typed languages like C or SQL, which have easier-to-follow (for me) semantics and which also have more comprehensive references available, so any references for R that have a programming-language-theory focus or are aimed at people used to strong typing would be good, too.

It is documented behavior. From <code>?rnorm</code>: <blockquote> Usage: [...] <pre class="prettyprint"><code> rnorm(n, mean = 0, sd = 1) </code></pre> Arguments: [...] <pre class="prettyprint"><code> n: number of observations. If ‘length(n) > 1’, the length is taken to be the number required. </code></pre> </blockquote>

Is this what rnorm(x) does if x is a vector, and how could I have found out faster?

Q: Is Rnorm a vector?

rnorm generates a vector of normally distributed random numbers.

Q: What does Rnorm mean?

rnorm is the R function that simulates random variates having a specified normal distribution. As with pnorm , qnorm , and dnorm , optional arguments specify the mean and standard deviation of the distribution.

Q: How does Rnorm work in R?

The rnorm() function in R generates a random number using a normal(bell curve) distribution. Thus, the rnorm() function simulates random variates having a specified normal distribution.

Q: Can a vector have one value in r?

In R, a single element is always a vector of length 1, there is no special object for single values.

Tags:

r

I’m looking for R resources, and I started looking at “An Introduction to R” here at r-project.org. I did and got stumped immediately.

I think I've figured out what’s going on, and my question is basically

Are there resources to help me figure out something like this more easily?

The preface of the Introduction to R suggests starting with the introductory session in Appendix A, and right at the start is this code and remark.

x <- rnorm(50)
y <- rnorm(x)

Generate two pseudo-random normal vectors of x- and y-coordinates.

The documentation says the (first and only non-optional) parameter to rnorm is the length of the result vector. So x <- rnorm(50) produces a vector of 50 random values from a normal distribution with mean 0 and standard deviation 1.

So far so good. But why does rnorm(x) seem to do what y <- rnorm(50) or y <- rnorm(length(x)) would have done? Either of these alternatives seem clearer to me.

My guess as to what happens is this:

The wrapper for rnorm didn’t care what kind of thing x is and just passed to the underlying C function a pointer to the C struct for x as an R object.
R objects represented in C are structs followed by “data”; the data of the C representation of an R vector of reals starts with two integers, the first of which is the vector's length. (The vector elements follow those integers.) I found this out by reading up on R internals here.
If a C function were written to find the value of an R integer from a passed pointer-to-R-object, and it were called with a pointer to an R vector of reals, it would find the vector’s length in the place it would look for the single integer.

In addition to my main question of “How can I figure out something like this more easily?”, I wouldn’t mind knowing whether what I think is going on is correct and whether the fact that rnorm(x) is idiomatic R in this context or more of a sloppy choice. Given that it does something useful, can it be relied upon or is it just lucky behavior for an expression that isn’t well-defined in R?

I’m used to strongly-typed languages like C or SQL, which have easier-to-follow (for me) semantics and which also have more comprehensive references available, so any references for R that have a programming-language-theory focus or are aimed at people used to strong typing would be good, too.

273

asked May 18 '18 21:05

Steve Kass

1 Answers

It is documented behavior. From ?rnorm:

Usage: [...]

 rnorm(n, mean = 0, sd = 1)

Arguments: [...]

   n: number of observations. If ‘length(n) > 1’, the length is
      taken to be the number required.

answered Sep 22 '22 06:09

Ralf Stubner

Related questions
                            
                                Disable hover information for a specific layer (geom) of plotly
                            
                                How to subset a Data frame column wise using column names? [duplicate]
                            
                                rstudioapi askForPassword without masking for username entry
                            
                                R: Deleting rows based on a value in a column from a large data set in R [duplicate]
                            
                                Using group_by with mutate_if by column name
                            
                                Split data frame by two factors
                            
                                get a line break / new line in excel file with r xlsx
                            
                                replace values throughout a tibble
                            
                                Subsetting geojson data with R
                            
                                Correlation Matrix - tidyr gather v. reshape2 melt
                            
                                dplyr number of rows across groups after filtering
                            
                                How to put plots without any space using plot_grid?
                            
                                convert all factor columns to character in a data.frame without affecting non-factor columns
                            
                                How to plot dataframe in R as a heatmap/grid?
                            
                                How can I add a message box in R Shiny?
                            
                                Using stat_function to draw partially shaded normal curve in ggplot2
                            
                                R data.table compute new column, but insert at beginning
                            
                                In R, using melt(), how can I hide warning messages?
                            
                                Does `tfread` exist?
                            
                                Rank most recent scores of students within a given date - 30 days window

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With