How do I assign a random seed to the dplyr sample_n function?

This is the "sample_n" from dplyr in R.

For reproducibility, I should place a seed so that someone else can get my exact results.

Is there a built-in way to set the seed for "sample_n"? Is this something that I do in the environment and "sample_n" responds to it?

These are not built-into the "sample_n" function.

  • There is the environment "set.seed" function [1]
  • There is a library 'withr' that creates a seed-containing wrapper for code [2]


2 Answers

The dplyr::sample_n documentation tells that :

This is a wrapper around sample.int() to make it easy to select random rows from a table. It currently only works for local tbls.

so behind sample_n, sample.int is called, which means that the standard Random Number Generator is used, and that you can use set.seed for reproducibility.

Does this example help? In it, I am using set.seed and the mtcars dataset.

x <- mtcars
sample_n(x, 10)

sample_n(x, 10) #without set.seed()

x <- mtcars
sample_n(x, 10)
