Select Random Samples in R using Dplyr – (sample_n() and sample_frac())

Sample_n() and Sample_frac() are the functions used to select random samples in R using Dplyr Package.  Dplyr package in R is provided with sample_n() function which selects random n rows from a data frame.  Sample_frac() in R returns the random N% of rows.

  • select random n rows from a dataframe in R using sample_n() function
  • select random n percentage of rows from a dataframe in R using sample_frac() function
  • select random n rows from a dataframe in R using slice_sample() function
  • select random rows by group which selects the random sample within group using slice_sample() and group_by() function in R

slice_sample and sample_frac() in R12

We will be using mtcars data to depict the above functions

slice_sample and sample_frac() in R11

 

sample_n() Function in Dplyr  : select random samples in R using Dplyr

The sample_n function selects random rows from a data frame (or table). First parameter contains the data frame name, the second parameter of the function tells R the number of rows to select.

library(dplyr)
mydata <- mtcars

# select random 4 rows of the dataframe 
sample_n(mydata,4)

In the above code sample_n() function selects random 4 rows of the mtcars dataset. so the result will be

Select Random Samples in R with Dplyr sample_frac() Function

 

sample_frac() Function in Dplyr : select random samples in R using Dplyr

The sample_frac() function selects random n percentage of rows from a data frame (or table). First parameter contains the data frame name, the second parameter tells what percentage of rows to select

library(dplyr)

mydata <- mtcars

# select random 20 percentage rows of the dataframe 
sample_frac(mydata,0.2)

In the above code sample_frac() function selects random 20 percentage of rows from mtcars dataset. So the result will be.

sample_frac() and sample_n() in R 7

 


sampling with slice() function in R

slice_sample() function in R 

slice_sample() function returns the sample n rows of the dataframe as shown below.

 
# slice_sample() function in R

library(dplyr) 
mtcars %>% slice_sample(n = 5)

In the above example we will be selecting 5 samples, so the sample 5 rows are returned

head() and tail() function in r slice(),top_n() 9

 

slice_sample() by group in R

slice_sample() by group in R  Returns the sample n rows of the group using slice_sample() and group_by() functions

 

# slice_sample() by group in R 
mtcars %>% group_by(vs) %>% slice_sample(n = 2)

in the above example we will be selection 2 samples for VS=0 and 2 samples for VS=1 using slice_sample() and group_by() function.

head() and tail() function in r slice(),top_n() 12

 


Sample Function in R with dataset with replacement:

sample function in R 11

Let’s extract set of sample elements from the data set with replacement with the help of sample() function. We will use default mtcars table.

## applying Sample function in R with replacement

set.seed(123)
index = sample(1:nrow(mtcars), 10,replace = TRUE)
index
mtcars[index,]

as the result we will generate sample 10 rows from the mtcars dataframe using sample() function with replacement. so the resultant sample may have repeated rows as shown below

slice_sample and sample_frac() in R14

 

 

 

 

For further understanding on sampling  with the help of dplyr package refer the documentation


Other Related Topics:

 

Select Random Samples in R with Dplyr – (sample_n() and sample_frac()) next

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.