Generate Sample with Sample() Function in R

Sample() function in R, generates a sample of the specified size from the data set or elements, either with or without replacement. Sample() function is used to get the sample of a numeric and character vector and also dataframe. Lets see an example of

  • sample of a numeric and character vector using sample() function in R
  • sample of a dataframe using sample() function in R.
  • sample using slice() function in R .
  • select sample from each group using slice() and group_by() function in R

slice_sample and sample_frac() in R12

Syntax for Sample() Function in R:

sample(x, size, replace = FALSE, prob = NULL)
x Data Set or  a vector of one or more elements from which sample is to be chosen
size size of a sample
replace Should sampling be with replacement?
prob probability weights for obtaining the elements of the vector being sampled

Sample function in R with replacement:

Lets see an example that generates 10 random sample from vector of 1 to 20. With replacement =TRUE. which means value  in the sample can occur more than once

## basic Sample function in R

sample(1:20, 10, replace=TRUE)

When we run the above code the output will be

[1]  6  8  12  19  5  18  19  14  13  2

 

Sample Function in R with set.seed(): 

If we run the sample function again and again, it gives different set of samples each time.What if we require the same sample each time?
For that we use set.seed() function. Set.seed() function always returns the same pseudo-random sequence.

## Sample function in R with set.seed()

set.seed(10)
sample(1:20, 10, replace=TRUE)

 Whenever we run the sample function along with the set.seed() function it outputs same set of samples.

so the output always will be the same for set.seed(10)

[1] 11  7  9  14  2  5  6  6  13  9

 

 

Sample function with no Replacement:

By passing an argument replace = FALSE. We can draw a sample without replacement

## Sample function in R with No Replacement

sample(1:20, 10, replace=FALSE)

 so the output will be without any repetition

[1]   14  11  3  19  6  7  1  4  5 10

Sample Function in R with dataset:

Let’s extract set of sample elements from the data set with the help of sample function in R. We will use default mtcars table in R.

## applying Sample function in R to mt cars table to extract 5 sample rows

set.seed(123)
index<-sample(1:nrow(mtcars), 5)
index
mtcars[index,]

when we execute the above code ·

  • sample function returns the 5 row indexes to the index object·
  • Then that 5 indexes are passed as input to the mtcars to fetch that 5 rows

So the output will be

[1] 10  25  13  26  27

and the sample dataframe will be

Sample function in R 2
As we have set seeds it will give the same set of 5 rows as a sample.

 

 

Sample Function in R with dataset with replacement:

sample function in R 11

Let’s extract set of sample elements from the data set with replacement with the help of sample() function. We will use default iris table.

## applying Sample function in R with replacement

set.seed(123)
index = sample(1:nrow(iris), 10,replace = TRUE)
index
mtcars[index,]

as the result we will generate sample 10 rows from the iris dataframe using sample() function with replacement. so the resultant sample may have repeated rows as shown below

Sample function in R 1

 


sampling with slice() function in R:

Lets use mtcars dataframe to depict example on sampling using slice() function

slice_sample() function in R: 

slice_sample() function returns the sample n rows of the dataframe as shown below.

 
# slice_sample() function in R

library(dplyr) 
mtcars %>% slice_sample(n = 5)

In the above example we will be selecting 5 samples, so the sample 5 rows are returned

head() and tail() function in r slice(),top_n() 9

 

slice_sample() by group in R:

slice_sample() by group in R  Returns the sample n rows of the group using slice_sample() and group_by() functions

 

# slice_sample() by group in R 
mtcars %>% group_by(vs) %>% slice_sample(n = 2)

in the above example we will be selection 2 samples for VS=0 and 2 samples for VS=1 using slice_sample() and group_by() function.

head() and tail() function in r slice(),top_n() 12


Other Related Topics:

previous small sample function in r                                                                                                           next small sample function in R

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.