Stratified Random Sampling in R – Dataframe

Stratified Random Sampling in R : In Stratified sampling every member of the population is grouped into homogeneous subgroups before sampling. Each sub group is called Strata. A representative from each strata is chosen randomly, this is stratified random sampling. Lets see in R

Stratified random sampling in R 1

 

Stratified random sampling of dataframe in R:

Sample_n() along with group_by() function is used to get the stratified random sampling of dataframe in R as shown below. We are using iris dataset

# stratified Random Sampling in R 

Library(dplyr) 
sample_iris <- iris %>%
  group_by(Species) %>%
  sample_n(3)
sample_iris

3 rows are selected from each Species(strata) so the result will be

Stratified random sampling in R 2

                                                                                                           

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.