Replace the missing value of column in R

To replace the missing value of the column in R we use different methods like replacing missing value with zero, with average and median etc. In this tutorial we will be looking on how to

  • Replace the missing value of the column in R with 0 (zero)
  • Replace missing value of the column with mean
  • Replace missing value of the column with median

Let’s first create the dataframe.

df1 = data.frame(State = c('Arizona AZ','Georgia GG', 'Newyork NY','Indiana IN','Florida FL'), 
                 Score=c(62,47,55,74,31))
df1

So the dataframe will be

Replace the missing value of column in R 1

 

Replace missing value of the column with zero (0):

Replace missing value of Mathematics_score column with zero

### replace missing value with zero

df1$Mathematics_score[is.na(df1$Mathematics_score)] <- 0

so the resultant dataframe will be

Replace the missing value of column in R 2

 

Replace missing value of the column with mean:

Replace missing value of Mathematics_score column with mean

### replace missing value of column with mean

df1$Mathematics_score[is.na(df1$Mathematics_score)] <- mean(df1$Mathematics_score,na.rm = TRUE)

so the resultant dataframe will be

Replace the missing value of column in R 3

 

Replace missing value of the column with median:

Replace missing value of Mathematics_score column with median

### replace missing value of column with median

df1$Mathematics_score[is.na(df1$Mathematics_score)]<- median(df1$Mathematics_score,na.rm = TRUE)

so the resultant dataframe will be

Replace the missing value of column in R 4

                                                                                                   

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.