Extract substring of the column in R dataframe

To extract the substring of the column in R we use functions like substr() , str_sub() or str_extract() function. Let’s see how to get the substring of the column in R like

  • Extract first n characters
  • Extract last n characters
  • Extract First word of the column in R
  • Extract last word of the column in R

With an example of each. Let’s first create the dataframe

df1 = data.frame(State = c('Arizona AZ','Georgia GG', 'Newyork NY','Indiana IN','Florida FL'), Score=c(62,47,55,74,31))
df1

So the resultant dataframe will be

extract substring of the column in R dataframe 1

 

Extract first n characters of the column in R

Method 1:

## Method 1 - extract first n character

df1$substring_State = substr(df1$State,1,4)
df1

so the dataframe will be

extract substring of the column in R dataframe 2

Method 2:

## Method 2 - extract first n character

library(stringr)
df1$substring_State = str_sub(df1$State,1,4) 
df1

so the dataframe will be

extract substring of the column in R dataframe 3

 

Extract last n characters of the column in R

# extract last 2 string of column

df1$last_2_string = str_sub(df1$State,-2) 
df1

So the dataframe is

extract substring of the column in R dataframe 4

 

Extract First word of the column in R

Extract first word of the column with str_extract() function along with regular expression is shown below

# extract first word of the column in R

df1$substring_first <- str_extract(df1$State,"(\\w+)") 
df1

So the resultant dataframe is

extract substring of the column in R dataframe 5

 

Extract Last word of the column in R

Extract last word of the column with str_extract() function along with regular expression is shown below

# extract last word of the column in R

library(stringr)
df1$substring_last <- str_extract(df1$State,"\\w+$") 
df1

So the resultant dataframe is

extract substring of the column in R dataframe 6