Extract First N and Last N character in pyspark

In order to Extract First N and Last N character in pyspark we will be using substr() function. In this Tutorial we will see an example on how to extract First N character from left in pyspark and how to extract last N character from right in pyspark. Let’s see how to

  • Extract First N character in pyspark – First N character from left
  • Extract Last N character in pyspark – Last N character from right

With an example for both

We will be using the dataframe named df_states

Extract First N and Last N character in pyspark 1

 

 

 

Extract First N character in pyspark – First N character from left

First N character of column in pyspark is obtained using substr() function.

########## Extract first N character from left in pyspark

df = df_states.withColumn("first_n_char", df_states.state_name.substr(1,6))
df.show()

First 6 characters from left is extracted using substring function so the resultant dataframe will be

Extract First N and Last N character in pyspark 2

 

 

 

Extract Last N character in pyspark – Last N character from right

Extract Last N character of column in pyspark is obtained using substr() function. by passing first argument as negative value as shown below

########## Extract Last N character from right in pyspark

df = df_states.withColumn("last_n_char", df_states.state_name.substr(-2,2))
df.show()

Last 2 characters from right is extracted using substring function so the resultant dataframe will be

Extract First N and Last N character in pyspark 3

 

Extract First N and Last N character in pyspark                                                                                       Extract First N and Last N character in pyspark