Left and Right pad of column in pyspark –lpad() & rpad()

In order to add padding to the left side of the column we use left pad of column in pyspark, left padding is accomplished using lpad() function. In order to add padding to the right side of the column we use right pad of column in pyspark, right padding is accomplished using rpad() function. Let’s see how to

  • Left pad of the column in pyspark – lpad()
  • Right pad of the column in pyspark – rpad()

We will be using dataframe df_states
Left and Right pad of column in pyspark –lpad() & rpad() 1

 

 

Add left pad of the column in pyspark

Padding is accomplished using lpad() function. lpad() Function takes column name ,length and padding string as arguments. In our case we are using state_name column and “#” as padding string so the left padding is done till the column reaches 14 characters.

### Add Left pad of the column in pyspark
from pyspark.sql.functions import *

df_states = df_states.withColumn('states_Name_new', lpad(df_states.state_name,14, '#'))
df_states.show(truncate =False)

So the resultant left padding string and dataframe will be
Left and Right pad of column in pyspark –lpad() & rpad() 2

 

 

Add Right pad of the column in pyspark

Padding is accomplished using rpad() function. rpad() Function takes column name ,length and padding string as arguments. In our case we are using state_name column and “#” as padding string so the right padding is done till the column reaches 14 characters.

### Add Right pad of the column in pyspark
from pyspark.sql.functions import *

df_states = df_states.withColumn('states_Name_new', rpad(df_states.state_name,14, '#'))
df_states.show(truncate =False)

So the resultant right padding string and dataframe will be
Left and Right pad of column in pyspark –lpad() & rpad() 3

 

Left and Right pad of column in pyspark –lpad() & rpad()                                                                               Left and Right pad of column in pyspark –lpad() & rpad()