Add Leading and Trailing space of column in pyspark – add space

In order to add leading and trailing space of column in pyspark, we use will be using pad() function. To Add leading space of the column in pyspark we will be using left padding with space. To Add trailing space of the column in pyspark we will be using right padding with space. To Add leading and trailing space of the column in pyspark we will be using pad function. Let’s see how to

  • Add leading space of the column in pyspark
  • Add trailing space of the column in pyspark
  • Add both leading and trailing space of the column in postgresql

We will be using df_states table.

Add Leading and Trailing space of column in pyspark 1

 

 

Add leading space of the column in pyspark : Method 1

To Add leading space of the column in pyspark we use lpad() function. lpad() Function takes column name ,length and padding string as arguments. In our case we are using state_name column and “ ” (space) as padding string so the leading space is added till the column reaches 14 characters

### Add leading space of the column in pyspark
from pyspark.sql.functions import *

df_states = df_states.withColumn('states_Name_new', lpad(df_states.state_name,14, ' '))
df_states.show(truncate =False)

After adding leading space the dataframe will look like

Add Leading and Trailing space of column in pyspark 1

 

Add leading space of the column in pyspark : Method 2

To Add Leading Space of the column in pyspark we can use concat() function. concat() Function takes ”  ” (space) and column name as argument, so that the space is placed before the column name as leading space as shown below.

### Add Leading space of the column in pyspark

from pyspark.sql import functions as sf

df_states =df_states.withColumn('states_Name_new',sf.concat(sf.lit('     '), sf.col('state_name')))
df_states.show(truncate =False)

After adding leading space the dataframe will look like

Add Leading and Trailing space of column in pyspark 1

 

 

Add Trailing space of the column in pyspark: Method 1

To Add Trailing space of the column in pyspark we use rpad() function. rpad() Function takes column name ,length and padding string as arguments. In our case we are using state_name column and “ ” (space) as padding string so the trailing space is added till the column reaches 14 characters

### Add Trailing space of the column in pyspark
from pyspark.sql.functions import *

df_states = df_states.withColumn('states_Name_new', rpad(df_states.state_name,14, ' '))
df_states.show(truncate =False)

After adding trailing space the dataframe will look like

Add Leading and Trailing space of column in pyspark 3

 

 

Add Trailing space of the column in pyspark : Method 2

To Add Trailing Space of the column in pyspark we can use concat() function. concat() Function takes ”  ” (space) and column name as argument, so that the space is placed after the column name as trailing space as shown below.

### Add Trailing space of the column in pyspark

from pyspark.sql import functions as sf

df_states =df_states.withColumn('states_Name_new',sf.concat(sf.col('state_name'),sf.lit('     ')))
df_states.show(truncate =False)

After adding trailing space the dataframe will look like

Add Leading and Trailing space of column in pyspark 3

 

 

Add both Leading and Trailing space of the column in pyspark

To Add Leading Space and Trailing space of the column in pyspark we use concat() function. concat() Function takes column name and “ ” (space) on either side.

### Add both Leading and Trailing space of the column in pyspark
from pyspark.sql import functions as sf

df_states =df_states.withColumn('State_Name_New',sf.concat(sf.lit('  '), sf.col('state_name'),sf.lit('  ')))
df_states.show(truncate =False)

After adding both leading and trailing space the dataframe will look like

Add Leading and Trailing space of column in pyspark 4

for more details you can refer this article

 

 


Other Related Topics:

 

Add Leading and Trailing space of column in pyspark – add space                                                                                                 Add Leading and Trailing space of column in pyspark – add space

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.