Remove leading zeros of column in pyspark

In order to remove leading zero of column in pyspark, we use regexp_replace() function and we remove consecutive leading zeros. Lets see an example on how to remove leading zeros of the column in pyspark.

  • Remove Leading Zeros of column in pyspark

We will be using dataframe df.

Remove leading zero of column in pyspark 1

Remove Leading Zeros of the column in pyspark c1

 

Remove leading zero of column in pyspark

We use regexp_replace() function with column name and regular expression as argument and thereby we remove consecutive leading zeros. The regular expression replaces all the leading zeros with ‘ ‘. then stores the result in grad_score_new.

### Remove leading zero of column in pyspark
from pyspark.sql.functions import *
import pyspark.sql.functions as F

df = df.withColumn('grad_Score_new', F.regexp_replace('grad_Score', r'^[0]*', ''))

so the resultant dataframe with leading zeros removed will be

Remove leading zero of column in pyspark 2

 


Other Related Topics: 

                                                                                               Remove leading zero of column in pyspark

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.