In order to remove leading, trailing and all space of column in pyspark, we use ltrim(), rtrim() and trim() function. Strip leading and trailing space in pyspark is accomplished using ltrim() and rtrim() function respectively. In order to trim both the leading and trailing space in pyspark we will using trim() function. Let’s see how to
- Remove Leading space of column in pyspark with ltrim() function – strip or trim leading space
- Remove Trailing space of column in pyspark with rtrim() function – strip or trim trailing space
- Remove both leading and trailing space of column in postgresql with trim() function – strip or trim both leading and trailing space
- Remove all the space of column in postgresql
We will be using df_states table.
Remove Leading space of column in pyspark with ltrim() function – strip or trim leading space
To Remove leading space of the column in pyspark we use ltrim() function. ltrim() Function takes column name and trims the left white space from that column.
### Remove leading space of the column in pyspark from pyspark.sql.functions import * df_states = df_states.withColumn('states_Name', ltrim(df_states.state_name)) df_states.show(truncate =False)
so the resultant table with leading space removed will be
Remove Trailing space of column in pyspark with rtrim() function – strip or trim trailing space
To Remove Trailing space of the column in pyspark we use rtrim() function. rtrim() Function takes column name and trims the right white space from that column.
### Remove trailing space of the column in pyspark from pyspark.sql.functions import * df_states = df_states.withColumn('states_Name', rtrim(df_states.state_name)) df_states.show(truncate =False)
So the resultant table with trailing space removed will be
Remove both leading and trailing space of column in pyspark with trim() function – strip or trim space
To Remove both leading and trailing space of the column in pyspark we use trim() function. trim() Function takes column name and trims both left and right white space from that column.
### Remove leading and trailing space of the column in pyspark from pyspark.sql.functions import * df_states = df_states.withColumn('states_Name', trim(df_states.state_name)) df_states.show(truncate =False)
So the resultant table with both leading space and trailing spaces removed will be
Remove all the space of column in pyspark with trim() function – strip or trim space
To Remove all the space of the column in pyspark we use regexp_replace() function. Which takes up column name as argument and removes all the spaces of that column through regular expression
### Remove all the space of the column in pyspark from pyspark.sql.functions import regexp_replace, col df_states = df_states.withColumn('states_Name', regexp_replace(col("state_name"), " ", "")) df_states.show(truncate =False)
So the resultant table with all the spaces removed will be
Other Related Topics:
- Remove leading zero of column in pyspark
- Left and Right pad of column in pyspark –lpad() & rpad()
- Add Leading and Trailing space of column in pyspark – add space
- Remove Leading, Trailing and all space of column in pyspark – strip & trim space
- String split of the columns in pyspark
- Repeat the column in Pyspark
- Get Substring of the column in Pyspark
- Get String length of column in Pyspark
- Typecast string to date and date to string in Pyspark
- Typecast Integer to string and String to integer in Pyspark
- Extract First N and Last N character in pyspark
- Convert to upper case, lower case and title case in pyspark
- Add leading zeros to the column in pyspark
- Concatenate two columns in pyspark