Typecast string to date and date to string in Pyspark

In order to type cast string to date in pyspark we will be using to_date() function with column name and date format as argument. To type cast date to string in pyspark we will be using cast() function with StringType() as argument. Let’s see an example of type conversion or casting of string column to date column and date column to string column in pyspark.

  • Type cast string column to date column in pyspark
  • Type cast date column to string column in pyspark

We will be using the dataframe named df_student

Typecast string to date and date to string in Pyspark 1

 

 

Type cast string column to date column in pyspark:

First let’s get the datatype of “birthday” column as shown below

### Get datatype of birthday column

df_student.select("birthday").dtypes

so the resultant data type of birthday column is stringTypecast string to date and date to string in Pyspark 2

Now let’s convert the birthday column to date using to_date() function with column name and date format  passed as arguments, which converts the string column to date column in pyspark and it is stored  as a dataframe named output_df

########## Type cast string column to date column in pyspark

from pyspark.sql.functions import to_date
df1 = df_student.withColumn('birthday',to_date(df_student.birthday, 'dd-MM-yyyy'))

 Now let’s get the datatype of birthday column as shown below

### Get datatype of birthday

output_df.select("birthday").dtypes

so the resultant data type of birthday column is date

Typecast string to date and date to string in Pyspark 3

 

 

 

Type cast date column to string column in pyspark:

First let’s get the datatype of birthday column from output_df as shown below

### Get datatype of birthday column

output_df.select("birthday").dtypes

so the resultant data type of birthday column is dateTypecast string to date and date to string in Pyspark 4

Now let’s convert the birthday column to string using cast() function with StringType() passed as an argument which converts the  date column to string column in pyspark and it is stored  as a dataframe named output_df

########## Type cast date column to string column in pyspark

from pyspark.sql.types import StringType
output_df = df_student.withColumn("birthday",df_student["birthday"].cast(StringType()))

Now let’s get the datatype of birthday column as shown below

### Get datatype of birthday column

output_df.select("birthday").dtypes

So the resultant data type of birthday column is stringTypecast string to date and date to string in Pyspark 5

 

Typecast string to date and date to string in Pyspark                                                                                              Typecast string to date and date to string in Pyspark