Typecast Integer to string and String to integer in Pyspark

In order to typecast an integer to string in pyspark we will be using cast() function with StringType() as argument, To typecast string to integer in pyspark we will be using cast() function with IntegerType() as argument. Let’s see an example of type conversion or casting of integer column to string column or character column and string column to integer column or numeric column in pyspark.

  • Type cast an integer column to string column in pyspark
  • Type cast a string column to integer column in pyspark

We will be using the dataframe named df_cust

Typecast Integer to string and String to integer in Pyspark 1

 

 

Typecast an integer column to string column in pyspark:

First let’s get the datatype of zip column as shown below

### Get datatype of zip column

df_cust.select("zip").dtypes

so the resultant data type of zip column is integer

Typecast Integer to string and String to integer in Pyspark 2

Now let’s convert the zip column to string using cast() function with StringType() passed as an argument which converts the integer column to character or string column in pyspark and it is stored  as a dataframe named output_df

########## Type cast an integer column to string column in pyspark

from pyspark.sql.types import StringType
output_df = df_cust.withColumn("zip",df_cust["zip"].cast(StringType()))

Now let’s get the datatype of zip column as shown below

### Get datatype of zip column

output_df.select("zip").dtypes

so the resultant data type of zip column is String

Typecast Integer to string and String to integer in Pyspark 3

 

 

 

 

Typecast String column to integer column in pyspark:

First let’s get the datatype of zip column as shown below

### Get datatype of zip column

output_df.select("zip").dtypes

so the data type of zip column is String

Typecast Integer to string and String to integer in Pyspark 4

Now let’s convert the zip column to integer using cast() function with IntegerType() passed as an argument which converts the character column or string column to integer column in pyspark and it is stored  as a dataframe named output_df

########## Type cast string column to integer column in pyspark

from pyspark.sql.types import IntegerType
output_df = output_df.withColumn("zip",output_df["zip"].cast(IntegerType()))

Now let’s get the datatype of zip column as shown below

### Get datatype of zip column

output_df.select("zip").dtypes

So the resultant data type of zip column is integer

Typecast Integer to string and String to integer in Pyspark 5

 


Other Related Topics :

Typecast Integer to string and String to integer in Pyspark                                                                                               Typecast Integer to string and String to integer in Pyspark

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.