In order to typecast an integer to decimal in pyspark we will be using cast() function with DecimalType() as argument, To typecast integer to float in pyspark we will be using cast() function with FloatType() as argument. Let’s see an example of type conversion or casting of integer column to decimal column and integer column to float column in pyspark.
- Type cast an integer column to decimal column in pyspark
- Type cast an integer column to float column in pyspark
We will be using the dataframe named df_cust
Typecast an integer column to string column in pyspark:
First let’s get the datatype of zip column as shown below
### Get datatype of zip column df_cust.select("zip").dtypes
so the resultant data type of zip column is integer
Now let’s convert the zip column to string using cast() function with DecimalType() passed as an argument which converts the integer column to decimal column in pyspark and it is stored as a dataframe named output_df
########## Type cast an integer column to Decimal column in pyspark from pyspark.sql.types import DecimalType output_df = df_cust.withColumn("zip",df_cust["zip"].cast(DecimalType()))
Now let’s get the datatype of zip column as shown below
### Get datatype of zip column output_df.select("zip").dtypes
so the resultant data type of zip column is decimal
Typecast an integer column to float column in pyspark:
First let’s get the datatype of zip column as shown below
### Get datatype of zip column df_cust.select("zip").dtypes
so the resultant data type of zip column is integer
Now let’s convert the zip column to string using cast() function with FloatType() passed as an argument which converts the integer column to float column in pyspark and it is stored as a dataframe named output_df
########## Type cast integer column to float column in pyspark from pyspark.sql.types import FloatType output_df = df_cust.withColumn("zip",df_cust["zip"].cast(FloatType()))
Now let’s get the datatype of zip column as shown below
### Get datatype of zip column output_df.select("zip").dtypes
So the resultant data type of zip column is float
Other Related Topics:
- Typecast string to date and date to string in Pyspark
- Typecast Integer to string and String to integer in Pyspark
- Extract First N and Last N character in pyspark
- Convert to upper case, lower case and title case in pyspark
- Add leading zeros to the column in pyspark
- Typecast Integer to Decimal and Integer to float in Pyspark
- Concatenate two columns in pyspark
- Simple random sampling and stratified sampling in pyspark – Sample(), SampleBy()
- Join in pyspark (Merge) inner , outer, right , left join in pyspark
- Get duplicate rows in pyspark
- Quantile rank, decile rank & n tile rank in pyspark – Rank by Group
- Populate row number in pyspark – Row number by Group
- Percentile Rank of the column in pyspark
- Mean of two or more columns in pyspark
- Sum of two or more columns in pyspark
- Row wise mean, sum, minimum and maximum in pyspark