Get data type of column in Pyspark (single & Multiple columns)

In order to Get data type of column in pyspark we will be using dtypes function and printSchema() function . We will explain how to get data type of single and multiple columns in Pyspark with an example.

  • Get data type of single column in pyspark
  • Get data type of multiple column in pyspark
  • Get data type of all the column in pyspark

We will use the dataframe named df_basket1.

Get data type of column in Pyspark (single & Multiple columns) 1

 

Get data type of single column in pyspark

dataframe.select(‘columnname’).printschema() is used to select data type of single column

df_basket1.select('Price').printSchema()

We use select function to select a column and use printSchema() function to get data type of that particular column. So in our case we get the data type of ‘Price’ column as shown above.

Get data type of column in Pyspark (single & Multiple columns) 2

 

 

Get data type of multiple column in pyspark

dataframe.select(‘columnname1′,’columnname2’).printschema() is used to select data type of multiple columns

df_basket1.select('Price','Item_name').printSchema()

We use select function to select multiple columns and use printSchema() function to get data type of these columns. So in our case we get the data type of ‘Price’ and ‘Item_name’ column as shown above

Get data type of column in Pyspark (single & Multiple columns) 3

 

 

Get data type of all the columns in pyspark
Method 1: using printSchema()

dataframe.printSchema() is used to get the data type of each column in pyspark.

df_basket1.printSchema()

printSchema() function gets the data type of each column as shown below

Get data type of column in Pyspark (single & Multiple columns) 4

 

Method 2:  using dtypes
dataframe.dtypes is used to get the data type of each column in pyspark

df_basket1.dtypes

dtypes function gets the data type of each column as shown below

Get data type of column in Pyspark (single & Multiple columns) 5

 

                                                                                                    Get data type of column in Pyspark (single & Multiple columns)