Get List of columns and its data type in Pyspark

In order to Get list of columns and its data type in pyspark we will be using dtypes function and printSchema() function . We will explain how to get list of column names of the dataframe along with its data type in pyspark with an example.

  • Get List of column names in pyspark dataframe.
  • Get List of columns and its datatype in pyspark using dtypes function.
  • Extract List of column name and its datatype in pyspark using printSchema() function
  • we can also get the datatype of single specific column in pyspark.

We have used two methods to get list of column name and its data type in Pyspark.

We will use the dataframe named df_basket1.

Get List of columns and its data type in Pyspark 1

 

Get List of columns in pyspark:

To get list of columns in pyspark we use dataframe.columns syntax

df_basket1.columns

So the list of columns will be

Get List of columns and its data type in Pyspark 2

 


Get list of columns and its data type in pyspark

Method 1:  using printSchema() function.

df_basket1.printSchema()

printSchema() function gets the data type of each column as shown below

Get List of columns and its data type in Pyspark 3

 

Method 2:  using dtypes function.

df_basket1.dtypes

dtypes function gets the data type of each column as shown below

Get List of columns and its data type in Pyspark 4

 

 


Get data type of single column in pyspark using printSchema() – Method 1

dataframe.select(‘columnname’).printschema() is used to select data type of single column

df_basket1.select('Price').printSchema()

We use select function to select a column and use printSchema() function to get data type of that particular column. So in our case we get the data type of ‘Price’ column as shown above.

Get data type of column in Pyspark (single & Multiple columns) 2

 

Get data type of single column in pyspark using dtypes – Method 2

dataframe.select(‘columnname’).dtypes is syntax used to select data type of single column

df_basket1.select('Price').dtypes

We use select function to select a column and use dtypes to get data type of that particular column. So in our case we get the data type of ‘Price’ column as shown above.

Get datatype of the column in pyspark d1

 


Other Related Topics:

 

Get List of columns and its data type in Pyspark                                                                                               Get List of columns and its data type in Pyspark

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.