Maximum or Minimum value of column in Pyspark

Maximum and minimum value of the column in pyspark can be accomplished using aggregate() function with argument column name followed by max or min according to our need. Maximum or Minimum value of the group in pyspark can be calculated by using groupby along with aggregate() Function. We will see with an example for each

  • Maximum value of the column in pyspark with example
  • Minimum value of the column in pyspark with example
  • Maximum value of each group of dataframe in pyspark with example
  • Minimum value of each group of dataframe in pyspark with example

We will be using dataframe named df_basket1

maximum and minimum value of column in pyspark 1

 

 

Maximum value of the column in pyspark with example:

Maximum value of the column in pyspark is calculated using aggregate function – agg() function. The agg() Function takes up the column name and ‘max’ keyword which returns the maximum value of that column

## Maximum value of the column in pyspark
df_basket1.agg({'Price': 'max'}).show()

Maximum value of price column is calculated

maximum and minimum value of column in pyspark 2

 

 

Minimum value of the column in pyspark with example:

Minimum value of the column in pyspark is calculated using aggregate function – agg() function. The agg() Function takes up the column name and ‘min’ keyword which returns the minimum value of that column

## Minimum value of the column in pyspark

df_basket1.agg({'Price': 'min'}).show()

Minimum value of price column is calculated

maximum and minimum value of column in pyspark 3

 

 

Maximum value of each group in pyspark with example:

Maximum or Minimum value of column in Pyspark c2

 

Maximum value of each group in pyspark is calculated using aggregate function – agg() function along with groupby(). The agg() Function takes up the column name and ‘max’ keyword, groupby() takes up column name which returns the maximum value of each group in a column

#Maximum value of each group

df_basket1.groupby('Item_group').agg({'Price': 'max'}).show()

Maximum price of each “Item_group” is calculated

maximum and minimum value of column in pyspark 4

 

 

Minimum value of each group in pyspark with example:

Maximum or Minimum value of column in Pyspark c1

Minimum value of each group in pyspark is calculated using aggregate function – agg() function along with groupby(). The agg() Function takes up the column name and ‘min’ keyword, groupby() takes up column name which returns the minimum value of each group in a column

#Minimum value of each group

df_basket1.groupby('Item_group').agg({'Price': 'min'}).show()

Minimum price of each “Item_group” is calculated

maximum and minimum value of column in pyspark 5

 


Other Related Topics:

 

Maximum or Minimum value of column in Pyspark                                                                                           Maximum or Minimum value of column in Pyspark

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.