Extract Top N rows in pyspark – First N rows

In order to Extract First N rows in pyspark we will be using functions like show() function and head() function. head() function in pyspark returns the top N rows. Number of rows is passed as an argument to the head() and show() function. First() Function in pyspark returns the First row of the dataframe. Let’s see how to

  • Extract First row of dataframe in pyspark – using first() function.
  • Extract First N rows in pyspark – Top N rows in pyspark using head() function
  • Extract First N rows in pyspark – Top N rows in pyspark using take() and show() function

With an example for each

We will be using the dataframe named df_cars

Extract Top N rows in pyspark – First N rows 1

 

Extract First row of dataframe in pyspark – using first() function

dataframe.first() Function extracts the first row of the dataframe

########## Extract first row of the dataframe in pyspark

df_cars.first()

so the first row of “df_cars” dataframe is extracted

Extract Top N rows in pyspark – First N rows 2

 

 

Extract First N rows in pyspark – Top N rows in pyspark using show() function

dataframe.show(n) Function takes argument “n” and extracts the first n row of the dataframe

########## Extract first N row of the dataframe in pyspark – show()

df_cars.show(5)

so the first 5 rows of “df_cars” dataframe is extracted

Extract Top N rows in pyspark – First N rows 3

 

 

Extract First N rows in pyspark – Top N rows in pyspark using head() function

dataframe.head(n) Function takes argument “n” and extracts the first n row of the dataframe

########## Extract first N row of the dataframe in pyspark – head()

df_cars.head(3)

so the first 3 rows of “df_cars” dataframe is extracted

Extract Top N rows in pyspark – First N rows 4

 

 

Extract First N rows in pyspark – Top N rows in pyspark using take() function

dataframe.take(n) Function takes argument “n” and extracts the first n row of the dataframe

########## Extract first N row of the dataframe in pyspark – take()

df_cars.take(2)

so the first 2 rows of “df_cars” dataframe is extracted

Extract Top N rows in pyspark – First N rows 5

 

Extract Top N rows in pyspark – First N rows                                                                                          Extract Top N rows in pyspark – First N rows