Delete Duplicate Rows in Postgresql

In this Tutorial we will see how to delete duplicate rows in postgresql. deleting duplicate rows in postgresql table can  be accomplished by using multiple methods each is explained with an example.

The table which we use for depiction is

ExamScore:

delete duplicate rows in postgresql 1

 

Method 1: Select distinct /unique rows:

It is Most efficient and fastest way to select data without unnecessary duplicates

select distinct * from ExamScore

Resultant table:

delete duplicate rows in postgresql 2

 

Method 2: delete unwanted duplicates and return only unique rows

SELECT * FROM
(SELECT *, ROW_NUMBER() OVER 
(PARTITION BY (studentid) ORDER BY studentid DESC) rn
 FROM ExamScore) as foo WHERE rn = 1;
  • PARTITION BY divides into groups and ORDER BY sorts them by descending order.
  • ROW_NUMBER() assigns an integer number to every row in each category.

Duplicate rows are deleted as the result we have following table

Resultant table:

delete duplicate rows in postgresql 3

 

                                                                                               

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.