Find Duplicate Rows in Postgresql

In this Tutorial we will see how to find duplicate rows in postgresql. Getting duplicate rows in postgresql table can  be accomplished by using multiple methods each is explained with an example.

The table which we use for depiction is

ExamScore:

find Duplicates in Postgresql 1

Method 1: Find Duplicate Rows in Postgresql:

select distinct * from ExamScore where studentid in ( select studentid from (
select studentid, count(*)
from ExamScore
group by studentid
HAVING count(*) > 1) as foo);

We have chosen duplicate row by counting the number of rows for each studentid and chosen the rows having count > 1.

Resultant table:

find Duplicates in Postgresql 2

 

Method 2: Find Duplicate Rows in Postgresql with partition by

We have chosen duplicate row by partition by and order by as shown below


select distinct * from ExamScore where studentid in  (
select studentid from (
  select studentid,
  ROW_NUMBER() OVER(PARTITION BY studentid ORDER BY studentid asc) AS Row
  FROM ExamScore
) as foo
where 
foo.Row > 1);

Resultant table:

find Duplicates in Postgresql 2

 

                                                                                                   

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.