How to deduplicate rows in PostgreSQL
Sep 5, 2013
Use the following query to keep only one row and delete all duplicates of that row.
Note: this is non-standard SQL and works only with PostgreSQL.
The following snippet finds all rows in the users table that have identical email and organization_id. It keeps the oldest of the rows (via users.id > u2.id) and deletes the newer duplicates.
DELETE FROM users USING users u2 WHERE users.email = u2.email AND users.organization_id = u2.organization_id AND users.id > u2.id;
Read the documentation for PostgreSQL’s USING clause in combination with DELETE.