Sunday, September 6, 2009

Oracle: How to remove duplicate keys found in table

If you have been trying to add a unique constraint or a primary key to an Oracle table and you receive the annoying error:


ORA-01452:
cannot CREATE UNIQUE INDEX; duplicate keys found
Cause:
A CREATE UNIQUE INDEX statement specified one or more columns that currently contain duplicate values. All values in the indexed columns must be unique by row to create a UNIQUE INDEX.
Action:
If the entries need not be unique, remove the keyword UNIQUE from the CREATE INDEX statement, then re-execute the statement. If the entries must be unique, as in a primary key, then remove duplicate values before creating the UNIQUE index.

Now you have the following to consider:

1. Maybe you have to redesign the datamodel and tables to allow duplicate keys in the table. Maybe they are there for a reason...

2. Create the primary/unique key without enforcing validation of the old existing data in the table with:

     CREATE UNIQUE INDEX indexName
      ON tableName (columnName)
      NOVALIDATE;

   This makes the unique constraint created but the existing data in the table is not validate according to the rule of being unique.

3. Find a way to remove the duplicates

If you select to remove the duplicate rows you have several ways to do this:

3.1. single sql-statement

    delete from tableName t1
      where t1.rowid != ( select min(t2.rowid) 
                            from tableName t2
                           where t2.columnName = t1.columnName );


3.2. plsql-program


3.3. create index statement exceptions into


The exceptions table can be created with the following script:
$ORACLE_HOME/rdbms/admin/utlexcpt.sql


Now try to add the unique constraint:

   alter table tableName
      
add constraint constraintName unique (colummName1, columnName2)

      exceptions into exceptions;


The rowid's for the duplicate rows will be stored in the exceptions table and they can be used with an join to delete the dupplicate rows in the source table.


Take a "backup" of the rows which are to be deleted:


   create table tableNameCopy as select * from tableName
        where rowid in (select row_id from exceptions);


Delete the duplicate rows:


   delete tableName where rowid in (select row_id from exceptions);






No comments:

Post a Comment