| Oracle中用Rowid查找和删除重复记录 |
| [ 作者:佚名 转贴自:网络转载 阅读次数:32 更新时间:2007-8-2 14:15:00 录入:刘光勇 ] |
|
|
|
平时工作中可能会遇到当试图对库表中的某一列或几列创建唯一索引时,系统提示 ORA-01452 :不能创建唯一索引,发现重复记录。 *C.Wt1:g? 39^{JaQ 下面总结一下几种查找和删除重复记录的方法(以表CZ为例): 7"^OBSW 表CZ的结构如下: ![@bb~SrkC SQL> desc cz :z;?Qf3R Name Null? Type `GwX(op ----------------------------------------- -------- ------------------ A & f mWb`z ?0 C1 NUMBER(10) lyhjN9 C10 NUMBER(5) 2!P^h C20 VARCHAR2(3) c=P@\ FG vYq[=zb ( 删除重复记录的方法原理: HDp*hXA4 (1).在Oracle中,每一条记录都有一个rowid,rowid在整个数据库中是唯一的,rowid确定了每条记录是在Oracle中的哪一个数据文件、块、行上。 "oX~]v -8H( %m % (2).在重复的记录中,可能所有列的内容都相同,但rowid不会相同,所以只要确定出重复记录中那些具有最大rowid的就可以了,其余全部删除。 n9CF7r9zd fX<`&GX 重复记录判断的标准是: ~d 1y9xYUs C1,C10和C20这三列的值都相同才算是重复记录。 fR:X/uJ01 0 =N <J 经查看表CZ总共有16条记录: lV YE()h SQL>set pagesize 100 z^mA$f]@+G SQL>select * from cz; Dw8#U% wgX+hd1 C1 C10 C20 7(yj&,yj ---------- ---------- --- -PrUR1*t 1 2 dsf ;co"\v 1 2 dsf g^hF'~$ 1 2 dsf }`&Y{a| 1 2 dsf O}xk%|KD` 2 3 che +ZNXbVyhw 1 2 dsf _a}q. 1 2 dsf O_ \.hq 1 2 dsf ]W{,XRxq 1 2 dsf A|$_R!S+T 2 3 che AiTzin 2 3 che oy_]'z\03 2 3 che :LjPGk 2 3 che 0{Utk#*N 3 4 dff pq.(!FW- 3 4 dff 4}@ o1{W 3 4 dff 3u}/ k 4 5 err gIgDBy$T 5 3 dar R6~dB2, 6 1 wee E?ijUts;X 7 2 zxc s<qx zv ^2 IUg@[r 20 rows selected. bdc8szu ?t|5r1<F= 1.查找重复记录的几种方法: _@>~u 4R (1).SQL>select * from cz group by c1,c10,c20 having count(*) >1; fgpXEL:^T C1 C10 C20 O*JqjJ9k ---------- ---------- --- OVZ@2N5f| 1 2 dsf !_PBJ$&<Z 2 3 che RUv6#L~Y 3 4 dff ^4z/R+? wsL))?<UyA (2).SQL>select distinct * from cz; N}1Zu4m Kx u mQI) C1 C10 C20 OGy(5B~ ---------- ---------- --- $?:f^ 1 2 dsf :wFC;+]t 2 3 che X:3]7] 3 4 dff #6H9*]3 sF=Xr!ATp (3).SQL>select * from cz a where rowid=(select max(rowid) from cz where c1=a.c1 and c10=a.c10 and c20=a.c20); ('Q0i C1 C10 C20 .BG;S|6hF ---------- ---------- --- ^]R~YAZv 1 2 dsf \#[>SZFc& 2 3 che qo<e !P 3 4 dff Ljwm"4 n/{+>t 2.删除重复记录的几种方法: +U{;4X~ (1).适用于有大量重复记录的情况(在C1,C10和C20列上建有索引的时候,用以下语句效率会很高): *zf3 rW`A SQL>delete cz where (c1,c10,c20) in (select c1,c10,c20 from cz group by c1,c10,c20 having count(*)>1) and rowid not in s-s1` (select min(rowid) from cz group by c1,c10,c20 having count(*)>1); m=Y%/> 3JP%:f0m SQL>delete cz where rowid not in(select min(rowid) from cz group by c1,c10,c20); L8#p0j i, ,t &/L. (2).适用于有少量重复记录的情况(注意,对于有大量重复记录的情况,用以下语句效率会很低): %0x1/52f, SQL>delete from cz a where a.rowid!=(select max(rowid) from cz b where a.c1=b.c1 and a.c10=b.c10 and a.c20=b.c20); ,Q^4E$D \c[']0KCv SQL>delete from cz a where a.rowid<(select max(rowid) from cz b where a.c1=b.c1 and a.c10=b.c10 and a.c20=b.c20); fPTGlh_:P LZ'^+ *y SQL>delete from cz a where rowid <(select max(rowid) from cz where c1=a.c1 and c10=a.c10 and c20=a.c20); |5drDrC4 @yOD%D >H (3).适用于有少量重复记录的情况(临时表法): }B7:c-~Uy SQL>create table test as select distinct * from cz; (建一个临时表test用来存放重复的记录) *$OpZ@Z 5v&R~ %b SQL>truncate table cz; (清空cz表的数据,但保留cz表的结构) Q3XD|m1h<n ^n=XcG!71S SQL>insert into cz select * from test; (再将临时表test里的内容反插回来) 6JeI7 Cd'[u3| (4).适用于有大量重复记录的情况(Exception into 子句法): u46q.O+ 采用alter table 命令中的 Exception into 子句也可以确定出库表中重复的记录。这种方法稍微麻烦一些,为了使用“excepeion into ”子句,必须首先创建 EXCEPTIONS 表。创建该表的 SQL 脚本文件为 utlexcpt.sql 。对于win2000系统和 UNIX 系统, Oracle 存放该文件的位置稍有不同,在win2000系统下,该脚本文件存放在$ORACLE_HOMEOra90rdbmsadmin 目录下;而对于 UNIX 系统,该脚本文件存放在$ORACLE_HOME/rdbms/admin 目录下。 z_75N 'Y-6 h#=<"_j?=N 具体步骤如下: sz'm@U5 SQL>@?/rdbms/admin/utlexcpt.sql aRmd%S ]> ;t^2W{vv Table created. |3F%8?/G; ?$pg#oS ? SQL>desc exceptions #Jr (4:O Name Null? Type n ,b:g y\ ----------------------------------------- -------- -------------- 8 cD wU HenitZ2| ROW_ID ROWID ER0'Pn} OWNER VARCHAR2(30) $4z_6MZn TABLE_NAME VARCHAR2(30) g9(?;Lr CONSTRAINT VARCHAR2(30) j8[o,q mX-0|A SQL>alter table cz add constraint cz_unique unique(c1,c10,c20) exceptions into exceptions; <lT?$#YCb * Z|JgZr4 ERROR at line 1: Z%q5wmg ORA-02299: cannot validate (TEST.CZ_UNIQUE) - duplicate keys found VeAe+r3\J ?q[j{i| SQL>create table dups as select * from cz where rowid in (select row_id from exceptions); >N,~Bu84^ 2k3B2 Table created. x< v"a0] A|Jq<]#2 SQL>select * from dups; }o_ q(Gql GNBEW@-?& C1 C10 C20 5Lbm XOJ ---------- ---------- --- E_MRxFH 1 2 dsf g &S@ ?K& 1 2 dsf }:^`lr 1 2 dsf rPrb]*+f 1 2 dsf Y`LU2^0# 2 3 che 2rOQ('SB 1 2 dsf 8 3x-L 1 2 dsf 9 qh~'i 1 2 dsf Bj2Z6 1 2 dsf XnH&m4'^ 2 3 che y_05PV0~n 2 3 che SFu(G 2 3 che k+5yxJqj| 2 3 che , W%=ebq 3 4 dff 7[/5imp= 3 4 dff WH+HdKLmd 3 4 dff |c|@.G> & ^/)yj)@E 16 rows selected. KS 9,>7 .uI U=-c SQL>select row_id from exceptions; <[TB/l$ .+[ 52&I3! ROW_ID }uqC$4BX! ------------------ Yn5-5NSV AAAHD/AAIAAAADSAAA |4O| r q AAAHD/AAIAAAADSAAB <Z^c_}\P AAAHD/AAIAAAADSAAC c`uiNK@' AAAHD/AAIAAAADSAAF ex@ x+ AAAHD/AAIAAAADSAAH "n} z AAAHD/AAIAAAADSAAI $vpt_T/ AAAHD/AAIAAAADSAAG RuKy'$ AAAHD/AAIAAAADSAAD (!D__(YZ AAAHD/AAIAAAADSAAE 4(nUn AAAHD/AAIAAAADSAAJ g}GZ ^ AAAHD/AAIAAAADSAAK (P8a<i=gN AAAHD/AAIAAAADSAAL Yl-IvmD AAAHD/AAIAAAADSAAM 9Q\U@(X AAAHD/AAIAAAADSAAN dn(tM#+0 AAAHD/AAIAAAADSAAO (lG8~ AAAHD/AAIAAAADSAAP 2c%4 9_ 9 M<ky[oS 16 rows selected. P~jxLhF SYSAF]5" SQL>delete from cz where rowid in ( select row_id from exceptions); .AxUUiNH bI_fca|<l 16 rows deleted. ~3\Dk'Wg |eC&g1FH SQL>insert into cz select distinct * from dups; #q!NPTrFz `n$3D(+WN 3 rows created. ~r~Z'Df {.&*F?.y7 SQL>select *from cz; PQ,s2c_Nov JUHJve C1 C10 C20 HwR4n%Km} ---------- ---------- --- e}\sX=? 1 2 dsf gH3O)h 2 3 che )Ac&!k`kJ. 3 4 dff IC\(QyJV 4 5 err fsA!TNmQ@ 5 3 dar 7:a/"m*} 6 1 wee <2?VGax: 7 2 zxc K`heZK^5 !P17 < 7 rows selected. 2ycTv5= aO7BX0 "] 从结果里可以看到重复记录已经删除。
|
|
|
|