如何删除SQL中的重复行
如何删除SQL中的重复行? 在本节中,我们将学习在MySQL和Oracle中删除重复行的不同方法。如果SQL表…
如何删除SQL中的重复行?
在本节中,我们将学习在MySQL和Oracle中删除重复行的不同方法。如果SQL表包含重复的行,那么我们必须删除重复的行。
准备样品数据
该脚本将创建名为contacts的表。
DROP TABLE IF EXISTS contacts;
CREATE TABLE contacts (
id INT PRIMARY KEY AUTO_INCREMENT,
first_name VARCHAR(30) NOT NULL,
last_name VARCHAR(25) NOT NULL,
email VARCHAR(210) NOT NULL,
age VARCHAR(22) NOT NULL
);
在上表中,我们插入了以下数据。
INSERT INTO contacts (first_name,last_name,email,age)
VALUES ('Kavin','Peterson','[email protected]','21'),
('Nick','Jonas','[email protected]','18'),
('Peter','Heaven','[email protected]','23'),
('Michal','Jackson','[email protected]','22'),
('Sean','Bean','[email protected]','23'),
('Tom ','Baker','[email protected]','20'),
('Ben','Barnes','[email protected]','17'),
('Mischa ','Barton','[email protected]','18'),
('Sean','Bean','[email protected]','16'),
('Eliza','Bennett','[email protected]','25'),
('Michal','Krane','[email protected]','25'),
('Peter','Heaven','[email protected]','20'),
('Brian','Blessed','[email protected]','20');
('Kavin','Peterson','[email protected]','30'),
在执行DELETE语句后,我们将执行脚本以重新创建测试数据。
该查询从联系人表返回数据:
SELECT * FROM contacts ORDER BY email;
| id | first_name | last_name | age | |
| 7 | Ben | Barnes | [email protected] | 21 |
| 13 | Brian | Blessed | [email protected] | 18 |
| 10 | Eliza | Bennett | [email protected] | 23 |
| 1 | Kavin | Peterson | [email protected] | 22 |
| 14 | Kavin | Peterson | [email protected] | 23 |
| 8 | Mischa | Barton | [email protected] | 20 |
| 11 | Michal | Krane | [email protected] | 17 |
| 4 | Michal | Jackson | [email protected] | 18 |
| 2 | Nick | Jonas | [email protected] | 16 |
| 3 | Peter | Heaven | [email protected] | 25 |
| 12 | Peter | Heaven | [email protected] | 25 |
| 5 | Sean | Bean | [email protected] | 20 |
| 9 | Sean | Bean | [email protected] | 20 |
| 6 | Tom | Baker | [email protected] | 30 |
以下SQL查询从联系人表返回重复的电子邮件:
SELECT
email, COUNT(email)
FROM
contacts
GROUP BY
email
HAVING
COUNT (email) > 1;
| COUNT(email) | |
| [email protected] | 2 |
| [email protected] | 2 |
| [email protected] | 2 |
我们有三行重复的电子邮件。
(A)使用DELETE JOIN语句删除重复的行
DELETE t1 FROM contacts t1
INNERJOIN contacts t2
WHERE
t1.id < t2.id AND
t1.email = t2.email;
输出:
Query OK, three rows affected (0.10 sec)
三行已被删除。我们执行下面给出的查询,以从表中查找重复的电子邮件。
SELECT
email,
COUNT (email)
FROM
contacts
GROUP BY
email
HAVING
COUNT (email) > 1;
查询返回空集。要验证联系人表中的数据,请执行以下SQL查询:
SELECT * FROM contacts;
| id | first_name | last_name | age | |
| 7 | Ben | Barnes | [email protected] | 21 |
| 13 | Brian | Blessed | [email protected] | 18 |
| 10 | Eliza | Bennett | [email protected] | 23 |
| 1 | Kavin | Peterson | [email protected] | 22 |
| 8 | Mischa | Barton | [email protected] | 20 |
| 11 | Micha | Krane | [email protected] | 17 |
| 4 | Michal | Jackson | [email protected] | 18 |
| 2 | Nick | Jonas | [email protected] | 16 |
| 3 | Peter | Heaven | [email protected] | 25 |
| 5 | Sean | Bean | [email protected] | 20 |
| 6 | Tom | Baker | [email protected] | 30 |
行ID的9、12和14已被删除。我们使用以下语句删除重复的行:
执行用于创建联系人的脚本。
DELETE c1 FROM contacts c1
INNERJ OIN contacts c2
WHERE
c1.id > c2.id AND
c1.email = c2.email;
| id | first_name | last_name | age | |
| 1 | Ben | Barnes | [email protected] | 21 |
| 2 | Kavin | Peterson | [email protected] | 22 |
| 3 | Brian | Blessed | [email protected]o.com | 18 |
| 4 | Nick | Jonas | [email protected] | 16 |
| 5 | Michal | Krane | [email protected] | 17 |
| 6 | Eliza | Bennett | [email protected] | 23 |
| 7 | Michal | Jackson | [email protected] | 18 |
| 8 | Sean | Bean | [email protected] | 20 |
| 9 | Mischa | Barton | [email protected] | 20 |
| 10 | Peter | Heaven | [email protected] | 25 |
| 11 | Tom | Baker | [email protected] | 30 |
(B)使用中间表删除重复的行
要使用中间表删除重复的行,请按照以下步骤操作:
步骤1.创建一个新表结构,与真实表相同:
CREATE TABLE source_copy LIKE source;
步骤2.插入数据库原始计划中的不同行:
INSERT INTO source_copy SELECT * FROM source GROUP BY col;
步骤3.删除原始表,并将立即表重命名为原始表。
DROP TABLE source; ALTER TABLE source_copy RENAME TO source;
例如,以下语句从联系人表中删除具有重复电子邮件的行:
-- step 1 CREATE TABLE contacts_temp LIKE contacts; -- step 2 INSERT INTO contacts_temp SELECT * FROM contacts GROUP BY email; -- step 3 DROP TABLE contacts; ALTER TABLE contacts_temp RENAME TO contacts;
(C)使用ROW_NUMBER()函数删除重复的行
注意:自MySQL 8.02版以来,已支持ROW_NUMBER()函数,因此我们应在使用该函数之前检查MySQL版本。
以下语句使用ROW_NUMBER()为每个行分配一个顺序整数。如果电子邮件重复,则该行将大于一。
SELECT id, email, ROW_NUMBER()
OVER (PARTITION BY email
ORDER BY email
) AS row_num
FROM contacts;
以下SQL查询返回重复行的ID列表:
SELECT id FROM (SELECT id, ROW_NUMBER() OVER ( PARTITION BY email ORDER BY email) AS row_num FROM contacts ) t WHERE row_num> 1;
输出:
| id |
| 9 |
| 12 |
| 14 |
删除Oracle中的重复记录
当我们在表中找到重复的记录时,我们必须删除不需要的副本,以保持数据的干净唯一。如果表中有重复的行,我们可以使用DELETE语句将其删除。
在这种情况下,我们有一列,它不是用于评估表中重复记录的组的一部分。
考虑下面给出的表:
| VEGETABLE_ID | VEGETABLE_NAME | COLOR |
| 01 | Potato | Brown |
| 02 | Potato | Brown |
| 03 | Onion | Red |
| 04 | Onion | Red |
| 05 | Onion | Red |
| 06 | Pumpkin | Green |
| 07 | Pumpkin | Yellow |
-- create the vegetable table
CREATE TABLE vegetables (
VEGETABLE_ID NUMBER generated BY DEFAULT AS ID ENTITY,
VEGETABLE_NAME VARCHAR2(100),
color VARCHAR2(20),
PRIMARY KEY (VEGETABLE_ID)
);
-- insert sample rows
INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES('Potato','Brown');
INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES('Potato','Brown');
INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES('Onion','Red');
INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES('Onion','Red');
INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES('Onion','Red');
INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES('Pumpkin','Green');
INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES('Pumpkin','Yellow');
-- query data from the vegetable table SELECT * FROM vegetables;
假设我们要保留具有最高VEGETABLE_ID的行,并删除所有其他副本。
SELECT MAX (VEGETABLE_ID) FROM vegetables GROUP BY VEGETABLE_NAME, color ORDER BY MAX(VEGETABLE_ID);
| MAX(VEGETABLE_ID) |
| 2 |
| 5 |
| 6 |
| 7 |
我们使用DELETE语句删除VEGETABLE_ID COLUMN中的值不是最高的行。
DELETE FROM vegetables WHERE VEGETABLE_IDNOTIN ( SELECT MAX(VEGETABLE_ID) FROM vegetables GROUP BY VEGETABLE_NAME, color );
三行已被删除。
SELECT *FROM vegetables;
| VEGETABLE_ID | VEGETABLE_NAME | COLOR |
| 02 | Potato | Brown |
| 05 | Onion | Red |
| 06 | Pumpkin | Green |
| 07 | Yellow |
如果我们想让ID最小的行,请使用MIN()函数而不是MAX()函数。
DELETE FROM vegetables WHERE VEGETABLE_IDNOTIN ( SELECT MIN(VEGETABLE_ID) FROM vegetables GROUP BY VEGETABLE_NAME, color );
如果我们有一个不属于评估重复项的组的列,则上述方法有效。如果列中的所有值都有副本,那么我们将无法使用VEGETABLE_ID列。
让我们拖放并创建一个具有新结构的蔬菜表。
DROP TABLE vegetables; CREATE TABLE vegetables ( VEGETABLE_ID NUMBER, VEGETABLE_NAME VARCHAR2(100), Color VARCHAR2(20) );
INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES(1,'Potato','Brown');
INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES(1, 'Potato','Brown');
INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color)VALUES(2,'Onion','Red');
INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color)VALUES(2,'Onion','Red');
INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES(2,'Onion','Red');
INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES(3,'Pumpkin','Green');
INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES('4,Pumpkin','Yellow');
SELECT * FROM vegetables;
| VEGETABLE_ID | VEGETABLE_NAME | COLOR |
| 01 | Potato | Brown |
| 01 | Potato | Brown |
| 02 | Onion | Red |
| 02 | Onion | Red |
| 02 | Onion | Red |
| 03 | Pumpkin | Green |
| 04 | Pumpkin | Yellow |
在蔬菜表中,已复制所有列VEGETABLE_ID,VEGETABLE_NAME和颜色中的值。
我们可以使用rowid,这是一个指定Oracle在哪里存储行的定位器。因为rowid是唯一的,所以我们可以使用它来删除重复的行。
DELETE FROM Vegetables WHERE rowed NOT IN ( SELECT MIN(rowid) FROM vegetables GROUP BY VEGETABLE_ID, VEGETABLE_NAME, color );
该查询验证删除操作:
SELECT * FROM vegetables;
| VEGETABLE_ID | VEGETABLE_NAME | COLOR |
| 01 | Potato | Brown |
| 02 | Onion | Red |
| 03 | Pumpkin | Green |
| 04 | Pumpkin | Yellow |
本文收集自互联网,转载请注明来源。
如有侵权,请联系 wper_net@163.com 删除。

还没有任何评论,赶紧来占个楼吧!