博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
选择语句以查找某些字段的重复项
阅读量:2381 次
发布时间:2019-05-10

本文共 4177 字,大约阅读时间需要 13 分钟。

本文翻译自:

Can you help me with SQL statements to find duplicates on multiple fields? 你能帮我用SQL语句来查找多个字段的重复项吗?

For example, in pseudo code: 例如,在伪代码中:

select count(field1,field2,field3) from table where the combination of field1, field2, field3 occurs multiple times

and from the above statement if there are multiple occurrences I would like to select every record except the first one . 并且从上面的陈述中, 如果有多次出现,我想选择除第一个以外的每个记录


#1楼

参考:


#2楼

To see duplicate values: 要查看重复值:

with MYCTE  as (    select row_number() over ( partition by name  order by name) rown, *    from tmptest      ) select * from MYCTE where rown <=1

#3楼

CREATE TABLE #tmp(    sizeId Varchar(MAX))INSERT  #tmp     VALUES ('44'),        ('44,45,46'),        ('44,45,46'),        ('44,45,46'),        ('44,45,46'),        ('44,45,46'),        ('44,45,46')SELECT * FROM #tmpDECLARE @SqlStr VARCHAR(MAX)SELECT @SqlStr = STUFF((SELECT ',' + sizeId              FROM #tmp              ORDER BY sizeId              FOR XML PATH('')), 1, 1, '') SELECT TOP 1 * FROM (select items, count(*)AS Occurrence  FROM dbo.Split(@SqlStr,',')  group by items  having count(*) > 1  )K  ORDER BY K.Occurrence DESC

#4楼

To get the list of fields for which there are multiple records, you can use.. 要获取有多个记录的字段列表,您可以使用..

select field1,field2,field3, count(*)  from table_name  group by field1,field2,field3  having count(*) > 1

Check this link for more information on how to delete the rows. 有关如何删除行的更多信息,请查看此链接。

Edit : As the other users mentioned, there should be a criterion for deciding how you define "first rows" before you use the approach in the link above. 编辑:正如其他用户所提到的,在使用上述链接中的方法之前,应该有一个标准来决定如何定义“第一行”。 Based on that you'll need to use an order by clause and a sub query if needed. 基于此,您需要使用order by子句和子查询(如果需要)。 If you can post some sample data, it would really help. 如果您可以发布一些示例数据,那将非常有用。


#5楼

You mention "the first one", so I assume that you have some kind of ordering on your data. 你提到“第一个”,所以我假设你对你的数据有某种排序。 Let's assume that your data is ordered by some field ID . 假设您的数据按某些字段ID排序。

This SQL should get you the duplicate entries except for the first one. 除了第一个条目之外,此SQL应该为您提供重复的条目。 It basically selects all rows for which another row with (a) the same fields and (b) a lower ID exists. 它基本上选择具有(a)相同字段和(b)较低ID的另一行的所有行。 Performance won't be great, but it might solve your problem. 性能不会很好,但它可能会解决您的问题。

SELECT A.ID, A.field1, A.field2, A.field3  FROM myTable A WHERE EXISTS (SELECT B.ID                 FROM myTable B                WHERE B.field1 = A.field1                  AND B.field2 = A.field2                  AND B.field3 = A.field3                  AND B.ID < A.ID)

#6楼

If you're using SQL Server 2005 or later (and the tags for your question indicate SQL Server 2008), you can use ranking functions to return the duplicate records after the first one if using joins is less desirable or impractical for some reason. 如果您正在使用SQL Server 2005或更高版本(并且您的问题的标记表示SQL Server 2008),则可以使用排名函数在第一个之后返回重复记录,如果由于某种原因使用连接不太理想或不切实际。 The following example shows this in action, where it also works with null values in the columns examined. 以下示例显示了此操作,它还可以在检查的列中使用空值。

create table Table1 ( Field1 int, Field2 int, Field3 int, Field4 int )insert  Table1 values    (1,1,1,1)        , (1,1,1,2)        , (1,1,1,3)        , (2,2,2,1)        , (3,3,3,1)        , (3,3,3,2)        , (null, null, 2, 1)        , (null, null, 2, 3)select    *from     (select      Field1                    , Field2                    , Field3                    , Field4                    , row_number() over (partition by   Field1                                                      , Field2                                                      , Field3                                         order by       Field4) as occurrence          from      Table1) xwhere     occurrence > 1

Notice after running this example that the first record out of every "group" is excluded, and that records with null values are handled properly. 运行此示例后请注意,排除每个“组”中的第一个记录,并正确处理具有空值的记录。

If you don't have a column available to order the records within a group, you can use the partition-by columns as the order-by columns. 如果您没有可用于对组中的记录进行排序的列,则可以使用partition-by列作为order-by列。

转载地址:http://ibexb.baihongyu.com/

你可能感兴趣的文章
ORA-12052,不能建立快速刷新物化视图的解决
查看>>
物化视图comlete刷新会产生大量的日志
查看>>
Mysql cluster slave server的自动检测与修复
查看>>
solaris同步时钟
查看>>
mysql升级
查看>>
linux和solaris下的时间计算
查看>>
perl的安装
查看>>
oracle和mysql的行列转换
查看>>
Scale-up(纵向扩展)和Scale-out(横向扩展)
查看>>
过敏性鼻炎偏方
查看>>
MySQL数据库优化实践--硬件方面
查看>>
Redis复制与可扩展集群搭建
查看>>
innodb_flush_log_at_trx_commit和sync_binlog innodb_flush_method
查看>>
V$sql_text v$sqlarea v$sql 的区别
查看>>
Redis 集群功能说明
查看>>
linux 下 free的用法
查看>>
oracle11gR2在RedHat5上前期安装配置脚本
查看>>
sar的用法
查看>>
10g和11g自动任务的区别
查看>>
Mysql innodb存储引擎的性能优化
查看>>