경고로 중복을 제거하십시오
-
02-07-2019 - |
문제
Rowid, Longitude, Latitude, BusinessName, URL, 캡션이있는 테이블이 있습니다. 이것은 다음과 같습니다.
rowID | long | lat | businessName | url | caption
1 20 -20 Pizza Hut yum.com null
모든 복제물을 삭제하려면 URL (우선 우선 순위)을 가진 중복을 유지하거나 다른 중복을 유지하거나 다른 사람에게 URL이없는 경우 캡션이있는 것을 유지하고 나머지를 삭제하는 방법은 무엇입니까?
해결책
여기 내 루핑 기술이 있습니다. 이것은 아마도 주류가되지 않은 것에 대해 투표 할 것입니다. 그리고 나는 그것에 대해 멋지다.
DECLARE @LoopVar int
DECLARE
@long int,
@lat int,
@businessname varchar(30),
@winner int
SET @LoopVar = (SELECT MIN(rowID) FROM Locations)
WHILE @LoopVar is not null
BEGIN
--initialize the variables.
SELECT
@long = null,
@lat = null,
@businessname = null,
@winner = null
-- load data from the known good row.
SELECT
@long = long,
@lat = lat,
@businessname = businessname
FROM Locations
WHERE rowID = @LoopVar
--find the winning row with that data
SELECT top 1 @Winner = rowID
FROM Locations
WHERE @long = long
AND @lat = lat
AND @businessname = businessname
ORDER BY
CASE WHEN URL is not null THEN 1 ELSE 2 END,
CASE WHEN Caption is not null THEN 1 ELSE 2 END,
RowId
--delete any losers.
DELETE FROM Locations
WHERE @long = long
AND @lat = lat
AND @businessname = businessname
AND @winner != rowID
-- prep the next loop value.
SET @LoopVar = (SELECT MIN(rowID) FROM Locations WHERE @LoopVar < rowID)
END
다른 팁
이 솔루션은 지난주에 "스택 오버플로에서 배운 것"으로 당신에게 가져옵니다.
DELETE restaurant
WHERE rowID in
(SELECT rowID
FROM restaurant
EXCEPT
SELECT rowID
FROM (
SELECT rowID, Rank() over (Partition BY BusinessName, lat, long ORDER BY url DESC, caption DESC ) AS Rank
FROM restaurant
) rs WHERE Rank = 1)
경고 : 실제 데이터베이스에서 테스트하지 않았습니다.
세트 기반 솔루션 :
delete from T as t1
where /* delete if there is a "better" row
with same long, lat and businessName */
exists(
select * from T as t2 where
t1.rowID <> t2.rowID
and t1.long = t2.long
and t1.lat = t2.lat
and t1.businessName = t2.businessName
and
case when t1.url is null then 0 else 4 end
/* 4 points for non-null url */
+ case when t1.businessName is null then 0 else 2 end
/* 2 points for non-null businessName */
+ case when t1.rowID > t2.rowId then 0 else 1 end
/* 1 point for having smaller rowId */
<
case when t2.url is null then 0 else 4 end
+ case when t2.businessName is null then 0 else 2 end
)
delete MyTable
from MyTable
left outer join (
select min(rowID) as rowID, long, lat, businessName
from MyTable
where url is not null
group by long, lat, businessName
) as HasUrl
on MyTable.long = HasUrl.long
and MyTable.lat = HasUrl.lat
and MyTable.businessName = HasUrl.businessName
left outer join (
select min(rowID) as rowID, long, lat, businessName
from MyTable
where caption is not null
group by long, lat, businessName
) HasCaption
on MyTable.long = HasCaption.long
and MyTable.lat = HasCaption.lat
and MyTable.businessName = HasCaption.businessName
left outer join (
select min(rowID) as rowID, long, lat, businessName
from MyTable
where url is null
and caption is null
group by long, lat, businessName
) HasNone
on MyTable.long = HasNone.long
and MyTable.lat = HasNone.lat
and MyTable.businessName = HasNone.businessName
where MyTable.rowID <>
coalesce(HasUrl.rowID, HasCaption.rowID, HasNone.rowID)
다른 답변과 유사하지만 순위가 아닌 행 번호를 기준으로 삭제하려고합니다. 일반적인 테이블 표현식과 혼합하십시오.
;WITH GroupedRows AS
( SELECT rowID, Row_Number() OVER (Partition BY BusinessName, lat, long ORDER BY url DESC, caption DESC) rowNum
FROM restaurant
)
DELETE r
FROM restaurant r
JOIN GroupedRows gr ON r.rowID = gr.rowID
WHERE gr.rowNum > 1
가능하면 균질화 한 다음 복제를 제거 할 수 있습니까?
1 단계:
UPDATE BusinessLocations
SET BusinessLocations.url = LocationsWithUrl.url
FROM BusinessLocations
INNER JOIN (
SELECT long, lat, businessName, url, caption
FROM BusinessLocations
WHERE url IS NOT NULL) LocationsWithUrl
ON BusinessLocations.long = LocationsWithUrl.long
AND BusinessLocations.lat = LocationsWithUrl.lat
AND BusinessLocations.businessName = LocationsWithUrl.businessName
UPDATE BusinessLocations
SET BusinessLocations.caption = LocationsWithCaption.caption
FROM BusinessLocations
INNER JOIN (
SELECT long, lat, businessName, url, caption
FROM BusinessLocations
WHERE caption IS NOT NULL) LocationsWithCaption
ON BusinessLocations.long = LocationsWithCaption.long
AND BusinessLocations.lat = LocationsWithCaption.lat
AND BusinessLocations.businessName = LocationsWithCaption.businessName
2 단계 : 복제를 제거합니다.
제휴하지 않습니다 StackOverflow