In SQL Server, I have a table MyTable, with the following 3 columns, P(bigint), F(bigint), D(bigint).
Now what I need to do is to group all records by same P value, then in each group, calculate the following value:
V = ABS((P & 0xFFFFFFFF) * 256 - F)
for each record, and delete all records in the group, except for the one with the minimum value of V.
So I write the SQL statement as follows:
DELETE FROM MyTable WHERE (P, ABS((P & 0xFFFFFFFF) * 256 - F)) NOT IN (SELECT P, MIN(ABS((P & 0xFFFFFFFF) * 256 - F)) FROM MyTable GROUP BY P HAVING COUNT(*) > 1)
But in SQL Server, I will get the following error:
Msg 4145 SQL Server An expression of non-boolean type specified in a context where a condition is expected,near ','
Why? Also whether bitwise and & works in SQL Server 2005? I only have SQL Server 2008 R2 and the online document at https://docs.microsoft.com/en-us/sql/t-sql/language-elements/bitwise-and-transact-sql?view=sql-server-ver15 does not indicate whether it support 2005 & 2008.
Update1
P, F, D are all bigint type.
Update2
With the helps of others, I finally write a query with NOT EXISTS, as follows:
Method 1: DELETE t FROM MyTable t WHERE NOT EXISTS (SELECT 1 FROM (SELECT P, MIN(ABS((P & 0xFFFFFFFF) * 256 - F)) as minpf FROM MyTable GROUP BY P) tt WHERE tt.P = t.P AND tt.minpf = ABS((t.P & 0xFFFFFFFF) * 256 - t.F))
Note: I remove the HAVE COUNT(*) > 1, otherwise, the statement will also delete those records who are in a group with only one record.
Though it works, I doubt its performance. Since in most of the groups(group by P), there will only be one record, while calculating the minpf in such a group and then delete the record with the calculated value <> minpf is wasting time(no records are deleted in such a group). So just wonder if there are better way to do the same thing?
Update 3
I test the performance of my method(method 1) with the two methods Gordon provided(with minor revision)
Method 2:
DELETE t FROM MyTable t LEFT JOIN (SELECT P, MIN(ABS((P & 0xFFFFFFFF) * 256 - F)) as minpf FROM MyTable GROUP BY P ) tt ON tt.p = t.p AND tt.minpf = (tt.P & 0xFFFFFFFF) * 256 - tt.F) WHERE tt.P IS NULL;
Method 3:
DELETE t FROM (SELECT t.*, ROW_NUMBER() OVER (PARTITION BY P ORDER BY ABS((P & 0xFFFFFFFF) * 256 - F)) as seqnum FROM MyTable t ) t WHERE seqnum > 1;
The performance data are below:
First Test:
(22721 rows affected) Method 1:16094
(22721 rows affected) Method 2:17156
(22721 rows affected) Method 3:12188
Second Test:
(22721 rows affected) Method 1:26297
(22721 rows affected) Method 2:27562
(22721 rows affected) Method 3:11625
3rd Test:
(22721 rows affected) Method 1:26297
(22721 rows affected) Method 2:27359
(22721 rows affected) Method 3:11578
So based on the test, method 3 has the best performance.
In SQL Server, simply use a JOIN
:
DELETE t
FROM MyTable t LEFT JOIN
(SELECT P, MIN(ABS((P & 0xFFFFFFFF) * 256 - F)) as minpf
FROM MyTable
GROUP BY P
HAVING COUNT(*) > 1
) tt
ON tt.p = t.p AND tt.minpf = (tt.P & 0xFFFFFFFF) * 256 - tt.F)
WHERE tt.P IS NULL;
Or, more simply use window functions:
DELETE t
FROM (SELECT t.*,
ROW_NUMBER(*) OVER (PARTITION BY P ORDER BY ABS((P & 0xFFFFFFFF) * 256 - F) as seqnum
FROM MyTable t
) t
WHERE seqnum > 1;
IN statements do work, but they do not work in the way you use them - simple like that. The content of IN is a list of compare values, not another sql statement.
In your case you do not IN - you do delete form select with join and use that to define the rows to delete.
the select
query after IN
must return only one column , test it first
Try this :
DELETE FROM MyTable
WHERE concat(P, ABS((P & 0xFFFFFFFF) * 256 - F)) NOT IN (SELECT concat( P, MIN(ABS((P &
0xFFFFFFFF) * 256 - F))) FROM MyTable GROUP BY P HAVING COUNT(*) > 1)
User contributions licensed under CC BY-SA 3.0