Mats codemix

Mats codemix

A little c#, a litte .NET and throw in some c++ and you get a nice Spaghetti

Mats codemix RSS Feed
 
 
 
 

Making groups unique - fixing a duplicate key problem in SQL

I ran into a problem where updating a key column caused duplicate key errors.  The table in question had several columns that formed the Primary Key.  One of those columns was userd for a  serial number making groups of keys unique.

Example

Group

GroupSerial

Apple

1

Apple

1

Pear

1

Pear

1

Here we see that each row is not unique.

So how can we fix this?

First we lay up the data like this, having a unique row number:

Row#

Group

GrpSerial

GrpCnt

GrpMax

1

Apple

1

2

Apple

1

3

Pear

1

4

Pear

1

5

Pear

1

Then we count the items in each group and set that value to each member of the group.

Row#

Group

GrpSerial

GrpCnt

GrpMax

1

Apple

1

2

2

Apple

1

2

3

Pear

1

3

4

Pear

1

3

5

Pear

1

3

Then we the the max row# for each group

Row#

Group

GrpSerial

GrpCnt

GrpMax

1

Apple

1

2

2

2

Apple

1

2

2

3

Pear

1

3

5

4

Pear

1

3

5

5

Pear

1

3

5

Now we can calculate the group serial for each row.

grp_serial = row_serial  - grp_max + grp_cnt

Row#

Group

GrpSerial

GrpCnt

GrpMax

1

Apple

1

2

2

2

Apple

2

2

2

3

Pear

1

3

5

4

Pear

2

3

5

5

Pear

3

3

5

Simple as (apple) pie.

And here is an example in SQL:

 1 /*
 2 How to update a group_serial fixing uniqueness
 3 */
 4 
 5 
 6 -- Create a table with a series of groups 
 7 drop TABLE matstest
 8 go
 9 
10 CREATE TABLE matstest
11 (
12     id numeric identity not null
13     ,row_serial int null
14     -- Value columns
15     ,grp int Null
16     ,grp_serial int null
17     -- helper columns
18     ,grp_cnt int null
19     ,grp_max int null
20 )
21 go
22 
23 -- having a unique index on grp,grp_serial would cause an error 
24 truncate table matstest
25 insert into matstest (grp,grp_serial) values (1,1)
26 insert into matstest (grp,grp_serial) values (22,1)
27 insert into matstest (grp,grp_serial) values (22,1)
28 insert into matstest (grp,grp_serial) values (33,1)
29 insert into matstest (grp,grp_serial) values (33,1)
30 insert into matstest (grp,grp_serial) values (33,1)
31 
32 -- Fake a high row_serial just to prove that low numbers 
33 update matstest set row_serial = id +4311
34 
35 -- Show the starting point
36 select * from matstest
37 
38 --
39 -- It is important to include all the keys that make up the group! In this example there is just one column (grp)
40 -- Find count per group (grp_count)
41 --
42 select grp, count(*) as grp_cnt
43 into #tmp_grp_cnt
44 from matstest
45 group by grp
46 having count(*) > 1 -- Skip this to have count also on the groups with one row
47 
48 update matstest
49 set grp_cnt = ga.grp_cnt
50 from matstest m
51 join #tmp_grp_cnt ga on ga.grp = m.grp
52 
53 --
54 -- Find max row row_serial number per group  (grp_max)
55 --
56 select grp, max(row_serial) as grp_max
57 into #tmp_grp_max
58 from matstest
59 group by grp
60 having count(*) > 1
61 
62 update matstest
63 set grp_max = ga.grp_max
64 from matstest m
65 join #tmp_grp_max ga on ga.grp = m.grp
66 
67 --
68 -- Now we can calculate the serial number per group
69 -- Serial number per group = row_serial - (grp_max) + (grp_cnt)
70 --
71 update matstest set grp_serial = row_serial - (grp_max) + (grp_cnt) where grp_cnt is not null
72 
73 
74 -- show the final result
75 select * from matstest
76 
77 /*
78 
79 -- Starting point
80        id    row        grp grps cnt max
81 1    1    4312    1    1    [NULL]    [NULL]
82 2    2    4313    22    1    [NULL]    [NULL]
83 3    3    4314    22    1    [NULL]    [NULL]
84 4    4    4315    33    1    [NULL]    [NULL]
85 5    5    4316    33    1    [NULL]    [NULL]
86 6    6    4317    33    1    [NULL]    [NULL]
87 
88 
89 -- Final result
90        id    row        grp grps cnt max
91 1    1    4312    1    1    [NULL]    [NULL]
92 2    2    4313    22    1    2    4314
93 3    3    4314    22    2    2    4314
94 4    4    4315    33    1    3    4317
95 5    5    4316    33    2    3    4317
96 6    6    4317    33    3    3    4317
97 */

Leave a Reply