Speed up the deletion of duplicates

https://stackoverflow.com//questions/22075976

23-12-2019
|

Question

I coded a VBA script in Excel which adds new data into a Datasheet with previous information. Before doing that, the new data is copied into a provisional Datasheet. To prevent duplicates, I create an additional column and do a VLOOKUP of IDs. If the ID from the new imported data is already in the Datasheet with the old data, this row is marked as duplicated and will be deleted. The "non-duplicated rows" are then copied into the final Datasheet, where all the data is stored.

Right now I use a column reference (A:A) in the VLOOKUP and I don´t know if maybe this is the reason why the VBA script needs every day more resources and time to run. When I coded for the first time, I did the test with no more than 4,000 rows in the original Datasheet and 4,000 rows in the imported data. The macro was done after 90 seconds. Right now, it needs more than 5 minutes and the Datasheet with data is just 40,000 rows large, while the new data is always around 4,000 rows.

Should I dynamically reference the range in the VLOOKUP instead of using A:A or it doesn't matter in terms of speed?

Solution

As mentioned in my comment, there certainly is a way to accomplish this task using VBA, but sometimes the simpliest solution is best. I would reccomend added all 40K records each time and using the "Remove Duplicates" function under the "Data" ribbon using the column that holds your unique value.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow