My company works with builders to sell security and AV products in new housing developments. We have lots of different customers – hundreds of different building companies and developments with hundreds of houses – all with different configurations, requirements, and sales and discount structures.
We are building a new sales system to manage all of this complexity. Salesmen have smart clients that work offline. Management of products and product lines are done centrally with a standalone Windows application using LINQ to access a SQL Server back-end.
Painful Validation Time
One of the biggest challenges is the product validation logic in the admin app – a powerful validation scheme ensuring that every product change is validated against all of the business rules describing what different builders and end-customers require. It is critical to prevent product mistakes being propagated out to the sales guys.
Initially the validation code was running fine, but as the volume of test data grew, the validation time for a change was spiking up to 10 minutes. It always completed correctly, so I knew there wasn’t any logic problem.
Finding the Hotspot
We had VS Team System so I fired up VSTS Profiler, and it was essentially useless. It showed overall how long it took to run the validation code, and how much memory was used, but there was no breakdown straight to the specific problem areas.
I had used ANTS Profiler previously at another company, so I downloaded the latest edition and profiled my code. There is a lot of loading in the application before you get to the validation code, but the timeline allowed me to narrow down to the particular problem area. Also, the ability to pick individual threads came in useful as the validation process was spun off to a worker thread to prevent GUI freezes.
One method showed up as the hotspot, with 14,000 hits while everything else was a couple of hundred. This method was using LINQ to do a complex query with a lot of JOINs.
I narrowed down my field of search using the timeline. Here we see a significant sustained spike in CPU whilst the Product Validation code was running.
The fix was to tweak the DB structure to eliminate the unnecessary JOINs, shortening the query time right down to around ten seconds.
This screenshot illustrates the massive drop in CPU and Time with Children after the fixes were implemented.
Understanding, not Just Debugging
I had used profilers before, but not for such a complex application. It was easy to use the ANTS Profiler timeline to find the exact problem on the specific thread. I like the fact that it throws up a big red flag on the problem area. At the same time, I discovered more about how the validation code was really working – which is very difficult to do with normal debugging.