Saturday 13 December 2008

Performing to expectations

There are good and bad points about running a small consultancy. I would like to focus on one of the good things though. If I can steal a quote from an old American Theatre manager, “Every day, the same thing. Variety!”

So, last week was largely involved in coding in good old VB6. This past week has been partially spent writing a guide on securing home PCs to protect children and bank details. However, I also did some work on how to troubleshoot performance issues for some people that didn’t want to hire outside talent for the work but needed the skills. That is OK with me. I always enjoy mentoring and teaching. I thought that it would be good to share the basics with a wider audience so I will blog about it here.

There are a couple of odd things about performance tuning. The first is that the law of diminishing returns tends to cut in long before you reach the theoretical limit. There comes a time when the cost vs benefit equation comes out against further change. The second is that it frustrates managers for reasons that will quickly become apparent.

So, the first step is to find the bottleneck. Are we memory bound or CPU bound or I/O bound – and with virtual memory, memory bound can add to I/O bound.

Memory bound applications are not quite what they used to be. When I was a kid, I had an Acorn Atom. In fact, I had the world’s fastest Acorn Atom since I had replaced the 1Mhz 6502 with a 2Mhz 6502A which I ran at 4Mhz using a bolt on heat sink (rare for processors in those days) and a 5V line running at 7.2 volts. That puppy used 2114L RAM chips each of which stored 1K bits. Put 8 of them on a bus and you have 8K bytes of memory. Each of those cost £24 at the time. I see that they are now available from specialist dealers for £1.40 but we are talking about 1980 money so we are talking £83 for 1K bit or £664 (about $992) for 8K of memory.

These days, you can get 1GB for less than £17 so the problem is normally not that there is not enough memory to back up the address space but that there is considerable contention for the memory. A prime candidate for this sort of problem is a server used for multiple purposes. Small Business Server has to be a domain controller and an IIS box and an Exchange Server and a SQL Server host. That is a lot for one box. Adding a memory hungry application is not going to help matters at all and most people don’t try. However, you often see IIS and SQL Server on the same box and both are big users of memory. While Server 2008 has made some improvements in this area and 64 bit servers are more common, there are still a lot of applications that hit problems. The key is looking at the page faults per second. The number will vary depending on the application but if they look too high then you probably need to tune the memory and give yourself some head room if such a thing is possible within the address space restrictions. The ASKPERF blog discusses this in much more detail. Oh, and overworked .NET apps tend to use a LOT of memory because the garbage collection get starved. Always looks at workload first with them.

CPU bound processes are perhaps more interesting. As always, Perfmon is your friend and you can get a lot of information from looking at thread activity and percentage of time in kernel mode. However, please be aware of something very important. These figures will be best estimates. They can’t be taken as gospel. Apps that thrash the CPU fall into two camps. Those that really are that CPU intensive and those that are doing unnecessary work. Calculating Pi to a million places is CPU intensive. Cracking codes is CPU intensive. If you are serving web pages or doing database updates or something which isn’t number crunching, then it shouldn’t be that CPU intensive. You need to discover where the CPU is being wasted. Heap management is a classic. If you fragment the heap badly by using sloppy memory allocation and deallocation, well, the heap manager will spend a lot of time cleaning up. Consider object brokers as they are often the answers. Do you have too many threads? For CPU intensive tasks, you should have fewer threads than for I/O bound tasks. If we are talking about a database server that waits for the DB to return records which are then processed then 50 threads per CPU might well be perfectly healthy. If you are crunching through large arrays then 5 threads per CPU might be too many. Please remember that thread switching is not free. Oh, and if your process is spending too much time in Kernel mode then you might want to consider what drivers you have and what you are asking the system to do. Finally, you might have to hand tune code to make it more efficient. I discussed this back in 2005.

I/O bound processes spend most of their lives waiting. Typically CPU utilisation will be low. There are really 2 approaches here. The first is to speed up the I/O operation. Disk transfer times vary between 45MB/s to 3GB/s and seek times vary from 2ms per seek to up to 15ms per seek. Faster hardware can make a big difference, especially if the hard drive has a decent cache buffer or if you can cache in software. Faster network links can help too. The other approach is to minimise I/O by careful caching of data. A small read only table may as well be held in memory. There is no need to pull back more fields from a database than you will use. You could even look at offloading reading and writing to another process in some cases. Typically, you need to consider more than one of these options.

So, why does this frustrate managers? Well, because there is no clearly defined end to this process, there is no specific end date by which you will have results. Try putting that on a Gantt chart! The other reason is that progress is very non-linear. You find a bottleneck and fix it. You immediately hit a second bottleneck. You fix it. If you have chosen well, initial progress is rapid. Because of the law of diminishing returns, you will make less dramatic improvements over time. The manager gets to see less and less success over each iteration. To many people, that seems like you are getting worse at what you do so that is one to message carefully.

I hope that this helps someone

Signing off,

Mark Long, Digital Looking Glass Ltd

No comments: