Tuesday, 9 September 2008

Shooting trouble and the breeze

I have been thinking about problem solving and how to teach it in the last few days. It seems to be a difficult subject, not least because there is no general right way with all other ways being wrong. It seems that we all find our own right way according to how our minds work but there are some common traits and I would like to discuss those for a while.

You might be wondering why I am pontificating on these in a technical blog. The reason is that so many of the things that I do are essentially problem solving. When you sit down to write a program, you are trying to solve a problem or there wouldn’t be much point in writing it. When you debug a program, you are trying to solve a problem, namely that the program doesn’t work the way that you want. Even reverse engineering is a series of problems although this is subtler. Engineers tend to be problem solvers though I could name quite a few managers who would argue that engineers were actually problems in themselves :-)

Anyway, problem solving is very much an engineer thing and this is a mixed blessing. We tend to try to solve all problems even when no solution is requested. If you go to an engineer to complain that your relationship with your spouse is not going well, they will try to offer advice even though most are far from expert at relationships themselves… and all you wanted was an audience and not advice. Never the less, a lot of time and effort goes into turning ordinary well balanced people into engineers because someone needs to make sure that the lights work and all that other technical stuff.

A technique that is often taught is logical decomposition. – breaking a problem down into smaller and smaller parts until each part is trivial and therefore solvable. There is a lot to be said for this and it has been a mainstay of programming for many years. It is perhaps the perfect technique for ISTJs if you are familiar with Myers-Briggs personality types. It is very good at the sort of problems that it is good for and useless for the rest – but programming is generally an area well suited to this approach. It fails miserably with problems such as “How do I travel faster than light?” or “How do I travel in time?” because those problems don’t break down neatly into smaller problems. A lot of the skill in using decomposition for programming problems comes in knowing where to put the boundaries when looking at the problem. One major weakness of the approach is that there is no “big picture” analysis and this may be a problem. There tend to be a great many pieces at the end of the day and managing those can be a problem in itself. However, most programmers/software engineers/code monkeys et al tend to be most skilled with this approach as they have been taught to see it as *the* problem solving technique.

How might we solve a performance problem? Well, there are a few things that we would do initially. The first is to find out what sort of performance problem it is. Are we CPU bound? If so, we probably have a poor choice of algorithm and need to improve it or (more often in my experience) do less unnecessary work. Are we disk bound? Better hardware or better caching can help there. Maybe there is a lot of contention for resources and the system is blocked on that – common when you have a multiprocessor beast of a box and a highly contended mutex or something like that. Record locking on a database is another common scenario that looks like this. Whatever the cause, the solution seems to decompose into two steps:

1. Find out what it is doing.
2. Find a way of making it not be a problem.

Ok, there are quite a few ways of trying to work out what it is doing. You can step through it (in big or small lumps depending on how well you understand it), repeatedly dump the process state and examine it that way or add tracing logic or instrument the code in some way – Perfmon often being a good start. None of these decomposes terribly well into a set of repeatable steps and maybe that is why it is hard to get a programmer to be a good debugger. Fixing it often involves coming up with a better solution to the problem than the original implementation. This generally relies on knowledge of how things work under the covers. Against, decomposing the problem is of limited value here.

Ah, but wait, I hear you say… this is only one sort of problem solving and there are many others. This is of course true. The approach taken for different problems is different again but it seems that flexibility is the key to so many of them. Even in as limited a field as IT, there seem to be too many approaches for most organisations to be able to cover all the required bases well. That means a lot to teach and learn, even for a subset of the skill of troubleshooting which is itself a subset of problem solving.

As well as trying to understand what is required to troubleshoot so that we can work out what skills need to be passed on, there have been a lot of very clever people trying work out how to get troubleshooting done by people without the skill. So many internal help desks or technical support lines are staffed by nice, intelligent and reasonable people in low cost labour markets who have been given minimal training and a script. In fairness, if the script is well done then it solves most problems. However, when it fails to solve the problem, its failure is absolute. There is always residual need for flexibility and deep technical skills and those are expensive to maintain. It is hard to justify the cost until you need those skills and then any price seems much more reasonable.

Maybe it makes sense to hire in those skills as needed. I certainly hope so as that is a fundamental part of the business model for Digital Looking Glass. So far, so good but your views, as always, are welcomed.

Signing off,

Mark Long, Digital Looking Glass Ltd

P.S. The site has been redesigned if you would like a look.

No comments: