Ending error-driven development: systems development

Showing posts with label systems development. Show all posts

Saturday, August 22, 2015

Social science and systems development

A couple of weeks ago, I wrote a blogpost on Agile and pseudo-science, which led to an interesting discussion on my Facebook wall about the lack of science and data in systems development.

As I wrote in my post, there is an unfortunately tendency of systems developers to rely on pseudo- or popular science, rather than proper science. A lot of decisions are made on assumptions for which there is no real evidence, but which somehow has become embedded in the world of systems development - be it project management, programming, testing, or some other aspect.

We tend to think of systems development as off shot of computer science with some cognitive psychology thrown in. While computer science and cognitive psychology certainly are important for creating systems, they are very much related to the "systems" part of systems development, while they hardly add to the processes, methods etc. that adds up to the "development" part.

A comment in the before-mentioned Facebook discussion suggested that we should look at social science when talking about science and systems development.

This seems like a good idea to me. A lot of what goes on in systems development is about humans interacting with each other, rather than about math, algorithms, cryptography or other aspects of computer science.

I once worked on a project where we had a horrible high turnover, which is definitely a well-known sign of a doomed team. Yet this team was one of the best teams you could imagine - continuously over-performing in the sense of delivering over scope, under budget and without flaws. From everything we know about team dynamics, this team should not have performed like this, but it did, and it could have been good to have had someone qualified there to analyse how this could be the case.

Equally it would be good to have someone look at the opposite type of teams - the ones which in theory should perform well, but which continuously under-perform. The ones where good developers suddenly seem unable to actually finish anything, where simple requirements turn into hideous bloated things, and where the only cure seems to be to kill of the team, never to let them work together again.

In both cases we know the anomalies relate to people, the interaction between them, and the culture of the team and the organization surrounding it. We just don't seem to be able to figure out how to create the first type of team, while avoiding the second type.

Given how many systems development projects fail, maybe it is worth looking at these things? And if we do, maybe we should try to find the proper tools?

Social science would seem like a good fit, and it would seem like a good idea for large organizations with many systems development projects, to take a look at how the research methods of social science could be applied to systems development, in order to become better at it.

Sunday, August 2, 2015

Agile and pseudo-science

Back before I started blogging about IT-related stuff, I used to write about pseudo-science on my other (very quiet) blog Pro-science, and before I had that blog, I used to comment a lot on a number of science and skeptic blogs.

First commenting and later blogging about science and pseudo-science has helped enable me to be better at detecting woo and pseudo-science in my daily life. Woo is a term used in the skeptic community to indicate something which is based on anti-science and have no scientific validity or data to back it up.

Woo and pseudo-science exists everywhere, and I have found that the field of IT is absolutely no exception. Rather, there are certain areas which is rampant with pseudo-science, often bordering on woo.

One such area is the field of Agile.

The field of Agile is not a well-defined field, but is a loosely collection of methodologies, techniques, and tools, used for systems development. It is commonly understood as having formed as a result of the agile manifesto, but in reality, it started developing before then, and would probably have come into existence as a field, even if the manifesto hadn't been written.

As a software developer, I use Agile methodologies, techniques, and tools on a daily basis. I also go to conferences and meetups about Agile frequently.

Due to that, I can't help noticing that there is not a whole lot of science behind Agile. Not only that, many of the arguments people use to promote Agile is not exactly based in science or data, but rather on pseudo-science, often in the form of popular science.

This summer, I went to a conference which had a talk on Agile and NLP. I didn't go to the talk, so I can't say anything about the actual content, but let's be perfectly clear: NLP is pseudoscience (.pdf) - see also Thirty-Five Years of Research on Neuro-Linguistic Programming. NLP Research Data Base. State of the Art or Pseudoscientific Decoration? by Tomasz Witkowski (.pdf).

If you do a search on the words "Agile" and "NLP", you'll find articles like NLP Patterns and Practices for High-Performance Teams and Achievers by J.D. Meier. If you look at the blogpost, and are in any way familiar with the writing of people pushing pseudo-science, you'll notice the patterns there. There are appeals to effectiveness and popularity, but no actual data to back those claims up.

It is highly likely that Meier actually believes that NLP helps, but there is no science to back it up - neither in the form of an explanation of how the NLP mechanisms works or in the form of data, backing up the claims of improvement.

This is, unfortunately, widespread in the field, and not just when it relates to NLP.

Agile (as all things related to systems development) is very much dependent on people, and since most realize this, there is a lot of focus on this aspect.

This results in many speakers on the subject on Agile, will pepper their talk with references to neuroscience, but rather than looking at the actual science papers, they will get their knowledge from popular science books, by people like Malcolm Gladwell.

In 2012, Steven Poole wrote an article for The New Stateman on the subject of popular science books on neuroscience, called Your brain on pseudoscience: the rise of popular neurobollocks. While he, in many peoples' opinion, painted with too broad a brush, there was an important message in the article, which we'll do well remembering - popular science books often have an agenda, and they are simplistic, and, in many cases, get the science completely wrong.

E.g., as a rule, one should expect anything written by Malcolm Gladwell to be wrong.

I understand why speakers and authors tend to use popular science books as sources, rather than the real science articles. With popular science books, there is a good chance that the audience has already read them, and the science is presented in a form which is easy to digest and regurgitate.

Just a pity that the science is often presented simplistic at best and wrong at worst.

Some speakers, such as Linda Rising and Kevlin Henney, instead read the original science papers, and refer to them instead. Given that those two speakers are among my favorite speakers, I hardly think it takes anything away from their talk. It just makes them more accurate.

Having addressed the use of pseudo-science and popular science, I think it is also important to address the fact of data on which to evaluate things.

This is obviously connected to the other problems. Lack of data, leads to bad conclusions.

A simple example of the lack of data, is this: How do we know Agile works?

Or put in another way: On what basis do we conclude that Agile works better than what was there before?

As far as I've been able to find out, there is no real data to support such claims.

Yes, there have been studies that indicate that Agile works better, but those are somewhat doubtful, since there was no baseline to compare to, no clear definition of what went before Agile, or even what Agile is.

People tend to compare Agile to Waterfall, but as Laurent Bossavit wrote in his excellent book The Leprechauns of Software Engineering, there is no clear evidence that Waterfall was ever used - at least not in the form that people commonly think of when talking about the Waterfall method.

Personally, I believe there is great value in using Agile, but I am not willing to make any claims without data to back me up, and in order to have that, we need to try to not base the field on pseudo-science and misrepresentations in popular science, and instead try to use real science, and to collect and evaluate data.

Not everything fit into the scientific method, and in a field that is so heavily dependent on people, there will be times where personal preferences will lead the way, but this doesn't mean that we can't do better.

Think of it this way: We would not accept a new UX design without an AB test or some other empirical data to back it up. So why would we accept it when it comes to our methodologies, techniques, and tools?

Wednesday, October 5, 2011

Hotfix Hell

I am a firm believer in many agile processes and tools, including iterative development, where you work with many deliveries. This allows for early, frequent feedback, and allows you to find errors early, before the project turns into an error-driven development project.

Unfortunately, this is not always possible.

A type of project I often work at, is the large project which spans several years, where the customer involvement is minimal, except at the start and at the end. This sort of project is pretty much doomed to go over time, go over cost, have many errors etc. They are exactly the reason why agile development has become so popular. Even if the project is developed agile, the lack of customer involvement, will mean that the project could in the wrong direction, without anyone finding out, before at the end.

Typical projects where this happens, are public projects subject to tenders. Here the scope of the functionality etc. is determined before the contractors bid on the contract (though often with a clarification phase at the start), and after the bid has been accepted (plus initial clarification phase has passed), the scope, deadline, and price are fixed. The customer will then often not be involved until the final acceptance test phase, where the solution is often found to be lacking (to put it mildly).

As this approach has obvious flaws, there has been an attempt to fix it by introducing sub-deliveries, which each has to pass the acceptance tests. In my experience, there are typically two sub-deliveries before the final delivery at the end.

This approach might seem somewhat agile, and since it gives earlier feedback, you’d think that it would help. Unfortunately, in my experience, it actually worsens the problem.

The problem is in the acceptance test part.

Picture the typical two year project with three deliveries (two sub deliveries and one final). Given the fact that there is a lot of scaffolding etc. at the start, the first sub delivery will fall after one year, with a sub delivery half a year later, and the final deliver half a year after the second sub delivery (i.e. one year after the first sub delivery).

This is, in itself, not an unreasonable schedule.

Unfortunately, the programmers involved will often have to acquire domain knowledge, while doing the scaffolding and early development of functionality, increasing the likelihood of wrong decisions and/or errors in the implementation. Some of this will become apparent as the first deadline approaches, and might be changed in time - unfortunately some it won’t be possible to change it all, and some misunderstandings will only become clear during the acceptance testing.

Since the project is on a tight schedule, the work on sub delivery two starts - often starting by changing the flaws found before the deadline, which they didn’t have time to fix, but also start on new functionality etc.

Unfortunately, the errors will still be in the code submitted to the acceptance test.

Since the errors are found, the acceptance test fails, and the customer rejects the sub delivery as it stands.

What happens then? Well, this is where the hotfix hell starts.

Given the fact that the code submitted as sub delivery one has failed the acceptance test, the developers have to fix the errors in the submitted code, which is now out of date, compared to the code base. This is done by making a patch or hotfix to the code.

The patched/hotfixed code is then re-submitted to acceptance testing.

If this passes, then all is well. Unfortunately that’s rarely the case. Instead, new errors will be found (perhaps introduced by the fix), which will need to be fixed, re-submitted, tested etc. This will take up considerable amounts of time, calender wise, but also resource wise - meaning that programmers, testers, customer testers etc. will use a lot of time fixing problems in the code, which they could have spent on other things. Other things, such as sub delivery two.

Just because sub delivery one has failed the acceptance test, doesn’t mean that work on sub delivery two has stopped - it is, after all, to be delivered six months down the line.

Unfortunately the plan didn’t take into account the hours need to work on sub delivery one after the delivery and/or date for expected acceptance test. This means that sub delivery two is in problems, since the developers won’t have time to do all the work required for it to pass acceptance test.

Meanwhile, sub delivery one and sub delivery two move more and more apart, resulting in developers having to fix problems in obsolete code - this is frustrating for the programmers, and introduces the risk of errors only being fixed in the old delivery instead of being fixed in both, since porting the fixes is difficult.

At some stage, the sub delivery will pass the acceptance test, or (more commonly in my experience) it will be dropped, as the next delivery either is about to be delivered or has been delivered.

Due to the work on the sub delivery one, the second sub delivery is unfortunately either late or in such a mess that it cannot pass acceptance test (or, more likely, both).

This means that when sub delivery two is handed in, hotfix hell starts all over again.

So, how can this be fixed?

Well, one way is to do iterations the agile way. Unfortunately, that’s not particularly likely to happen.

Another way is to base deliveries on the date when the acceptance test of the earlier sub delivery has passed. So in the above example, the second delivery will be handed in six months after the first delivery has passed the acceptance test.
Given the nature of the projects using sub deliveries, this is also unlikely to happen. Often the last deadline is defined by a new law or regulation, and is firm (until it becomes completely apparent that it cannot be done).

A more likely solution would be to take the overhead to hotfixes into account when planning. This would mean that the time spent on hotfixes on sub delivery one wouldn't take time set aside to sub delivery two. The problem with this approach would be that this would make the price higher when bidding, since more people would be needed to finish the work on time, than if one assumes that not hotfixes are necessary. On top of that, it is also hard to estimate just how much time is needed for this (in my experience, everybody vastly underestimates this).

My suggestion would be something more simple. Timebox the acceptance test - and call it something else.

Before the project starts, the customer and the contractor decide how much time will be used for testing the sub delivery in order to make sure that the fundamentals are fine, but without resulting in the developers having to fix obsolete problems.

When the timebox is done, the customer will either accept or reject - a rejection would mean that the customer think the code is so fundamentally flawed that it cannot be used. If that’s the case, the customer and the contractor will need to sit down together and figure out how to get on from there. Perhaps the final deadline will have to be moved, the contractor will have to add more people to the project in order to get it back on track, or the customer will have to become more involved in the development process (e.g. by providing people who can help testing during the development of sub delivery two).

I am well aware that my suggestion breaks with the concept of sub deliveries, but I would claim that the concept of sub deliveries is fundamentally flawed, and instead of helping a problem, it actually makes it worse. Since this is the case, I think we have to re-think how they are used, if at all.

Friday, July 3, 2009

Ending error-driven development, part 4 - setting the team

My last post was focused on problems which might exist in a team, and what measures which can be done to avoid, or at least compensate, for these problems. This post focuses on the roles which should be filled in a team which has problems, and has turned into the error-driven development phase.

As always, these are my opinions and my opinions only. I’d love to get feedback, and hear what roles other people find necessary or unnecessary.

I am not claiming that the roles that I mention are always necessary for a project to succeed, or even that they are necessary when in an EDD project. They are, however, roles that I find will help the project along the way towards becoming a better project.

The roles are not necessarily full-time jobs, and a person could fulfill several roles at the same time, as long as that person has the right skill-set for each of those roles.

So, apart from a project manager, software architects and developers, what roles do I think should be part of a team’s setup?

QA manager

Configuration manager

Deployment manager

Defect manager

Test manager

Release manager

End user representative

QA Manager

Let’s be honest, when a project is under a great deal of pressure, and has been so for a while, the developers start cutting corners. This is the only way they can remain any hope of making it on time.

This means that it’s necessary to have someone who is responsible for the code quality, and who have the jurisdiction to enforce coding standards.
Depending on how bad shape the code is, this can either be a small task among other tasks, or it can be a full-time job. If there is a lot of legacy code, the QA manager will have a hard time getting the code up to par, even if the legacy code is ignored. It’s a sad fact that code which is being written for a crappy code base will most likely also be crappy – it’s the “broken window syndrome”. Why bother with writing clear, beautiful code, when it’s going to be part of a messy, unclear code base?

The QA manager needs to be respected by the other developers, and must be able to help mentoring less experience developers on how to solve problems in the correct way.

Configuration manager

We probably all know this situation. The tests have gone well, everything seems to work, but when it’s deployed out to the production environment, the system suddenly fails, and no one knows why. After some frantic work, it’s found out that one of the settings in the configuration was wrong – it might have been overwritten by a test setting, it might not have been deployed, or there might be something else wrong. Whatever the reason, it’s a situation which should be avoided at all costs.

This is where the configuration manager comes in. That’s the person responsible for ensuring that the configurations are correct and the right version on all systems. Every time a change is made to the configurations, the configuration manager should be involved.

Deployment manager

Deploying systems is a hard, thankless task, and most developers don’t want to do it. The deployment manager is responsible for the deployment of the system, ensuring that all the correct information is filled out, that the people who need to know that the deployment is happening get to hear it, and that the deployment went well. In case the deployment didn’t go well, the deployment manager is responsible for finding out what went wrong, and to fix the problems.

If you get the right deployment manager, he or she will try to ensure that the deployment procedure gets automated, so he or she will have fewer things to worry about in the deployment phase.

A deployment manager is not the same as a release manager (see below), as the deployment manager is focused on the entire system, while the release manager should only focus on the parts of the system affected by that specific release. This means that while the deployment manager should ensure that each and every subsystem is running after a deployment, the deployment manager is not responsible for smoke testing the system – that’s up to the release and test managers.

Defect manager

When you are in an EDD project you will get a large number of reported defects (called defects for short), and it’s a quite good idea to have a single point of entry for those defects. The purpose for this is to ensure that the developers don’t get disturbed by defect reports all the time, and to make it easier to weed the defects for duplicates and hidden change requests (a future post will go into these differences).

The defect manager should also be the single point of contact, regarding defects, for people outside the team. This means that if a tester has a question regarding some behavior which might be a defect, the tester should talk with the defect manager.

The defect manager should have clear guidelines from the project leader about how the defect should be prioritized, both in relation to each other, but also in relation to other tasks, such as new development.
My rule of thumb would be that critical defects (e.g. parts of the system doesn’t work), should be have highest priority, but otherwise defects should be solved when someone is working on the general area they are connected to. Of course, in the case of defects found in new functionality for the release, they should always be fixed straight away.

Test manager

Given my heavy focus on testing (see my earlier post on the subject), I guess it should not come as a surprise that I feel that there should be a person dedicated to coordinating the tests.

The test manager is responsible for ensuring that all relevant tests are run every time a new release is on the way, that tests are modified as required, and that new tests are added as new requirements appear.

A good test manager should focus on automating as much of the test process as possible (but not so much that the test quality suffer), and ensuring that everything is tested sufficiently. The test manager should also always be on the lookout for new tools and techniques for testing.

Release manager

A release manager is the person responsible for a particular release, keeping track on everything related to that specific release.

During deployment it might be hard to tell the differences between a deployment manager and a release manager, but there are some key differences. A deployment manager is responsible for ensuring that the deployment process goes as it should, while the release manager is responsible for ensuring that the correct things are deployed for that particular release, and that that particular release works as it should after deployment. In other words, the release manager is responsible for making the package which should be deployed, while the deployment manager ensures that the package is actually deployed.

End user representative

If at all possible, there should be at least one representative from the end users in the team. This person should be available to answer any questions the other team members might have regarding the domain the system should work in.

Sometimes a requirement seems ambiguous to a developer, while being quite clear to someone who knows the domain better. If there isn’t an end user representative available, the developer will often end up guessing how the requirement should be understood, and design and implement based on that guess – often resulting in a wrong design/implementation – or the developer will have to waste time finding someone who correctly understand the requirement and pass the information on to the developer.

Having an end user representative on-site, will allow the developer to quickly get feedback on how to understand the requirement, reducing the development time and/or number of wrong design decisions and implementations.

How should the work be divided between the roles?

There is a certain amount of overlap between what can be considered the area of responsibility of the different roles, so I thought I should try to explain how I see the divide.

Defects should be handled by the defect manager, but the release managers should be aware of what defects are solved in their releases. The test manager gets to say whether a defect has been solved or not (do they pass the tests).

Configuration managers should handle configuration of servers etc., but should work closely with the release managers to ensure that all configuration changes in the releases are taken care of. The configuration manager also needs to work with the deployment manager to ensure that configuration changes are deployed correctly.

Release managers should work closely with the deployment manager to ensure that the releases are deployed correctly. By “correctly” I mean that they should not only make sure that the system is up and running afterwards (this is the deployment manager’s job), but ensure that it’s the correct releases which are up and running. The test manager should be the one responsible for testing that all the additions, modifications, and corrections are deployed, but the release manager is in charge of keeping track of what additions, modifications, and corrections are part of the deploy.

For a different way of explaining this, let’s go through the process:

1) A new release is defined, and a release manager is appointed. The release manager work together with the rest of the team to ensure that the release is finished on time.

2) The test manager starts ensuring that there are tests covering new functionality, and modifies any existing tests as warranted.

3) While developing the new release, the code is continuously committed, built and tested. The test manager ensures that all errors found during the testing are reported to the defect manager.

4) The defect manager prioritizes the errors, and make sure that they are added to the tasks for the release if relevant. When doing this, the defect manager ensures that the release manager and QA manager are aware of the open defects.

5) Any fixed defect goes to the test manager, who either confirms the fix or reopens the defect as not solved.

6) The release manager informs the configuration manager of any changes to the configuration.

7) During the development of the new release, the QA manager will do general QA work, enforcing coding standards, do code reviews etc. If any area gets a large number of defects, this area will be in the QA manager’s focus.

8) After the release is done, passing the tests sufficiently, the release manager makes sure that all the relevant code is made into a deployment package, and hands this over to the deployment manager.

9) The deployment manager goes through the deployment steps, ensuring that the deployment goes well.

10) After the deployment has gone well, the configuration manager ensures that the configuration is correct.

11) The test manager tests that the deployed version corresponds with the release version [go through steps 8 through 11 as many times as necessary until it’s right]

12) Finally, the release manager signs off the release, filling out whatever papers are necessary, informing the necessary people about the release etc. [this might be handled by the project leader together with the release manager]

Note that there frequently will be several test environments which the deployment should go through, but the release manager is only responsible for the first deployment, while the rest presumably can be handled by the deployment manager alone.

Monday, June 29, 2009

Ending error-driven development, part 2 - adapting agile practices

This post is not about whether agile methods are better than the waterfall model; rather it’s about adapting to the situation, and using the most useful tools for solving the problem. Error-driven development is often plagued by a number of problems, which might be solved, or at least reduced, by using some of the practices from the agile methods, such as scrum or eXtreme programming.

If you are already using an agile method, it might be worth evaluating whether it’s the correct method for the situation.

Anyway, here are some agile practices it might be worth adapting to your project.

Daily stand-up meetings

A lot of the problems in EDD projects are related to breakdowns in communications - not only between the team and the customer, but also between the team members and the team leader or between different team members.

The purpose of daily stand-up meetings is to spread knowledge of what each person has done since last time, what they are doing now, and what problems they have encountered. This allows team members to either help solving the problems or to plan their work accordingly – there is no need to start working on something which will run into the same problem before the problem gets fixed.

In the book Manage It, Johanna Rothman distinguish between status meetings, where everyone tells everyone else what their status is (something she consider a waste of everybody’s time) and daily stand-up meetings, where people tell the others about a) what they have just finished, b) what they are going to do now, c) what problems they encountered (something she consider very valuable). In other words, daily stand-ups can be said to give a status of the current progress for each team member.

While I certainly agree with Rothman that daily status meetings are not as good as daily stand-ups, I would hesitate to make the claim that they are a waste of time. In a project with a real breakdown in communications, status meetings can help get everyone up to speed on the project as a whole. Of course, such status meetings probably shouldn’t be daily, but rather weekly, and they should be discontinued (or at least held less frequently) when the project is back on track.

Work in short iterations

Agile methods focus on working in short iterations (usually 2-4 weeks long) where there is a finished product at the end, which can be tested. By finished, I mean a product with fully integrated functionality which can be tested through the entire system (e.g. from the GUI all the way down to the database).

This allows for continuous testing, and will give early warnings about problems in the requirements, architecture, or technology. On top of that, it has the benefit of demonstrating progress to people inside and outside the team – the psychological value of this cannot be overestimated, in a project where everybody feels that they have been working hard without showing any progress.

This approach also works in systems which are fully implemented, but which are full of bugs. Here the functionality should be fixed so they are bug-free.

No matter whether it is new functionality or existing functionality, they should be prioritized accordingly to how important they are for the customer, and where they are in the workflow of the user. If you have a workflow where the functionalities are used in the order A->B->C->D then you should implement them in that order, even if the customer feels that B is more important than A. The exception of course being if there are alternative workflows which will take the user either directly to B or to B through some other functionality – then it might make sense to implement B before A.

If the project is just one of several intra-dependent projects (e.g. if the service provider and the service consumer is implemented at the same time), it’s important to coordinate the iterations, so any dependencies between the projects are taken into consideration when planning iterations.

Implement functionality rather than architecture

This pretty much follows from the last point, but it’s important to keep in mind anyway. When developing a system, there are a lot of frameworks that need to put in place (caching, data access layers etc.), but once that’s done, the team should stop thinking in terms of architecture. Instead the team should focus on functionality.
An example of developers thinking of architecture instead of functionality is the case where new requirements are implemented layer-wise. E.g. first all the changes are done to the database, then to the ORM etc. This means that none of the new requirements (or changes to the old ones) is done before they are all done. Not a good way to make visible progress, and not something which can be easily tested before the very end.

Consider pair programming for complex problems

I must admit that I am not particularly hooked on the concept of pair programming, as I am not sure that the costs are in proportion with the benefits. If you have two programmers on equal level, then pair programming can make sense, since it can create a synergy effect, but if the programmers are on different levels, then it will quickly turn into a mentoring process. While I find mentoring processes valuable, they have their time and place, and it’s not necessarily in the everyday programming in a project that has hit problems.

Still, if there are complex problems which need to be solved in the system, then pair programming might very well be a very good idea. The benefits of pair programming should in most cases easily be worth the reduced productivity for a time (given most people don’t pair program most of the time, there will be a cost in productivity when doing this). The benefits of having two people doing the code is that they will work together to solve the problems, making it more likely that it’s done correctly, and there will be two people who understand both the problems and the solutions to them (as they were implemented).

Use continuous integration

To my mind, there is nothing that beats instant feedback when there are problems. Continuous integration is a powerful tool in helping giving developers instant feedback about problems with their code, allowing them to fix those problems as soon as they occur.

Continuous integration is, simply put, the practice of continuously committing your code to the code base, where it’s build and have tests run on it (unit tests and smoke tests) to see whether it works as it should.

This doesn’t absolve the developers of running unit tests etc. before they check in. Instead it’s a safeguard against any issues that might have occurred during check-in (forgotten to check in a file etc.).

For more on continuous integration, go read Martin Fowler’s article on the subject.

Monday, June 15, 2009

Ending error-driven development, part 1 - testing

Some time ago, I wrote about something I call ”error-driven development”, which is a type of software development I come across all too often. You can find the original post here.

I’ve found out that many software developers and consultants can relate to the post, and I’ve discussed with several what one can do about error-driven development (EDD).

Well, there is no perfect answer to this question, since the root cause of EDD is different in every EDD project. I have, however, been on a number of EDD projects through the years, so I have some suggestion on some general measures one can do to either turn EDD into something else, or to limit the damage.

I’ll try to go through some of them from time to time. In this post I’ll focus on testing.

Testing

I will make the claim that testing is one of the most underrated activities in software development projects, and this has to change in order to avoid EDD. What’s more, testing is also a widely misunderstood concept. Testing is a much bigger activity than most people believe, and covers more aspects than generally thought.

Testing should of course ensure that the system works as intended, but it should also ensure that the system doesn’t work when it’s not supposed to, and that the system can handle unexpected events in a meaningful way.

In his book, Release It, Michael Nygard makes a very good point: Systems are built to pass acceptances tests, not to run in the real world. This is one of the things that lead to EDD projects, where the developers are working on a later version of a system which is in production.

Testing should allow for the particularities of the real world, and not only for the test environments (see Release It for some very good examples of the differences, and some good ways of making up for these differences).

There are several types of testing, some of which I will cover here, and in my experience, focusing on just one of them, will lead to problems in the long run.

Unit testing

With the spreading of concepts like test-driven development, unit tests are very much in the vogue. Unfortunately, books on TDD and its irk generally doesn’t explain how unit tests should be written – just that the they are important, and should be written before the code.

Making unit tests ensuring that code works as expected is of course very important, but if that’s all what the unit tests do, it’s not enough. Unit tests should also ensure that code doesn’t work when that’s expected – e.g. if a method gets an invalid parameter, you expect it to fail in some way or another. Tests for this – don’t just assume that this is the case, even if the code works with correct input parameters. Besides ensuring that the code works as it should, even when this means throwing an exception, it also makes it easier for others to see what behavior is expected of the code.

There is, unfortunately, a tendency to focus on code coverage of unit tests, where code coverage is taken to mean percentage of code lines executed during the tests. This is the wrong code coverage measure. Instead one should focus on covering all the breaking and non-breaking states that the code can be in.

E.g. if you have some code which receives a text string containing a number, which it converts to a number, make sure to test the following:
a) A string containing a positive integer
b) A string containing a positive floating point number using the normal separator (in the US an example could be “10.10”)
c) A string containing a positive floating point number using a separator from a different culture (e.g. the Danish “10,10”).
d) The same as b) and c) just with thousand-separators (“1,000.00” and “1.000,00” respectively).
e) The same as a) through d), but with negative numbers instead.
f) A string containing a number too large to be handled by the data type it’s going to be converted to.
g) A string containing a negative number too large to be handled by the data type it’s going to be converted to.
h) A string containing letters
i) A string containing zeros in front of the number

I could continue, but you get the point. As you can see, that’s a large number of tests for a fairly simple functionality, which is often implemented by using built in functionality. Even so, it’s worth spending the time on doing these, as this is the sort of things which can cause real problems in production.

Smoke testing

Unit tests are of course not the only sort of testing; there are others which are just as important. Smoke tests are automatic tests which can be run to test different flows through the system. E.g. in an internet portal, the smoke test might log in, and navigate to a specific page, while entering data in the intermediate pages.

These tests generally need some kind of tool to be made. Depending on your development framework and the nature of the system, you need to find one that suits you. In portal projects I’ve seen pretty good results with smoke tests made in Ruby, but in my current project, we are using Art of Test’s WebAii, where the tests are written in C# or VB.NET (but can test web GUIs written in other languages).

Smoke tests require a lot of time to make and maintain, especially in a system where the user interface is changed often. In such cases, it might make sense to have resources focused on running and maintaining smoke tests. These shouldn’t only focus on this, but they should have the responsibility to ensure that all smoke tests can run at all time.

Even if there are people responsible for maintaining it should be the responsibility of the developers to run the relevant smoke tests before checking in any changes to the user interface, and in case it fails, to correct the tests or the code as needs be.

Smoke tests help ensure that changes in one part of the user interface don’t have a negative impact on the functionality of another part, which is often the case.

Integration testing

In these days of SOA, ROA and what have you, it’s very rare that a system stands alone. Rather, systems tend to work together with other systems through integration points. Even the system doesn’t work with other systems over the network, it will generally use a database manage system, such as DB2, Oracle, or MS SQL, run in an operative systems (*NIX, Windows etc.), or have other interaction with other systems. All this should be tested.

If possible, integration testing should be automated, but even if that’s not practical for some reason or other, manual integration testing should be done.

As in smoke testing, it’s possible to get a number of tools which allows you to make the tests. The selection of tools again depends on the system and the development framework.

Integration testing can be very difficult, as the testing is dependent upon external systems, some of which might not have been coded yet. In such cases, remember that it’s not the other systems that the test should test, but rather the integration points with these. So, there is no real need for a fully functional system in the other end. Rather, it’s sufficient to have a mock system which sends data as it could appear from the external system. This can be done through tools like soapUI, which can both send data through webservices your system exposes, and which can serve as a receiver for your web service requests. Of course, this isn’t always enough, and I have experienced a project where the behavior of the developed system was so dependent on the retrieved data, that it was necessary to build a simulator, simulating all the back end systems.

Remember to test for differences in cultures in different systems. Can your system survive that the date-format or numbers it receives confirm to a different cultural standard than yours? This is something that’s easily overlooked, but which can have a great impact – either by crashing the system, or by the system misunderstanding the values. It makes a great difference if the date “1/5/2009” is January 5th or May 1st.

Even less ambiguous formats might cause problems, and they can be even harder to figure out. E.g. if you use a date format “dd-MMM-yyyy”, would be fine for the first 4 months when exchanging data between a Danish and a US system, but on May 1st it would be “01-May-2009” in the English speaking world, but “01-Maj-2009” in the Danish speaking world. This could mean that the system suddenly, and unexpectedly, stops working as expected, even though everything has been running just fine until then (this is not an made up example – I once started in a new job on May 1st, where my first accomplishment was to figure out this exact problem).

The more integration tests you make during development, the fewer fixes needs to be done when the system is in production (I refer to Michael Nygard's Release It for good advice on making testing environments for integration testing).

Manual testing

There is unfortunately a tendency for developers to believe that as long as you have enough automatic tests, there is no need for manual testing. This is of course nonsense.
No matter how many automatic tests you have, and how sophisticated tools you’ve used to make them, there is no substitute for human eyes on the system.

Manuel tests can be divided into two groups: Systematic testing (based on test cases) and monkey testing.

Systematic testing, normally done based on test cases, tests the functionality of the system, ensuring that it works as specified, including implicit specifications. The testers should have enough understanding of the business that they meaningfully test the system, not just follow the test script step-by-step.

With regards to the test cases, my general suggestion is that they are not written by technical people, but rather by people with an understanding of the business domain. Optimally they should be written at the same time as the requirements or at least before the coding really start, and only be in general terms. Before the developer starts developing, he or she should read the relevant test cases, making sure that he or she understands the requirements as stated in general terms. If there are some business concepts that appear unclear, it’s possible for the developer to acquire the necessary domain knowledge before starting on the development. When the system is developed, the test cases can be made specific to the system (I recommend keeping the unspecific test cases in reserve though, as the system can change a lot over the time, and it’s good to have some general test cases to refer back to).

As with all the earlier tests, there should also be testing of wrong usage of the system, ensuring that this wrong usage will result in neither major problems nor a wrong result.

Note that while test cases and use cases might sound similar, at least at first, as I describe test cases, that’s not really the case. Use cases describe things on an abstract level, while test cases are more specific. In an insurance system, a use case would describe how the user creates an insurance policy. Test cases would not only describe that the user will create an insurance policy, but rather what sort of insurance to choose, what values should be used, and what extras should be selected.

Monkey testing is unsophisticated testing of the system, where the tester tries to do whatever suits him or her, trying to provoke a failure in the system. It might be entering a wrong value in a field, clicking on a button several times in a row, or doing something else unexpected by the developers. The purpose of the testing is to emulate the sort of things which might happen in the real world, outside the safe testing zone.

While monkey testing it’s very important to document the exact steps which results in the error. Some times the symptom of the error (the system failing) occurs a rather long time after the action which caused the error.

In conclusion

There are of course many other sorts of testing (performance testing for one), but I feel that by doing the sort of testing I mention, one can do a lot to prevent a project turning into an EDD project.

The reason good testing can help avoid EDD is simple. A lot of the time EDD projects only addresses the symptoms, fixing bugs as they are reported, but they don’t address the fundamental problems, so these fixes are only temporary at best, and in general introduces other errors, which are only discovered at a later stage.

Testing will ensure that the system being developed is stable, or at least that the non-working functionality are discovered at an earlier stage. Testing also ensures that changes can be introduced more easily, as side-effects are shown straight away.

Of course, introducing testing into an EDD project is not easy. It will be running behind schedule, and people will be overburdened with work, so adding new tasks will not be doable. This doesn’t mean that testing shouldn’t be done though, just that it should be done in steps, rather than all at once. Find the core functionality, or alternatively the most problematic code, and introduce testing there – unit testing should come first, but don’t forget the other types of testing.

I know this is easier said than done, but I’ve been in projects solidly in the EDD category, which we managed to turn around, in part because of testing. In one project, we made the case for unit tests by making 10 unit tests of basic functionality in the system, showing eight of them failing. This resulted in me getting resources allocated to me, just to ensure proper unit testing of all basic functionality (we later expanded to other functionality and introduced other types of testing).

If such a drastic demonstration isn’t possible, start by doing unit tests whenever you change some code – this will ensure that the code works properly after you’ve changed it. Sadly, code in EDD projects are often not in a stage where unit tests can easily be introduced. This is why they should be introduced when the code is changed anyway, since it gives an opportunity for refactoring the code at hand to allow unit tests.

I hope this rather long post made sense to people. It’s not revolutionary concepts I’m trying to introduce, and for many people, the things I mention are blatantly obvious. Even so, there are many people, and organizations, out there, for which testing doesn’t come naturally. These people, and organizations, need to be reminded ever so often that there is a very good reason why we do these things.

Testing can’t stand alone of course; many other measures are needed to avoid project development to turn into EDD, or to turn a project away from being an EDD project. Still, they are fundamental for a healthy development, so leaving them out, will more or less guarantee that the project turn into EDD.

Book Review: The Pragmatic Programmer

The Pragmatic Programmer - from journeyman to master by Andrew Hunt and David Thomas (Addison-Wesley, 2000)

After having this book recommend several times, I got my work to buy it for the office. And I'm quite happy that I did that.

The goal of this book is to give programmers (or rather systems developers) a set if tips on how to become better, by becoming more pragmatic. In this, the book is quite successful.

When you've worked in the IT field for some years, as I have, you'll probably have heard most, or all, of the ideas before. Indeed, many of them are industry standards by now (e.g. using source control). Even so, it's good to have them all explained in one place, and it might remind people to actually do things the right way, instead of cutting corners, which will come back an haunt the project later.

If you're new to the field, I think this book is a must-read, especially if you're going to work in project-oriented environments (e.g. as a consultant). I'm certainly going to recommend that we get inexperienced new employees to read this book when they start.

Now, to the actual content of the book. It covers a lot of ground, not in depth, but well enough to give people a feel of the subject. The first two chapters ("A Pragmatic Philosophy" and "A Pragmatic Approach") explains the ideas and reasons behind being pragmatic, and how it applies to systems development. The next chapter ("The Basic Tools"), tells what tools are available and should be used. This is probably the most dated chapter, especially when it comes to the examples, but it's still possible to get the general idea.

Chapter 4 ("Pragmatic Paranoia") and 5 ("Bend, Or Break") deals with two areas where many people are too relaxed in my opinion: testing and coding defensively (ensuring valid input data etc.). I cannot recommend these two chapters too highly.

"While You Are Coding" explains how to code better, and (more importantly in my opinion) when and how to refactor. The last two chapters ("Before the Project" and "Pragmatic Projects") gives tips on how to set up and run projects in a pragmatic way.

There are of course tips that I disagree with, or which I would have put less emphasis on, and the book is obviously written before agile methods, like scrum, became widespread (though eXtreme Programming is mentioned). Still, even so, I can really recommend the book to everyone, novices and experienced developers alike.

Book Review: Release It!

Release It! - Design and Deploy Production Ready Software by Michael T. Nygard

If you are in the business of making software systems, odds are that you might have heard about Nygard's book. People have raved about it since it was published in 2007.

That being the case, it had been on my to-read list for a while, but without any urgency. Then I went to the JAOO conference last month, and heard two sessions with Michael Nygard presenting his ideas. After that, I knew I had to get hold of the book straight away.

Release It! is something as rare as a book which is groundbreaking while stating the obvious.

First of all, Nygard makes the simple point that we (meaning the people in the business) are all too focused on making our systems ready to pass QA's tests and not on making ready to go into production. This is hardly news, but it's the dirty little secret of the business. It's not something you're supposed to say out loud. Yet Nygard does that. And not only that, he dares to demand that we do better.

Having committed this heresy, he goes on to explain how we can go around doing that.

He does that in two ways. First he present us for the anti-patterns which will stop us from having a running system in production, and then he present us for the patterns which will make it possible to avoid them. Or, if it's not possible to avoid them, to minimize the damage caused by them.

That's another theme of Nygard's book. The insistence that the system will break, and the focus on implementing ways to do damage control and recovery.

The book is not only aimed at programmers, though they should certainly read it, it's also aimed at anyone else involved in the development, testing, configuration and deployment of the system at a technical level, including people involved in the planning of those tasks.

As people might have figured by now, I think the hype around the book has been highly warranted, and I think that any person involved in the field would do well to read the book.

Error-driven software development

When developing software systems, there are a number of systems development types out there, e.g. test-driven development (focuses on making tests before implementing), and what might be called requirements-driven development (focus on finding all the requirements before implementing). Unfortunately, there is a type of development that I all too frequently come across, which I've come to call error-driven development.

Error-driven development is systems development, where everything is done in reaction to errors. In other words, the development is reactive, rather than proactive, and everybody is working hard, just to keep the project afloat, without any real progress being made.

I should probably clarify, that I am not speaking about the bug fixing phases, which occurs in every project, but rather the cases where the project seems to be nothing but bug-fixing (or change-requests, which is to my eyes is a different sort of bug reports), without any real progress being made.

Unsurprisingly, this is not very satisfactory for any of the people involved. What's more, it's often caused by deep, underlying problems, where the errors are just symptoms. Until these underlying problems are found, the project will never get on the right track, and will end up becoming a death march.

The type of underlying problems, which can cause error-driven development, could be things like:

Different understanding of the requirements for the software among the people involved. Some times the people who make the requirements have an entirely different understanding of what the end system should be like than the end users.

Internal politics. Some departments or employees might have different agendas, which might lead to less than optimal working conditions.

Lack of domain knowledge among the people involved. If you're building e.g. a financial system, it helps if at least some of the people involved in the development have a basic idea of the domain you're working within.

Bad design. Some times early design decisions will haunt you for the rest of the project.

Unrealistic time constraints. If people don't have time to finish their things properly, they will need to spend more time on error fixing later.

There are of course many other candidates, and several of them can be in play at the same time, causing problems.

No matter what the underlying problems are, the fact is, that just focusing on fixing bugs and implementing change requests, won't help. Instead it's important to take a long hard look at the project, and see if the underlying problems can be found and addressed.

This seems trivial, but when you're in the middle of an error-driven development project, it's hard to step out and take an objective look at it. What's more, you might not be able to look objectively at the process. Often, it requires someone who hasn't been involved from the start, to come and look at things with fresh eyes.

As a consultant who often works on a time-material basis, I often get hired to work on error-driven development projects. The reason for this is simple: often it appears to the people involved, that the project just need a little more resources, so they can get over the hurdle of errors, and then it will be on the right track. When hired for such projects, I always try to see if there are some underlying problems which needs to be addressed, instead of just going ahead and fixing errors/implementing changes. Unsurprisingly there often are such problems.

Frequently these problems can be fixed fairly simply (reversing some old design decisions, expanding peoples' domain knowledge, get people to communicate better, implement a test strategy, use agile methods etc.), while at other times, they can't be fixed, only taken into consideration, allowing you to avoid the worst pitfalls.

So, my suggestion is, if you find yourself in a project which over time has turned into an error-driven development type project, try to take a long hard look at what has caused this, instead of just going ahead and try to fix all the errors/implement the changes. Error reports and change requests are just noisy symptoms in most cases, and will continue to appear as long as the real problems aren't addressed in one way or another.

Ending error-driven development