Tuesday, May 29, 2012

Software Engineering is Engineering

Ultimately, real advances in software development
depend upon advances in programming techniques,
which in turn mean advances in programming languages.
-- Jack W. Reeves, 1992.

I hope you liked the previous article "Why We're Writing the Same Code Over and Over". I think the idea of Abstraction cost is very useful in particular for understanding the progress of software development industry. Another important thing in that regard is seeing the right parallels to other industries.

There are a lot of people arguing what is software development: engineering or craftsmanship? But it is a false dichotomy.

The usual argument is that there are software development projects the exact outcome of which is hard to predict, so software development must not be engineering. But engineering projects in more traditional industries also differ greatly in predictability of exact outcome.

Years ago I read an article that completely changed how I thought about software design.
The name of the article is What Is Software Design? It was written by JackReeves and published in The C++ Journal Vol. 2, No. 2. 1992.
Essentially the point is this. When the original phases of software development were laid down, they were just plain wrong. Requirements, Design, Implementation, and Test are not what we think they are. Design is not something that you do only before you code. Implementation is not the act of coding. We can see this if we look realistically at what they are in other engineering disciplines.

(from the article):
The final goal of any engineering activity is to create some kind of documentation. When a design effort is complete, the design documentation is given to the manufacturing team. This is a different set of people with a different set of skills from those of the design team. If the design documents truly represent a complete design, the manufacturing team can proceed to build the product. In fact, they can proceed to build much of the product without further assistance from the designers. After reviewing the software development life cycle today, it appears that the only software documentation that actually seems to satisfy the criteria of an engineering design are the source code listings.
This is how people build bridges and buildings and toasters...

Think about it. When you are programming, you are doing detailed design. The manufacturing team for software is your compiler or interpreter. The source is the only complete specification of what the software will do. The cute boxes in class diagrams are not the design, they are a high level view of the design.
Consider engineers making a bridge. They make plans (which he calls design documents) and hand them to the construction team building the bridge. EE's make schematics (which he calls design documents) and hand them to the techs with soldering irons. We make source code (which he calls design documents) and hand it to the computer which manufactures software. The parallel is striking. We are just the first branch of engineering that had a way to go directly from design to product at the press of a button. Building, or implementation, is so easy in software that we overlooked it and called programming implementation. We forget that every coding decision is a design decision which impacts the capabilities of the product. -- MichaelFeathers

What Is Software Design?
This essay first appeared in the Fall, 1992 issue of C++ Journal.
Object oriented techniques, and C++ in particular, seem to be taking the software world by storm.
Again, there are probably a number of reasons why, but I want to suggest an answer from a slightly different perspective: C++ has become popular because it makes it easier to design software and program at the same time.
For almost 10 years I have felt that the software industry collectively misses a subtle point about the difference between developing a software design and what a software design really is.
the software industry has created some false parallels with hardware engineering while missing some perfectly valid parallels.
To summarize:
  • Real software runs on computers. It is a sequence of ones and zeros that is stored on some magnetic media. It is not a program listing in C++ (or any other programming language).
  • A program listing is a document that represents a software design. Compilers and linkers actually build software designs.
  • Real software is incredibly cheap to build, and getting cheaper all the time as computers get faster.
  • Real software is incredibly expensive to design. This is true because software is incredibly complex and because practically all the steps of a software project are part of the design process.
  • Programming is a design activity—a good software design process recognizes this and does not hesitate to code when coding makes sense.
  • Coding actually makes sense more often than believed. Often the process of rendering the design in code will reveal oversights and the need for additional design effort. The earlier this occurs, the better the design will be.
  • Since software is so cheap to build, formal engineering validation methods are not of much use in real world software development. It is easier and cheaper to just build the design and test it than to try to prove it.
  • Testing and debugging are design activities—they are the software equivalent of the design validation and refinement processes of other engineering disciplines. A good software design process recognizes this and does not try to short change the steps.
  • There are other design activities—call them top level design, module design, structural design, architectural design, or whatever. A good software design process recognizes this and deliberately includes the steps.
  • All design activities interact. A good software design process recognizes this and allows the design to change, sometimes radically, as various design steps reveal the need.
  • Many different software design notations are potentially useful—as auxiliary documentation and as tools to help facilitate the design process. They are not a software design.
It is tempting to ask if there is any other engineering discipline that can produce designs of such complexity as software in such a short time, but first we have to figure out how to measure and compare complexity. Nevertheless, it is obvious that software designs get very large rather quickly.
complex hardware designs have correspondingly complex and expensive build phases. As a result, the ability to manufacture such systems limits the number of companies that produce truly complex hardware designs. No such limitations exist for software....
If anything, as CAD and CAM systems have helped hardware designers to create more and more complex designs, hardware engineering is becoming more and more like software development.
It seems obvious to most people that software designs do not go through the same rigorous engineering as hardware designs. However, if we consider source code as design, we see that software designers actually do a considerable amount of validating and refining their designs. Software designers do not call it engineering, however, we call it testing and debugging. Most people do not consider testing and debugging as real "engineering"; certainly not in the software business. The reason has more to do with the refusal of the software industry to accept code as design than with any real engineering difference. Mock-ups, prototypes, and bread-boards are actually an accepted part of other engineering disciplines. Software designers do not have or use more formal methods of validating their designs because of the simple economics of the software build cycle.

Revelation number two: it is cheaper and simpler to just build the design and test it than to do anything else.
Most current software development processes try to segregate the different phases of software design into separate pigeon-holes. The top level design must be completed and frozen before any code is written. Testing and debugging are necessary just to weed out the construction mistakes. In between are the programmers, the construction workers of the software industry.
For example, no other modern industry would tolerate a rework rate of over 100% in its manufacturing process. A construction worker who can not build it right the first time, most of the time, is soon out of a job. In software, even the smallest piece of code is likely to be revised or completely rewritten during testing and debugging. We accept this sort of refinement during a creative process like design, not as part of a manufacturing process. No one expects an engineer to create a perfect design the first time. Even if she does, it must still be put through the refinement process just to prove that it was perfect.
The overwhelming problem with software development is that everything  is part of the design process. Coding is design, testing and debugging are part of design, and what we typically call software design is still part of design.
It would be nice if top level designers could ignore the details of module algorithm design. Likewise, it would be nice if programmers did not have to worry about top level design issues when designing the internal algorithms of a module. Unfortunately, the aspects of one design layer intrude into the others. The choice of algorithms for a given module can be as important to the overall success of the software system as any of the higher level design aspects.
It is probably better to let the original designers write the original code, rather than have someone else translate a language independent design later. What we need is a unified design notation suitable for all levels of design. In other words, we need a programming language that is also suitable for capturing high level design concepts. This is where C++ comes in. C++ is a programming language suitable for real world projects that is also a more expressive software design language. C++ allows us to directly express high level information about design components. This makes it easier to produce the design, and easier to refine it later. With its stronger type checking, it also helps the process of detecting design errors. This results in a more robust design, in essence a better engineered design.

Ultimately, a software design must be represented in some programming language, and then validated and refined via a build/test cycle. Any pretense otherwise is just silliness. Consider what software development tools and techniques have gained popularity. Structured programming was considered a breakthrough in its time. Pascal popularized it and in turn became popular. Object oriented design is the new rage and C++ is at the heart of it. Now think about what has not worked. CASE tools? Popular, yes; universal, no. Structure charts? Same thing. Likewise, Warner-Orr diagrams, Booch diagrams, object diagrams, you name it.
This says that the collective subconscious of the software industry instinctively knows that improvements in programming techniques and real world programming languages in particular are overwhelmingly more important than anything else in the software business. It also says that programmers are interested in design. When more expressive programming languages become available, software developers will adopt them.

Also consider how the process of software development is changing. Once upon a time we had the waterfall process. Now we talk of spiral development and rapid prototyping. While such techniques are often justified with terms like "risk abatement" and "shortened product delivery times", they are really just excuses to start coding earlier in the life cycle. This is good. This allows the build/test cycle to start validating and refining the design earlier. It also means that it is more likely that the software designers that developed the top level design are still around to do the detailed design.

As noted above—engineering is more about how you do the process than it is about what the final product looks like. We in the software business are close to being engineers, but we need a couple of perceptual changes. Programming and the build/test cycle are central to the process of engineering software. We need to manage them as such.
Undoubtedly, keeping such documentation up to date manually is difficult. This is another argument for the need for more expressive programming languages. It is also an argument for keeping such auxiliary documentation to a minimum and keeping it as informal as possible until as late in the project as possible.

What Is Software Design: 13 Years Later
When the article appeared, I hoped–actually expected–that I would get some type of rebuttal from some sort of industry "expert." I was looking forward to this since part of my reason for writing the article had been hopes of stimulating discussion within the software industry about the overall software development process. Nothing happened.
Everybody that has been in this business any length of time has seen plenty of examples where someone obviously sat down and coded the first thing that popped into their mind. Later, when it became obvious that there were shortcomings with the approach, there was too much blood, sweat, and "skin" invested in the code to scrap it and do something better. Fine, we all know a little thought can go a long way.

On the other hand, any of us who has spent time on a traditional development project with its strict rules forbidding the writing of a single line of code until the "design" is completed and reviewed and approved, etc. knows you can waste a hell of a lot of time producing documents that are out of date literally days after the actual coding starts. Why bother?

You think we could find some happy medium of "enough" design effort, but not too much. There is no such thing. The only way we validate a software design is by building it and testing it. There is no silver bullet, and no "right way" to do design. Sometimes an hour, a day, or even a week spent thinking about a problem can make a big difference when the coding actually starts. Other times, 5 minutes of testing will reveal something you never would have thought about no matter how long you tried. We do the best we can under the circumstances, and then refine it.

Letter to the Editor (Precursor to What Is Software Design?)
I get somewhat testy when people start making gratuitous comparisons between software design and other engineering disciplines. Major microprocessors have been shipped with bugs in their logic, bridges have collapsed, dams broken, airliners fallen out of the sky, and thousands of automobiles and other consumer products recalled - all within recent memory and all the result of design errors.
If what we are really doing is software design, then everything we do will somehow be reflected in code. We might as well write the code (or that portion of it that makes sense) when we make the decisions that affect that code.

I know all the arguments for "language independent" software design notations. They all ignore a fundamental problem. Software design involves translating concepts from some problem space into a programming language. This translation has to be done by human beings, and since our programming languages are usually totally inadequate to express the concepts of the problem space directly, it is usually a difficult and error prone process. When we translate concepts from one form to another, especially complex ones, we often loose important information. If several translations are involved, we are likely to end up with a final product that lost too much vital information, that does not accurately reflect our original concept, and/or that simply contains errors. This is compounded several times when the people actually doing the translation are different for each step. Remember, there is nothing sacred about C++ (or Ada, or C, or Smalltalk, or LISP, or any programming language). It is not the native language of our computers. Fundamentally, programming languages are just a design notation themselves. I do not see any point in introducing extra translation steps if they can be avoided.
Ultimately, all software design processes end up validating and refining the final design via a build/test cycle. Any pretense otherwise is just silliness. Yet, traditional MIL-STD and other waterfall model development processes will not even allow writing one line of code until a certain tonnage of auxiliary documentation has been produced and reviewed. Often, the people who produce this documentation then go on to other things leaving a group of new, and usually much younger and less experienced people to actually generate the real software design. It is hardly surprising (to me anyway) that this process has fallen into such disrepute that no real developers advocate it.
Maybe if we started treating software development as a homogeneous design process, and concentrated on improving the most important phases (programming, debug and test), we might find our industry to be more of a disciplined science than we think it is.
The next article is "Are Variables Evil?".

No comments:

Post a Comment