Programming as a means of reducing complexity
In the past year I’ve spent a fair amount of time considering programming from a different level of abstraction than I usually do. Part of the reason for this has been frustrations I’ve had implementing ideas. Part of it has been a desire for more power in programming. Part of it has been a need to unify the many directions my programming takes into one whole. And largely, it is the unquenchable desire to reach faster, farther, higher.
My search has demanded a solid basis for starting, a working definition of what it is that programming is and what it does. The definition I have been working with is the following:
Programming aims to reduce complexity
The complexity we speak of is the infinite complexity of the “real world”. To model the temperature in a house, to borrow an example from Booch, is inherently incomprehensible in its complexity. The variables are literally infinite, and therefore no a priori complete solution is tenable.
So we must settle for a simplified model of the problem domain. This invokes the necessary process of ‘naming’. Naming is not the assigning of a label to a real-world object. The real-world “object” we speak of takes shape because of our label, and not the other way around. The “object” doesn’t have an objective existence that we hunt to find and label. We make it come into existence. How is this so?
A good illustration of this could be the ocean. When you look at the ocean, what colors do you see? You may be able to name one color or a few that to you describe the colors of the ocean. If you had a picture of this scene, how would you draw the boundaries of those regions with a pen? Ah, then you would begin to see the inherent difficulty of the task. The pen requires that the delineation between regions be thin and precise. But the “real world” does not fit well into such a mold.
The same difficulty is felt when we define the boundaries of “entities” in a system we are building. The system of onomatology required to write a program does not allow for fuzzy boundaries. Object-oriented programming and similar hierarchical paradigms make our indecision more acute by attaching still greater significance to cold, hard rules about identity and relationships.
But, going back to the illustration of the ocean, what would be even more surprising is when we talk to an Eskimo from up north, and he comes up with dozens more names for different types of blue depending on the nature of the waves themselves. For him these are not arbitrary distinctions, as we have been trained to view them. They have real meaning in his eyes. He divides the world on different lines than we do. If we were to all take a picture of the world and cut it up by objects, the number of possible cuttings would equal the number of subjects.
Another brief example. You define in a program a Teapot object. Does that include a samovar, or not? Would that be a different object? Would it be hierarchically related, if your language forces that paradigm on you? Our conceptions of naming depend greatly on how we have seen objects termed. The applicability of semantics here is patent, and Hayakawa is as pertinent as ever.
And so we come to our above-mentioned realizations. Firstly, programming, by its nature, requires we be Adam in Eden, endlessly naming each and every entity that crosses our path. In the process, we are defining what entities we choose to exist in our necessarily idealized, paradisaic model of the real world. The model does not exist until we begin naming. Once we name an object, that nomenclature will affect how we view and interact with the object, and hence, the nature of the resulting program. Indeed, in object-oriented programming the names and family trees you set up initially will make or break the design, and the most effective object-oriented programmers are those with a knack for this Aristotelian naming game. It is this process of creation that can make programming such an enjoyable task, and at times leaves you with the feeling that you have been painting masterpieces of verdant landscapes for hours.
Having reached this understanding, the next step is to recognize our tasks as programmers: to reduce complexity. The real world is far too complicated to be even approached. Our modeled world is much simpler, by necessity. But still, to be of any value it needs to have subsumed some of the problem domain.
The argument for reducing complexity becomes still more convincing when we consider the realities of programming: maintenance and collaboration. Collaboration requires that another person can look at your code and, via osmosis, soak in your view of the model. He must see clearly the abstractions you are using to interact with this mental model, so that he can work in harmony with them.
From our definition of collaboration, it follows that maintenance is really the same thing: We collaborate with ourselves, though separated not merely by three dimensions, but by four. When we maintain code we have to soak in to our minds anew the mental model that we used when the code was written at some point in the past. What was a natural model for the “me” I was 6 months ago then may not be as intuitive for the “me” I am today. If we only had a DeLorean we would see our productivity increase by leaps and bounds.
And so the reality of programming is that we must learn to manage complexity in such a way that others can work easily with what we have written. ‘Others’ include team members now, and future maintainers, including ourselves.
Thusly, good programming can be rightly defined as reducing complexity by means of abstractions. The need for this is obvious from our preceding arguments. The methods are many and varied. In fact, the point of this essay is to show that everything about programming is an attempt to reduce complexity by introducing abstractions. Even naming, as discussed above, is a means of attaching handles so that we can manipulate what we desire - an abstraction for the sake of reducing complexity.
With no desire at exhaustiveness, let’s briefly consider a few examples.
One of the most basic abstractions of all in modern programming is the use of the function. It has many epithets, including method, procedure, and others which do not concern us now. A function allows us to put a measure of complexity behind a wall of abstraction so that we can manipulate larger concepts in our minds. We are freed from dealing with the nitty gritty of implementation, and are allowed to deal with piecing together these larger chunks. Thus we reduce complexity. When functions are properly written, as black boxes, without immodestly exposing their internals to consumers, the load on the programmer’s mind is commensurately reduced.
The most common abstraction in today’s world is the object. This takes the benefits of programming with functions and goes a step further. Now a group of functions is put behind a larger wall of abstraction. Now we have an interface we can deal with, and manipulate with seeming ease an object of deeper and greater complexity. When an object-oriented system adheres to the basic tenets of OOP, complexity is reduced.
There are, of course, many other abstractions. Some are more popular than other, and much depends on the language of implementation. In some languages templates add yet another method of abstraction. A whole other class of languages is called functional, and they have completely different methods of abstraction then procedural or object-oriented languages. Common techniques across many languages include: information hiding, modularization, encapsulation, simplicity. All of these help us deal with the problem, as pointed out by Dijkstra in 1972: No one’s brain can hold a whole program at one time.
We are limited by 1) our mental capacity for complexity and by 2) what Kant would call the discursivity of our knowledge, to seek abstractions and hold on to them for our lives.
That is a brief tour of my thoughts on this subject. In the last year I have worked with dozens of new languages, reading, experimenting, writing working programs, not just in the language, but into the language, as McConnell terms it. These experiences have provided much fodder for thought and contemplation. Delving into areas of more abstract mathematics, such as group theory, set theory, and lambda calculus, has made me consider more closely the foundations of programming and what can be done to ameliorate the difficulties of the current state of affairs.
My thoughts turned next to the practical. Problems that I faced were now seen in a new light: “What is an abstraction that would reduce the complexity here?” For instance, in building intranet and extranet software, a fair amount of repetition occurs. Similar paradigms for accessing a data store, viewing, listing, searching for, adding, modifying, and reporting on data is going to lead to similar interfaces and backing code. However, the problem domain is not simple enough to allow for a simple solution. Complexity comes when upgrades are requested, interaction with data sources of a different type, or an interface via a different medium (GUI, mobile, text, binary).
I did an incredible amount of research to see what solutions are currently offered. More and more is being done in terms of services these days, via SOAP or XML, or other protocols to allow for detached communication between systems written in different languages, running in different environments. I definitely view these steps as being in the right direction. But the current “solutions” are not solutions to the more abstract and gnawing questions that I am asking. They contribute more to the problem than to the solution. They muddy the waters. Often they tie you down more with special syntax, special development environments, and defined ways of thinking that do not solve the range of my domains.
New ground must be struck. What is needed is not more of the same but some lateral shifting, to paraphrase Pirsig.
In that spirit, here are some of the questions that drove me.
How can these complexities be managed? How can I avoid the error-prone repetition of code and the consequent maintenance issues when system-wide upgrades must be handled? How can I express my thoughts clearly, and at a level of abstraction, that my chosen programming language can understand? How can I unify the disparate areas of my programming that really have much the same in principle (the models), although everything else visible (the code) is completely different?
In short, how can I be more powerful at abstracting the difficult, reducing complexity in a way that has not hitherto been achieved?
I know this all seems long, but I would hate to just give the answer to my wanderings without the background. That would be a disservice to everyone who reads this. More important is understanding the path I took and why, and where it left me and why.
In my next installment I plan on presenting possible solutions I discovered and why I was not pleased with them, and the solution I ended up finding and my reasons for believing in its superiority.
Related Posts
Tags: abstraction • complexity • programming
Posted in programming on November 17th, 2006 |
