Getting The Problem Right
by Dave Berton September, 2014

There are many difficulties involved in building large systems, where “large" can be measured in terms of database size, or number of transactions, or number of integrations. There are general principles which can help guide you — such as loose coupling between parts of the system, or encapsulating the parts that change — but the most difficult is often defining the problem to begin with. Getting the problem right is necessary for architecting a successful solution, and at the same time it also provides the initial artifice behind which we can iterate over progressively better implementations. Starting with a decent definition of the problem almost always leads the way toward an initial design. The initial design will introduce a general API, or a general organization for services and/or endpoints. And once we’ve started on defining all these points of interaction, we are free to write the guts of the code knowing full well we could rip out and rewrite our specific implementations without upsetting the whole applecart of a solution. Often the solutions to large scale problems rely on getting the problem right, and not necessarily on a particular technical detail or optimization or whiz-bang widget.

The C++ programming language first came onto the scene in the early 1980s, and it set about solving a number of programming problems, starting with object oriented programming features. The very early versions of C++ had lists and arrays, but they were crippled because there was no mechanism to treat them in a uniform manner. Treating sequences of “elements” in a generic way is important, and not being able to do so was a problem to be solved. Generic types and operations were tackled later in the 80s, with C++ templates eventually providing parameterized types to generic algorithms. As a programming tool, this became indispensable and paved the way for a true standard library of types and algorithms (the STL) which programmers could rely on. This was the correct problem to solve — and it was solved initially by using clumsy macros. But even though that specific implementation was lacking, conceptualizing and defining the problem (a compile time mechanism for generic programming, without runtime support) paved the way for compilers to catch up and support generic programming directly.

Early version of the Python language contained many useful solutions for programmers. It was a high-level language, making it suitable for more than just scripting. It provided flexible types, including arrays and dictionaries, as well as a large collection of useful and extensible modules. The concept of iterating over items of any sequence differs from other languages such as C or Pascal, but it was a great problem to define and solve. Yet the concept of iterating was not complete: dictionaries could not be easily iterated over without creating intermediate lists of keys. Since the concept was correct, Python eventually evolved to the point where it could implement dictionary iterators and idiomatic Python became more uniform across different containers.

The Qt C++ library also solves a pile of different programming problems, and it does so across a large number of systems and architectures. The C++ language itself originally lacked support for a generic observer pattern, and since this is a key problem in UI programs, Qt proceeded to implement a solution for this problem. One could argue that Qt's meta-object compiler (moc) potentially implements the wrong solution (in terms of speed and true type safety (where type problems are only detected at runtime)), but at the same time potentially correct in terms of leveraging additional features (such as object introspection). Wrong solution (perhaps), but it was the correct problem to solve. In the meantime, there are now proposals to support object reflection (compile time and runtime) within the C++ standards effort itself, further validating Qt’s attempt to get the problem right.

Back to Python. The backwards-incompatible version 3 purported to solve the problem of dealing with text by treating everything internally as unicode. Defining the problem in this way leads to all sorts of implementation decisions in the solution, many of which can be seen to hamper the efforts of those dealing with byte string (such as those used by almost all unix tools, and internet-based applications such as web browser and crawlers).

We can iterate on implementations, however if our problem definition is off in the weeds somewhere, the slickest implementation won’t actually solve anything.