We've got big balls ... of mud ...

The most popular and enduring architecture for all large projects is the one called "Big Ball Of Mud."

It's what happens when requirements grow in unexpected directions that cut directly across what were originally conceived as architectural boundaries, and met with adhoc code that connects parts of the program that the original architecture would never have allowed to touch each other. Then the whole thing is maintained for ten years, in the face of steadily expanding and frequently changing use cases and problems and by a host of people many of whom don't really know what's going on in the rest of the program, then gets bought by a competitor who starts reimplementing the whole thing in a different language and with a "proper architecture" but runs out of budget for that so they only manage to reimplement about a third of it, then link the stuff in the new language together with the stuff in the original language (and libraries in a third, not that anyone's looking at libraries at this point) and hack together the logical mismatches between "old" and "new" code by running perl scripts on the database. .....

And it just goes on like that - year after year after year.

Now imagine it's 2037 or whatever and you have a system where this process has taken the architecture to its logical conclusion. It's not just a big ball of mud any more, it's a planet of mud. There is no solid ground anywhere in any direction. Nothing like any kind of rational architecture (except Big Ball Of Mud architecture) has emerged.

What can a programmer who is responsible for some small part of this giant mudball - say, one file of code - do? In the 'thinking globally and acting locally' sense, because obviously one programmer on a whole planet of mud is in no position to impose new architecture on the whole and doesn't even have the information needed to decide a new architecture for the whole.

What modest steps can one programmer take to smooth out, and perhaps dry a little, their particular little bit of mud? Ideally, they should be low-effort steps that don't require all that much reaching into all the other bits of mud. And ideally, if all the individual programmers all over the mud planet did them in the course of maintaining their little bits of mud, it should eventually result in the simplification of the whole insofar as possible, and the emergence of some kind of rational architecture.

It might be an architecture that no one of the programmers would have designed. It might be an architectural paradigm that humans haven't even discovered yet. It might not be particularly elegant. But after a lot of iterations by people working all over the program, what small, feasible, local actions in a big ball of mud produce a program with more tractable organization?

Are there any metrics that a monitoring system can auto-generate that could give the programmer a "score" for whether they're improving or degrading the process of creating some kind of code organization?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

I see this kind of like 'keyhole optimization' in compilers.

There's a stage of optimization in a compiler known as 'keyhole optimization' where the compiler crawls over the code looking at instruction sequences maybe a dozen long, trying to match small-scale patterns and reorganize or improve the efficiency of whatever operation is done by that short sequence. It does things like lift additions out of loops and turn them into multiplications, reorder operations to avoid cache clashes, and so on. And then the 'keyhole' shifts forward or backward by one instruction, and continues.

But you could use a keyhole optimizer to do global optimization, slowly. A lot of the patterns matched could result in rewrites that don't necessarily optimize in an immediate sense but instead promote the chances of some later patternmatch at some other keyhole offset matching. So an addition that might be lifted out of a loop but you can't tell because the start of the loop is outside the keyhole, can be migrated to an earlier instruction, and then the offset shifted one instruction back to see if it can now find the addition and the start of the loop in the same keyhole. And slowly, one small local patternmatch at a time, optimizations that completely reorganize large chunks of machine code can get done.

I'm looking for something like that kind of keyhole optimization that programmers can just routinely do, which taken together, after enough iterations, would eventually result in large scale reorganization of source code.

Who owns Planet Mud?

Arguably, the internet is already our planet-sized ball of mud with broken links, deep dependencies, an ad-hoc mess of configurations, bugs, security vulnerabilities, services that break down due to problems in protocols you've never even heard about (cough BGP cough), etc.. The problem with a planet-sized ball of mud is that nobody can take responsibility to improve it. It would be unwise to let someone you do not completely trust trawl your servers with their optimizers.

I hope the future moves more in the direction of unhosted web applications. We share code around as needed, but own our data. Mobile agents would cover the other side, putting some code onto a remote server to serve as an ambassador of sorts. These days, we arguably don't own most of our data - it's controlled by Facebook, Google, Twitter, Amazon, adservers, etc.. and they have plenty of motivation to maintain status quo. But with an unhosted architecture, it is much more feasible for companies, communities, and individuals to truly take responsibility for not just their data, but also their software systems and user experiences.

In any case, since we already do have a planet-sized ball of mud, we also have developed many ad-hoc solutions to work with it: indexing and mapping, archival and backups, publishing and subscribing, tutorials, advertising, etc.. There is no fundamental reason that most of these solutions wouldn't work for software distributions at sufficient scales. Haskell has Hoogle to find functions of a given type, for example. We could make it easy to find all failing tests within a distribution. A new library could come with an interactive tutorial.