Abstractions considered harmful?

Posit that OO was a reaction to GUI development where control flow and state in procedural code had gotten out of hand, so it was segregated for manageable design. Then, doesn't abstraction fail us when it comes time to debug? I'll want to see the control flow in a linear fashion, not follow the program counter go flying through umpteen different files, mentally trying to piece everything back together again. If abstraction has thusly harmed us, are there other abstractions that throw the debugging baby out with the design bath water?

Note: Ah, sorry to see this is perhaps belabouring the point (although I hope it is a slightly different facet of abstractions).

News: The wiki has started. Please contribute.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Linearity?

Are you sure you really want linearity? That implies to me some serious inefficiencies, like massive inlining and the elimination of functions. I don't think a linear program counter during debugging is worth giving up the benefits of encapsulation and modularization. That's what stack traces and watches and breakpoints are for.

I think the old days of spaghetti code made debugging far worse, because interactions and coupling between different parts of code tended to be far greater. With modern languages, it's much easier to isolate the source of bugs to specific types and modules and functions. Debugging will never be *pleasant*, but I definitely think it has gotten better rather than worse.

Re: inefficiencies

The underlying code isn't required to change. What if there were a means of having the debugger show an unravelled view of the code? See, here we'd be introducing an abstraction (you aren't following the real PC) so that debugging gets easier - for certain types of debugging; each mode of debugging would want to have abstractions that suited it.

So this would be a model of software development and maintenance where we'd want to figure out how to have out abstractions, and eat them too. Perhaps there would be things the development side would have to do to make the debugging side more tenable?

(I'd like everybody to put on your thinking caps and free their minds of how "debugging will never be pleasant" and try to imagine totally crazy, dope-smoking things that would make it slightly less horribly evil and - dare we dream - some day pleasant? Otherwise, how are we ever going to improve things? We'll only be improving the development side, and the debugging side will mostly benefit by dumb luck.)

You don't want to see the PC

You don't want to see the PC tracking linearly. For example, you don't want to see other threads on the machine. You want to abstract those away. You don't want to see the kernel implementation of system calls. You don't want to see what's inside teh GUI library. Those, too, should be abstracted away.

In fact, if you're debugging, you want to be able to abstract away all of the trustworthy code. You know there aren't bugs there. You want to be able to tell your debugging tools that you trust that code to obey certain contracts or types, and be done with it. Now you can focus on the untrustworthy, buggy code.

Re: You can't handle the truth!

Believe me, I grok. I am not trying to claim that all abstractions are bad, I am only claiming that not all abstractions help in all situations, and that it appears to me that the software development culture doesn't think enough about how abstractions could help or hinder maintenance.

Re: Linearity & the PC

Even if things are better than the Pasta Days, how much further might we go in improving debugging: letting us view the code in a style that aids us in finding the bugs? I think of the complaints people have that to understand a method in a class, you have to then go hunting through all the related (of course at least parent/child) classes to really see what it the implementation is finally doing. (An old co-worker suggested that ideally people would write code from the point of view of the debugger. You'd be less likely to get into huge architectures because you'd realize how byzantine they'd look later.)

By 'linearity' I meant (probably misusing terms) that when I have no idea where the bug is, I tend to want to step through my code as one long listing rather than jumping to-and-fro. That helps me understand what I actually wrote (vs. what I intended to write). Abstraction hides things, and when the bug is in the hidden thing, that's trouble.

I agree that in general I don't dip into libraries. It would be nice to be able to select what modules get 'woven' into the 'linear' view of things.

What other abstractions do we use that can make maintenance and debugging hard? We spend all this effort on architecture etc., but we should realize that at least 50% of what we'll do with software is debug and change it. Some of our abstractions don't help us there.

(Another take on it all: what are the ways in which abstractions are mis-applied, because developers don't think ahead or have enough experience with debugging? Shouldn't we codify & teach what not to do? Are there good anti-patterns that are all about code maintenance? Depth vs. breadth in object/type/whatever hierarchies, etc.)

E.g.: Concurrency - how do we abstract & what can we do to make debugging more sensible to a single-thread-minded human? Could there be more explicit synchronization points in shipping code so we can narrow the debugging search window? If people shudder to think of hobbling the system so, what would an economist say about the costs incurred for developing vs. debugging (assuming they weren't the kind of economist that loved to leave costs hidden)?

I think functional approaches convey benefits more clearly than OO for both the design and debugging sides of the coin. That is / could be a good selling point?

Why would you...

...posit such a thought about OO, given how very inaccurate it is?

Meanwhile, writing as a working OO programmer, on the rare occasions that I start a debugger I find abstraction working in my favour. For instance, the stack of abstractions that means I'm looking at my sourcecode, and not opcodes for the assembler generated when the C source of the VM for the language I'm working in was compiled, along with a hex dump of the runtime datastructure that represents my program and its data, for example.

Or there again the stack of abstractions that mean that I can "skip" a method call when stepping through some code, rather than having to chug through some linear trace of everything that happens all the way down to the native code (if not further?) when my program executes

puts "Hello World"

But then, stepping through long traces of method invocations is not the best use of a debugger in OO programming. Skillful use of a debugger is to pop in a breakpoint at an interesting place then look at the data structures to see if they are what you expect. If the flow of control seems to be wrong, then interpolating finer-grained unit tests to localise the problem is what the (smart) working programmer does, these days.

It's true that it's very hard to read OO source code and figure out what the trace of a big span of it will be at runtime. So hard, in fact, that noone with an ounce of sense does that. In the exemplar OO languages (Smalltalk, Self) reading large spans of source isn't something that the environment offers any affordances to do, and that's for a reason. Don't try and figure out what the late binding will end up calling: that's what we have computers for. Run the code and get it to tell you what it does!

On the other hand, writing as a FP neophite, it seems to me that (depending on the purity of the language used) there's very little about what's going to happen when you evaluate a form that you can't figure out statically from the program text, yes? In which case FP would be superior to OO in industry as regards debugging iff figuring out exact program behaviour from source listings was a big part of many programmers' job these days (I can remember when it often was). It isn't. Nice try, though.

Keith

Re: Bad posit, bad!

Feel free to elide the claims about reasons for invention. However, keep the bit about what OOP addresses vs. the old classic procedural approaches in some practice.

Re: the (smart) working programmer

Please to be keeping in mind that one's personal experience is not sufficient to accurately claim to know about software development in general! Within the entire enchilada of software development many (smart) people work in systems where one cannot have a unit test that will find a particularly evil bug without extraordinary expense or time invested, e.g.: some particularly embedded systems.

Additionally, please to be keeping in mind that the term 'bug' has a wide range of implications! E.g.: I would be interested in knowing what you think about code reviews, where people do, in fact, end up "reading large spans of source."

Abstraction is good for debugging

One major feature of abstraction is that it lets you debug things separately. Once you know that a piece of code is correct, you don't want to have to think about how it works in the process of debugging new code that uses it. You want to hide the details and think of it as something conceptually atomic. This applies about equally in functional and OO languages. You don't want to debug all the classes in your OO program at the same time -- you want to debug the one that you just wrote. (and switch to debugging another one only if you discover a serious problem in its behaviour)

Functional languages do tend to have a lot less mutable state floating around though, and tend to preserve referential transparency, so it tends to be easier to verify that each thing actually does what it's supposed to do in isolation, since you don't have to reconstruct some complicated program state. As such, the typical functions of debuggers like breakpoints and stack traces become almost unnecessary. I get along just fine with nothing more than a REPL in Haskell. (Which is good because if I needed a stack trace, meaningful results would be hard to obtain, and breakpoints don't really mean all that much, at least in the context of pure functions)

This can certainly be true, at least in C++

I've been in situations (writing large games for small consoles) where transparency is sometimes a huge win vs. layers of abstraction. It's for the reason you mention: something that looks simple in the source can involve tracing through deep call chains.

I don't see abstraction as the culprit, but that there's a point where building your own abstractions, layer upon layer, can eventually result in large gap between how you're thinking about your code and the level the tools work at.

Re: reality

(Interesting to hear from somebody who appears to have software development experience in more-than-simple projects!)

So if you do end up with abstractions, layer upon layer, that made sense while they were being built but nevertheless are somehow hindering finding that evil show-stopper bug, what do you do? What could your development tools do for you to help? What could your up-front design have done to help?

A classic to me is concurrency - folks design with shared state, and then they end up in the hell of deadlocks etc. So they have to write deadlock detection tools which may or may not find the problem (or simply force the machine to keep going, even in an 'invalid' state, for the sake of 'progress'). I'd hazard to guess that more-than-simple game code falls into this category. What if a different design 'abstraction' had been used (e.g.: actors)?

Example tool: "Every IDE should be able to automatically make control flow / sequence diagrams." That isn't inventing something new, this is just getting the tools people already know about into everybody's hands. I'm sure there are new-fangled things one could also invent that might help debug under certain circumstances.

Choosing abstractions

IMHO, part of the problem stems from building systems using abstractions that are non-compositional in nature. A good example is the shared-state concurrency model you mentioned - diagnosing the problem requires understanding what all of the involved threads are doing, and how they act on the state variable in question. The state (and thus behavior) of each thread is irrevocably bound to the behavior of other threads.

On the other hand, a CSP-based (or actors-based) model is compositional in nature - the state (and thus behavior) of each process (actor) can be understood in isolation. The interactions between any two pairs of processes are readily understood and analyzed. That pair of processes can then be considered as a single "process", and the interaction of that process with other processes can be considered. Global system state is a pure composition of the state of each component. There are well-known communications patterns (Welch's IO-PAR and IO-SEQ) for composing systems of processes in a deadlock-free fashion, and established theoretical tools for diagnosing deadlock via local analysis of subsets of a process network. In this case, the correct choice of abstractions results in (a) the ability to design in a way that prevents bugs in the first place, and (b) the generation of designs that are far more amenable to debugging should a bug occur.

I believe that a good abstraction should allow you to ignore internal details and focus on interfaces - otherwise "divide-and-conquer" debugging won't work, because you need to look at the internals of every single component, as well as its externally observable behavior. Threads clearly do not meet this criterion, which is why I think they're such a poor and bug-prone abstraction. By the same token, a well designed object-based abstraction should be an aid to debugging, by isolating different elements of the system in a way that permits a divide-and-conquer approach to assuring system correctness.

Re: delicious abstractions

Yes, that sounds eminently plausible, I buy it!

Can we think of other abstractions that get in the way, to differing degrees e.g.: OOD? Can we boil out the reasons for why they get in the way? Is it just the (non-)composability aspect? Can something be composable but still give us headaches when it comes time to debug? What about the sheer size of things? How about being able to rename things in the debugger so you can start to understand Other People's Code that you have to deal with? How about...?

I realize this might sound like something on /., to wit:

  1. Realize debugging can be hindered by abstractions of various ilk.
  2. [???]
  3. Profit!
But if we don't try to invent that middle part then surely we suck.

Debugging relies on reasoning

Well, I think that debugging in general boils down to reasoning about your code. Look at even simple "print statement debugging": you insert some code somewhere to understand an intermediate stage of your computation, which permits you to reason about how that intermediate stage resulted in an incorrect value, or how the correct intermediate value was transformed into an incorrect final result - repeat recursively until bug is pinpointed. I would imagine that any abstraction which aids in reasoning will therefore help with debugging, and any abstraction that makes it harder to reason will make debugging more difficult. Coming from that perspective, it seems likely that the formal methods and FP folks probably have some useful insights into what abstractions are an aid to debugging, because they're much more focused on reasoning to begin with.

Based on that observation, some obvious (but by no means comprehensive) guidelines include:

  • Minimize state
  • Where state exists, encapsulate it in composable chunks
  • Use generalized, standard combinators with well-understood properties to perform composition
Getting back to your question, I'd imagine that any abstraction that doesn't follow these guidelines is probably going to be harder to reason about, and therefore harder to debug. Threads are a good example, as we've already discussed. The use of goto for control flow instead of standard contructs like for- and while-loops is another. I suspect that one of the reasons that OO can lead to problematic designs is the lack of standard combinators. As of right now, you pretty much have to look at the guts of an object to understand how it's going to interact with other objects, e.g. which methods does it use or ignore? what is the direction of information flow? where does control flow reside? If I compose two objects into a system, what does the resulting composite do? I think the advent of "patterns" is probably helping some, but there is still a lack of real formality (and thus reasoning power).

Re: "Once the people begin to reason, all is lost."

Excellent points, thank you!

You comments reminded me of my old boss's work: Check out Natural Programming which explicitly worries about debugging! I had read about HANDS before, but marmalade (Lord, give me strength: would people please just capitalize things normally?) is new to me. The agreed upon point is the statement that "debugging is determining the difference between how you think the system should act and how it really does." Sort of like the "You keep using that word. I do not think it means, what you think it means" idea, where the 'word' is the thought we have in our head of what we wanted the program to do, vs. what we actually wrote in our source code.

Regarding the fact that objects are composed in horrible, horrible ways: Yes, that makes sense to me. God knows, my own code is like that all too often. There was a DDJ article where the author said something like "make all connections like the relational DB model: there are not connections in the objects, rather in extra objects that join them all up. That way it is explicit and more easily changed." It seemed weird and I never took him up on the challenge, but perhaps I should?

My program didn't do what I expected? Inconceivable!

Interesting. Whyline (part of marmalade) sounds like a neat project. Of course, in order to express a question like "why didn't x happen?" means that you need to be able to precisely describe what it is you expected to happen. There might even be some value in doing that before you write any code. At which point we come full circle to formal specification methods :-)

I'll have to see if I can dig up that DDJ article. It sounds intriguing. I have no idea whether or not the approach the author recommends would actually help with debugging, but it seems like it might be worthwhile trying.

While I haven't actually read it, it seems plausible that Categories for Software Engineering (recently referred to on LtU here) might provide some useful insights on how one might reason about large-scale abstractions (or construct abstractions that are more amenable to reasoning).

++(Formal Methods)

That strikes me as all too valid a point. I have used Z ("zed") before, it was actually kind of nice. However, I don't have enough personal experience to say if formal methods get in the way of "just getting stuff done" under the pressure folks can feel at work. At the moment, my gut feeling is that I would seriously enjoy working in a commercial quality project that took a more formal approach.

On the other flipper, when I think of working on a personal project, I want to work on a hacked up implementation first to get the idea fleshed out. I'd then like to not ship that, and write a new version that used more design by contract and other formalizations. The reality seems to be that people only rarely throw away the prototype. (Maybe it should always be written in a language that makes one feel dirty, so one will have to re-write it?) OK, actually, I want the prototype to be done in a language system that is so amazing that the prototype is robust and yet not a big hassle. I think of type inference in Haskell/*ML like that. Could there be progressive refinement of formal methods in a system? Could there be automated inference of contracts in a system based on the code you write - as a means of hilighting that it is perhaps doing something you didn't intend? Could there be some sub class of auto-generated unit tests? I dunno, I'm just throwing out ideas.

Even if using formal methods is a good answer, are there things we could do in the debugging phase to help all situations, including if the code we wrote kinda sucked?

What maybe I'd like to see is a book or class or site (I'll go fire one up on wikispaces) where software development is explored from a perspective heavily weighted towards debugging. Can we, for example, throw out all languages that have no debugger? Can we then order things according to how good their debugger is? That seems like the most basic thing, and yet I find so many systems (Java, Clean, Haskell?, etc.) that are apparently lacking in powerful debugging tools.

[edit: added wikispaces link.]

Praxis

On the other flipper, when I think of working on a personal project, I want to work on a hacked up implementation first to get the idea fleshed out. I'd then like to not ship that, and write a new version that used more design by contract and other formalizations. The reality seems to be that people only rarely throw away the prototype.

Praxis High-Integrity Systems, probably one of the world's foremost users of formal methods (principally Z and CSP) in software development actually advocates prototyping prior to formalization. Creating a formal spec only makes sense if you know what to specify. "Hacked up implementations" can help with the process of determining what to specify.

As far as 'automated inference of contracts' goes, isn't that pretty much what type inference in a language with an extremely expressive type system does? It's not apparent to me what you could do beyond that. Auto-generated tests certainly seems feasible though.

Re: Praxis

Yeah, it was both nice and sad to see them in the last IEEE Spectrum. Nice because people should know, sad because this is news? Our industry sucks.

For all that some people like

For all that some people like Z, it's not necessarily where formal specification is at these days. You may not have been immediately drawn to Z, but you might have more enjoyment from using a different formalism.

As to stepwise refinement, that tends to be a big part of the deal for a formal specification methodology.

QuickCheck!

The second QuickCheck paper shows how to write specifications with QuickCheck.

In my opinion, QuickCheck is one of the best debugging tools available because you write executable properties that your code should adhere to, and then you generate test cases to see if you can find something that breaks your specification.

Even simple properties and simple generators can quickly find boundary problems, off-by-one errors and more. I've even written a Test Driven Development version of QuickCheck.

QuickCheck effectively generates an infinite number of unit tests to see if they all match your spec. How can it get any better? Try QuickCheck today!


--Shae Erisson - ScannedInAvian.com

Re: off by one

It is terrific to have tools which get us stronger testing and understanding of when our code is/n't behaving.

Having said that, I feel compelled to point out it is half the battle - once we know there is a problem, we have to figure out what the root cause is. That's where I wonder if our abstractions can sorta play interference...

Presumably, the more we apply formal methods and testing, the fewer bugs sneak through, and the better our understanding of the system as it is being developed. Thus, we can hope it will mean the bugs that don't get squashed will be relatively easy?

Intentional Programming

Another set of tools and abstractions that I think show a lot of promise come from the world of intentional programming.

Whatever tools are out there are probably going to take real effort to implement well. That probably means one of: a damned smart person (please, working in some language I personally want to use!), or a company doing the same, or an open source project doing the same.

Unfortunately, I think it will be a long time before anybody really gets these out. That is very sad to me.

So I said to myself, "Self..."

Would that there were projects afoot to make underlying systems for debugging. Or maybe I should stress the underlying part, instead. To what degree could there be abstractions that let you debug Haskell and Forth and Java, etc.? Some kind of plug-in AST kombobluator?

"Mine is bigger/smaller" than

"Mine is bigger/smaller" than yours disucssions get nowhere, but I might mention that I have personally used test-driven development on both huge enterprise-style server based systems and embedded firmware.

I can imagine embedded systems which are too small for a developer to be able to run a test framework on the device (not that this is required for TDD). I also imagine that in such a constrained environment there wouldn't be much use of OO techniques and the codebase might well be small enough for code reading techniques to pay dividends. But these would be very, very small devices and a very small segment of the programming world, too.

What I think about code reviews is that they are excellent things. That's why I much prefer to write code in a pair, and to practice collective code ownership, to get as many pairs of eyeballs on the code as possible.

So far as code reviews as an off-line technique go, as I was taught the Fagan process the reviewers only examine small stretches of code at a time because at the level of attention required that's all that can be done effectively before fatique sets in an defects are missed. Fagan inspection works (it is pretty much the only development technique that has hard numbers in support). But it is also horifically expensive and comes from a development tradition that's obsolete.

There are very much cheaper ways to get almost the same benefit, I'd suggest.

Enlarge your choice

Software development spans a gamut, and I like to think about that gamut, having experienced a range which includes fun interactive Lisp debugging through to really really crappy raw memory dump debugging in horrible environments (maybe that's kind of a redundant description). It is not possible to fix all the factors which contribute to making a situation bad, especially when those factors are social (e.g.: company habit, methods, structure, methodologies). Software tools are often easier to apply, so let us think about those as well (and probably foremost).

Regarding missing defects in a code review, I believe finding any and missing some is better than finding none. Code reviews can also help to find bugs that are on the order of large misunderstandings between developers; we should try to think about the full range of 'bug's that are out there.

What "cheaper ways" have you employed? Which ones worked well? Which ones sucketh verily? Can you think of circumstances where they should not be applied? Can you think of situations where people don't usually apply them, and it could be a terrific help to do so?

huh?

Who is it that you think is claiming that all abstractions help in all situations?

Replies

Keith, it would be helpful if you would use the "reply to this comment" link when replying to a comment, rather than the overall forum topic. The reply form at the end of the thread will produce a top-level reply comment, which destroys the threading (and can cause some confusion when the reply is not near the post which is being replied).

Yeah, I thought I had. And so

Yeah, I know.

I thought I had. And so am puzlled that the comment appeared where it did.

In fact, I'm not sure I understand the LtU interface at all well. I find it quite difficult to see what's going on in a discussion.

Bulls Eye

I was trying to make the same point in my question, but the pithyness of the heading took the discussion in other directions. However, i am glad that you asked the same questions.

But can you help summarise this discussion?Can we say:

Abstractions add fuzziness when you have to know what is happening, but aid in better communication.

Re: Bulls Eye

I don't mean to be flippant, but I'm not sure there's a summary everybody would agree upon :-).

Personally, I wanted to heighten the awareness that any time you abstract along a given dimension of the problem, you run the risk of it making other perspectives on it hard to deal with. And, pretty much, every problem has multiple facets - humans arne't very good at dealing with the entire gestalt, hence the need for abstractions at all. If everybody stops and thinks "gee, is having a deep object hierarchy going to make debugging this code hell for somebody else, or even my future self?" then that's good :-) Likewise, "does Haskell really need yet another operator that I invent that uses a strange series of symbols, and that looks sort of like another symbol already in use but differs only by a single '>' symbol?"

I also wanted to hear from other folks how they saw abstractions working; given abstraction pattern Y, what are the ramifications for all the things we have to worry about in life: debugging, porting, design, scalability, verifiability, implementation, how much time is left in the schedule before we just have to ship the bloody thing? (There's the wiki for when people do want to record such thoughts.)

Would classifying this kind of stuff get me an honorary degree of something-or-other? Ha ha.