Lambda the Ultimate

Why Learning Assembly Language is Still a Good Idea
started 5/9/2004; 9:57:17 PM - last post 5/11/2004; 11:06:06 AM

Chris Rathman - Why Learning Assembly Language is Still a Good Idea

5/9/2004; 9:57:17 PM (reads: 8813, responses: 15)

Why Learning Assembly Language is Still a Good Idea

Randall Hyde, author of Art of Assembly and HLA (High Level Assembly), discusses why Assembly language is still relevant. An interesting (and I think) polar opposite of Jame's recent posting on where programmers are frittering away Moore's law.

Often, you'll hear old-time programmers make the comment that truly efficient software is written in assembly language. However, the reason such software is efficient isn't because the implementation language imparts some magical efficiency properties to that software -- it's perfectly possible to write inefficient software in assembly language. No, the real reason assembly language programs tend to be more efficient than programs written in other languages is because assembly language forces the programmer to consider how the underlying hardware operates with each machine instruction they write. And this is the key to learning how to write efficient code -- keeping one's eye on the low-level capabilities of the machine.

On a related front, Cameron Laird and Kathryn Soraiz also discuss HLA in Programming Down to the Silicon and Rapid Development of An Assembler Using Python. (And while I'm dumping my links, I'll also mention the ASM compiler for .Net).

I've tinkered with HLA's OO, and I must say it's strange dealing with classes/methods at the same time you push and pop registers. The main thing, though, that I'd mention is that assembly language is not the only intermediate target machine. For example, Java programmers should understand the intermediate code produced by the compiler (IL for .Net compilers or whatever your target machine happens to be). It's not important to understand how the VM translates into x86, but I consider it important to be able to comprehend how the higher level code translates down into the target machine (virtual or actual).
Posted to teaching/learning by Chris Rathman on 5/9/04; 10:02:11 PM

andrew cooke - Re: Why Learning Assembly Language is Still a Good Idea

5/10/2004; 6:22:18 AM (reads: 576, responses: 1)

For example, Java programmers should understand the intermediate code produced by the compiler

why? i think i'm a pretty good java programmer, but i haven't a clue about this - i always assumed they'd have designed it so that it works well with the language. do i have to program java in a certain way to compensate for the bytecode?

what about asts? should we all learn the intermediate representation in the compilers we use so that we can finely tune our code so that that representation is efficient?

andrew cooke - Re: Why Learning Assembly Language is Still a Good Idea

5/10/2004; 6:26:06 AM (reads: 575, responses: 0)

my point, apart from needing to vent anger at missing my coffee this morning, was that assembly reflects the hardware architecture, about which you have little choice, but everything else up should be designed to provide an interface that reflects the language being supported.

(so i can see why someone using the prolog compiler just mentioned would need to learn about java bytecodes)

Chris Rathman - Re: Why Learning Assembly Language is Still a Good Idea

5/10/2004; 8:31:08 AM (reads: 551, responses: 1)

The more generalized problem is how one understands (and can prove) how the underlying target machine operates. For languages like Java, C# and Perl6, it's not nearly as important to understand how the hardware machine works, as it is to understand how the virtual machine works (JVM, .Net, Parrot). Perhaps I['m biased having worked embedded systems all those years ago, and having written my own Java Decompiler. But I think it important to be able to not just understand the Programming Language specification, but also understand the implementation, compilation, and interpretation issues involved.

As one quip says, Lisp programmers know the value of everything, but the cost of nothing. Ok, so this is an unfair characterization of Lisp, but I would note that the Lisp community spends an inordinate amount of effort to educate it's users on how the Lisp machine works. The question is why? Why is it important to understand the Lisp engine, given that it's a purely symbolic architecture?

Efficiency: For much the same reason that one should understand x86 assembly for compilers that target that platform, it's important to know how high level constructs will work their way down into actual executable instructions of the target machine. If one is working with Java, the target machine is not ASM, but rather the intermediate byte-code language. I think the same arguments that Hyde gives for learning ASM are applicable to learning the language of target virtual machines.

Debugging: Sooner or later, I'm always faced with a problem of either (a) not understanding how a higher level construct actually translates; or (b) a problem in the compiler/interpreter that I am using. As one who dislikes having blind spots, or relying solely on trial and error to prove something, I like to have something a bit more concrete than guessing. It's not so important that I meticulously understand how each and every high level instruction translates down to the machine. But there are times when it becomes important to be able to trace the process from top to bottom. Black boxes work fine in terms of allowing us to think in higher and higher levels of abstraction. But sometimes there are problems inside the blackboxes that we have to address.

Chris Rathman - Re: Why Learning Assembly Language is Still a Good Idea

5/10/2004; 8:50:45 AM (reads: 547, responses: 0)

Since it's something I'm working on today, let me give another example. I have a stored procedure that is very sensitive to the parameters that are present when the query plan is laid out - order of magnitude 10,000. Now I can guess why this procedure is hypersensitive, probably being the result of the selected indexes and the order of the join. But until I look under the hood and examine the differences in the query plans that are laid out (a sort of pseudo assembly type language), I will not really understand the nature of the problem nor will a solution present itself. I could examine the SQL over and over again, but getting down to the thinking and examining the compilation process is the only real answer. Everything else is just speculation. (Now if they just had a decent query plan language, I'd be set).

Dominic Fox - Re: Why Learning Assembly Language is Still a Good Idea

5/10/2004; 9:08:13 AM (reads: 535, responses: 0)

*ahem*

Corrupted bug report received in error: could not resolve address. This traces back unendingly, is lost at last amid a blitz of registers - beyond my skill

if not all skill. Go high-level to bare metal in winding descent; resist light, heat and shards; get zapped in nether regions. Every black box has another black box inside it somewhere.

The shortest path is a relaxed approach to the infinite: so route around bad sectors, solicit patches, number each release in cautious increments. Surely that kingdom

looms aloft where art is catalogued, all things considered harmless. There the blinding magi parley, indolent in their wreaths: their elan earthed, exposed as commonplace.

Martin Sandin - Re: Why Learning Assembly Language is Still a Good Idea

5/10/2004; 10:37:00 AM (reads: 504, responses: 0)

Isn't this just The Law of Leaky Abstractions manifested?

The same reason people seem to learn all kinds of things about how lazy certain Haskell constructs really are. I've seen (newbie) programmers who have too little knowledge about what is really happening "down below" having a very hard time to even start guessing where to begin optimizing, or looking for optimization oportunities, or the relative costs of iterating over a hundred elements in a collection and hitting the internet. I tend to think assembler (and hardware) experience makes one a better programmer, and not just in the optimization sense, but very much in the abstraction sense as well, as it gives you an idea of what your language is an abstraction of, and a basis from which you can at least start to comprehend O-notation and other things which matter.

Dominic Fox - Re: Why Learning Assembly Language is Still a Good Idea

5/10/2004; 2:06:32 PM (reads: 462, responses: 0)

I guess O-notation is at the crux of it - in TAOCP, you're never far away from an analysis of the time complexity of an algorithm, based on a step-by-step implementation in which every step is exactly costed. The point is to teach the discipline of thinking that way, rather than drumming the details of a specific architecture into the reader's head.

Frank Atanassow - Re: Why Learning Assembly Language is Still a Good Idea

5/11/2004; 8:13:00 AM (reads: 351, responses: 3)

Martin: Isn't this just The Law of Leaky Abstractions manifested?

Please do not propagate this nonsense.

Not only is this not a law, but only an informal conjecture, it is also an incorrect conjecture which perpetuates fuzzy thinking and discounting of responsibility.

In that article, Spolsky gives some examples of abstractions which he claims `leak' (whatever that means).

He says that TCP claims to guarantee that every message will arrive, but that in fact 1) it won't arrive if your computer gives out, or 2) it will arrive, but very slowly. Well, 1) obviously the proposition that the receiver of a message is available is a precondition of the abstraction and 2) arriving slowly does not contradict the claim, which only says that it will arrive. Now, I don't know if TCP actually respects that guarantee, but I do know that those two counterexamples are not counterexamples at all.

His examples about 2D arrays and page faults, and SQL performance follows the pattern of 2): he pretends that performance guarantees are part of the specification. His remaining examples, about NFS and C++ strings and cars follow the pattern of 1): he pretends that the specification doesn't mention some precondition/assumption.

These are all straw man arguments which stem from Spolsky's desire to oversimplify; when the world doesn't match his expectations, rather than blame his own model of the world, he blames the notion of model (abstraction) itself. `Every abstraction is leaky' means`no model is sound.' Well, while it is undoubtedly true that no model is going to account for all phenomena of interest, one can say, with absolute certainty, of some models that they are sound, provided one is clear on exactly what the model is trying to account for.

Of course, Spolsky is used to working in languages which don't possess any kind of formal semantics, and consequently it's impossible to reason about programs, modules, procedures, classes, etc. without, at the very least, a finite degree of uncertainty. In languages like Scheme or SML, which do possess a semantics, one can formalize a specification, stating precisely what it assumes and guarantees, and then say of a module, for example, that it either implements that specification or not. There is no in-between, or leakiness, or wish-washiness of any sort.

I really find this sort of thing extremely irritating, because it's so circular and such a self-fulfilling prophecy. Why are abstractions leaky? Well, because people design them to leak. Why? It's not only that crystallizing a specification takes more work. It's also thatthey would rather keep their goals as fuzzy as they can get away with, because that way if an implementation is wrong or unsatisfactory, one can shift the blame.

Don't tell me it's not true; I used this tactic all the time when I was a teenager and wanted to get out of something. `You want me to clean my room? Sure.' An hour later. `Oh, under the bed too? And the closet? My desk? You didn't say that!'

Andris Birkmanis - Re: Why Learning Assembly Language is Still a Good Idea

5/11/2004; 8:44:42 AM (reads: 323, responses: 0)

In languages like Scheme or SML, which do possess a semantics

If one really wants to be difficult to deal with, one (the same one) can say: "It isn't Scheme all the way down, you give the semantics using some formalism, which itself needs grounding", and then bring up the problem of bootstrapping, and Goedel, and dragons.

But I am not that one :-) I tend to agree, except that probably Spolsky in fact suggested just an observation that people tend to create incomplete (if not unsound) abstractions?

Dominic Fox - Re: Why Learning Assembly Language is Still a Good Idea

5/11/2004; 8:54:24 AM (reads: 323, responses: 0)

The trouble with specifications is that they can't be formal all the way up (even if they can be formal all the way down, from specification language to "higher-level" language to machine language). Given informal requirements, a formal specification will always be a translation; and some part of what the requirer means (or thinks they mean) will inevitably be lost in the process.

Abstractions "leak" at this juncture, which is the point at which two quite distinct language games with quite distinct rules are brought together. Unless we can all learn to "require" only what we can specify exhaustively and unambiguously, this will continue to be the case.

As far as responsibility goes, there exists a strain of ethical opinion which holds that it is formalism that discounts responsibility by making it unnecessary: no-one need answer personally for the correctness of what can be formally proven. The situation that calls for responsibility is the situation in which I have to make a choice and do not possess any infallible means of determining what is the correct choice. But maybe we'd better not get into meta-ethics right now...

james litsios - Re: Why Learning Assembly Language is Still a Good Idea

5/11/2004; 8:58:05 AM (reads: 312, responses: 0)

When I interview a developer I ask: - Tell me what can be found in a "core" dump? - What is meta-programming with templates (in C++) all about?

In the first case I want to hear about his assembly language skills. In the second I want hear him mention functional programming.

Being a good software engineer is all about economics. If you don't know assembly, you don't know when it would be a good time to use that knowledge.

Frank Atanassow - Re: Why Learning Assembly Language is Still a Good Idea

5/11/2004; 10:26:12 AM (reads: 285, responses: 1)

Given informal requirements, a formal specification will always be a translation; and some part of what the requirer means (or thinks they mean) will inevitably be lost in the process.

Even if that is inevitable (and, though I think it can happen, I don't think it's inevitable), the person who does the specifying (`requirer'), and any reasonable person who uses the specification, can certainly distinguish between `X does Y' and, say, `X does Y using time O(n)'. This is a pretty clear distinction, and it's disingenuous of Spolsky to try to impute performance guarantees into the abstractions he names.

Even if you say that in every specification designed for software there is an implicit performance requirement (which I think is true), it's disingenuous not to make that explicit when you criticize abstractions in general. As in, `OK, I know that nothing in the SQL spec says that this query will be fast, but...'

Similarly, it's staggeringly obvious that windshield wipers, headlights and a car roof are not going to prevent hydroplaning, or make it seem as if it isn't raining. You don't need a formal spec of windshield wipers to see that.

In fact, I don't think Spolsky gives a single example where one could argue, in an intellectually honest way, that the gap between informality and formality is responsible for the `leakiness'.

Andris Birkmanis - Re: Why Learning Assembly Language is Still a Good Idea

5/11/2004; 11:06:06 AM (reads: 272, responses: 0)

You don't need a formal spec of windshield wipers to see that.

A spec as a statement of how it should behave or of how it's implemented? If later, then I think in light of chaos, eh, theory, it is not possible to prove even given a formal spec that in all cases, under all circumstances wipers will not influence hydroplaning. But as James mentioned already, SE is about economy, so for all practical purposes it's good enough to run QA and see that in most (realistic) cases the statement is valid.

Probably it is not inevitable to prevent information loss in the process of requirement gathering, but it could be economically inefficient to try achieving that.

I don't even suggest another sidepath - where is the boundary of responsibility between customer and provider? Is it the athmosphere bearing sound waves, is it paper of the contract, or does it depend on legislation? And morality? This leads nowhere, or rather, it can be lead to any question of philosophy.

Martin Sandin - Re: Why Learning Assembly Language is Still a Good Idea

5/11/2004; 11:59:58 AM (reads: 261, responses: 0)

Interesting points, and I shall have to think about them, you do seem to be reading something into it which I did not intended to try to convey. I should probably not quote him if his case is as bleak as you imply, but then I'll have to say what I actually read in a way which doesn't rely on his words instead. Then you can actually correct my muddled ways of thinking instead of his, and for me that is probably going to be even more rewarding :) I never read that text as saying that abstraction leak is 'inevitable', merely that many common abstractions we do have do in fact, 'leak'. I'm actually curious to where I should mend my ways if I read his text and is reminded of:

- How people asking about why his Haskell code leaks memory, and the answer seems to be to learn the details of how the Haskell evaluation strategy works. This is ofcourse a part of the abstraction, but it's a murky and uncomfortable part. Maybe that's the crux.

- How Python string building is simple as that, but when things perform dog slow you need to learn about how the interpreter does string concatenation, and the fact that strings are immutable and all kinds of things the abstraction does not immediatly reveal.

And other things I can't think of quickly. These things are obviously parts of the total abstraction, which in this sense doesn't leak. But that is in a formal sense, in an informal sense (this is where we shouldn't go, informal?) these things are leaks in a higher-level abstraction. Is there a pattern to this kind of phenomena? If it's there, is it in any way a useful pattern to notice or should we just ignore it? Is it in any way related to the abstraction boundary 'leakage' mentioned above? Is this just a case of 'this abstraction is harder to use than it looks, youll just have to learn', as I suspect you're saying?

Ehud Lamm - Re: Why Learning Assembly Language is Still a Good Idea

5/11/2004; 1:24:30 PM (reads: 255, responses: 0)

Why are abstractions leaky? Well, because people design them to leak.

This is certainly an important part of the problem, since we can obviously design abstractions that don't leak. But can we do it all the time? Can we do it productively while building real life applications? This is less clear, and naturally depends on murky subjects like SE and methodology.

Which is why I formulated a different approach. I am not talking about abstractions leaking, but rather about programmers actively breaking abstractions boundaries. Why do they do it? In a nutshell: sometimes it is easier in the short term, to break an abstraction rather than create a more powerful (and usually more abstact) component.

The problem is that some language provide better (i.e., safer) constructs when it comes to abstraction breaking than others (e.g., inheritance vs. reflection vs. class loaders).

A more detailed discussion of this issue can be found in my paper Component Libraries and Language Features, Ada-Europe'2001, LNCS 2043.