Lambda the Ultimate

inactiveTopic Programming as if Performance Mattered
started 5/5/2004; 6:41:20 AM - last post 5/10/2004; 6:59:32 PM
James Hague - Programming as if Performance Mattered  blueArrow
5/5/2004; 6:41:20 AM (reads: 811, responses: 10)
Tooting my own horn a bit, but I've gotten some good feedback on Programming as if Performance Mattered, kind of a different (and sneaky) take on optimization and speed and language choice. The sneaky part comes from...well, you should just read it :)

andrew cooke - Re: Programming as if Performance Mattered  blueArrow
5/5/2004; 6:55:46 AM (reads: 597, responses: 1)
you see a factor of 10 improvement with a factor of 10 increase in clock speed, as far as i can see. yet alan kay claims (in that video) that this isn't happening - that his smalltalk code is running only 50 times faster than some ancient chunk of iron (pdp-11?). what's the explanation? is it that we forget just how good those old machines were compared to early x86?

Frank Atanassow - Re: Programming as if Performance Mattered  blueArrow
5/5/2004; 7:51:59 AM (reads: 585, responses: 2)
andrew: alan kay claims (in that video) that this isn't happening - that his smalltalk code is running only 50 times faster than some ancient chunk of iron (pdp-11?).

I saw this video once also, and found this a bizarre style of argument. He derides computer engineers because their designs don't significantly increase the performance of programs written in his favorite language. To me it seems clear rather that his Smalltalk implementation doesn't exploit the hardware.

For the last 20 years now, hardware has been designed to shift more and more of the burden of optimization on compilers (so: at compile-time) than the execution architecture (so: at run-time). Maybe Kay can't accept this because he's so obsessed with late binding.

Luke Gorrie - Re: Programming as if Performance Mattered  blueArrow
5/5/2004; 8:20:03 AM (reads: 577, responses: 0)
Brilliant, James. And I don't just say that as an Erlang fan :-)

Neel Krishnaswami - Re: Programming as if Performance Mattered  blueArrow
5/5/2004; 12:20:03 PM (reads: 596, responses: 0)
Hi Frank, I'm not sure I agree with the claim "hardware has been designed to shift more of the burden of optimization to compilers". I mean, the advances in architectures have been things like branch predictors, speculative execution, and caching, all of which are dynamic things that don't rely so much on the compiler to do things "properly" in order to get good performance. The one architecture that has been designed on the assumption of a good compiler -- Itanium -- has not been a great success.

Keith Moore - Re: Programming as if Performance Mattered  blueArrow
5/6/2004; 2:57:22 AM (reads: 413, responses: 0)
You only see a factor of 10 improvement if there is a factor of 10 increase in the speed of *everything* the code touches (cache, main memory, disk, etc.)

That said, I still don't understand Alan Kay's claims.

It may be like someone claiming "My 2GHz P4 is at least 20 times faster than my old 100MHz Pentium. Why doesn't MS Word feel 20 times faster?" One reason is, like I mentioned above, all parts of the system are not 20 times faster. Also, the "MS Word" running today is not the same "MS Word" from 10 years ago.

James Hague - Re: Programming as if Performance Mattered  blueArrow
5/6/2004; 6:25:29 AM (reads: 382, responses: 0)
Glad you liked it Luke! I felt like it needed to be said.

On the factor of 10x issue: That the improvement is fairly close to the increase in clockspeed, well, it still seems low to me considering that the Pentium has gone through several big rounds of architectural rethinks since the PII and has generally been beefed up in a number of ways. Either they're all for naught and clockspeed is what really matters, or they're necessary to make up for memory still being much slower than the CPU.

Uh oh, I see this article is now on Slashdot, with a link to Lambda. Here's hoping both servers come out okay :)

andrew cooke - Re: Programming as if Performance Mattered  blueArrow
5/6/2004; 3:49:46 PM (reads: 274, responses: 1)
on the topic of optimizers, i just wrote some code that runs a factor 4 faster on x86 than a sun sparc, relative to a different reference algorithm.

it's 4 times slower than the reference on the sun, but equal in speed on the x86 (and in both cases varying "n" gives the big-O behaviour expected, showing that they're both compute-bound).

the only explanation i've got so far is that the code is very "not like a fortran program", so the optimizer on the sparc (direct f77) is making a much worse job than on the x86 (f2c then gcc). the reference algorithm takes a standard approach, so the f77 compiler wouldn't be surprised.

but a factor of 4...?!

(on edit - for the curious, it does a lot of binary tree manipulation, with dynamic memory handling via a hack that uses the c library malloc and converts pointers to indices in a common array.)

Oleg - Re: Programming as if Performance Mattered  blueArrow
5/6/2004; 7:58:13 PM (reads: 281, responses: 0)
Perhaps the following two threads

could also be relevant to the discussion. The person who posed the problem later noted that the problem is real and practical. Also note the comment by Brad Lucier in the second thread.

andrew cooke - Re: Programming as if Performance Mattered  blueArrow
5/7/2004; 9:40:07 AM (reads: 216, responses: 0)
so it looks like it was memory caching. if i do all the allocation up-front (and presumably from a contiguous block of memory) the sun version matches the reference. presumably the x86 malloc routines are a bit smarter about grouping/recycling memory blocks.

on a similar vein, this article on the p-m (centrino chip) architecture (referenced on /. today) is pretty interesting.

Mark Evans - Re: Programming as if Performance Mattered  blueArrow
5/10/2004; 6:59:32 PM (reads: 146, responses: 0)
You all know my thoughts.