Lambda the Ultimate

inactiveTopic C++ vs. Lisp
started 11/28/2002; 6:36:56 AM - last post 12/2/2002; 7:25:44 AM
Dejan Jelovic - C++ vs. Lisp  blueArrow
11/28/2002; 6:36:56 AM (reads: 2740, responses: 10)
C++ vs. Lisp

One programmer's attempt to translate Lisp code into C++.


Posted to general by Dejan Jelovic on 11/28/02; 6:40:55 AM

Dan Shappir - Re: C++ vs. Lisp  blueArrow
11/28/2002; 7:56:42 AM (reads: 2345, responses: 1)
I may be missing something, but some of the statements in this article don't add up:

Also, notice that I had to print out the list explicitly -- cout doesn't know how to print an STL list by default, a rather egregious fault in my book. (However, we can solve this with C++ operator overloading, a fairly advanced skill.)

This is simply not true. You can easily use the ostream_iterator to print an STL list:

copy(list.begin(), list.end(), ostream_iterator<string>(cout));

Note that I'm assuming that the namespace std is being used.

The thing to note here is the use of the copy algorithm. STL algorithms, when used in conjunction with C++ function objects, allow for a functional style of programming, which mostly answers Brandon's complaint about the use of iterators.

Finally, with regards to Brandon's own version of STL:

I decided to stick with C++ and try to make it better. But how? My first thought was to get rid of iterators. I decided to rewrite all of the STL's algorithms to accept containers instead of iterators. But more importantly, I decided to also make the algorithms (where possible) return copies of containers instead of iterators as well. This would enable the functional programming that makes Lisp code so succinct.

There is a reason why STL was designed the way it was - efficiency. Both Straustrup in his original design of C++ and Stepanov in his design of STL realized that unless they make the generated instruction sequence as efficient as that done by hand (more or less) the developers they were targeting wouldn't use their tools. Generating copies of the containers would result in such inefficiency that no real C++ programmer would ever use it.

Here is an example: to traverse a sequence in reverse. Using Brandon's STL you would create a copy of the original container that contains the elements in reverse order, traverse it and then discard it. Using the real STL you would simply use a reverse_iterator.

Luke Gorrie - Re: C++ vs. Lisp  blueArrow
11/28/2002; 10:45:51 AM (reads: 2334, responses: 1)
I've been wanting to finish a similar experiment that I did between Lisp and Haskell, though I haven't had so much perseverance as Brandon. Maybe a reader who knows Haskell can help me out?

The idea was to take a solution to the Nine Nines problem that was written in Lisp and do a Haskell version, to learn a bit of Haskell. The program works, but I'm unsatisfied on a couple of points: one is that I wonder if some neat Haskell tricks could make the code nicer? In particular the set/list conversion doesn't look so nice. Another is performance: the Lisp program runs in about 2 seconds on CMUCL, and the Haskell one takes about 20 minutes in GHC, but as far as I can see they use essentially the same algorithm. At first I assumed that the memoization was broken, but it "seemed to work" in some testing.

Anyone care to suggest improvements?

(The real trouble for me with learning Haskell is that I end up writing Erlang programs in Haskell! The same problem with learning ML.)

Dejan Jelovic - Re: C++ vs. Lisp  blueArrow
11/28/2002; 2:31:25 PM (reads: 2264, responses: 0)
Dan Shappir wrote:

This is simply not true. You can easily use the ostream_iterator to print an STL list:

Yeah, I also noticed that. However, the output of using std::copy to ostream_iterator is rather ugly. Usually one wants to output a comma-separated list.

There is a reason why STL was designed the way it was - efficiency.

Actually, it's efficiency _and_ memory management. Some of the code I write using STL can be converted into much simpler and equally efficient code in languages that have parametric polymorphism _and_ GC.

Dejan

Karl Zilles - Re: C++ vs. Lisp  blueArrow
11/29/2002; 9:02:51 PM (reads: 2189, responses: 0)
I can't offer any help with the Haskell version.

I did port your program to OCaml just to get a feel for how it compared speed wise and to learn a little more of the language. The code is here:

http://www.zilles.com/karl/ninenines.ml

If anyone can offer advice on how to improve it, I would welcome it.

Compiled under cygwin OCaml on windows, it took a little over 5 seconds on my 1GHz machine.

I'm hoping the answer is 195. Someone please tell me if it's not. :)

Luke Gorrie - Re: C++ vs. Lisp  blueArrow
11/30/2002; 4:06:49 PM (reads: 2074, responses: 0)
195 is correct.

On my 800mhz Athlon, the Ocaml version runs in about 6.8 seconds, as measured by the unix 'time' command, using Ocaml 3.02 and compiling as the comment in the source file says. The Lisp version in CMUCL 18d takes only 1.5 seconds, as measured by (time (run 9)). I'm pretty impressed with CMUCL.

I suspect that the ~ 20 minutes of the Haskell one is due to some serious flaw in the program, but darned if I've been able to find it. I stopped looking after waiting a whole weekend trying to rebuild GHC with profiling support, interrupted by many "no space left on device" errors.

Speaking of correctness, I had a pretty alarming bug in my Haskell version: I was missing the parentheses in the `combinations' function, so I had something like:

if x then a else b ++ if y then c else d

and expected if-then-else to bind tighter than ++, which it doesn't. The result was a plausible but incorrect answer, which I'd never have found without already having known the correct answer from the Lisp program. Not particularly deep, but I found that interesting. Maybe a simpler syntax would complement the static type system for helping to write correct programs - but perhaps only for the newcomer.

Alex Sauer-Budge - Re: C++ vs. Lisp  blueArrow
11/30/2002; 5:30:27 PM (reads: 2055, responses: 0)
I'm fascinated by the variation in performance of the Lisp, Ocaml and Haskell implementations and the implication that a few tweaks/tricks and more knowledge about how the high-level code gets translated into machine-level code would improve the implementation. I would want that the underlying algorithm would play a bigger role in the performance than the particular programming language constructs used to specify it.

I know it is more the constant in front of the asymptotic complexity than the exponent, but my desires just reflect that I more interested in focusing my attention on the mathematical nature of algorithms than haggling with the programming language, compiler/interpreter and computer over how best to represent the algorithm.

If we accept that a given algorithm can have multiple solutions in a particular language, than has there been any work done on designing programming languages that are not sensitive to which particular solution the programmer chooses? My guess is that this is impossible to ensure in any "general purpose" language, but it seems a fundamental enough question that there might be some research on the topic.

Isaac Gouy - Re: C++ vs. Lisp  blueArrow
12/1/2002; 10:34:55 AM (reads: 2028, responses: 1)
programming languages that are not sensitive to which particular solution the programmer chooses
We write two different implementations of some algorithm, and the programming language tool figures out that they are the same algorithm?

Hmmm.

You might find Intentional Programming interesting. IP has been discussed previously on LtU.

Dan Shappir - Re: C++ vs. Lisp  blueArrow
12/2/2002; 2:28:59 AM (reads: 2129, responses: 0)
We write two different implementations of some algorithm, and the programming language tool figures out that they are the same algorithm?

Actually some old compilers had something along these lines. The Sieve of Erathostenes was such a popular test of compiler code generation that some compiler writers put in code to identify it. When it was identified the compiler would emit a hand-tuned optimal implementation of that algorithm. At least so the legend goes.

Brandon Corfman - Re: C++ vs. Lisp  blueArrow
12/2/2002; 7:17:26 AM (reads: 2044, responses: 0)
Hi --

Yes, your comment about std::copy is correct, and I could have used that instead of the for-loop. Keep in mind, though, that the original program was written as quickly as possible, without too much regard to coding style. When I ended up translating to the 2nd smaller version of the C++ code, I kept the original 'for-loop' code in, and std::copy could be used instead for sure.

However, when I look at Norvig's code, it uses one of the most common keywords in Lisp: format. This gave me three options when I was illustrating my point:

1) Use std::copy. In comparison to Lisp's format, this is obscure, hard-to-remember syntax that I would've had to look up while I was coding. In addition, this is a piece of C++ arcana that I learned from Scott Meyers' "Effective STL" -- it's not a commonly used idiom. 2) Use cout and operator overloading. This has the big advantage of using the standard C++ output keyword, but operator overloading is also fairly obscure. 3) Use a standard for-loop and cout for each element of the list. While not as "modern", it's in the same spirit of simplicity as Norvig's approach. I don't have to explain what the code means, it's no less efficient than std::copy, and it's what I would write in production code because junior programmers would understand it too.

My own personal coding style is that I will use a few more lines of code if I don't need write a comment on what I've done. It's true that this is conflict with the "rewrite the program in the smallest number of lines" goal, but coding style is all about balancing goals.

In my EasySTL code, I ended up using the std::copy algorithm there because I think it's fine to stick arcana like that in a library.

Also, I agree with you, passing copies of lists around is not the most efficient method. But you already have modifying functions (i.e. the standard STL functions) if efficiency is an issue! To me, worrying about speed issues is important only if that's where you're spending most of your time in the code (i.e. 80/20 rule). For example, in the case of the Encoding problem, there was no speed hit using the library because most of the run-time was spent doing multimap stores/lookups.

Alex Sauer-Budge - Re: C++ vs. Lisp  blueArrow
12/2/2002; 7:25:44 AM (reads: 1988, responses: 0)
I do indeed find the ideas of Intentional Programming very interesting (from my very naive understanding of these things). An approach like this, where you must manually specify the optimizations for particular constructs, is probably the only way to do it in reality, but I have a proclivity to dream about the computer doing more work for me (after all, isn't optimization just the application of another algorithm?). At the most basic level, I am thinking about standard compiler optimization strategies like hoisting and inlining, but in broader terms I am thinking about multi-stage languages.

When I asked my question, I was imagining a simple language (like one of the lambda calculi?) in which you could always statically reduce a given specification of an algorithm into a unique normal form which is the selected from the rest because it is the most efficient. I wouldn't expect such a reduction to improve the inherent asymptotic complexity of an algorithm, "just" minimize the pre-multiplying constant.