expressivity of "idiomatic C++"

The April issue of C/C++ Users Journal has an article called A New Solution To an Old Problem by Andrew Koenig and Barbara E. Moo. In it, they revisit the Hamming numbers problem from Dijkstra's A Discipline of Programming.

They examine four different solutions:

  1. A naive O(n^2) solution.
  2. Dijkstra's efficient but somewhat convoluted O(n) solution (re)implemented in C++.
  3. An elegant O(n) solution in Haskell.
  4. Their own new O(n*log(n)) solution in "idiomatic C++".

The Haskell solution is the following

scale n (x:xs) = (n * x) : (scale n xs)
merge xs [] = xs
merge [] ys = ys
merge (x:xs) (y:ys) =
   if x == y then
      x : (merge xs ys)
   else if x < y then
      x : (merge xs (y:ys))
   else
      y : (merge (x:xs) ys)

seq = 1 : (merge (scale 2 seq)
                 (merge (scale 3 seq) (scale 5 seq)))

Their "idiomatic C++" solution uses ordered sets:

set<int> seq;
seq.insert(1);
set<int>::const_iterator it = seq.begin();

int val = *it;
seq.insert(val * 2);
seq.insert(val * 3);
seq.insert(val * 5);
it++;

In conclusion, they have this to say (emphasis mine),

We have looked at four solutions to a simple problem. The first was straightforward but slow. The second was much faster but fairly tricky. The third, in a functional language, was again straightforward -- but requires a totally different way of thinking about programming. Indeed, advocates of this way of thinking use the program's straightforwardness to argue that this way of thinking is superior.

With the fourth solution, we believe that the argument is far from obvious.

I may be reading too much into this quote, but it sounds to me like Koenig and Moo consider it a bad thing to require a "totally different way of thinking about programming".

P.S. While googling for Hamming numbers, I came across this related paper: Expressivity of Functional-Logic Languages and Their Implementation by Juan José Moreno Navarro.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

FP as a second language

but it sounds to me like Koenig and Moo consider it a bad thing to require a "totally different way of thinking about programming".

Of course they do. Remember that one of the core design values of C++ is building upon pre-existing knowledge and expectations from C. I think there is an inherent conservativeness in the C/C++ community, partly because of this value and partly because they are the "powers that be" in many programming domains.

I should point out that, though I don't share that PL conservative outlook, I don't think it is unreasonable either. If you have spent many years mastering a particular "way of thinking about programming", you probably aren't going to want to give it up on short notice.

No one wants to go from being a doctor or engineer in his "native" PL to being a taxi driver in someone else's.

I think a lot of technological "religion" flows from exactly this kind of learning opportunity cost.

Agreed

First, it's a fair observation that when you're faced with a problem and someone proposes a paradigm shift as a solution, you should explore all other avenues first, because paradigm shifts are high-cost events by definition. Given this, it's not clear to me that Koenig and Moo are necessarily hostile to FP, and in fact anyone who writes forcefully about taking full advantage of the STL, as Koenig and Moo have done, can't be all that hostile to it!

However, even if I'm mistaken on that last point, it doesn't mean that the entirety of the C++ community agrees, or even that people who are on the standards committee agree: witness Bjarne Stroustrup's insistence that C++ is multiparadigm including functional programming, boost::lambda, FC++, and Phoenix. Also, boost::lambda and Phoenix are being integrated, and hopefully at some future date we'll also see monads make it from FC++ into whatever the integrated boost functional programming library is called.

It pains me to say it... but C++ is my fourth-favorite functional language. That may not sound like much, but the fact that it's on my list at all just flabbergasts me. For the curious, my first three are Oz, O'Caml, and the Lisp family.

on paradigmatic shifts

it's a fair observation that when you're faced with a problem and someone proposes a paradigm shift as a solution, you should explore all other avenues first, because paradigm shifts are high-cost events by definition.

STL was a paradigm shift at the time it was introduced, was it not?

So was OO

This is the big advantage C++ has- even over other "Object Oriented C" languages (like Objective-C). It provides the new "paradigm" (or pseudo-paradigm- I don't think templates in C++ or modules in Ocaml quite qualify as full on paradigms by themselves) without really requiring it. You can ignore it if you don't understand it or don't understand it well enough to use it right now. I still know a number of programmers supposedly programming in C++ whose code looks awfully similiar to C.

Note that this is also a disadvantage of C++, in that code written by one programmer who "knows" the language is not necessarily understandable by another programmer who "knows" the language.

Surely you mean that code wri

Surely you mean that code written by a programmer who knows the language may not be comprehensible to one who "knows" (i.e. doesn't know) the language?

Known Knowns :-)

I think his point was that it's quite possible (actually, common) for one C++ programmer to know the language from one angle and another programmer to know it from another angle, and for each of these programmers to find the other's code incomprehensible, even though they're both writing perfectly valid ANSI C++.

I encounter this phenomenon a lot: I make no apologies for the fact that my C++ style is very much driven by the modern capabilities of the language and libraries coupled with my background as a Lisp and, nowadays, O'Caml programmer. Folks who come to C++ with what I think of as the MFC (or, to be fair, MacApp) perspective frequently find my code incomprehensible (I pasted some compiling, but not working, source code into a chat with other C++ programmers with over a decade's experience and had one of them seriously challenge the idea that the code would even compile). Conversely, I find code written in the rely-on-members-in-a-superclass-to-side-effect-a-data-member-that-I-rely-on, coupled with plain-ol'-public-data-members, style utterly incomprehensible as well.

C++ now supports multiple diametrically opposed styles and their associated communities. This is something that it has in common with Perl. I leave you to draw your own conclusions about that.

Yes- that was precisely what I meant

In my experience, there are three main "styles" of C++ programmers I've seen:

First, there is C-with-classes programmer. These people are comming out of the C world, and they're not really comfortable with all these new-fangled stuff. Often times, there code will compile and run perfectly fine on a straight C compiler. Most of the rest use the occasional class or STL library, but not often. You'd have to change maybe 1% of the code to get it to compile as C. The red flag that you're dealing with a C-with-classes programmers is when you can do:
#define class struct
and everything still pretty much works.

Next up is the Java++ programmer. These are people comming out of the Java, Python, Ruby, etc. world. And they are still programming in said languages. They use objects a lot, but avoid using features not in these languages, like pointer arithmetic, templates, operator overloading, etc. The red flag that you're dealing with a Java++ programmer is when you start seeing classes with only virtual member functions all over the place (that's how you implement an interface, see...). Also, a tendancy to allocate everything on the heap and unbox almost nothing is also a common warning sign.

Lastly, there is the he-man C++ programmer. Generally, these people don't really know another language- several of them may have learned Pascal or Scheme back in college, but they're "grown-ups" using "grown-up languages" now (just ask them), but an increasing number of them learned C++ as their first language and never moved on. These programmers have memorized everything Bjarne Stroustrup ever wrote, and are out to prove it. The red flag that you're dealing with a he-man C++ programmer is that you find yourself having to look up what a private const static virtual constructor means, because that's critical to understanding why the code even works. He-man C++ programmers love template metaprogramming, which is possibly the most damning thing I can say about them. He-man C++ programmers never, ever eat quiche.

In my experience, there is a

In my experience, there is a fourth category: Those who wish they were in the third, but don't know enough C++ to pull it off. Warning sign: writing libraries that replicate existing libraries to augment C++, or writing libraries to duplicate functionality alreay present in the standard language, because they don't know about it.

In fairness, though, plenty of C++ code uses a lot of classes, as it is written by people who have graduated beyond the first category, but have yet to discover "advanced" features like STL, templates, or operator overloading in any meaningful way, or don't really need it that much, to grind out code.

Be carefull about who you call advanced

There's a natural tendancy to assume I listed the classes from least advanced to most advanced. Well, OK- the c-with-classes folks generally really don't know C++. But I tend to fall into the Java++ class, despite knowing all about templates, operator overloading, etc. Simply because a language provides a feature doesn't mean you should use it at every opportunity- as the IOCCC shows. And while the Java++ programmers may be ignorant of the more obscure corners of the languages, He-man C++ programmers are just as commonly (in my experience) ignorant of Design Patterns. I've seen He-man C++ programmers be completely baffled by the code written by Java++ programmers. Not that they didn't understand the mechanics of how the code worked, but because they didn't understand the metaphors and patterns the Java++ programmer was using.

You do know what quotes signi

You do know what quotes signify, right?

Yes

And I meant them.

Simply because you're up on the syntax and semantics of C++ doesn't necessarily make you a superior programmer over the guy who just got introduced to the language a week ago.

No, But...

...like money buying you happiness, it certainly helps a lot!

I meant in English. Simply be

I meant in English. Simply because you can think up snide asides doesn't make you more articulate.

stereotyping is not very constructive, and...

The red flag that you're dealing with a C-with-classes programmers is when you can do:
#define class struct
and everything still pretty much works.

by language definition every c++ program works just as well with the above macro applied. have a look at the c++ standard to figure out why...

you should at least support your overgeneralizing critique by providing correct accusations.

in my personal experience i have met just as many close-minded and/or incompetent c++ programmers as i have met such persons in most other language community of your choice. attitude and skill is usually not directly related to programming language preference.

Look at the wookie!

Disclaimer: following honored traditions pioneered at Slashdot, I haven't read the article being referred to. But I'm going to rip into it anyway. :)

If you have spent many years mastering a particular "way of thinking about programming", you probably aren't going to want to give it up on short notice.

No one wants to go from being a doctor or engineer in his "native" PL to being a taxi driver in someone else's.

I think a lot of technological "religion" flows from exactly this kind of learning opportunity cost.

True, and it results in a pretty irrational argument on the part of the authors of this article.

On the one hand, the article implicitly demonstrates the benefits of considering multiple approaches to solving a problem, in that by doing so, we can find solutions that meet a desired balance between e.g. expressivity and performance. That's how they achieved their new solution, after all.

OTOH, the implied conclusion is (apparently) that a major part of the possible solution space - i.e. functional solutions - should be excluded from consideration in general.

The reason for excluding such solutions boils down to the fact that they've previously been excluded, which is why something as simple as a basic recursive solution to a problem is seen as a "totally different way of thinking about programming". The reasoning is circular (dare I say recursive?) There's no underlying logic here, other than the need to defend a knowledge gap that exists for historical reasons. The same argument could have been made against using STL-based code in the first place.

To hide this weak argument, they've resorted to major obfuscation — conflating issues arising from different paradigms, different languages, different algorithms, differing amounts of library support, and implied issues like side-effect free programming, all without making any attempt to tease the issues apart. To top it all off, we are to believe that this single, highly confused example has some bearing on the general case.

All in all, we're dealing with the Chewbacca defense.

another quote from the article's Conlusion

the implied conclusion is (apparently) that a major part of the possible solution space - i.e. functional solutions - should be excluded from consideration in general.

I don't believe this is a fair characterization of K & M's conclusion. Let me give you another out-of-context quote from the same piece:

It is not essential to master every detail of every tool that you might ever need to use. What is essential is knowing that the tools exist, being open to the possibility of using them, and being willing to learn to how to use them when they are likely to be useful.

Thanks for Chewbacca Defense link by the way. I've been using the term ever since I saw the South Park episode in which this notion was introduced. I didn't realize, however, that the term had gained such wide recognition.

Not buying it

Koenig & Moo wrote:
It is not essential to master every detail of every tool that you might ever need to use. What is essential is knowing that the tools exist, being open to the possibility of using them, and being willing to learn to how to use them when they are likely to be useful.

I agree with this point in general, but it doesn't save the authors in this case. If someone is at a point where FP seems to require learning "a totally different way of thinking about programming", they're not going to meaningfully be able to keep that as a readily available option in their toolbox. Suddenly learning all about FP right at the point when it is "likely to be useful" isn't a great strategy. In any case, the comments you quoted do seem to imply something about the authors' opinion on the merits of investing in learning that different way of programming.

A big problem I have with all of this is what, specifically, the bogeyman is here — what are the elements of the "different way of thinking" they see as being required for the FP solution to this particular problem.

Presumably the problem isn't Haskell syntax in general, since that's a specific language issue, rather than anything intrinsic to the notion of FP. You could implement the same sort of code in C++.

If the problem is recursion, then what we're really talking about is the fact that mainstream languages have done a poor job of implementing recursion, primarily because they didn't know any better at the time. So while recursive solutions may be a totally different way of thinking about programming for some people, that's only because they've been Sapir-Whorfed into partial ignorance by the languages they use. The "different way of thinking" that is required here actually involves unlearning old limitations on thinking, and the only possibly valid argument against doing so is if the language you're using still doesn't support recursion well.

I can't imagine that the pattern matching is the issue — non-FP languages have that too, Python comes to mind. The lack of mutation in Haskell might be more of a concern, but that's somewhat orthogonal to real-world FP, and it's not news to anyone that mutation can have convenience and performance advantages.

In all these cases, the real issues involve learning principles which are more general, more widely applicable, than their specific expression in any single language, particularly mainstream languages. I would hope that someone who's actually teaching students would recognize this.

Myths and ideology in programming

Here is how my girlfriend, a literary critic, would analyze this.

According to Roland Barthes, a myth is an artificial cultural artifact which is presented as timeless, universal and natural. You can find examples in nationalism, advertising and media. Judging from the quoted text of the article, it sounds like the authors are using myths to reinforce a power structure where pedants and celebrities exert control over programmers and users.

The whole "different way of thinking" thing is part of the imperative ideology. It's a bogeyman playing on xenophobia that is used to make people uncomfortable about competing paradigms, and keep them in the fold of a community (culture) where what they read and hear can be controlled and censored.

They write that the FP approach is "straightforward -- but requires a totally different way of thinking about programming." Hm? It is both straightforward and "totally" alien? This borders on contradiction. Indeed, if a person can grasp the FP solution without having done any actual functional programming, just by reading a short article, and moreover regard it as "straightforward", one can hardly claim it is a "totally different way of thinking". Even if it is totally different from imperative thinking, whatever that is, it may not be incompatible with other sorts of thinking habits which their audience might broadly share. After all, we all get taught about equations, functions and, to some extent, recursion in high school. (At least factorial and the chain rule.) How alien can it be?

To be fair, though, there is also an FP (and LP) myth that because FP is "based on" mathematics it is natural and hence superior to procedural and OO paradigms. But mathematics is a cultural and historical artifact. I do not mean that mathematical truths are relative; I mean that the body of mathematical literature has taken a form which is a historical accident. For example, mathematics is often divided into geometry, analysis and algebra, but this could as easily have been geometry, analysis and coalgebra, in which case OOP advocates could have exploited this myth, OO being rather coalgebraic in nature.

Don't get me wrong. I think that there is a genuinely closer relationship between mathematical truth and FP, because FP has been analyzed in a thousand different ways in scientific literature, and FP languages tend to come with a semantics which is moreover effective for reasoning about programs. And I regard this as an advantage, because of the opportunity cost argument. But the "FP is more pure" (= natural) argument is bogus.

Myths and ideology in mathematics

But mathematics is a cultural and historical artifact. I do not mean that mathematical truths are relative

This reminds me of Proofs and Refutations by Imre Lakatos, a highly influential work in philosophy of mathematics. Since I haven't reread it in more than five years, my recollection may be faulty. (You can double-check it by following the above Amazon link which gives you access to the entire scanned book.)

I believe Lakatos would strongly agree that "mathematics is a cultural and historical artifact". I think he goes a little further and argues that mathematical truths are, in fact, relative. The above Wikipedia link gives a fairly good summary of his views.

Here's a quote from the book found by searching "inside the book" for absolute truth. It's from the footnote that spans pages 54 and 55:

'Nature confutes the sceptics, reason confutes the dogmatists' (Pascal, [1659], pp. 1206-7). Few mathematicians would confess [...] that reason is too weak to justify itself. Most of them adopt some brand of dogmatism, historicism or confused pragmatism and remain curiously blind to its untenability; for example. 'Mathematical truths are in fact the prototype of the completely incontestable... But the rigor of maths is not absolute; it is in a process of continual development; the principles of maths have not congealed once and for all but have a life of their own and may even be the subject of scientific quarrels.' (A.D.Aleksandrov [1956], p.7) (This quotation may remind us that dialectic tries to account for change without using criticism: truths are 'in continual development' but always 'completely incontestable'.)

from Lakatos back to Hamming

On Apr 7, 2005, I wrote:

This reminds me of Proofs and Refutations by Imre Lakatos

Interestingly, this brings us back to Hamming. I've been reading some of the papers mentioned in a recent new forum topic posted by Charles Stewart on Apr 17, 2005. Specifically, R.W. Hamming's essay The Unreasonable Effectiveness of Mathematics makes it clear that he was deeply influenced by the aforementioned work of Imre Lakatos.

A couple of quotes from the essay:

It is claimed that an ex-editor of Mathematical Reviews once said that over half of the new theorems published these days are essentially true though the published proofs are false. How can this be if mathematics is the rigorous deduction of theorems from assumed postulates and earlier results? Well, it is obvious to anyone who is not blinded by authority that mathematics is not what the elementary teachers said it was. It is clearly something else.

...

Indeed it seems to me: The Postulates of Mathematics Were Not on the Stone Tablets that Moses Brought Down from Mt. Sinai.

...

For over thirty years I have been making the remark that if you came into my office and showed me a proof that Cauchy's theorem was false I would be very interested, but I believe that in the final analysis we would alter the assumptions until the theorem was true. Thus there are many results in mathematics that are independent of the assumptions and the proof.

That reminds me

I had meant to respond to your earlier post. I simply want to say, that anyone who hasn't read "Proofs and Refutations" should (even if you aren't particularly mathematically inclined). It's enjoyable and quite good.

Whose myth?

What's missing in Your analysis is that claiming that FP employs a completely "different way of thinking" is not a myth generated by the imperative crowd originally but by the FP crowd itself. What happend is turning an optimistic revolutionary attitude towards programming ( "forget everything you know about C .." or "you will have a hard time to unlearn but it will be worthwhile in the end .." [1] ) into a conservative and defensive reaction ( "you don't need a completely different way of thinking to get the job done .. "). The authors simply play back the ideological distinction to their originators. The message always arrives it's receiver.

Kay

[1] "Think different" is/was an optimistic slogan of Apple Corp.

And if it's recursion...

How can you be a good programmer if you think that recursion is a "new way of thinking"? Good programmers understand recursion, even if they write mostly for loops.

Nth languages

I should point out that, though I don't share that PL conservative outlook, I don't think it is unreasonable either. If you have spent many years mastering a particular "way of thinking about programming", you probably aren't going to want to give it up on short notice.

I do find this somewhat unreasonable. It should not take years for an experienced programmer to master or at the very least become proficient in a new programming language even if there is a paradigm shift. And as one learns more languages it takes even less time (and it takes even less time without a paradigm shift). And it's not like you have to drop everything and hide away in a cave while you're learning.

Finally, this example alone is a pretty good example of why knowing multiple paradigms/languages is useful, and there are heaps of FP techniques being applied in OO languages nowadays providing another good incentive (at least for FP). Learning other paradigms/languages is not a waste of time.

The Haskell code irks me; it is more verbose than it needs to be. I assume this was done to make it more recognizable to non-Haskell (read C++) programmers.

scale n = map (n*)
merge xs [] = xs
merge [] ys = ys
merge (x:xs) (y:ys)
    | x == y    = x : merge xs ys
    | x 

Unilingual Perquisites

I do find this somewhat unreasonable. It should not take years for an experienced programmer to master or at the very least become proficient in a new programming language even if there is a paradigm shift.

How many years (decades) did it take OO to be understood by mainstream programmers? Many still don't.

Just because some of us are good at this kind of thing shouldn't lead us to assume that most otherwise skilled programmers should be too.

From a sociological point of view, I think you skipped over the pithier part of my post:

No one wants to go from being a doctor or engineer in his "native" PL to being a taxi driver in someone else's.

The psychological benefits of sticking to what you know well are a pretty strong motivator.

Like an English speaker in any part of the world where English is widely spoken as a second language, the path of least resistance is always to revert to English when the going gets rough.

Professionalism

How many years (decades) did it take OO to be understood by mainstream programmers? Many still don't.

I'm not talking about the mainstream, I'm talking about individuals, particularly professionals (more the connotations of the word rather than the denotation).

Just because some of us are good at this kind of thing shouldn't lead us to assume that most otherwise skilled programmers should be too.

I have a feeling most people who are "good at this kind of thing" weren't born that way, have many languages under their belts, and found there first or second language took more time than the latter ones. In a nutshell, they are "good at this" because they have practiced it!

No one wants to go from being a doctor or engineer in his "native" PL to being a taxi driver in someone else's.

Part of my intent is that one isn't going to be a "taxi driver" for long. Further, if such people are really so insecure, they can keep it to themselves until they feel comfortable.

The psychological benefits of sticking to what you know well are a pretty strong motivator.

Like an English speaker in any part of the world where English is widely spoken as a second language, the path of least resistance is always to revert to English when the going gets rough.

Yep, this and the above are good reasons not to ever learn anything new at all!

I guess I should've stopped working at guitar, when I was feeling so clumsy trying to change chords smoothly or stopped drawing after circular filing some less-than-successful attempts, and there certainly were those CT papers that gave me a niced glazed look. Sure, I wouldn't be able to play guitar, draw, or know what a colimit is now, but at least I wouldn't have felt uncomfortable!

Pragmatism

Yep, this and the above are good reasons not to ever learn anything new at all!

There is a difference between advocating a position and observing its power "on the ground".

If we want to advocate new ways of doing programming, we have to confront the reasons why they are resisted.

If we don't care if the benighted wallow in ignorance until their dying day, well, we could ignore them (and we wouldn't be discussing this article at all.)

It is easier to change people's behaviour than to change their values. If we want them to adopt our favorite paradigm, we have to address the reasons they resist it.

If we don't care about that, we shouldn't get so mad that they resist it. ;-)

Just a thought about target audiences.

Wonders whether teaching any subject, requires the decision of whether to teach to the bottom, middle, or top of the class. Those that teach to the top are accused of elitism. Those that teach to the bottom are accused of dumbing down. And those that teach to the middle are accused of promoting mediocrity.

[edit note: And I'd prefer LtU to be accessible to all three at the same time]

[edit edit note: Ok, so we might not want the absolute dregs] :-)

Different Class

Not only that, but some of us are sitting in the Advanced Category Theory seminar while others are attending Object Oriented Programming With Java 101.

I personally want to see some Ruby-on-Rails-style tutorials for web programming with Haskell...

Old school

Being one who's mid-tier but hangs around smart people hoping some of that knowledge will rub off, I think it's important for the best and brightest to reach down (as well as up). It helps to touch bases with the basics periodically to help solidify the more abstract ideas (a dose of reality) and helps bring in new aspirants to mix up the field of ideas.

Of course, the alternative is to put the Oleg's and Noel's in a room by themselves, but they'd probably get bored with each other sooner or later. :-)

My approach

Or you create a community site like LtU, where demi-gods can share their wisdom... I really suggest people use this opportunity (e.g., Oleg's paper about delimited continuations currently being discussed).

Conciseness

The Haskell code irks me; it is more verbose than it needs to be.

Perhaps then you should also drop the matches on nil in merge.

scale n = map (n*)
merge (x:xs) (y:ys)
    | x == y    = x : merge xs ys
    | x <  y    = x : merge xs (y:ys)
    | otherwise = y : merge (x:xs) ys

seq = 1 : merge (scale 2 seq)
                (merge (scale 3 seq) (scale 5 seq))

Ironic

The irony is that the original definition of scale will only work on infinite lists whereas my version will work on finite or infinite lists, however the original version of merge would work on finite lists, but this version won't. It's a bit of an odd inconsistency in the original code.

IMHO the version of each func

IMHO the version of each function that works on finite lists is to be preferred, simply because that is least surprising. That said, a comment that as all lists inside the programme are infinite, termination conditions are not needed would probably remove the element of surprise.

Apples to oranges

The Haskell implementation is complete self-contained solution. It requires no reference to libraries. The C++ is simply leveraging off behavior provided by the template library. In terms of simplicity, the haskell solution just requires you learn that languages syntax, of which this example is a pretty plain jane example. For the C++, you have to learn it's immense syntax, figure out how it's used here, and then dive into the template library and figure out what to do there. And we are talking about a math problem here.

Default Assumptions

Chris Rathman: For the C++, you have to learn it's immense syntax, figure out how it's used here, and then dive into the template library and figure out what to do there.

I think Koenig and Moo's assumption is that a modern C++ programmer has done that, and therefore there won't be any mystery to their C++ solution, whereas even once you explain the Haskell syntax, the approach of pattern matching on algebraic data structures is still alien.

Of course, you still have a point; it's not the least bit alien to anyone conversant in any of the ML family, Haskell, Oz, Concurrent Clean...

The topic is a pertinent one to me. As I've written before, my day job involves a large-ish body of C++ written aggressively in the Microsoft MFC style, and I'm slowly attempting to untangle it and make it more idiomatic in the context of what it does. Since a lot of what it does is deal with containers, moving more to functionalism and the use of exceptions feels very natural to me, but folks who don't know the STL and boost::bind reasonably well find some of my code weird. So even within the C++ community, there's a knowledge/stylistic-preference gap that has to be addressed.

Wrong assumptions

One of the first languages I looked at after C and Java was OCaml (I read part of SICP before). Pattern matching algebraic data structures was the most natural thing. I immediately started hating C dialects for the (needless) lack thereof.

Often ML code is clearer than the infinitely tangled data and control flow that usually ends up in any non-trivial Java program, with all kinds of objects calling each other. No wonder that often a bug ends up somewhere inbetween!

That said, I like imperative features.

"Language" vs "Library" is often an artificial distinction.

Does it matter much that lists in ML are an "intrinsic" datatype, whereas std::set is part of the library (and is capable itself of being written in "plain" C++)? In my mind, no.

There are many good arguments to use in language flamewars; the language-vs-library argument isn't one of them. After all, if C++ had sets as an intrinsic datatype, that would just be further grist for the "it's a bloated langauge" mill. :)

Any competent C++ programmer should be familiar with the STL. That said, there are many who aren't--who still use it as little more than an OO extension to C.

It's a question of the amount of information conveyed

The Haskell example captures the abstraction in very large part. The C++ example shows a recipe for using the library to create a solution. There is an impedance mismatch between the depth of information the Haskell code conveys and what the C++ code conveys - being at once both a higher and a lower level of abstraction for the Haskell code.

Sure, libraries level the playing field. For 99.99% of the population that uses such things, the example given will just be a recipe to be copied and pasted. Either solution could be put in a library and added to the next generation of HQ9++ as a single character N (H is already taken).

The traditional mathematician recognizes and appreciates mathematical elegance when he sees it. I propose to go one step further, and to consider elegance an essential ingredient of mathematics: if it is clumsy, it is not mathematics.
- Edsger Dijkstra

Lists in ML; C++ templates

aren't part of the language. They are pattern-matched like everything else.

Lists in ML, just like strings in Java, have an extra built-in syntax, though, because their use is so common.

To C++: the best features of that language are that is makes some common C idioms easier, adds some typechecking and gives inheritance (which is overused, though).

OTOH, I still use C in preference, since C++'s assumptions to memory layout (variable sized structs aren't really possible as classes) are restrictive. Templates are the worst thing since unsliced bread, requiring everything the code to be compiled for EVERY freaking instantiation! In C I just put a void pointer in a structure, no need to compiled it 50 times. I don't use the STL&C++ for that reason.

Builtin syntax vs. standard library? Doesn't matter

Regarding the simplicity of a program, it doesn't matter which part of the language it uses is built into the language and which is a part of the standard library.

Actually if more is done in a library then it's a sign of a more flexible language. Since libraries can be added but compilers cannot be easily extended, it means that a random other problem has a higher chance of admitting a clean solution: even if it wasn't considered by language designers, it can be solved by library writers later.

If you say that it's unfair because it makes languages with lots of libraries appear more powerful than those without, then I reply that they appear more powerful because they are. Libraries matter.

A ready library might be less universal than the ability to implement a solution from scratch, but it definitely allows the task to be solved much easier when it applies.

Sign of a flexible language?

Actually if more is done in a library then it's a sign of a more flexible language.

The argument would hold more weight had the template library not required the C++ compiler to be majorly modified to get them to work.

More generally though, there are lots of smart programmers out there who can make libraries out of even the worst languages. The availability of libraries has much less to do with the quality of the
language and more to do with the shear numbers of people using the language to various ends.

Anyhow, my argument isn't about how simple it is to do this in C++ vs. Haskell. It's about the mismatch between the code samples. The C++ code has much of the detail hiding behind the library whereas the Haskell code is just fairly straight forward from that languages perspective. And just because the sample Haskell code does not use a library doesn't mean that it couldn't be implemented as a library for sequences.

There are a lot of things which are simpler in C++ but I fancy that infinite algebraic series is not among the top.

Of course... a language should support some basic set

of features.

You mention C++ templates (more generically, generic polymorphism). You are correct; implementing the STL--or anything like it--required this feature be present in the language; Java is now undergoing a similar evolution.

No argument there.

And as others have pointed out, the advantage that Haskell has over C++ in this instance is lazy evaluation and lazy data structures. One could port the Haskell algorithm (and it's O(n) performance) to C++, but the code would be an ugly hodgepodge of roll-your-own thunks and other assorted hackery, needed to emulate lazy evaluation in an otherwise eager language. (Think of a Scheme implementation of the same algorithm, then convert the uses of delay/force one would find into equivalent C++ machinery).

However, comparisons of the C++ and Haskell code above are comparing apples to oranges; they are different algorithms. One uses lazy evaluation of lists to select elements from an ever-expanding "frontier" of possible choices; the other uses an ordered associative container (a STL "set") to essentially sort the elements. The behavior of std::set is well-known to any C++ programmer (including performance characteristics, which are documented), and the meaning of the above code fragment is clear as day. Just as the meaning of the Haskell snippet is obvious to a competent Haskell programmer. Both samples are likely to be difficult to read for someone not conversant in the language.

Regarding "hiding detail"--where I come from, that's considered a good thing. I don't want to know (or care) that std::set is implemented using a red/black tree, for instance... such details are not germaine to the problem at hand. What salient detail did you think the C++ code obscured?

Guess I could be defensive to the bitter end

Of such intransigence are language wars made of. But I'll bow out under the premise that I glean more information from the Haskell code than the C++ code - and yes that could very well be because I've been working more with Hindley-Milner type systems of late and haven't worked much with C++ Templates.

But as you state, they are different algorithms, so I win my argument anyhow, even if I had to totally change my argument in the process. :-)

totally different?

Where the article presents "The Functional Solution" it is quite reasonable, introducing lazy lists gently, and ending with "[the definition of seq] is no more magical than our earlier example". The comment about a "totally different way of thinking" is not supported in the article, but it's surely intended to set up the C++ solution as preferable. Of course, it's the FP advocates who continually assert the difference.

The primary difference in the functional program is that it has a name for the infinite list that is the solution. The alternatives provide a data structure where the solution is accumulated, which to my eyes appears unfinished. The "idiomatic" solution generates the numbers OK, but does not provide an idiomatic way to access them, i.e. via an iterator. The article's conclusion is flawed, since its solution amounts to a fair demonstration of a hole in C++'s expressivity.

Full C++ listing please

What is the full C++ listing? The C++ listing given assembles an ordered set containing the numbers 1,2,3,5.

for the reader to complete

The OP contains all of the code from the article, and http://www.cuj.com/code/ doesn't have it yet. (And if you guess the name of the zip file for the April code, the code from the Koenig/Moo article isn't in that file, either.)

OP?

OP?

OP?

original post/poster [in the thread]

An attempt at a full C++ version

Below is an attempt at a full C++ version. It differs from the haskell version at the 38th number.

(edit) Initially I was confused as to why this is; I've realised that this is because of the order in which the numbers are produced. Increasing the 20 to 60, for example, causes the lists to agree in the 38th number. One might suggest that the haskell version has comprehensibility benefits.

#include <set>
#include <iostream>
using namespace std;
int main()
{
   set<int> seq;
   seq.insert(1);
   set<int>::const_iterator it = seq.begin();

   for(int i = 0; i < 20; ++i)
   {
      int val = *it;
      seq.insert(val * 2);
      seq.insert(val * 3);
      seq.insert(val * 5);
      it++;
   }

   it = seq.begin();
   std::cout << *(it++);
   for(; it != seq.end(); ++it)
      std::cout << "," <<  *it;
   std::cout << std::endl;
   return 0;
}

a deficiency in the C++ implementation

It differs from the haskell version at the 38th number. Anyone have an idea why?

Because you haven't computed the 38th number yet. You execute the body of the loop 20 times, generating the following 41 numbers:

1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 18, 20, 24, 25, 27, 30, 32, 36, 40, 45,
48, 50, 54, 60, 64, 72, 75, 80, 81, 90, 96, 100, 108, 120, 125, 135, 150, 160,
180

135 is (incorrectly) followed by 150.

If you iterate 30 times — instead of 20 — your program generates the following 56 numbers:

1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 18, 20, 24, 25, 27, 30, 32, 36, 40, 45,
48, 50, 54, 60, 64, 72, 75, 80, 81, 90, 96, 100, 108, 120, 125, 128, 135, 144,
150, 160, 162, 180, 192, 200, 216, 225, 240, 250, 270, 300, 320, 360, 375, 400

135 is now (correctly) followed by 144.

So, I think your example highlights a deficiency in the "idiomatic C++" implementation. There is no easy way to tell how many insertions one needs to perform in order to compute the Nth Hamming number correctly. Thanks for pointing this out.

P.S. I found this on the python list: Classical FP problem in python : Hamming problem (alternatively, see the entire thread)

I can't believe I wrote this...

There is no easy way to tell how many insertions one needs to perform in order to compute the Nth Hamming number correctly.

On the other hand, by simply iterating N times, you do at most O(1) too much (amortised) work.

This suggests the following implementation, which is a little closer to the Haskell version. Plus, it's about as short as the original C++ version if you ignore includes and typedefs.

#include <iostream>
#include <set>
#include <algorithm>
#include <boost/iterator/transform_iterator.hpp>

using namespace std;
using namespace boost;

typedef binder1st<multiplies<int> > scale_type;
typedef transform_iterator<scale_type, set<int>::iterator> scale_iterator;

main()
{
    set<int> s;
    s.insert(1);

    scale_iterator it2(s.begin(), bind1st(multiplies<int>(), 2));
    scale_iterator it3(s.begin(), bind1st(multiplies<int>(), 3));
    scale_iterator it5(s.begin(), bind1st(multiplies<int>(), 5));
    set<int>::iterator it = s.begin();

    for (int i = 0; i < 40; ++i)
    {
	s.insert(*it2++);
	s.insert(*it3++);
	s.insert(*it5++);
	if (i > 0) { std::cout << ", "; }
	std::cout << *it++;
    }
    std::cout << '\n';
}

This is almost the Haskell version. To do the Haskell version properly, and have constant time access, you need to implement merge. Here are the gory details, courtesy of Boost. Don't read if you're squeamish...

#include <iostream>
#include <list>
#include <algorithm>
#include <boost/iterator/transform_iterator.hpp>
#include <boost/iterator/iterator_facade.hpp>

using namespace std;
using namespace boost;

typedef binder1st<multiplies<int> > scale_type;
typedef transform_iterator<scale_type, list<int>::iterator> scale_iterator;

template<typename It1, typename It2>
class merge_iterator
    : public boost::iterator_facade<
	      merge_iterator<It1, It2>
	    , const int
	    , incrementable_traversal_tag
	>
{
public:
    merge_iterator(It1 p_it1, It2 p_it2)
	: m_it1(p_it1), m_it2(p_it2)
    {
    }

private:
    friend class iterator_core_access;

    void increment()
    {
	int x1 = *m_it1;
	int x2 = *m_it2;
	if (x1 <= x2) { ++m_it1; }
	if (x1 >= x2) { ++m_it2; }
    }

    const int& dereference() const
    {
	return std::min(*m_it1, *m_it2);
    }

    It1 m_it1;
    It2 m_it2;
};

main()
{
    list<int> s;
    s.push_back(1);

    std::cout << 1;

    scale_iterator it2(s.begin(), bind1st(multiplies<int>(), 2));
    scale_iterator it3(s.begin(), bind1st(multiplies<int>(), 3));
    scale_iterator it5(s.begin(), bind1st(multiplies<int>(), 5));
    merge_iterator<scale_iterator,
	           merge_iterator<scale_iterator, scale_iterator> >
	    it(it2, merge_iterator<scale_iterator, scale_iterator>(it3, it5));

    for (int i = 1; i < 40; ++i, ++it)
    {
	int x = *it;
	std::cout << ", " << x;
	s.push_back(x);
    }
    std::cout << '\n';
}

It's a lot less ugly than I thought it would be.

Interestingly, the hardest part that I found was getting the typedefs and other type declarations right. The actual code was easy. Especially since I ignored termination conditions on the input iterators.

One thing which isn't covered here, of course, is that the Hamming number problem is a very artificial problem in the first place.

I note that the "full haskell

I note that the "full haskell C++" version does require a lot more boilerplate and declaration than the haskell version, which does rather obscure its meaning, at least at first glance. The same is not true of the "semi-haskell" version I feel, which is probably clearer than either of my versions.

To be fair...

It's true that the "full Haskell C++" version has more boilerplate, but in fairness, I used a pile driver to crack an egg: hamming_iterator is a fully conforming proposed-standard iterator, which is arguably overkill.

In writing it, I found that the declarations were the most tricky parts. The boilerplate (and it's not clear that it's all boilerplate; it's there because C++ iterators can do things that Haskell lazy lists can't, such as allow clients to modify what they point to if the container allows it) was fairly mechanical, didn't take long, and worked first time more or less.

I agree with Scott Turner that I'd rather do a recreational programming problem like this in Haskell. On the other hand, I learned stuff doing it this way in C++.

I think the most disturbing remark in the original article is that the Haskell solution "requires a totally different way of thinking about programming". If you're a professional software engineer on a deadline, and you can see a solution that's easy for you to understand and good enough, then you should do it. Your time is valuable.

In this case, though, we're talking about Hamming numbers. Why you would bother with such a simple, artificial, recreational programming problem if you weren't trying to learn something?

C++ still not comparable to Haskell

That matches the efficiency of the Haskell, but not the expressiveness. The Haskell version has a name 'seq' which embodies the infinite sequence of Hamming numbers. Very straightforward. In C++ that should be represented by an iterator. In the above programs, neither s.begin() nor 'it' is the complete answer.

Below is the implementation an iterator class hamming which represents the infinite sequence of numbers.

[includes and typedefs just like "almost Haskell" version above]

set<int> set_of_one () {
    set<int> s;
    s.insert(1);
    return s;
}

class hamming
    : public boost::iterator_facade<
	      hamming
	    , const int
	    , incrementable_traversal_tag
	>
{
public:
    hamming ()
      : s(set_of_one()),
	it2(s.begin(), bind1st(multiplies<int>(), 2)),
        it3(s.begin(), bind1st(multiplies<int>(), 3)),
        it5(s.begin(), bind1st(multiplies<int>(), 5))
    {
    	it = s.begin();
    }

private:
    friend class iterator_core_access;

    void increment()
    {
        s.insert(*it2); it2++; // Fix bug affecting the first increment.
	s.insert(*it3++);
	s.insert(*it5++);
	it++;
    }

    const int& dereference() const
    {
        return *it;
    }

    set<int> s;
    set<int>::iterator it;
    scale_iterator it2;
    scale_iterator it3;
    scale_iterator it5;
};

Fixing the bug in the "almost Haskell" version, and figuring out a way to initialize everything was a firm reminder of why I do my recreational programming in Haskell. It's not just that the C++ version is more verbose; it's more complex and error-prone. A class like hamming can be useful, and "a whole new way of thinking" is not necessary to understand that. Agreed, the problem is not routine, but it remains a good example of the merits of FP.

One small thing...

I carefully used list rather than set in my "full Haskell" version. The difference is O(n log n) vs O(n) time.

Another modification

I should mention that the following version behaves slightly more intuitively:

#include <set>
#include <iostream>
using namespace std;
int main()
{
   set<int> seq;
   seq.insert(1);
   set<int>::const_iterator it = seq.begin();

   for(int i = 0; i ::const_iterator it2 = seq.begin();
   std::cout 

python version

I should mention that the following version behaves slightly more intuitively:

Right, we end up generating an ordered set of 72 numbers (1, 2, ..., 675, 720) of which only the first 40 can be trivially proven to be consecutive Hamming numbers. The rest of the sequence contains gaps.

For archival purposes, here's the Python version:

# http://mail.python.org/pipermail/python-list/2005-January/262118.html
# http://mail.python.org/pipermail/python-list/2005-January/262277.html
# http://mail.python.org/pipermail/python-list/2005-January/262292.html

# Requires Python 2.4

from itertools import tee, imap

def imerge(xs, ys):
    x = xs.next()
    y = ys.next()
    while True:
        if x == y:
            yield x
            x = xs.next()
            y = ys.next()
        elif x < y:
            yield x
            x = xs.next()
        else:
            yield y
            y = ys.next()


def hamming():
    def _hamming():
        yield 1
        for n in imerge(imap(lambda h: 2*h, hamming2),
                        imerge(imap(lambda h: 3*h, hamming3),
                               imap(lambda h: 5*h, hamming5))):
            yield n
    hamming2, hamming3, hamming5, result = tee(_hamming(), 4)
    return result

for i in hamming():
    print i,

The version you replied to pr

The version you replied to printed out the first 40 hamming numbers correctly. Replacing 40 in the programme with n would result in the first n hamming numbers being printed. For the record, I guessed that the modification would rectify the problem, rather than reasoning about the programme.

zerg

I may be reading too much into this quote, but it sounds to me like Koenig and Moo consider it a bad thing to require a "totally different way of thinking about programming".

In my Programming Paradigms & Foundations class w/ Dr. Koenig, so far we've covered Algol, Smalltalk, Icon, APL and Simula. We've tried as much as possible to compare language features to Java, C++, Scheme and Python. We've talked about Djikstra's structured programming and next week we will talk about John Backus' "Can Programming Be Liberated from the von Neumann Style?"

In no way at any time have I gotten the impression that he would think that "a totally different way of thinking about programming" would be a bad thing. This includes the first lecture, which was more or less the article he & his wife published in the 2005 April CUJ.

Dodo-egg Omelette

Algol, Smalltalk, Icon, APL and Simula... Java, C++, Scheme and Python

Pardon me for being underwhelmed; only one unambiguously FP language in the bunch, and heavily weighted to PLs that predate C++ (and which thereby could have had their features co-opted or rejected during the development of C++.)

Everybody thinks they're open-minded; they just have a few things they think are too weird to consider seriously. ;-)

Well, I believe he has used A

Well, I believe he has used Algol & APL during his work @ AT&T, but I see what you're saying. Still, I'm not convinced that he's outright rejected functional programming and I don't see his article as a warning to stay away from FP... After all, it is called C/C++ Users Journal, and not Haskell Users Journal...

w/ respect to your post's title, I believe you will only pry Squeak from its users' cold, dead hands. :P Also, I'm pretty sure Icon is where Python got its generators from (I've been wrong about this sort of thing before).

(Is there such a thing as HUJ?)

Generators

Also, I'm pretty sure Icon is where Python got its generators from
> (I've been wrong about this sort of thing before)

You're certainly OK on this point. It should be noted though that while the concept of generators from Icon made it into Python (largely I believe due to Tim Peters), the way in which generators integrate into the language is entirely different. Icon's underlying expression language is massively different to Python's, and generators are a natural fit into Icon, whereas in Python they're rather restricted. Some well known Icon people feel that Python's generators don't compare to Icon's and I think they have a point.

Koenig and FP

For what it is worth:

"In his 35+ years as a programmer, Andrew Koenig has written an online student registration system in APL, a software distribution system in a mixture of C and C++, pattern matching libraries in C++, ML..."

Bio Quote taken from http://www.artima.com/weblogs/viewpost.jsp?thread=54469

Perhaps of more interest:
"An anecdote about ML type inference" By Andrew Koenig
http://www.usenix.org/publications/library/proceedings/vhll/full_papers/koenig.a

Logic paradigm solution for Hamming numbers



stratify hamming(N) by [N,hamming].
stratify print(N) by [N,print].
stratify hamming << print.
hamming(1).
hamming(New) <-- hamming(Old), New is Old*2, New < 100000.
hamming(New) <-- hamming(Old), New is Old*3, New < 100000.
hamming(New) <-- hamming(Old), New is Old*5, New < 100000.
print(Num) <-- hamming(Num).


via Starlog.

a Prolog solution

The Starlog solution looks much neater than, say,

/*
  http://lml.ls.fi.upm.es/~jjmoreno/jjpapers/expr_imp.ps
*/

generate(dif(Twos, [V2|L2]), dif(Threes, [V3|L3]), dif(Fives, [V5|L5])) :-
        select(Twos, Threes, Fives, V),
        dequeue(Twos, Twos1, V),
        dequeue(Threes, Threes1, V),
        dequeue(Fives, Fives1, V),
        V2 is 2*V,
        V3 is 3*V,
        V5 is 5*V,
        write(V),  nl,
        generate(dif(Twos1, L2), dif(Threes1, L3), dif(Fives1, L5)).

select([X1|_L1], [X2|_L2], [X3|_L3], X1) :- X1 =< X2, X1 =< X3.
select([X1|_L1], [X2|_L2], [X3|_L3], X2) :- X2 < X1, X2 =< X3.
select([X1|_L1], [X2|_L2], [X3|_L3], X3) :- X3 < X1, X3 < X2.

dequeue([X|L], L, X).
dequeue([Y|L], [Y|L], X) :- X \= Y.

hamming :- generate(dif([1|L2], L2), dif([1|L3], L3), dif([1|L5], L5)).

Remarkably concise Haskell function

hamming = 1 : map (*2) hamming # map (*3) hamming # map (*5) hamming
    where xxs@(x:xs) # yys@(y:ys)
              | x==y = x : xs#ys
              | x<y  = x : xs#yys
              | x>y  = y : xxs#ys

via Answers.com.

Number signs?

xs#ys

Are those number signs supposed to be "++"?

Local infix operator # (merge)

No, # is the infix operator that's defined as a local function in hamming's where clause.

Back to drawing board

No, # is the infix operator that's defined as a local function in hamming's where clause.

Ah, obvious now that you point it out.

I guess I either have to go back to the drawing board with Haskell, or that code was a bit TOO concise for my own good. ;-)

Space Complexity

Has anyone done a space complexity analysis of the Haskell program?

Re: Space Complexity

I think it's fair to ask the same question of the C++ implementation.

In the Haskell case, you need to "remember" (some of) the previously computed numbers. To be more precise, to compute the Nth Hamming number HN, you need to remember the previously computed numbers in the interval of [HN / 5, HN].

In the C++ case, you end up carrying around all of the numbers in the interval of [1, HN] plus "most" of the numbers in the interval of [HN, 5*HN].

Since I don't have the required number-theoretical skills to estimate the sizes of the aforementioned sequence intervals, I ran a quick numeric experiment instead. The following listing shows the first 12 Hamming numbers. The blue lines show the sequence that the Haskell implementation must "remember" at each step. The red lines show the set of numbers computed by the K & M's C++ implementation at each step. To keep the sequences visually aligned, I use underscores to show the current gaps in the C++ ordered set. These show the positions of currently missing Hamming numbers that will be filled at some later point.

In the Haskell case, I use the following conventions to indicate the current end of the "scaled" lazy lists:

  • the underlined number is the end of the list scaled by 2;
  • the bold number is the end of the list scaled by 3;
  • the crossed-out number is the end of the list scaled by 5.
1
1 2 3 _ 5

1 2
1 2 3 4 5 6 _ _ 10

1 2 3
1 2 3 4 5 6 _ 9 10 __ 15

1 2 3 4
1 2 3 4 5 6 8 9 10 12 15 __ __ 20

  2 3 4 5
1 2 3 4 5 6 8 9 10 12 15 __ __ 20 __ 25

  2 3 4 5 6
1 2 3 4 5 6 8 9 10 12 15 __ 18 20 __ 25 __ 30

  2 3 4 5 6 8
1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 __ 30 __ __ 40

  2 3 4 5 6 8 9
1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 __ __ 40 45

    3 4 5 6 8 9 10
1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 __ __ 40 45 __ 50

    3 4 5 6 8 9 10 12
1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 __ 36 40 45 __ 50 __ 60

      4 5 6 8 9 10 12 15
1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 __ 36 40 45 __ 50 __ 60 __ __ 75

      4 5 6 8 9 10 12 15 16
1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 40 45 48 50 __ 60 __ __ 75 80

Unfortunately, this doesn't really tell you anything about the asymptotic behavior of each respective implementation.

Re: Space Complexity

Since I don't have the required number-theoretical skills to estimate the sizes of the aforementioned sequence intervals

(Other than observing trivially that the space complexity is no worse than O(n) in both cases.)

Not convinced it is trivial

Thanks for taking the time to write out the sets!

While observing the "previously computed" numbers does make it appear as if the Haskell version doesn't require storage greater than N, this is not neccessarily the case. There is also a need for temporary storage as well at each iteration, which could possibly increase asymptotically greater than the rate of N. Is it easy to do the analysis of the cost of the lazy evaluation temporary storage?

expected Haskell behavior in pseudo-code

... does make it appear as if the Haskell version doesn't require storage greater than N, this is not neccessarily the case.

The O(n) estimate is based on the assumption that given a moderately smart compiler, the Haskell implementation can work internally along the following lines (excuse my bad Scheme):

(define (make-hamming)
  (let* ((tail '(1))
         (s2 tail)
         (s3 tail)
         (s5 tail))
    (lambda ()
      (let* ((result (car tail))
             (h2 (* 2 (car s2)))
             (h3 (* 3 (car s3)))
             (h5 (* 5 (car s5)))
             (next (min h2 h3 h5)))
        (set-cdr! tail `(,next))
        (set! tail (cdr tail))
        (if (= h2 next)
            (set! s2 (cdr s2)))
        (if (= h3 next)
            (set! s3 (cdr s3)))
        (if (= h5 next)
            (set! s5 (cdr s5)))
        result))))

Storage-wise, this behaves like the blue sequence in my previous comment, provided that the above snippet really works the way I think it does.

Re: Space Complexity -- Haskell wins

To follow up on my earlier comment

the space complexity is no worse than O(n) in both cases.

In the C++ case, the O(n) estimate cannot be improved, given an implementation like the one fleshed out in marcin's post.

The Haskell implementation seems to be doing a little better than O(n) based on the following empirical data:

Step # Tail size tail size / step #
10 7 0.70
100 44 0.44
1000 228 0.23
10,000 1,108 0.11
100,000 5,253 0.05
1,000,000 24,623 0.02

The table basically says that assuming the Haskell implementation behaves as described in this post, it needs to remember 44 previous numbers in order to compute the 100th. It needs to remember 24,623 previous numbers in order to compute the 1,000,000th. On the face of it, that seems to be more efficient than O(n).

(As a sanity check for my Python script, I should probably ask Anton to verify whether or not the 1,000,000th Hamming number is 519381797917090766274082018159448243742493816603938969600000000000000000000000000000, assuming you count 1 as the 0th number.)

Sanity confirmed (in some narrow sense)

(As a sanity check for my Python script, I should probably ask Anton to verify whether or not the 1,000,000th Hamming number is 519381797917090766274082018159448243742493816603938969600000000000000000000000000000, assuming you count 1 as the 0th number.)

Yes, it is.

> seq !! 0
1
> seq !! 1000000
519381797917090766274082018159448243742493816603938969600000000000000000000000000000
> elemIndex 519381797917090766274082018159448243742493816603938969600000000000000000000000000000 seq
Just 1000000

That last expression used a cut & paste from the original quoted number, to make sure they matched.

Approximation

Sloane's on-line encyclopedia of integer sequences is always a useful reference. In this case it provides the approximation
f n = (ln (n * sqrt 30)) ** 3 / (6 * ln 2 * ln 3 * ln 5) + O(lnln n)
for the number of hamming numbers smaller than n. Let's forget about the last term, and invert the formula to obtain the nth hamming number
h n = exp ((n * (6 * ln 2 * ln 3 * ln 5))**(1/3)) / sqrt 30
Test: h 1000000 = 5.183690169e83: close enough for me. Now the tail size, as you call it, is
tailsize n = n - f (h n / 5)
Test:

Step # Tail size Approx
10 7 7.66
100 44 44.52
1000 228 228.31
10,000 1,108 1,108.77
100,000 5,253 5,254.48
1,000,000 24,623 24,624.33

The approximation is not far off. The right-hand side of tailsize can be rewritten as

tailsize
    =   (ln 5 * 3*c**(-1/3)*n**(2/3) - 3*(ln 5)**2 *c**(-2/3)*n**(1/3) + ln 5**3/c)
    where
        c =   6*(ln 2)*(ln 3)*(ln 5)

and this approximately

tailsizeapprox n
    =   2.48*n**(2/3) - 2.05*n**(1/3) + 0.57)

So yes, this is sublinear.

Now, all the complexity measures discussed so far seem to assume that integer operations are constant time and integers are constant size. If we take into account that the size of the hamming numbers grows than it changes. The size in bits of the nth hamming number is

sizeh n = log (h n) / log 2
The log nicely cancels out with the exp in (h n),
sizeh n = (2*(c*n)**(1/3)-ln 2-ln 3-ln 5)/(2 * ln 2)
    where
        c =   6*(ln 2)*(ln 3)*(ln 5)
This function has the form
sizeh n = c1 + c2 * n**(1/3)
We can approximate the size of the first n hamming numbers sum [sizeh n | i <- [0..n]] with the integral
sizehrange n = c1 * n + 3/4 * c2 * n**(4/3)
The size of the hamming numbers in the tail is
tailsize = sizehrange n - sizehrange (f (h n / 5))
Now this gets messy, but the function is still dominated by the n**(4/3) factor, which makes it superlinear. For example tailsize 1,000,000 = 6819368.46 bit = 832KB. Does this seem about right?

Perl implementation with lazy streams

Thanks for your analysis. Beautifully done. Thanks a million.

For example tailsize 1,000,000 = 6819368.46 bit = 832KB. Does this seem about right?

Since I don't have a Haskell compiler installed, I'm going to have to pass this question on to Anton "the sanity checker" van Straaten. (My Python script doesn't currently discard numbers that are no longer needed and I'm too lazy to fix it.)

As an aside, the AT&T Encyclopedia of Integer Sequences links to a nice article by Mark Dominus (the author of Higher-Order Perl) that shows how to solve the Hamming problem in Perl using Mark's own lazy streams module.

Aside: time complexity

This reminds me of something I have wondered: what is the time complexity of append (as in ++) in a lazy language? I've heard that it's O(n) just like in a strict language but in my fuzzy mental picture I expect it to be O(1). Know what I mean?

What, because strictly speaki

What, because strictly speaking, a lazy language ought to print out [1,2,3]++[4,5,6] by printing the elements of the first list, then the elements of a second, rather than running over the first list, adding it into a new list, then adding the elements of the second to that list, then run over that new list printing out its elements?

IO lists

Right.

In Erlang we have a very useful related datatype called an IO list to represent binary data that will ultimately be written to a file or socket. While we represent plain binary data as either a string (list of 8-bit integers) or a binary (array of 8-bit integers) an IO list is a "deep list" (tree of conses) of these elements. What this means is that if you have two chunks of data to join together you can cons them into an IO list in O(1) time instead of having to concatenate them in O(n) time. In practice this makes it very easy to avoid the pitfall of spending O(n^2) time to accumulate large binary messages from the repeated concatenation of smaller parts.

Take HTML generation in the Yaws webserver for an example:

ehtml_expand({Tag, Attrs, Body}) when atom(Tag) ->
    Ts = atom_to_list(Tag),
    NL = ehtml_nl(Tag),
    [NL, "<", Ts, ehtml_attrs(Attrs), ">", ehtml_expand(Body), "</", Ts, ">"];
If the last line was instead NL ++ "<" ++ Ts ++ ... then in a strict language like Erlang we would take O(n^2) time to generate the HTML. How about in e.g. Haskell?

Creating these IO lists is cheap but of course they will ultimately need to be written to a file descriptor. At this point either the runtime system concatenates the IO list into a byte array in memory (O(n) using relatively fast C code - the list_to_binary primitive) or a writev system call could be used and might avoid ever concatenating the data in memory at all.

This has a lot in common with Boehm's most excellent cord data structure.

Not sure,

but if I remember correctly, there are both amortized and worst-case O(1) lazy append's given in CTM and I think I saw them in Okasaki too (Purely Functional Datastructures).

Functional nearly-one-liner

ham1:: [Int] 
ham1 = y  
where y = [1:merge (map ((*) 2) y) (merge (map ((*) 3) y) (map ((*) 5) y))]

via Clean.

Why can this not be written a

Why can this not be written as one or two lines? Why are the last two lines separate?

I just paste them...

...as the author posts them. Call it a two-liner then.

Top-level sharing

In Clean by default top level functions are reevaluated.
So if you write this as

ham1:: [Int] 
ham1 = [1:merge (map ((*) 2) ham1) (merge (map ((*) 3) ham1) (map ((*) 5) ham1))]

then ham1 is reevaluated over and over again.

There's a special definition symbol (=:) to indicate that the "http://foldoc.hld.c64.org/foldoc.cgi?CAF">CAF should be shared.

ham1:: [Int] 
ham1 =: [1:merge (map ((*) 2) ham1) (merge (map ((*) 3) ham1) (map ((*) 5) ham1))]

Haskellish nearly-one-liner

hamming :: Num -> [Num]
hamming n = [i | i <- [1..n], i `mod` 2 == 0 || i `mod` 3 == 0 || i `mod` 5 == 0 ]

via Professor Jeff Rohl using the old Haskell relative, Gofer.

Correctness...

Pasted verbatim, not checked. Possible errors: logical operator, assumption that all numbers divisible by 2,3, and 5 are strictly Hamming numbers.

210?

Isn't a Hamming number. The missing bit of spec is "only", as in "only divisible by 2, 3, and 5".

Functional one-liner in Mathematica

(Edit. See my later comments for code corrections and remarks. There's nothing wrong with the idea here - though the code is incorrect, it's easily fixed.)

Sort[Flatten[Table[2^i*3^j*5^k,{i,0,5},{j,0,5},{k,0,5}]]]

This one-liner is trivial, lucid, and fast. I'm a C++ guy too, but not provincial. C++ is wrong for Hamming numbers.

Using the same command format on a 1.0 GHz PC, Mathematica 5.1.1 computed and sorted the first 27000 Hamming numbers in 0.3 second. The 27000th number is 6863037736488300000000000000000000000000000.

In all cases memory allocation is optimal because the array is allocated once according to the Table size. The C++ code is another question.

The C++ code fails outright at the 10791st Hamming number. This is the first Hamming number greater than 2^64. If the hardware is only 32-bit, C++ fails at the 1846th Hamming number. Big-ints are the worry, not efficiency. Mathematica handles them seamlessly, as do certain other languages. C++ does not. This example makes being "close to the hardware" a curse.

And I seriously doubt that C++ is faster with "small" Hamming numbers. Mathematica's computation of the first 216 Hamming numbers was essentially instantaneous. They are shown below as output from the one-liner.

{1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 18, 20, 24, 25, 27, 30, 32, 36, 40, 45, 48, 50, 54, 60, 72, 75, 80, 81, 90, 96, 100, 108, 120, 125, 135, 144, 150, 160, 162, 180, 200, 216, 225, 240, 243, 250, 270, 288, 300, 324, 360, 375, 400, 405, 432, 450, 480, 486, 500, 540, 600, 625, 648, 675, 720, 750, 800, 810, 864, 900, 972, 1000, 1080, 1125, 1200, 1215, 1250, 1296, 1350, 1440, 1500, 1620, 1800, 1875, 1944, 2000, 2025, 2160, 2250, 2400, 2430, 2500, 2592, 2700, 3000, 3125, 3240, 3375, 3600, 3750, 3888, 4000, 4050, 4320, 4500, 4860, 5000, 5400, 5625, 6000, 6075, 6250, 6480, 6750, 7200, 7500, 7776, 8100, 9000, 9375, 9720, 10000, 10125, 10800, 11250, 12000, 12150, 12500, 12960, 13500, 15000, 16200, 16875, 18000, 18750, 19440, 20000, 20250, 21600, 22500, 24300, 25000, 27000, 28125, 30000, 30375, 32400, 33750, 36000, 37500, 38880, 40500, 45000, 48600, 50000, 50625, 54000, 56250, 60000, 60750, 64800, 67500, 75000, 81000, 84375, 90000, 97200, 100000, 101250, 108000, 112500, 121500, 135000, 150000, 151875, 162000, 168750, 180000, 194400, 202500, 225000, 243000, 253125, 270000, 300000, 303750, 324000, 337500, 405000, 450000, 486000, 506250, 540000, 607500, 675000, 759375, 810000, 900000, 972000, 1012500, 1215000, 1350000, 1518750, 1620000, 2025000, 2430000, 2700000, 3037500, 4050000, 4860000, 6075000, 8100000, 12150000, 24300000}

Mathematica not comparable to Haskell

Using the same command format on a 1.0 GHz PC, Mathematica 5.1.1 computed and sorted the first 27000 Hamming numbers in 0.3 second. The 27000th number is 6863037736488300000000000000000000000000000.

Compare this to ghc on a 1.3 GHz box. Using the concise source code the program calculates the 27000th number correctly as 3983823817623774429511680, in 0.12 second.

The Mathematica code is very nice as long as you realize it demands care, because it generates many but not all of the Hamming numbers.

some missing

Your output implies that all Hamming numbers between 7776 and 24300000 are multiples of 5.

coincidently, 7776 = 2^5 * 3^5.

Haskell sez: bring it!

I'm not even a Haskell programmer, but Haskell's spirit is channelling through me and forcing me to point out that it can do this very concisely too:

hamming = sort [2^i*3^j*5^k | i <- [0..5], j <- [0..5], k <- [0..5]]

Of course, it suffers from the same drawbacks pointed out by Scott.

Templates to the rescue!

The C++ code fails outright at the 10791st Hamming number. This is the first Hamming number greater than 2^64. If the hardware is only 32-bit, C++ fails at the 1846th Hamming number. Big-ints are the worry, not efficiency. Mathematica handles them seamlessly, as do certain other languages. C++ does not. This example makes being "close to the hardware" a curse.

If you look at the sample C++ (iterators) versions above, you can see that by replacing the type held in the list<> or set<> you can get different storages. If you used something like GMP's big int's library, I'd imagine everything would carry on as if nothing had changed. That's where "close to the hardware" gives you choice.

Templates to the Rescue, Indeed

genneth: If you look at the sample C++ (iterators) versions above, you can see that by replacing the type held in the list<> or set<> you can get different storages. If you used something like GMP's big int's library, I'd imagine everything would carry on as if nothing had changed. That's where "close to the hardware" gives you choice.

On the contrary, that's where parametric polymorphism gives you choice, and C++ is a relative latecomer to the parametric polymorpism game. OK, not as late as Java... :-) In fact, C++'s parametric polymorphism still isn't as expressive as, e.g. O'Caml's. This isn't legal C++:

template <typename T>
typedef foo <T> bar <T>;

whereas this is legal O'Caml:

type 'a bar = 'a foo

C++ and parametric polymorphism

C++'s parametric polymorphism still isn't as expressive as, e.g. O'Caml's. This isn't legal C++:
template <typename T>
typedef foo <T> bar <T>;
whereas this is legal O'Caml:
type 'a bar = 'a foo 
That's true, but not the greatest example, since in C++ you can do
template<typename T>
struct bar {
  typedef foo<T> type;
};
and
typename bar<T>::type
is now a type alias.

My first complaint about C++ parametric functions would be that C++ parametric functions can't be passed as arguments.

My first complaint about

My first complaint about C++ parametric functions would be that C++ parametric functions can't be passed as arguments.

the lack of first class parametric functions is indeed one of the more serious issues with c++. a solution to the problem is possible through wrapping the parametric function inside a non-parametric object. ugly, but it works:

struct foo {
    template<typename T>
    static void bar(T t) { }
};

template<typename T>
void test(T f) { T::bar("test"); }

int main() {
    test( foo() );
}

while many defiancies in c++ (like the aforementioned generic typedefs) will be addressed in near future, this and similar issues are unfortunately bound to stay because of the kludgy parametricity design.