Lambda the Ultimate

inactiveTopic Making the Case for PHP at Yahoo!
started 10/29/2002; 11:31:58 PM - last post 11/2/2002; 1:51:10 PM
Ehud Lamm - Making the Case for PHP at Yahoo!  blueArrow
10/29/2002; 11:31:58 PM (reads: 2629, responses: 17)
Making the Case for PHP at Yahoo!
This is a very interesting presentation. Of specific interest are the detailed discussion of how Yahoo went about choosing a server side programming language, and the critique of in-house DSLs.

I am not sure I agree with the conclusion about in-house languages. I think the Yahoo experience does not really match the typical case. However, the size of the system and its history make this an interesting case study.

I'd like to know which of these languages and systems is the one from Revenge of the Nerds.

(PDF version)


Posted to general by Ehud Lamm on 10/29/02; 11:33:37 PM

Noel Welsh - Re: Making the Case for PHP at Yahoo!  blueArrow
10/30/2002; 2:25:54 AM (reads: 2529, responses: 3)
I was struck by how poor yScript2 was. I can't believe two Stanford PhDs came up with a system like that (no functions!) They should have hired a LtU reader ;-)

Ehud Lamm - Re: Making the Case for PHP at Yahoo!  blueArrow
10/30/2002; 2:28:18 AM (reads: 2616, responses: 2)
Yeah, that sort of what I thought.

The language was crippled, but to conclude that all in house languages are so mediocre would be wrong.

Ehud Lamm - Re: Making the Case for PHP at Yahoo!  blueArrow
10/30/2002; 2:38:13 AM (reads: 2514, responses: 0)
Also see this and this.

Noel Welsh - Re: Making the Case for PHP at Yahoo!  blueArrow
10/30/2002; 3:18:58 AM (reads: 2669, responses: 1)
I agree with their decision that its a waste of resources to maintain their own language when there are so many fine languages already out there. This argument wouldn't hold if they were doing anything special with their language, such as the stuff that bigwig does, but judging from the talk they haven't raised their abstraction level very far so they made the right decision.

Michael Vanier - Re: Making the Case for PHP at Yahoo!  blueArrow
10/30/2002; 3:58:19 AM (reads: 2513, responses: 0)
The system that Paul Graham and his partner wrote in Common Lisp was Yahoo Store. I think I read somewhere that it was re-written in C++, which either hurts Paul's arguments or shows Yahoo to be very conservative. I was very surprised not to see a mention of this in the presentation. I emailed Paul and asked him about it; I'll let you guys know what he says (or else he can just post here ;-)).

Ehud Lamm - Re: Making the Case for PHP at Yahoo!  blueArrow
10/30/2002; 4:01:45 AM (reads: 2768, responses: 0)
Sure.

You can take this argument a step further. Inventing yet anouther general purpose language is often a waste of time. Building DSLs, however, can be a useful software engineering techqnique. Hence, most langauge designers are likely to find themselves hired to desing in house DSLs.

This, of course, means we need better tools for building DSLs.

Isaac Gouy - Re: Making the Case for PHP at Yahoo!  blueArrow
10/30/2002; 12:53:33 PM (reads: 2457, responses: 1)
How pragmatic they are!

Paraphrasing: Our infrastructure is FreeBSD, we don't want to mess with that. Java solutions won't work well on FreeBSD, so they aren't an option.

Are most real-world choices of programming language this constrained?

Bryn Keller - Re: Making the Case for PHP at Yahoo!  blueArrow
10/30/2002; 1:37:13 PM (reads: 2531, responses: 0)
At least that constrained, in my experience.

Must run on this platform, must have a compiler on that platform, must have an interface for this database, must work with that tool, must be marketable (!), etc.

Michael Vanier - Re: Making the Case for PHP at Yahoo!  blueArrow
10/30/2002; 2:25:09 PM (reads: 2421, responses: 1)
... must be able to find lots of programmers fluent in the language, must be currently fashionable, must be 100% buzzword-compatible ;-)

Actually, in my experience (and that of most other people I've talked to) language quality, or even language appropriateness for the job, is the absolute last thing that gets considered. The reason for this is that the people making the technical decisions are often (usually?) not the engineers, and even when engineers do make the decisions they usually don't know much about computer languages.

Of all the 20-30 languages I know, I think PHP is one of the worst. I used it for a database-backed web site project a couple of years ago and was absolutely appalled by the lack of design sense of the authors. I ended up using python to generate most of the PHP code, which worked nicely ;-) I assume the language is better now; it could hardly have been any worse. The PHP array concept was particularly hilarious in its awfulness. Actually, though, IMO the very idea of bundling scripts with web pages is flawed.

Isaac Gouy - Re: Making the Case for PHP at Yahoo!  blueArrow
10/30/2002; 5:13:13 PM (reads: 2397, responses: 0)
... must be able to find lots of programmers fluent in the language, must be currently fashionable, must be 100% buzzword-compatible ;-)
If that was the case at Y! they'd have chosen JSP and trashed their infrastructure.

How can we explain that they would choose PHP over Python? Python didn't seem to even make their shortlist. Maybe they don't like indentation? Maybe they view Python as yet another general purpose language?

Could it be that they view PHP as a specialized html/db scripting language which matched their needs?

Chris - Re: Making the Case for PHP at Yahoo!  blueArrow
10/30/2002; 7:42:09 PM (reads: 2438, responses: 0)
In their benchmark graphs, mod_perl seems to clearly beat PHP, so i was (barely) surprised when they picked PHP as the winner. I guess the presentation did say they felt Perl code would be a maintence nightmare.

re Java threading problems on FreeBSD: Yahoo has actively contributed to FreeBSD kernel development for a long time, so I am surprised that they quickly dismiss FreeBSD's threading problems instead of fixing them. I guess they don't want to wait until FreeBSD 5.0.

re Python: Yahoo Maps use Python. You can see that their URLs have .py filename extensions. Yahoo Maps is "powered by" Mapquest, so maybe Mapquest made the decision to use Python.

Ehud Lamm - Re: Making the Case for PHP at Yahoo!  blueArrow
10/31/2002; 5:19:33 AM (reads: 2425, responses: 0)
in my experience (and that of most other people I've talked to) language quality, or even language appropriateness for the job, is the absolute last thing that gets considered

This matches my experience too.

Isaac Gouy - Re: Making the Case for PHP at Yahoo!  blueArrow
10/31/2002; 10:19:48 AM (reads: 2325, responses: 0)
So let's take the next step.

1) If language quality is the absolute last thing that gets considered then could it be that in-practice "language quality" is less important to the success of projects and companies than the things that are given more priority?

2) Are the expectations of projects/companies already so-low that they are prepared to deal with whatever problems/timescales result from using "low quality" languages? What they do is "good enough" because they don't expect things to be any better and their competition isn't doing anything better.

3) Is the "high quality" language immediately dismissed because of some real implementation constraint? Even when the "high quality" language has a high quality implementation there are still things like:

native Win32 port: The debugger is not supported in this port.
Gulp!

Michael Vanier - Re: Making the Case for PHP at Yahoo!  blueArrow
10/31/2002; 8:46:42 PM (reads: 2381, responses: 0)
I think one issue is that people nearly always think short-term, and language quality doesn't usually bite you in the short term. It does bite you in the long term, though, and when it does the damage is often so severe that the entire project can die.

BTW here is what Paul Graham had to say about Yahoo Store:

"Store is still written in Lisp. The last I heard they were planning to rewrite it in C++, so that the people now in charge of it could read the source. However, since part of the functionality of the Store Editor is to take s-expressions created by users at runtime and compile them into code that generates pages, they will literally have to write a Lisp interpreter to do it. I expect they are finding this an obstacle. If they managed to do this (without realizing that they're implementing their own Lisp) it would be a new world's record case of Greenspun's tenth rule."

Greenspun's tenth rule, BTW, is ""Any sufficiently complicated C or Fortran program contains an ad-hoc, informally-specified bug-ridden slow implementation of half of Common Lisp."

Isaac Gouy - Re: Making the Case for PHP at Yahoo!  blueArrow
11/1/2002; 5:05:03 PM (reads: 2262, responses: 0)
language quality doesn't usually bite you in the short term
Doesn't "language quality" have a minute-by-minute impact? I'm no longer sure I know what you mean by "language quality".

the damage is often so severe that the entire project can die
Do you have examples of the language choice killing the project?

Yahoo Store. I think I read somewhere that it was re-written in C++, which either hurts Paul's arguments or shows Yahoo to be very conservative
As it seems that Y! are planning a rewrite in C++, which of Paul Graham's arguments does this hurt?

Michael Vanier - Re: Making the Case for PHP at Yahoo!  blueArrow
11/1/2002; 6:20:48 PM (reads: 2265, responses: 0)
What I mean (and I suspect you already know this) is that it's possible to get by with an inferior language for quite a while, as long as the project is small enough. With a small enough program, it's almost always possible to debug the program exhaustively and get it working. When the program gets big enough, this is much harder and may become intractable. When was the last time you wrote a 100,000 line program in assembly language? Good design alone won't help you if the language has flaws.

As for examples, the one that's closest to my heart is a neural simulation system I worked with (and hacked on) extensively for my Ph.D. The program was (badly) written in C, with a home-brew scripting language superimposed on it. The resulting program was (and still is) full of memory leaks and other common C-related problems. We wanted speed, so we went with C, and the resulting code is such a mess that it hasn't been seriously worked on for years, although many have tried. This is why I got interested in smalltalk; I figured if I could prototype the program in a decent language I might eventually get the right design, and I could then port it (with difficulty, no doubt) to something like C++ for speed (or do something cleverer). I'm sure other people have their own favorite examples.

As for Yahoo, according to Paul, they want to rewrite Yahoo Store "so they can understand the code". Consider this statement. Since when has C++ been considered a particularly readable language? What I read into it is that they can't find enough qualified lisp programmers to maintain the code. This presumably hurts Paul's arguments that server-side programming isn't bound by the same language constraints as other kinds of programming (e.g. shrink-wrapped consumer software). I'm not sure I ever bought that argument anyway, since someone always has to maintain software. I don't think lisp is per se unmaintainable, but it's just a fact that there aren't that many lisp programmers out there. Do a web search for "C++ jobs" and "lisp jobs" if you don't believe me. I get more than a 10:1 page ratio of C++:lisp on google.

Isaac Gouy - Re: Making the Case for PHP at Yahoo!  blueArrow
11/2/2002; 1:51:10 PM (reads: 2287, responses: 0)
What I mean (and I suspect you already know this)
There seem to be many different aspects of a programming language that we can say are better or worse for a particular purpose. Saying language x is better ("high quality") than language y simply begs the question what aspect of language x is better, and for what purpose?

As for examples
Your example graciously states several failings, I don't intend to dwell on those. I'd like to suggest there seems to have been a more general failure in understanding what was being attempted - are we implementing a well-understood designed solution? are we trying to extend a well designed system? are we exploring something we don't really understand?

We wanted speed, so we went with C
How could you know what fast mean't for your application until you had it working correctly?

Michael Jackson's Rules on Optimization
Rule 1. Don't do it.
Rule 2. (for experts only) Don't do it yet
"Principles of Program Design" by M.A. Jackson, Associated Press 1975 ISBN 0 12 379050 6

smalltalk; I figured if I could prototype the program in a decent language
It isn't that Smalltalk is "a decent language" - it's that Smalltalk has many qualities that make it suited for exploratory programming and prototyping.
Using C for prototyping is a mistake; just as using Smalltalk for unix systems programming would be a mistake. That doesn't mean either of them is an inferior language, just that they are a bad choice for that particular purpose.

then port it... for speed
If you succeed in building a prototype that does what you want in Smalltalk, then I think you'll find that you can identify performance hotspots with the profiling tools. You can probably identify other algorithms that would improve performance - this often gives the biggest performance improvement. You can micro-optimize - re-implement hot-spot methods in a less-general more-direct way, until they are just calls to C primitives. You can re-implement little bits in C and pass raw blocks of data to them from Smalltalk. You can use a different Smalltalk implementation (Smalltalk/X, Smalltalk MT). Once you have something that does all you need, there will be plenty of opportunities for optimization.
And of course you'll find many examples of Smalltalk neural simulation implementations from Google.

I don't think lisp is per se unmaintainable, but it's just a fact that there aren't that many lisp programmers out there.
Maybe the big payoff from using Lisp was in prototyping and building Store fast? Presumably the functionality is now stable and better understood - it's legacy software.