Pycket: A Tracing JIT For a Functional Language

Pycket: A Tracing JIT For a Functional Language
Spenser Bauman, Carl Friedrich Bolz, Robert Hirschfeld, Vasily Krilichev, Tobias Pape, Jeremy Siek, and Sam Tobin-Hochstadt
2015

We present Pycket, a high-performance tracing JIT compiler for Racket. Pycket supports a wide variety of the sophisticated features in Racket such as contracts, continuations, classes, structures, dynamic binding, and more. On average, over a standard suite of benchmarks, Pycket outperforms existing compilers, both Racket’s JIT and other highly-optimizing Scheme compilers. Further, Pycket provides much better performance for proxies than existing systems, dramatically reducing the overhead of contracts and gradual typing. We validate this claim with performance evaluation on multiple existing benchmark suites.

The Pycket implementation is of independent interest as an application of the RPython meta-tracing framework (originally created for PyPy), which automatically generates tracing JIT compilers from interpreters. Prior work on meta-tracing focuses on bytecode interpreters, whereas Pycket is a high-level interpreter based on the CEK abstract machine and operates directly on abstract syntax trees. Pycket supports proper tail calls and first-class continuations. In the setting of a functional language, where recursion and higher-order functions are more prevalent than explicit loops, the most significant performance challenge for a tracing JIT is identifying which control flows constitute a loop -- we discuss two strategies for identifying loops and measure their impact.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Racket

This is interesting, and I'd be fascinated to learn more about the RPython meta-tracing framework.

Racket itself I find wonderful and thoroughly infuriating.

One problem that may make me an odd man out on this website is that I want programming languages to be finished, documented to a thoroughly useful and complete level and maintained, libraries and all, and thus suitable for publishing products, not just proof-of-concept experiments, suitable for writing papers on.

I could write a very long rant about everything that is more powerful about racket than anything else, and everything that makes Racket utterly useless for real products. None of them are unfixable, they're things like "needs a better garbage collector more than any system on earth" and "has a culture that is sure that documentation doesn't have to make sense to anyone but the implementer" and "a macro system that's a crime against engineering" and "no gui library that could be used for real projects by anyone outside the racket devs."

So I'm sure you've just taken a very impressive and already fast system and made it even more impressive and faster. But it's real problem is that was already just short of being usable in products, and another proof of concept just gives it new ways to fall even shorter.

:(

We have plenty of production

We have plenty of production languages, but precious few languages that try to push the envelope of what is possible in PL. If we just put all our smart people on languages we use today, we will have nothing very interesting for tomorrow.

But we don't have ANY production languages

with anywhere near the sort of power of racket.

And they keep adding more wonderful things too.

If only they also had a production ready fork.

Have cake + eat cake

One of the reasons Racket can be so powerful is that they don't nerf themselves for industry.

If someone wanted to cherry pick/polish an industry-friendly version of Racket, I think it would be welcomed.

how?

Given the "documentation doesn't have to make sense to anyone but the implementer and his students" and other code not being documented, who is going to be able to understand this mess well enough to stabilize it?

If you want

If you want code/documentation to make sense, you need a feedback loop. If there is no feedback loop...then it just won't. You either help and try to create a feedback loop between the implementers (you know, by taking an interest in the code and asking questions), or you give up and go somewhere else.

Code is a conversation, often it is a one sided one where you are basically talking to yourself.

I would talk about the

I would talk about the people who use Racket "for real products", but the tone of your comment doesn't really make engagement seem worthwhile here.

for great good

There are many categories of people here: Those who apparently happily already use Racket for real products; Those who tried to use it and failed; Those who never bothered to try; Those who never even heard of Racket; etc.

For Racket, I do not believe the cardinality of the first category of people is going to prove that it is ready for real products. So it might be worth talking with those in the second category to see why they aren't in the first.

I read the tone as actually being the kind of thing I desperately search for and want to find in the world regarding my own projects and efforts. A person who knows enough and is interested enough to have apparently actually tried to use it! And has results to share! Hooray! But that's just me.

I take that as frustration

I take that as frustration from somebody who likes Racket. Plus, he's not the entire LtU.

Since your comment has no title, one has to look at the

source of the web page to find it. You can't click on an empty link

There is the HN discussion.

There is the HN discussion.

Cool

But I'm busy tacking a nice metalanguage on lua that will run on luajit.

Turning it into something similar to scheme will be little more than a one liner when I'm done, though it will be a language that builds out of arrays and tables instead of lists, but that may turn out to be an improvement.

An a little functional subset with s-expressions would also be easy.

The advantage of luajit is that it's small and already runs on everything. You can't get Racket running your android or ipad game afaik

[edit] I added to the ycombinator thread

The Racket people are very

The Racket people are very pragmatic, and if Carmack - the meticulous optimizer - takes an interest in it, some wonderful things could eventually happen.

He'd have to give them a brand new gc

.

If there are particular

If there are particular programs you have that exhibit poor GC behavior, we'd love to know about that. Benchmarks I've run suggest that Racket's GC is quite competitive with other functional languages, although it isn't parallel.

Obviously, you've never used Racket

The IDE is utterly unusable, stopping with a garbage collection symbol in the lower right hand corner every minute or so.

This is especially bad if one is editing programs that are larger than toys. I use a small netbook but I've been told it isn't usable on faster machines either.

Actually, I'm using DrRacket

Actually, I'm using DrRacket as we speak. I've been a heavy user of DrRacket for 15+ years. Furthermore, I use it to edit large programs, including itself. Again, if you have specific feedback, that would be very helpful.

Well, I used it multiple times for different projects

And gave up because the editor stops for up to 15 seconds on a regular basis.

Scrolling through large files without pauses, impossible, searching for completion terms on large files without pauses, impossible.

That is a specific feedback and if you don't like it, then you can read any insult into this comment you'd like.

That's helpful, thanks.

That's helpful, thanks. There are some known performance issues with scrolling, which are probably exacerbated on a netbook. What do you mean by "searching for completion terms"? Just searching for a string in a file with "Find"?

However, these issues are almost certainly feature-specific, DrRacket-specific, or GUI library-specific, rather than being an issue of GC performance. If a program allocates too much, for example, it will spend a lot of time in GC, which may be what's happening here.

They went through the trouble

of giving Racket a GC plugin for students to play with gc techniques, but didn't go through the trouble of giving it an acceptable gc in the first place.

Luajit has a nicely tuned incremental gc though not a generational one, that seems good enough.

Granted that the racket gc needs to deal with multiple threads which is a nightmare.

I suspect that Racket being "stop the world" means that they never bothered to make the code generator put in write barriers. That would be a huge change, and who knows, maybe not even feasible... but since they don't even support a lot of platforms they could cludge it by using OS tracked dirty page scanning and rescanning instead of a write barrier.

Sigh. There are flies in the ointment http://zachsaw.blogspot.com/2010/11/wow64-bug-getthreadcontext-may-return.html

Like I said, the things you

Like I said, the things you see are unlikely to be GC problems. Racket already has a generational GC and it certainly uses a write barrier. The write barrier is implemented using page protection from the OS, which works better with C code in the runtime system.

Also, the GC in Racket and the teaching materials for GC in PLAI are written by different people.

It has a little recycling symbol in the lower corner

of the IDE when it stops.

I'm sorry, but this is the only system I've ever used with frequent, long pauses for gc. The only one.

And I'm hardly the only person who rejects Racket because of this. That the IDE is unusable because of bad GC is well known. And yet the IDE IS the system. It's not worth using without the IDE.

If you're saying that the IDE some part of its design is so bad that NO gc could keep up with it, then you're saying something amazing - in which case you should replace it by a design by a competent programmer. This is a problem that literally no one else has. And no one has seen frequent long gc pauses in other systems for decades.

Also, if they used the write barrier to make the gc generational, but not to make it incremental or parallel, then I guess they used it to solve the wrong problem - either that or it's badly tuned.

If your attitude toward serious problems is to deny them, then by the time you realize that you don't have a choice, there will be no one around who knows HOW to solve them.

And no one has seen frequent

And no one has seen frequent long gc pauses in other systems for decades.

This is a pretty remarkable statement. In reality, GC pauses are a major issue on every garbage-collected VM, ranging from large HotSpot server systems to Android.

I think this discussion has exhausted its usefulness.

It was never useful, you never made it useful

But I've never run used a program that has a long pause frequently.
Sure, some systems have to do major collections occasionally. But this is ridiculous.
[edit] I will also add that the other systems have this problem as you say, and they've all found much better solutions to it than you have. Because no one else has ever written an editor that can't scroll because of gc.

Let me add that no one else has an editor where it can stop for gc in the middle of typing and be gone for 15 seconds. Really, you think your system is acceptably tuned? Really?

Let me add that one of my favorite programmer's editors (JEdit) is written in pure java.

I've run it on windows, I've run it on unix installed on an android device without much memory. Never had a problem with garbage collection. Never had a problem with large files. Even on a slow arm processor.

Open office is also in Java, also no problems.

Eclipse a full IDE, also java. Acceptable no matter what version of java you run it in, GNU or Oracle - and let me add that I've run that both ways on an arm processor without much memory.

Some versions of Microsoft office are probably written in .net. No problems with gc and scrolling or editing or searching..

Ok, visual studio 2010, written in winforms is a bit on the unacceptably slow side, but pauses for GC while editing aren't the problem... Editors built on IDEA are in java and also a bit heavy and slow, but still no long pauses for gc while doing basic editing.

XMaxima displays equations in full mathematical format, it's written in TCL/TK - it runs a lot better than the PLT ide

I think you have to go to an editor written in Javascript running in a browser to find something as slow and pausing as Racket. And you might have to have multiple pages open in the browser to get that bad.

Probably a GUI problem

The problem probably is the GUI library, not the GC. In the past, Java AWT and Swing used to have strange pauses while Eclipse SWT would run very nicely in the same Java VM.

Macros?

What's wrong with the macro system? Just learned it, seems nice.