Performance Problems You Can Fix: A Dynamic Analysis of Memoization Opportunities
Performance bugs are a prevalent problem and recent research proposes various techniques to identify such bugs. This paper addresses a kind of performance problem that often is easy to address but difficult to identify: redundant computations that may be avoided by reusing already computed results for particular inputs, a technique called memoization. To help developers find and use memoization opportunities, we present MemoizeIt, a dynamic analysis that identifies methods that repeatedly perform the same computation. The key idea is to compare inputs and outputs of method calls in a scalable yet precise way. To avoid the overhead of comparing objects at all method invocations in detail, MemoizeIt first compares objects without following any references and iteratively increases the depth of exploration while shrinking the set of considered methods. After each iteration, the approach ignores methods that cannot benefit from memoization, allowing it to analyze calls to the remaining methods in more detail. For every memoization opportunity that MemoizeIt detects, it provides hints on how to implement memoization, making it easy for the developer to fix the performance issue. Applying MemoizeIt to eleven real-world Java programs reveals nine profitable memoization opportunities, most of which are missed by traditional CPU time profilers, conservative compiler optimizations, and other existing approaches for finding performance bugs. Adding memoization as proposed by MemoizeIt leads to statistically significant speedups by factors between 1.04x and 12.93x.
This paper was recommended by Asumu Takikawa. It is a nice idea, which seems surprisingly effective. The examples they analysed (sadly they don't really explain how they picked the program to study) have a mix of memoization opportunities in fairly different parts of the codebase. There are several examples of what we could call "peripheral communication logic", eg. date formatting stuff, which we could assume is not studied really closely by the programmers focusing more on the problem-domain logic. But there is also an interesting of subtle domain-specific memoization opportunity: an existing cache was over-pessimistic and would reset itself at times where it was in fact not necessary, and this observation corresponds to a non-trivial program invariant.
The authors apparently had some difficulties finding program inputs to exercise profiling. Programs should more often be distributed with "performance-representative inputs", not just functionality-testing inputs. In one example of a linting tool, the provided "standard test" was to feed the code of the linting tool to itself. But this was under a default configuration for which the tools' developers had already fixed all alarms, so the alarm-creating code (which actually had an optimization opportunity) was never exercised by this proposed input.
Note that the caching performed is very lightweight, usually not a full tabulation of the function. Most examples just add a static variable to cache the last (input, output) pair, which is only useful when the same input is typically called several times in a row, but costs very little space.
The Gradualizer: a methodology and algorithm for generating gradual type systems
The Gradualizer: a methodology and algorithm for generating gradual type systems
One can think of the Gradualizer as a kind of meta-programming algorithm that takes a type system in input, and returns a gradual version of this type system as output. I find it interesting that these type systems are encoded as lambda-prolog programs (a notable use-case for functional logic programming). This is a very nice way to bridge the gap between describing a transformation that is "in principle" mechanizable and a running implementation.
An interesting phenomenon happening once you want to implement these ideas in practice is that it forced the authors to define precisely many intuitions everyone has when reading the description of a type system as a system of inference rules. These intuitions are, broadly, about the relation between the static and the dynamic semantics of a system, the flow of typing information, and the flow of values; two occurrences of the same type in a typing rule may play very different roles, some of which are discussed in this article.
Is Sound Gradual Typing Dead?
Is Sound Gradual Typing Dead?
In a fully dynamic system, typing checks are often superficial (only the existence of a particular field is tested) and done lazily (the check is made when the field is accessed). Gradual typing changes this, as typing assumptions can be made earlier than the value is used, and range over parts of the program that are not exercised in all execution branches. This has the potentially counter-intuitive consequence that the overhead of runtime checks may be sensibly larger than for fully-dynamic systems. This paper presents a methodology to evaluate the "annotation space" of a Typed Racket program, studying how the possible choices of which parts to annotate affect overall performance.
Many would find this article surprisingly grounded in reality for a POPL paper. It puts the spotlight on a question that is too rarely discussed, and could be presented as a strong illustration of why it matters to be serious about implementing our research.
Abstracting Gradual Typing
Abstracting Gradual Typing
At first sight this description seems to overlap with the Gradualizer work cited above, but in fact the two approaches are highly complementary. The Abstract Gradual Typing effort seems mechanizable, but it is far from being implementable in practice as done in the Gradualizer work. It remains a translation to be done on paper by skilled expert, although, as standard in abstract interpretation works, many aspects are deeply computational -- computing the best abstractions. On the other hand, it is extremely powerful to guide system design, as it provides not only a static semantics for a gradual system, but also a model dynamic semantics.
The central idea of the paper is to think of a missing type annotation not as "a special Dyn type that can contain anything" but "a specific static type, but I don't know which one it is". A problem is then to be understood as a family of potential programs, one for each possible static choice that could have been put there. Not all choices are consistent (type soundness imposes constraints on different missing annotations), so we can study the space of possible interpretations -- using only the original, non-gradually-typed system to make those deductions.
An obvious consequence is that a static type error occurs exactly when we can prove that there is no possible consistent typing. A much less obvious contribution is that, when there is a consistent set of types, we can consider this set as "evidence" that the program may be correct, and transport evidence along values while running the program. This gives a runtime semantics for the gradual system that automatically does what it should -- but it, of course, would fare terribly in the performance harness described above.
The Abstract Gradual Typing work feels like a real breakthrough, and it is interesting to idly wonder about which previous works in particular enabled this advance. I would make two guesses.
First, there was a very nice conceptualization work in 2015, drawing general principles from existing gradual typing system, and highlighting in particular a specific difficulty in designing dynamic semantics for gradual systems (removing annotations must not make program fail more).
Refined Criteria for Gradual Typing
Siek and Taha  coined the term gradual typing to describe a theory for integrating static and dynamic typing within a single language that 1) puts the programmer in control of which regions of code are statically or dynamically typed and 2) enables the gradual evolution of code between the two typing disciplines. Since 2006, the term gradual typing has become quite popular but its meaning has become diluted to encompass anything related to the integration of static and dynamic typing. This dilution is partly the fault of the original paper, which provided an incomplete formal characterization of what it means to be gradually typed. In this paper we draw a crisp line in the sand that includes a new formal property, named the gradual guarantee, that relates the behavior of programs that differ only with respect to their type annotations. We argue that the gradual guarantee provides important guidance for designers of gradually typed languages. We survey the gradual typing literature, critiquing designs in light of the gradual guarantee. We also report on a mechanized proof that the gradual guarantee holds for the Gradually Typed Lambda Calculus.
Second, the marriage of gradual typing and abstract interpretation was already consumed in previous work (2014), studying the gradual classification of effects rather than types.
A Theory of Gradual Effect Systems
Effect systems have the potential to help software developers, but their practical adoption has been very limited. We conjecture that this limited adoption is due in part to the difficulty of transitioning from a system where effects are implicit and unrestricted to a system with a static effect discipline, which must settle for conservative checking in order to be decidable. To address this hindrance, we develop a theory of gradual effect checking, which makes it possible to incrementally annotate and statically check effects, while still rejecting statically inconsistent programs. We extend the generic type-and-effect framework of Marino and Millstein with a notion of unknown effects, which turns out to be significantly more subtle than unknown types in traditional gradual typing. We appeal to abstract interpretation to develop and validate the concepts of gradual effect checking. We also demonstrate how an effect system formulated in Marino and Millstein’s framework can be automatically extended to support gradual checking.
Difficulty rewards: gradual effects are more difficult than gradual simply-typed systems, so you get strong and powerful ideas when you study them. The choice of working on effect systems is also useful in practice, as nicely said by Philip Wadler in the conclusion of his 2015 article A Complement to Blame:
I [Philip Wadler] always assumed gradual types were to help those poor schmucks using untyped languages to migrate to typed languages. I now realize that I am one of the poor schmucks. My recent research involves session types, a linear type system that declares protocols for sending messages along channels. Sending messages along channels is an example of an effect. Haskell uses monads to track effects (Wadler, 1992), and a few experimental languages such as Links (Cooper et al., 2007), Eff (Bauer and Pretnar, 2014), and Koka (Leijen, 2014) support effect typing. But, by and large, every programming language is untyped when it comes to effects. To encourage migration from legacy code to code with effect types, such as session types, some form of gradual typing may be essential.
We've mentioned some empirical studies of programming languages a few times, but I haven't seen a comprehensive list we can use as a reference.
Fortunately, I just came across this pretty decent overview of existing literature on how types impact development. Agree or disagree with Dan Luu's position, the comprehensive list warrants a front-page post in my opinion.
One point worth noting is that all the studies used relatively inexpressive languages with bland type systems, like C and Java, and compared those against typed equivalents. A future study ought to compare a more expressive language, like OCaml, Haskell or F#, which should I think would yield more pertinent data to this age-old debate.
Part of the benefits of types allegedly surround documentation to help refactoring without violating invariants. So another future study I'd like to see is one where participants develop a program meeting certain requirements in their language of choice. They will have as much time as needed to satisfy a correctness test suite. They should then be asked many months later to add a new feature to the program they developed. I expect that the maintenance effort required of a language is more important than the effort required of initial development, because programs change more often than they are written from scratch.
This could be a good thread on how to test the various beliefs surrounding statically typed and dynamically languages. If you have any studies that aren't mentioned above, or some ideas on what would make a good study, let's hear it!
A talk by Brian Kernighan at The University of Nottingham.
Describes all the usual suspects: AWK, EQN, PIC. Even AMPL. I was wondering which languages he had in mind when he mentioned that some of his creations were total flops. I'd love to learn from those!
The talk is a fun way to spend an hour, and the audio would be good for commuters. For real aficionados I would recommend reading Jon Bentley's articles on the design of these languages (as well as CHEM and others) instead.
Designers of Elm want their compiler to produce friendly error messages. They show some examples of helpful error messages from their newer compiler on the blog post.
Breaking Through the Normalization Barrier: A Self-Interpreter for F-omega, by Matt Brown and Jens Palsberg:
I haven't gone through the whole paper, but their claims are compelling. They have created self-interpreters in System F, System Fω and System Fω+, which are all strongly normalizing typed languages. Previously, the only instance of this for a typed language was Girard's System U, which is not strongly normalizing. The key lynchpin appears in this paragraph on page 2:
Pretty cool if this isn't too complicated in any given language. Could let one move some previously non-typesafe runtime features, into type safe libraries.
On type safety for core Scala: "From F to DOT: Type Soundness Proofs with Definitional Interpreters"
From F to DOT: Type Soundness Proofs with Definitional Interpreters by Tiark Rompf and Nada Amin:
Not only they solve a problem that has been open for 12 years, but they also deploy interesting techniques to make the proof possible and simple. As they write themselves, that includes the first type-soundness proof for F<: using definitional interpreters — that is, at least according to some, denotational semantics.
GADTs Meet Their Match: Pattern-Matching Warnings That Account for GADTs, Guards, and Laziness by Georgios Karachalias, Tom Schrijvers, Dimitrios Vytiniotis, Simon Peyton Jones:
Another great paper on a very useful incremental improvement on GADTs as found in Haskell, OCaml and Idris. Exhaustiveness checking is critical for a type system's effectiveness, and the redundant matching warnings are a nice bonus.
Optimizing Closures in O(0) time, by Andrew W. Keep, Alex Hearn, R. Kent Dybvig:
Looks like a nice and simple set of optimizations for probably the most widely deployed closure representation.
Dependent Types for Low-Level Programming by Jeremy Condit, Matthew Harren, Zachary Anderson, David Gay, and George C. Necula:
A conceptually simple approach to verifying the safety of C programs, which proceeeds in 3 phases: 1. infer locals that hold pointer bounds, 2. flow-insensitive checking introduces runtime assertions using these locals, 3. flow-sensitive optimization that removes the assertions that it can prove always hold.
You're left with a program that ensures runtime safety with as few runtime checks as possible, and the resulting C program is compiled with gcc which can perform its own optimizations.
This work is from 2007, and the project grew into the Ivy language, which is a C dialect that is fully backwards compatible with C if you #include a small header file that includes the extensions.
It's application to C probably won't get much uptake at this point, but I can see this as a useful compiler plugin to verify unsafe Rust code.
Active forum topics
New forum topics