Automating Ad hoc Data Representation Transformations by Vlad Ureche, Aggelos Biboudis, Yannis Smaragdakis, and Martin Odersky:
To maximize run-time performance, programmers often specialize their code by hand, replacing library collections and containers by custom objects in which data is restructured for efficient access. However, changing the data representation is a tedious and error-prone process that makes it hard to test, maintain and evolve the source code.
We present an automated and composable mechanism that allows programmers to safely change the data representation in delimited scopes containing anything from expressions to entire class definitions. To achieve this, programmers define a transformation and our mechanism automatically and transparently applies it during compilation, eliminating the need to manually change the source code.
Our technique leverages the type system in order to offer correctness guarantees on the transformation and its interaction with object-oriented language features, such as dynamic dispatch, inheritance and generics.
We have embedded this technique in a Scala compiler plugin and used it in four very different transformations, ranging from improving the data layout and encoding, to
retrofitting specialization and value class status, and all the way to collection deforestation. On our benchmarks, the technique obtained speedups between 1.8x and 24.5x.
This is a realization of an idea that has been briefly discussed here on LtU a few times, whereby a program is written using high-level representations, and the user has the option to provide a lowering to a more efficient representation after the fact.
This contrasts with the typical approach of providing efficient primitives, like primitive unboxed values, and leaving it to the programmer to compose them efficiently up front.