LtU Forum

PLDI 2016 Proceedings now available on-line, free for 3 weeks

The proceedings for PLDI 2016 - taking place the week of June 13 in Santa Barbara - are now available on-line, free to all until 2 weeks after the conference. Download now! Lots of good stuff there.

http://dl.acm.org/citation.cfm?id=2908080

WebAssembly

Finally, my prayers are being heard. The question was "get us web enabled bytecode" and the answer came from above in a form of W3C group effort, WebAssembly.

Though it encapsulates low level jump commands into structures controlled by labels (I'd like to leave it at the low level for maximum freedom), it is more or less what I'd expected from a bytecode that could be ran inside browser. It's enough assembler like language to leave a lot of space for specific implementation of different programming language architectures, while retaining speed of native implementations. Maybe it is an overkill for a bytecode language to automatically deal with functions (see "call" command), but it can't hurt, we dont have to use it, we can maintain our own call stack if we want to replace existing one.

It is ought to have continuous memory handler that indexes space from 0 on. This memory can even grow on the fly, if we want to. This is just what I need, encapsulation of memory management.

If this is what I'm thinking it is (nearly 1:1 mapping from bytecode to machine code upon compilation), I can say only one thing: "thank you, world :)"

Stack of regions for managing effects?

Tofte and Talpin used stack of regions for managing memory. Can you recommend papers where per-frame regions/heaps are used for managing effects?

Metaprogramming and Free Availability of Sources

Metaprogramming and Free Availability of Sources by François-René Rideau

This is an older paper about metaprogramming and it ties into open source development process and why metaprogramming requires free/open source software.

The beginning of the article outlines a list of types of metaprograms such as compilers/translators, phase splitters, code walkers, meta compilers,explainers, etc. Later on it says that metaprogramming is required as problem complexity increases and there are more actors/stakeholders involved.

In the traditional development process, that excludes metaprogramming, the near totality of the code constituting software is hand-typed by humans; the details of execution, the overall behavior, everything must follow directly from human thought; every stage of the development, every modification is the work of man. Now, the catch is that sooner or later, some new functionalities to add to a software project will require global architectural changes of some depth: all the global modifications of the program will have to be performed by hand, with all the inconsistency problems unwillingly introduced by these changes, that induce numerous bugs and cost lots of iterations of development (for a publicly documented such phenomenon, see the occasional API changes in the Linux kernel; for a spectacular example, see the explosive bug of the first Ariane V rocket). The alternative to these modifications is a complete reimplementation, whose cost is that of a rewrite from scratch. All in all, the incremental cost of traditional development is proportional to the size of code that differs between two iterations of the process.

Let us now make metaprogramming part of the development cycle. Since metaprogramming allows arbitrary processing of code by the computers themselves, it enables the programmer who has to face a structural change in his program to perform semi-automatically any task of instrumentation or transformation of his program, so as to make it conform the new architecture, to check his new code with respect to declared invariants, to enforce consistency of various parts of his program.

In statically typed languages or at least in better IDEs, it's possible to perform various refactorings across a codebase which can constitute metaprogramming in a crude respect:

tiny doses of metaprogramming used in traditional software projects are a determining factor for the continuation of these projects: choice, often static, of a development environment; filtering of programs with static analyzers, checkers of more or less sophisticated invariants, or other kinds of lint; use of editors that manage the indentation and structure of programs, that they can colorize and fontify; occasional use of search-and-replace tools on the source code; in the case of more daring hackers, “macros” and “scripts” written in elisp, perl or other languages, to edit and manipulate sources.

The question I pose to LtU is what are the ways in which metaprogramming can be introduced in the development of proprietary programs in mainstream languages such as JavaScript and Java.

Microscript

I'm designing a Turing complete utility microlanguage that has a minimal set of commands. Chosen commands should on the one side have ability to compile to fast assembler code, and on the other side sould be enough descriptive so one can directly write sane code with them.

The microlanguage would be a part of my bigger project metalanguage that compiles any other languages to the microlanguage which at the end (jit or cached) compiles either to browser-executable (html + javascript), either to native code. Because it wouldn't really be used for high level programming, but only as underneath intermediate language (analogous to some bytecode language), it shouldn't have all heavy weight addons like objects, structs or reactive constructs, but these addons should be able to compile from a guest language to the host microlanguage. It should be also easy to port the microlanguage to platforms like OSX, Windows, Linux or Android.

So I'm thinking about a minimal set of commands. Somehow I think it should be imperative language, because microprocessors work that way at low-level understanding. I was thinking of supporting only these:

  • variable assignment and arithmetic
  • pointers and arrays like in C
  • if branching
  • for and while loops with labels
  • defining functions with stack support
  • assembler passing engine like in C for system programming

Do these commands sound cool or there is something that can be changed for better speed/usability performance?

A2: Analog Malicious Hardware

Nothing much to do with language theory but too cool not to share.

While the move to smaller transistors has been a boon for performance it has dramatically increased the cost to fabricate chips using those smaller transistors. This forces the vast majority of chip design companies to trust a third party —often overseas—to fabricate their design. To guard against shipping chips with errors (intentional or otherwise) chip design
companies rely on post-fabrication testing. Unfortunately, this type of testing leaves the door open to malicious modifications since attackers can craft attack triggers requiring a sequence of unlikely events, which will never be encountered by even the most diligent tester.

In this paper, we show how a fabrication-time attacker can leverage analog circuits to create a hardware attack that is small (i.e., requires as little as one gate) and stealthy (i.e.,
requires an unlikely trigger sequence before effecting a chip’s functionality). In the open spaces of an already placed and routed design, we construct a circuit that uses capacitors to siphon charge from nearby wires as they transition between digital values. When the capacitors fully charge, they deploy an attack that forces a victim flip-flop to a desired value. We weaponize this attack into a remotely-controllable privilege escalation by attaching the capacitor to a wire controllable and by selecting a victim flip-flop that holds the privilege bit for our processor. We implement this attack in an OR1200 processor and fabricate a chip. Experimental results show that our attacks work, show that our attacks elude activation by a diverse set of benchmarks, and suggest that our attacks evade known defenses

A2: Analog Malicious Hardware. K. Yang, M. Hicks, Q. Dong, T. Austin, D. Sylvester, Department of Electrical Engineering and Computer Science, University of Michigan, USA.

Comment by Yonatan Zunger, Head of Infrastructure for the Google Assistant:

"This is the most demonically clever computer security attack I've seen in years. It's a fabrication-time attack: that is, it's an attack which can be performed by someone who has access to the microchip fabrication facility, and it lets them insert a nearly undetectable backdoor into the chips themselves. (If you're wondering who might want to do such a thing, think "state-level actors")

The attack starts with a chip design which has already been routed -- i.e., it's gone from a high-level design in terms of registers and data, to a low-level design in terms of gates and transistors, all the way to a physical layout of how the wires and silicon will be laid out. But instead of adding a chunk of new circuitry (which would take up space), or modifying existing circuitry significantly (which could be detected), it adds nothing more than a single logic gate in a piece of empty space.

When a wire next to this booby-trap gate flips from off to on, the electromagnetic fields it emits add a little bit of charge to a capacitor inside the gate. If it just happens once, that charge bleeds off, and nothing happens. But if that wire is flipped on and off rapidly, it accumulates in the capacitor until it passes a threshold -- at which point it triggers that gate, which flips a target flip-flop (switch) inside the chip from off to on.

If you pick a wire which normally doesn't flip on and off rapidly, and you target a vulnerable switch -- say, the switch between user and supervisor mode -- then you have a modification to the chip which is too tiny to notice, which is invisible to all known forms of detection, and if you know the correct magic incantation (in software) to flip that wire rapidly, will suddenly give you supervisor-mode access to the chip. (Supervisor mode is the mode the heart of the operating system runs in; in this mode, you have access to all the computer's memory, rather than just to your own application's)

The authors of this paper came up with the idea and built an actual microchip with such a backdoor in it, using the open-source OR1200 chip as their target. I don't know if I want to guess how many three-letter agencies have already had the same idea, or what fraction of chips in the wild already have such a backdoor in them."

Programming with a Differentiable Forth Interpreter

On arxiv. Abstract:

There are families of neural networks that can learn to compute any function, provided sufficient training data. However, given that in practice training data is scarce for all but a small set of problems, a core question is how to incorporate prior knowledge into a model. Here we consider the case of prior procedural knowledge such as knowing the overall recursive structure of a sequence transduction program or the fact that a program will likely use arithmetic operations on real numbers to solve a task. To this end we present a differentiable interpreter for the programming language Forth. Through a neural implementation of the dual stack machine that underlies Forth, programmers can write program sketches with slots that can be filled with learnable behaviour. As the program interpreter is end-to-end differentiable, we can optimize this behaviour directly through gradient descent techniques on user specified objectives, and also integrate the program into any larger neural computation graph. We show empirically that our interpreter is able to effectively leverage different levels of prior program structure and learn complex transduction tasks such as sequence sorting or addition with substantially less data and better generalisation over problem sizes. In addition, we introduce neural program optimisations based on symbolic computation and parallel branching that lead to significant speed improvements.

Theory of syntax extensions: does it exist?

I've become interested in the question of syntax extensions: not the general theory of extending syntax, nor the mechanisms for extending syntax (there are many, from the C preprocessor to Racket's extended syntax-case), but some basis for deciding what syntaxes should be provided and how they ought to be organized. Most languages have fixed syntax, either because their creators took it for granted or as a matter of principle. Those that do allow syntax extension usually have cultural constraints against overusing it, and the extended syntaxes provided by the standard library are a rag-bag whose members depend on history or the creator's whims or both. We seem to be at the same state where syntax libraries are concerned that we were in in the 1960s for procedure libraries.

Is there state of the art about this that I haven't heard about yet? Please inform and/or enlighten me.

For the record...

This google-oracle trial (##googacle) is hysterical. sarah jeong seems to be the person to follow.

Are there any PL experts/LTU-folk as expert witnesses?

STEPS Toward the Reinvention of Programming, 2012 Final Report

It appears that the final report from the STEPS program has been recently released at long last:

STEPS Toward the Reinvention of Programming, 2012 Final Report

If computing is important -- for daily life, learning, business, national defense, jobs, and more -- then qualitatively advancing computing is extremely important. Fro example, many software systems today are made from millions to hundreds of millions of lines of program code that is too large, complex and fragile to be improved, fixed, or integrated. (One hundred million lines of code at 50 lines per page is 5000 books of 400 pages each! This is beyond humane scale.)

What if this could be made literally 1000 times smaller -- or more? And made more powerful, clear, simple, and robust? This would bring one of the most important technologies of our time from a state that is almost out of human reach -- and dangerously close to being out of control -- back into human scale.

An analogy from daily life is to compare teh great pyramid of Giza, which is mostly solid bricks piled on top of each other with very little usable space inside, to a structure of similar size made from the same materials, but using the later invention of the arch. The result would be mostly usable space, and require roughly 1/1000 the number of bricks. In other words, as size and complexity increase, architectural design dominates materials.

The "STEPS Toward Expressive Programming Systems" project is taking the familiar world of personal computing used by more than a billion people every day -- currently reqiring hundreds of millions of lines of code to make and sustain -- and substantially recreating it using new programming techniques and "architectures" in dramatically smaller amounts of program code. This i made possible by new advances in design, programming, programming languages, systems organization, and the use of science to analyze and create models of software artifacts.

..and there seems to be a new research activity with Alan Kay starting up (HARC which might at least inspired by STEPS (if not used a jumping off point).

XML feed