LtU Forum

A2: Analog Malicious Hardware

Nothing much to do with language theory but too cool not to share.

While the move to smaller transistors has been a boon for performance it has dramatically increased the cost to fabricate chips using those smaller transistors. This forces the vast majority of chip design companies to trust a third party —often overseas—to fabricate their design. To guard against shipping chips with errors (intentional or otherwise) chip design
companies rely on post-fabrication testing. Unfortunately, this type of testing leaves the door open to malicious modifications since attackers can craft attack triggers requiring a sequence of unlikely events, which will never be encountered by even the most diligent tester.

In this paper, we show how a fabrication-time attacker can leverage analog circuits to create a hardware attack that is small (i.e., requires as little as one gate) and stealthy (i.e.,
requires an unlikely trigger sequence before effecting a chip’s functionality). In the open spaces of an already placed and routed design, we construct a circuit that uses capacitors to siphon charge from nearby wires as they transition between digital values. When the capacitors fully charge, they deploy an attack that forces a victim flip-flop to a desired value. We weaponize this attack into a remotely-controllable privilege escalation by attaching the capacitor to a wire controllable and by selecting a victim flip-flop that holds the privilege bit for our processor. We implement this attack in an OR1200 processor and fabricate a chip. Experimental results show that our attacks work, show that our attacks elude activation by a diverse set of benchmarks, and suggest that our attacks evade known defenses

A2: Analog Malicious Hardware. K. Yang, M. Hicks, Q. Dong, T. Austin, D. Sylvester, Department of Electrical Engineering and Computer Science, University of Michigan, USA.

Comment by Yonatan Zunger, Head of Infrastructure for the Google Assistant:

"This is the most demonically clever computer security attack I've seen in years. It's a fabrication-time attack: that is, it's an attack which can be performed by someone who has access to the microchip fabrication facility, and it lets them insert a nearly undetectable backdoor into the chips themselves. (If you're wondering who might want to do such a thing, think "state-level actors")

The attack starts with a chip design which has already been routed -- i.e., it's gone from a high-level design in terms of registers and data, to a low-level design in terms of gates and transistors, all the way to a physical layout of how the wires and silicon will be laid out. But instead of adding a chunk of new circuitry (which would take up space), or modifying existing circuitry significantly (which could be detected), it adds nothing more than a single logic gate in a piece of empty space.

When a wire next to this booby-trap gate flips from off to on, the electromagnetic fields it emits add a little bit of charge to a capacitor inside the gate. If it just happens once, that charge bleeds off, and nothing happens. But if that wire is flipped on and off rapidly, it accumulates in the capacitor until it passes a threshold -- at which point it triggers that gate, which flips a target flip-flop (switch) inside the chip from off to on.

If you pick a wire which normally doesn't flip on and off rapidly, and you target a vulnerable switch -- say, the switch between user and supervisor mode -- then you have a modification to the chip which is too tiny to notice, which is invisible to all known forms of detection, and if you know the correct magic incantation (in software) to flip that wire rapidly, will suddenly give you supervisor-mode access to the chip. (Supervisor mode is the mode the heart of the operating system runs in; in this mode, you have access to all the computer's memory, rather than just to your own application's)

The authors of this paper came up with the idea and built an actual microchip with such a backdoor in it, using the open-source OR1200 chip as their target. I don't know if I want to guess how many three-letter agencies have already had the same idea, or what fraction of chips in the wild already have such a backdoor in them."

Programming with a Differentiable Forth Interpreter

On arxiv. Abstract:

There are families of neural networks that can learn to compute any function, provided sufficient training data. However, given that in practice training data is scarce for all but a small set of problems, a core question is how to incorporate prior knowledge into a model. Here we consider the case of prior procedural knowledge such as knowing the overall recursive structure of a sequence transduction program or the fact that a program will likely use arithmetic operations on real numbers to solve a task. To this end we present a differentiable interpreter for the programming language Forth. Through a neural implementation of the dual stack machine that underlies Forth, programmers can write program sketches with slots that can be filled with learnable behaviour. As the program interpreter is end-to-end differentiable, we can optimize this behaviour directly through gradient descent techniques on user specified objectives, and also integrate the program into any larger neural computation graph. We show empirically that our interpreter is able to effectively leverage different levels of prior program structure and learn complex transduction tasks such as sequence sorting or addition with substantially less data and better generalisation over problem sizes. In addition, we introduce neural program optimisations based on symbolic computation and parallel branching that lead to significant speed improvements.

Theory of syntax extensions: does it exist?

I've become interested in the question of syntax extensions: not the general theory of extending syntax, nor the mechanisms for extending syntax (there are many, from the C preprocessor to Racket's extended syntax-case), but some basis for deciding what syntaxes should be provided and how they ought to be organized. Most languages have fixed syntax, either because their creators took it for granted or as a matter of principle. Those that do allow syntax extension usually have cultural constraints against overusing it, and the extended syntaxes provided by the standard library are a rag-bag whose members depend on history or the creator's whims or both. We seem to be at the same state where syntax libraries are concerned that we were in in the 1960s for procedure libraries.

Is there state of the art about this that I haven't heard about yet? Please inform and/or enlighten me.

For the record...

This google-oracle trial (##googacle) is hysterical. sarah jeong seems to be the person to follow.

Are there any PL experts/LTU-folk as expert witnesses?

STEPS Toward the Reinvention of Programming, 2012 Final Report

It appears that the final report from the STEPS program has been recently released at long last:

STEPS Toward the Reinvention of Programming, 2012 Final Report

If computing is important -- for daily life, learning, business, national defense, jobs, and more -- then qualitatively advancing computing is extremely important. Fro example, many software systems today are made from millions to hundreds of millions of lines of program code that is too large, complex and fragile to be improved, fixed, or integrated. (One hundred million lines of code at 50 lines per page is 5000 books of 400 pages each! This is beyond humane scale.)

What if this could be made literally 1000 times smaller -- or more? And made more powerful, clear, simple, and robust? This would bring one of the most important technologies of our time from a state that is almost out of human reach -- and dangerously close to being out of control -- back into human scale.

An analogy from daily life is to compare teh great pyramid of Giza, which is mostly solid bricks piled on top of each other with very little usable space inside, to a structure of similar size made from the same materials, but using the later invention of the arch. The result would be mostly usable space, and require roughly 1/1000 the number of bricks. In other words, as size and complexity increase, architectural design dominates materials.

The "STEPS Toward Expressive Programming Systems" project is taking the familiar world of personal computing used by more than a billion people every day -- currently reqiring hundreds of millions of lines of code to make and sustain -- and substantially recreating it using new programming techniques and "architectures" in dramatically smaller amounts of program code. This i made possible by new advances in design, programming, programming languages, systems organization, and the use of science to analyze and create models of software artifacts.

..and there seems to be a new research activity with Alan Kay starting up (HARC which might at least inspired by STEPS (if not used a jumping off point).

A Farewell to FRP in Elm

Making signals unnecessary with The Elm Architecture

...the big benefit is that Elm is now significantly easier to learn and use. As the design of subscriptions emerged, we saw that all the toughest concepts in Elm (signals, addresses, and ports) could collapse into simpler concepts in this new world. Elm is designed for ease-of-use, so I was delighted to stumble upon a path that would take us farther with fewer concepts. To put this in more alarmist terms, everything related to signals has been replaced with something simpler and nicer.

Algebra Of Pointers

I have been thinking a little about pointers, and what kind of mathematical structure they form. They seem to be like a one-dimensional vector (as opposed to a simple scalar). If we use the terms location for a point in n-dimensional space, and distance for a vector, then we get the following properties, adding a distance to a location results in a location, subtracting a distance from a location results in a location, subtracting a location from a location results in a distance.

Where is gets more tricky is that adding a location to a location would seem to be invalid (I don't know what type of thing this is), and yet to average two locations, to get the mid-point seems perfectly reasonable. Topologically what is going on here? What is the type of a location added to a location?

If we treat all locations as distances with an origin at zero, then everything makes sense again in terms of types, except for zero itself which seems to still be a location and not a distance (or it all gets horribly self-referential).

Any thoughts on how to make sense of this, what the type of the sum of two locations might be, etc?

Session Types in a Linearly Typed Multi-Threaded Lambda-Calculus

The paper Session Types in a Linearly Typed Multi-Threaded Lambda-Calculus, Hongwei Xi, Zhiqiang Ren, Hanwen Wu, William Blair, 2016.

We present a formalization of session types in a multi-threaded lambda-calculus (MTLC) equipped with a linear type system, establishing for the MTLC both type preservation and global progress. The latter (global progress) implies that the evaluation of a well-typed program in the MTLC can never reach a deadlock. As this formulated MTLC can be readily embedded into ATS, a full-fledged language with a functional programming core that supports both dependent types (of DML-style) and linear types, we obtain a direct implementation of session types in ATS. In addition, we gain immediate support for a form of dependent session types based on this embedding into ATS. Compared to various existing formalizations of session types, we see the one given in this paper is unique in its closeness to concrete implementation. In particular, we report such an implementation ready for practical use that generates Erlang code from well-typed ATS source (making use of session types), thus taking great advantage of the infrastructural support for distributed computing in Erlang.

Programming by poking: why MIT stopped teaching SICP

A blog post; excerpt:

In this talk at the NYC Lisp meetup, Gerry Sussman was asked why MIT stopped teaching the legendary 6.001 course, which was based on Sussman and Abelson’s classic text The Structure and Interpretation of Computer Programs (SICP). Sussman’s answer was that: (1) he and Hal Abelson got tired of teaching it (having done it since the 1980s). So in 1997, they walked into the department head’s office and said: “We quit. Figure out what to do.” And more importantly, (2) that they felt that the SICP curriculum no longer prepared engineers for what engineering is like today. Sussman said that in the 80s and 90s, engineers built complex systems by combining simple and well-understood parts. The goal of SICP was to provide the abstraction language for reasoning about such systems.

Today, this is no longer the case. Sussman pointed out that engineers now routinely write code for complicated hardware that they don’t fully understand (and often can’t understand because of trade secrecy.) The same is true at the software level, since programming environments consist of gigantic libraries with enormous functionality. According to Sussman, his students spend most of their time reading manuals for these libraries to figure out how to stitch them together to get a job done. He said that programming today is “More like science. You grab this piece of library and you poke at it. You write programs that poke it and see what it does. And you say, ‘Can I tweak it to do the thing I want?'”. The “analysis-by-synthesis” view of SICP — where you build a larger system out of smaller, simple parts — became irrelevant. Nowadays, we do programming by poking.

Also, see the Hacker news thread. I thought this comment was useful:

What should we consider fundamental?

A fair question, and a full answer would be too long for a comment (though it would fit in a blog post, which I'll go ahead and write now since this seems to be an issue). But I'll take a whack at the TL;DR version here.

AI, ML, and NLP and web design are application areas, not fundamentals. (You didn't list computer graphics, computer vision, robotics, embedded systems -- all applications, not fundamentals.) You can cover all the set theory and graph theory you need in a day. Most people get this in high school. The stuff you need is just not that complicated. You can safely skip category theory. What you do need is some amount of time spent on the idea that computer programs are mathematical objects which can be reasoned about mathematically. This is the part that the vast majority of people are missing nowadays, and it can be a little tricky to wrap your brain around at first. You need to understand what a fixed point is and why it matters. You need automata theory, but again, the basics are really not that complicated. You need to know about Turing-completeness, and that in addition to Turing machines there are PDAs and FSAs. You need to know that TMs can do things that PDAs can't (like parse context-sensitive grammars), and that PDAs can to things that FSAs can't (like parse context-free grammars) and that FSAs can parse regular expressions, and that's all they can do.

You need some programming language theory. You need to know what a binding is, and that there are two types of bindings that matter: lexical and dynamic. You need to know what an environment is (a mapping between names and bindings) and how environments are chained. You need to know how evaluation and compilation are related, and the role that environments play in both processes. You need to know the difference between static and dynamic typing. You need to know how to compile a high-level language down to an RTL. For operating systems, you need to know what a process is, what a thread is, some of the ways in which parallel processes lead to problems, and some of the mechanisms for dealing with those problems, including the fact that some of those mechanisms require hardware support (e.g. atomic test-and-set instructions). You need a few basic data structures. Mainly what you need is to understand that what data structures are really all about is making associative maps that are more efficient for certain operations under certain circumstances.

You need a little bit of database knowledge. Again, what you really need to know is that what databases are really all about is dealing with the architectural differences between RAM and non-volatile storage, and that a lot of these are going away now that these architectural differences are starting to disappear. That's really about it.

Was there a language with an explicit call stack?

When writing a runtime system or an interpreter in C (though TBH it’s true about all programming languages that I know of) with e. g. moving garbage collection or support for continuations, you find that, surprisingly, the call stack is too abstracted away by the semantics, and basically have to roll a separate one yourself (without the benefits of hardware support) or resort to things like getcontext(3) or GCC’s split stacks that technically are an extension of the semantics that need their own definition. LLVM tries to support precise GC, but falls short of real-world uses. Another facet of this problem is the unpredictability of stack overflows with alloca(3), C99 variable-length arrays or just plain old recursion. (I have to count my recursive calls so that my recursive-descent parser doesn’t crash on malicious input? Really?)

This motivates the question: was there a language that officially permitted or even required managing one’s own call stack, inspecting stack frames and the like? Note that this doesn’t mean not having some sort of subroutine linkage, but could mean banning recursion unless the user explicitly allocates space for it, so that the implicit stack is statically guaranteed to be bounded. (FORTRAN didn’t have recursion, of course, but I doubt it was for this reason.) Things I’d want to (be able to) implement in such a language are, again, a moving GC, closures, coroutines, and continuations. I’d expect the language to be reasonably ALGOL- or ML-like so that FORTH and its kin are excluded.

On a second thought it looks like the requirements of moving GC are orthogonal to those of continuations and well-defined stack overflow behaviour, but closures seem to be somewhere in between, so I’m leaving it as one question with possibly multiple valid answers.

XML feed