Securing the .NET Programming Model

Securing the .NET Programming Model. Andrew J. Kennedy.

The security of the .NET programming model is studied from the standpoint of fully abstract compilation of C#. A number of failures of full abstraction are identified, and fixes described. The most serious problems have recently been fixed for version 2.0 of the .NET Common Language Runtime.

This is highly amusing stuff, of course. Some choice quotes:

if source-language compilation is not fully abstract, then there exist contexts (think ‘attackers’) in the target language that can observably distinguish two program fragments not distinguishable by source contexts. Such abstraction holes can sometimes be turned into security holes: if the author of a library has reasoned about the behaviour of his code by considering only source-level contexts (i.e. other components written in the same source language), then it may be possible to construct a component in the target language which provokes unexpected and damaging behaviour.

One could argue that full abstraction is just a nicety; programmers don’t really reason about observations, program contexts, and all that, do they? Well, actually, I would like to argue that they do. At least, expert programmers...

"A C# programmer can reason about the security properties of component A by considering the behaviour of another component B written in C# that “attacks” A through its public API." -
This can only be achieved if compilation is fully abstract.

To see the six problems identified by thinking about full abstraction you'll have to go read the paper...

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

The C in CLR

The CLR is supposed to support other languages besides C#. Thus the compilation to the CLR Intermediate Language (IL) should by fully abstract for each and every language compiled.

Full abstraction in multi-language runtimes

Full abstraction would be a difficult feat to achieve in a multi-language runtime. For example, languages in which the parametricity theorem holds require an additional set of abstraction properties. What about a language with abstraction properties based on dependent types? Proof carrying code? Many features unique to certain languages would require additional CLR-level support to maintain full abstraction, so much that this doesn't seem a reasonable property to expect in a multi-language runtime.

Of course, since C# is the cannonical language of CLR, it makes sense to tighten the system to the point where full abstraction does hold for C#, without expecting this of other languages implemented on top of CLR.

For example, if you were going to build a serious proof-carrying code framework to implement further degrees of security, you would expect the proofs to be checked and the abstraction properties to hold at runtime in some higher-level translator from your language to CLR, rather than expecting them to actually hold in CLR.

Thus CLR may eventually prove "too tight" for high-performance languages that employ proofs and more advanced type systems, since the source language can often eliminate array out-of-bounds checks, null pointer checks, and implicitly thread-parallelize code, while the CLR itself cannot and thus incurs runtime penalties. But this isn't so much a complaint about CLR (which seems to be exactly what it should be), but a statement that it's not the last language runtime the world will ever need.

Multi-language holes?

Of course, since C# is the cannonical language of CLR, it makes sense to tighten the system to the point where full abstraction does hold for C#, without expecting this of other languages implemented on top of CLR.

Would that not mean that components written in other languages could provide "holes", allowing attacks against C# components? (Assuming that the other languages do not provide stronger guarantees than C#.)

I think you've got it backwards

Would that not mean that components written in other languages could provide "holes", allowing attacks against C# components? (Assuming that the other languages do not provide stronger guarantees than C#.)

Guarantees made by high-level language are irrelevant. The only guarantees that count are ones honoured by the CLR. This is the source of the problem.

For instance, if the semantics of C# state that an object is immutable, a programmer might assume that it is safe to pass it to untrusted code. If the CLR does not honour this, there is a security hole.

"Full abstraction holds" is another way of saying "guarantees made by the semantics of the high-level language are always honoured at runtime", which is the property needed to write secure code regardless of the attackers choice of language.

You're right.

So a language intrinsically weaker than C# could not be implemented as-is on the CLR---it would get the semantics of the stronger guarantees. For example, a component in a language with only mutable objects would get an error when it tried to modify an immutable object passed to it.

On the other hand, a language making stronger guarantees than C# could not allow trust in any components written in C# or weaker languages, since the CLR will not guarantee the stronger semantics.

"Full abstraction" for all languages supported by the CLR seems to be a hard problem. And "Common" seems a little off, too, in this scenario.

Yup

At least, that's what my understanding is.

Definition

"Full abstraction holds" is another way of saying "guarantees made by the semantics of the high-level language are always honoured at runtime", which is the property needed to write secure code regardless of the attackers choice of language.

Thanks, that's the best definition I've found of full abstraction so far.

Can we do better than bug-squishing?

This is news to me! Wasn't .NET supposed to be the panacea for all our untrusted vs trusted code headaches? Did this dream die?

[Apologies to those for whom this is old news. Please indulge my ignorance.]

With 20 20 hindsight it's easy to spot the flaw. The permissions afforded by the CLR must, by necessity, be a superset of the permissions required by each high-level .NET language. Thus, code written in, say, C# can be interacted with in ways forbidden by the semantics of C#, and unspeakable mayhem follows.

In his paper, Kennedy finds and fixes six failures of full abstraction in C#, as if they were bugs to be squished. It strikes me that, as bugs go, failures of abstraction must be elusive, and the fixes kludgy. Is there a better way? Are our PL Theory tools up to the task?

Holes

Would that not mean that components written in other languages could provide "holes", allowing attacks against C# components? (Assuming that the other languages do not provide stronger guarantees than C#.)

I think that could be avoided completely with small refinements to the CLR. For example, if the CLR supported a genuine bool type and enforced that it only contained true and false (and not other byte values like 67), then other languages could interoperate with C# without breaking the abstraction property in C#.

This is news to me! Wasn't .NET supposed to be the panacea for all our untrusted vs trusted code headaches? Did this dream die?

The behavior that Kennedy et all have pointed out are not security holes in the traditional sense, e.g. of what Internet Explorer is littered with. These findings offer no way to overwrite random memory, corrupt the stack, hijack a computer, etc.

Rather, they just show that certain ideal properties of the C# language don't hold in the CLR, and suggest ways to improve this. For example, there exist bool values v where v!=true && v!=false.

I only wish the rest of the industry (and of Microsoft!) were as open and respectable about identifying limitations and potential security issues and working publically to study and address them.

Holes

I only wish the rest of the industry (and of Microsoft!) were as open and respectable about identifying limitations and potential security issues and working publically to study and address them.

Hear, hear!

The behavior that Kennedy et all have pointed out are not security holes in the traditional sense, e.g. of what Internet Explorer is littered with. These findings offer no way to overwrite random memory, corrupt the stack, hijack a computer, etc.

Ah... I don't entirely agree with you on that point. You're not comparing like with like.

The .NET framework is supposed to allow us to run untrusted code in the same runtime as trusted code. If I were running untrusted code in the same runtime as Internet Explorer, flaws in IE would be the least of my security concerns!

The point is that a malicious agent can cause unexpected behaviour in security critical code. He has a foot in the door. Whether or not the attacker can escalate this into a full-scale compromise is a matter of luck and the attacker's skill.

Indeed, from a certain perspective, this is worse than an IE buffer overflow. Buffer overflows happen because of sloppy coding, and we might reasonably expect that a more careful, security conscious programmer would not make the same mistakes (for some value of "reasonably"). We are, however, demanding much more of the programmer if we expect him to anticipate attacks that arise from the failure of the CLR to respect the semantics of his programming language.

Unexpected behavior

The point is that a malicious agent can cause unexpected behaviour in security critical code. He has a foot in the door. Whether or not the attacker can escalate this into a full-scale compromise is a matter of luck and the attacker's skill.

This opens a wonderful can of worms!

Given bool b, should C#'s type system assure that (b!=true && b!=false) is never true? Should it assure that given int a,b with a>0 and b>0, that a+b>0? That given "object a=3,b=3" that a==b? None of those properties hold in C#, due to various historical accidents that most programmers educated in this decade don't intuitively understand.

Let's eat worms

Again, I don't think you are making a fair comparison.

The latter examples are instances of programmer confusion within the semantics of the language. One could make a case that this is a flaw in the design of the language (why do we encourage programmers to think of an int as if it was an integer?) but the behaviour can be anticipated just by consulting the standard, and language designs can be improved.

The example of (b != true && b != false) evaluating to true is a different can of worms entirely. The standard has been violated. It's like asking an engineer to deal with violations of the Laws of Physics. There's no point in fixing the standard if the standard isn't honoured!

You're right

You're right. (b != true && b != false) is an example of weird, semantically incorrect behavior while given int a>0,b>0, having a+b less than zero is weird, semantically correct behavior.

Um...

... have we reached a consensus here? :-)