On the (Im)possibility of Obfuscating Programs

I hope this sort of thing is welcome on LtU.

Informally, an obfuscator O is an (efficient, probabilistic) ``compiler'' that takes as input a program (or circuit) P and produces a new program O(P) that has the same functionality as P yet is ``unreadable'' in some sense. Obfuscators, if they exist, would have a wide variety of cryptographic and complexity-theoretic applications, ranging from software protection to homomorphic encryption to complexity-theoretic analogues of Rice's theorem. Most of these applications are based on an interpretation of the ``unreadability'' condition in obfuscation as meaning that O(P) is a ``virtual black box,'' in the sense that anything one can efficiently compute given O(P), one could also efficiently compute given oracle access to P.

In this work, we initiate a theoretical investigation of obfuscation. Our main result is that, even under very weak formalizations of the above intuition, obfuscation is impossible. We prove this by constructing a family of functions F that are inherently unobfuscatable in the following sense: there is a predicate pi such that (a) given any program that computes a function f in F, the value pi(f) can be efficiently computed, yet (b) given oracle access to a (randomly selected) function f in F, no efficient algorithm can compute pi(f) much better than random guessing.

We extend our impossibility result in a number of ways, including even obfuscators that (a) are not necessarily computable in polynomial time, (b) only approximately preserve the functionality, and (c) only need to work for very restricted models of computation TC0). We also rule out several potential applications of obfuscators, by constructing ``unobfuscatable'' signature schemes, encryption schemes, and pseudorandom function families.

On the (Im)possibility of Obfuscating Programs

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Other forms of obfuscation...

Quite some time ago I had a discussion with two of my professors regarding distribution of 'black box code' to remote machines, the idea being that one transform both the algorithm A and the problems P1..PN such that:

A' = G(A)
Pk' = H(G,A,Pk)
A' distributed
problems (1,P1')..(N,PN') distributed
Rk' = A'(Pk') computed remotely
(k,Rk') returned
R = H'(G,A,Rk')

The idea was to obfuscate the intent of these remote operations. The problem my professors were encountering was, apparently, both computing a powerful enough obfuscator G while ensuring H and H' computable and efficient enough that said effort was worth the cost of sending the problems over the wire. G could be created once (possibly using a key) and both G,A could be curried into H and H'.

I never did learn what became of their labors, though I don't believe the basic idea behind it was original research (IIRC they were focusing on algorithms, strength, efficiency).

Anyhow, that should count as obfuscating programs, at least I would imagine so. But it does require two parties.

If the point is a

If the point is a theoretical proof, I'm wondering where the surprise is -- I thought there are already proofs to show that steganography is (in a formal setting) impossible. Adding in the restrictions related to computations, in this sense, is a red herring.

An interesting take on this is the machine learning space (think search engine algorithms or spam filters): can you make practical algorithms that hide their nature? A more game-theoretic or statistical position seems important for algorithms you actually deploy. Tying it into LtU, a far reaching question is whether, just like current trends for crypto protocol languages, could we get similar machine learning building blocks and transforms verifiably or 'for free'?

The value of steganography

The value of steganography and other crypto-protocols is in raising the expected 'cost' to properly identify and interpret a piece of information, not in making it 'impossible' to guess the proper interpretation. Expected cost can be formally analyzed, and I have not been made aware of any arguments that indicate steganography 'impossible' in this sense.

Anyhow, the algorithm above falls under the categorization of encryption, not steganography. Only A' and P' are known to the recipient. Only the sender holds G, H, and H', which are created on the fly likely using random components. The recipient (e.g. a grid of concurrent processors owned by someone you don't trust) likely knows that it is processing encrypted problems. The idea is to keep which problems (along with any sensitive data held in those problems) out of the hands of the owners of the grid.

It is worth noting that there is not necessarily any mechanism to translate the Pk' back to Pk. That can be a one-way transform. One only needs to properly translate back the result, which is why it is H' rather than H^-1.

Tying it into LtU [...] could we get similar machine learning building blocks and transforms verifiably or 'for free'

Well, I think it is possible to create systems that become progressively better at making predictions on future and forgotten past observations based on processing past observations... and predictions are verifiable. But I don't think that's what you're asking about.

Can you explain a bit more about what you're asking?

Many things are impossible,

Many things are impossible, but, in practice, we solve them anyways via statistical and approximation approaches. Obfuscating computations might be one such thing. However, there is a new trend in machine learning asking whether it is possible to let users use an algorithm (e.g., a spam filter), but not learn how it works (e.g., so they can't write significantly better spam). The question for LtU is whether we can take a combinator-like approach to this or perhaps some program transformation. For example, crypto protocol building blocks or the 'faulty lambda calculus', respectively.

The OP notes that

The OP notes that obfuscators are difficult in the general case, but that was only if you aimed to achieve equal performance. So you sacrifice performance - e.g. via writing the algorithm in a language (perhaps an intermediate language) designed to support both return values and permutation-selected safe, correct, one-way obfuscating transforms at potential cost to performance.

But, unlike the system I described above, users would have access to an unencrypted and unrestricted 'problem' set and the ability to observe the results. Given these features they can learn the algorithm's strengths and weaknesses by simple probing... and use of their own machine-learning tool.

So you can make much more expensive the learning of 'how' the spam filter works, but you can't prevent learning 'what' the spam filter does well and does poorly. Unfortunately for the spam algorithm, it is the latter set of facts that is more relevant to writing better spam.

Request Signing Programs

The project that led me to this problem was client verification. How do we ensure a user must work with a verified client to access a network service?

We can embed keys in the clients and obfuscate the resulting binary; as additional measure, we can expire the client's key every hour or day or however long.

I suppose this could even work with JavaScript source code -- it'd be cool if you could do that -- but I'm not sure.

I am particularly interested in reliable time-stamping; I'm just not sure if that can be done. A user can alter their operating system's clock or the hardware clock; I can't think if anyway to counter the latter.

We might be able to perform reliable counting in the obfuscated binary; if the binary is restarted and loses its counter, we rely on a matching counter on the server to reject duplicates (protecting against replay attacks) and force the download of a new client with a new key.

Consider Capability Security

The project that led me to this problem was client verification. How do we ensure a user must work with a verified client to access a network service?

I understand there are reasons to do this, especially in distributed gaming. An excellent book on how to secure online games (and similar competitive systems) is Policing Online Games [amazon.com]. The goal is to prevent various forms of 'cheating' while still supporting the operation efficiency required for clients.

But, for almost every other situation I can imagine, the best approach is capability security. Attempting to lock down services so they can only run through particular client smells of expending effort, processing power, etc. entirely for the purpose of making one's resulting product less useful and more fragile.

I am particularly interested in reliable time-stamping; I'm just not sure if that can be done.

If you seriously need time-stamping for an untrusted single party, consider providing a trusted external time-notary-service with a public key that will take a string, add a time description, secure-hash the result, and sign the hash, and return the trio (time,value,signed hash). This will allow users to notarize the time at which a value was constructed in a manner that proves the value was produced at that time or earlier. Users may either send a string-representation of the value or a secure-hash of the value.


 TNS = Time Notary Server
 User: 
  Constructs value V (e.g. answer to homework problem)
  HV = HASH(V) ; secure hash
  TReply = (TNS <- HV)
 TNS:
  receives HV as X
  TM = current_time()
  HX = HASH(TM + ":" + X)
  CHX = SIGN(HX,Private Key)
  reply (TM + ":" + X + ":" + CHX)
 User:
  TReply now available
  Sends (ID,V,TReply) to instructor  
 Instructor:
  receives (ID,HOMEWORK,TREPLY)
  X = string between colons of TREPLY
  if(not equal(X, HASH(HOMEWORK))
     or overdue(TREPLY)
     or invalid_signature(TREPLY))
    reject (ID,HOMEWORK)
  ...

Time-stamping with two or more parties involved is a bit easier. Simply have both of them put what they think to be the date for the signature on a document, and allow them to decide together when a document goes into effect. The resulting product is easy to audit.

We might be able to perform reliable counting in the obfuscated binary; if the binary is restarted and loses its counter, we rely on a matching counter on the server to reject duplicates (protecting against replay attacks) and force the download of a new client with a new key.

Secure connections, with resistance to replay, can easily be achieved by Diffie-Hellman algorithms and simple session identifiers unrelated to the quality of the client binary. Tying the security of the client to the security of a network connection is a mistake: the extra coupling and complexity will do you no good and potential harm. Solve these problems in isolation.

I am interested in client verification. Really.

Attempting to lock down services so they can only run through particular client smells of expending effort, processing power, etc. entirely for the purpose of making one's resulting product less useful and more fragile.

Really, it was not my first choice. As long as I can get the times, I am otherwise okay with a third party client. I posted (and read) the article because I am interested in client verification -- not whether it is a good idea or not.

I am particularly interested in reliable time-stamping; I'm just not sure if that can be done.
If you seriously need time-stamping for an untrusted single party, consider providing a trusted external time-notary-service with a public key that will take a string, add a time description, secure-hash the result, and sign the hash, and return the trio (time,value,signed hash). This will allow users to notarize the time at which a value was constructed in a manner that proves the value was produced at that time or earlier. Users may either send a string-representation of the value or a secure-hash of the value.

The round trip time on that will be too long for my application. I want a time within 10ms of when a user clicked.

Why? My application is a curious one. I would like a lag-levelled interface for online games. I would like to be able to say: "all users will experience 400ms lag". Not one ms less and not one more. That is relatively easy to do: clients with round trip time under 400ms less some processing time timestamp their message with a time difference from some reference time (most recent log in or some such) and send it. Clients with round trip times greater than our limit simply have their requests rejected and are thrown off the server (thus, we set a high limit).

As long as the client's hardware clock is not being put in and out of the refrigerator, we can expect accurate relative times from the client. On the server, we hold each request until enough time has passed and then act on it. Voila -- fair lag. We see that a network hop in between the client and their timestamp would introduce substantial variance in the felt lag.

Now that we have got the fair lag, we can make all kinds of fair games. Kids in SoCal will not have an advantage over kids in South America when it comes to our game with guaranteed 400ms lag -- perhaps a real-time strategy game (Age of Empires showed us that even half a second is quite playable in certain games).

However, there is a vulnerability here. Those kids in SoCal -- who, believe it or not, are willing to pay for EVE characters and WoW money -- will create a market for hack tools to take advantage of their better network links. With a round trip time of 180ms, a client can pre-date their request a substantial amount -- 120ms or more. We can not catch this pre-dating in all cases. Especially if a client consistently pre-dates, we have no way of distinguishing them from someone on a laggy link.

The cheat that results is simple to model. We have a game called "Market Mayhem". Periodically, we enter a "financial crisis" in which players must select and sell off assets as fast as they can. The pre-dating client is able to act on what is effectively future information; they consistently win by selling before everyone else. By creating the illusion of fairness, we have made lag the basis for a fantastic exploit.

All we really need to trust the client for is the timestamp. For that, it is enough if we have a little obfuscated daemon that accepts requests from the real user interface to stamp and sign them and send them off. The daemon has a key it uses to sign so we know the daemon did it. We obfuscate the daemon to hide the key. We expire the key and replace the daemon every hour or day with a new obfuscated daemon.

I think that, with the addition of heartbeats, we can protect the daemon from clock spoofing (even by resetting the hardware clock). If we demand heartbeats from the daemon with a certain low variance, then a substantial pre-dating will be easy to detect. The server shoots down the daemon, invalidating its key and sending a new daemon to use. This daemon's relative timer is normed against the shifted clock and the adversary is back to square one.

We might be able to perform reliable counting in the obfuscated binary; if the binary is restarted and loses its counter, we rely on a matching counter on the server to reject duplicates (protecting against replay attacks) and force the download of a new client with a new key.
Secure connections, with resistance to replay, can easily be achieved by Diffie-Hellman algorithms and simple session identifiers unrelated to the quality of the client binary. Tying the security of the client to the security of a network connection is a mistake: the extra coupling and complexity will do you no good and potential harm. Solve these problems in isolation.

We seem to have fallen into a pattern common on IRC, help forums and mailing lists. You see through the problem I am interested in to the solution for a much simpler problem I don't actually have. Last I checked, SSL was able to protect against replay attacks from a third party.

In this scenario, the adversary (the user) has access to a signing machine but not the obfuscated secret. The replay attack I'm talking about is a user passing a request to be signed and capturing it to send at a later time. They could just pass a pile of requests they might like to use, get them all signed and then impersonate the trusted client.

In fact, the counter will not work as described. If the stamping binary is not actually able to connect to the server but is induced to sign requests, then the client can generate each request at each time and each different counter value by stopping and restarting the daemon and messing around with the hardware clock; they do this all on a super computer in a short period of time and then merrily impersonate the client till its key expires. Heartbeats, as well as being required, may also be enough.

Now, this supercomputer business is really too much for a high school kid; the only people who would invest in this sort of stuff would be gold farmers who sell characters. I think hearbeats defeat all of it at the cost of throwing players off the server when they take their laptop outside in winter.

All we really need

All we really need to trust the client for is the timestamp.

How does this stop someone from holding onto packets from the server for 120 ms before delivering them, while sniffing through the packets (by running another copy of the application, if necessary) for early information?

Please Explain A Little More

I would like to understand how this is an attack. In my description, no component depends on the value of the real round trip time -- only its variation (not too much or you lose your session) and its bound (not too high or you're kicked).

Non-issue

If implemented correctly in the general case, the situation of having an 'extra' box in the middle would be irrelevant. I.e. it is simply a subset of the general case.

In the general case, the client program will need to hold onto packets for some non-zero period before being allowed to process them. Thus, to avoid 'looking into the future', packets must be encrypted and every system capable of potentially decrypting the packets must only be able to do so at the discretion of a trustworthy component.

The trustworthy component must, therefore, be a distributor either of packets themselves or of the keys to decrypt the packets. As a service it may be local or remote; remote services are easier to ensure are trustworthy, but supporting it locally is the ideal case for minimizing latency.

Some DRM and media distribution schemes run into similar problems. But, in the case of gaming, for which worst case for violating a trusted component is pretty minimal (slightly less unfair gaming), it is reasonable to trade some trust for improved latency as an engineering decision.

In the general case, the

In the general case, the client program will need to hold onto packets for some non-zero period before being allowed to process them.

I don't see why that is necessary, actually.

I don't see why that is

I don't see why that is necessary, actually.

It is necessary because you said you wanted a guarantee for both no more lag as well as no less lag. To support the latter half, you cannot allow clients to process messages from servers (or other clients in a peer-to-peer game) until it is time to process the messages.

Unless you have a constant-latency trusted overlay network to deliver messages to all recipients exactly on time, they'll need to be queued up in the clients. (If, on the other hand, you do have such an overlay network, then your problem has already been solved.)

server holds

I don't see why that is necessary, actually.
It is necessary because you said you wanted a guarantee for both no more lag as well as no less lag.

The experienced lag is a combination of both round trip time as well as server hold time. The server hold time makes up the difference. Clients with shorter links experience a longer hold time.

Clients on shorter links will have game time that is shifted forward -- they will see things sooner in terms of wall clock time. However, they will not experience shorter lag.

There is small window that can be exploited -- but from a different source. The server has to send data slightly before it is needed, so as to cover client processing time and the small variance in lag from request to request. This doesn't have to be very much time if we are sending small enough chunks, though; and variance in lag tends to be small (though I am collecting figures still) -- less than 2ms.

Having clients do the same thing at the same wall clock time is actually much harder, since I need to synchronize clocks to do that. The variance in clock synchronization is actually tied linearly to the round trip time so I didn't want to get in to that.

This approach is exploitable

This approach is exploitable by artificially raising one's own latency in the manner Matt was describing, and is thus incomplete.

I.e. if I artificially raise my RTT latency by 100 ms, then the server will compensate by sending me messages, say, 50 ms earlier. Since the raise was artificial, I get these messages in my queue 50 ms before I'm supposed to have them, and I can use these messages to peek into the future. The shorter my link, the more I can exploit.

Even better, it would take a very intelligent system to see past the noise of variance when I occasionally allow my other targeting assist script take advantage of this information... such as targeting the head of someone who is about to pop out from behind cover before I should have been able to see them.

No, to make this work, you need to make it work without guesstimating the distances between server and client.

I never use the value of the round trip time.

The server sends updates to all clients at the same time -- it never compensates for lag by sending the same data at different times to different clients. That is why there is the timeshift I mentioned in my earlier post -- nearer clients see things sooner. They do not, however, see their own actions reflected in the world any more quickly than anyone else. Where is the exploit, as long as I send data to clients all at the same time?

I must emphasize the difference between consistent wall clock times and consistent felt lag. The latter is accomplished entirely by delaying requests based on their timestamp. The responses are not manipulated. I do not strive for consistent wall clock time because, basically, it doesn't matter if all players' responsiveness is levelled.

No, to make this work, you need to make it work without guesstimating the distances between server and client.

Can you describe my algorithm to me and then point out the step with the flaw? If not, we must start with that. Your remark assumes steps I never describe. I do not even mention a mechanism for estimating the round trip time. I use hearbeats to measure its variance (large variance is suspicious). Its value is not used in the algorithm at all.

An uncertainty principle

One simple principle to convince yourself of is that in a two party communication with different upstream and downstream latencies, and assuming no pre-synchronized clocks, there is no way for either of the two parties to figure out what portion of latency exists on their upstream vs. downstream. Neither party, no matter how elaborate the cooperative protocol, can deduce more than the round trip latency. So delaying requests and delaying responses both have the same discernable effect - they increase the round trip time. (However note: variability in the upstream is distinguishable from variability in the downstream). Understanding this point can help simplify the way you think about networking. For example, ignoring the variability issue, you can pretend that the send pipe from server to client is zero latency and all of the latency exists on the return pipe.

But anyway, you asked for a recap of your scheme:
- The server sends world updates to the client
- The client stamps any request he makes with the world time he was observing when he made them
- The server collects requests and applies them to the wolrd at time_stamp + 400 ms

The attack is:
- Client receives a packet and holds and inspects it, immediately presenting its information to the cheating user
- After 120ms, the packet is given to the obfuscated client. The cheating user is already aware of information from 120 ms beyond the world state contained in this packet.
- The obfuscated client sends back requests (based on input from the user who's "seen the future"), including the time stamp of the latest world state it's seen in this packet

Yeah, I suspect Matt's

Yeah, I suspect Matt's attack will still work since you aren't forcing clients to hold onto packets. With unencrypted packets, everything in the 'trusted' client will be operating 120 ms downstream of the player, and the cheating player will be 'reacting to' the future (effectively getting a -120 ms boost in reaction time).

Encryption

The other point I had tried to make was that merely encrypting the packets without some sort of trusted device (dongle or otherwise) to decrypt them doesn't work, since the client itself will necessarily be able to decrypt them and convert them into meaningful output for the player.

Off topic?

Why is any of this thread relevant to LTU? Shouldn't you be discussing this in some sort of networking forum?

PLT is about achieving

PLT is about recognizing and achieving properties of programming languages. Discussions regarding properties of communications protocols are eminently on-topic. They translate directly into properties for distributed programming languages, process calculi, workflow languages, etc. They also have application regarding properties of 'live' programming languages (like Smalltalk).

So I don't view this as off topic for PLT. Indeed, I'd more likely consider the development of new communications protocols aiming to achieve certain properties to be off-topic for a networking forum.

Probably...

I tend to agree that the relationship to PLT was tenuous at the outset and has diminished as the thread has progressed, but Jason's a regular here and it's tempting to post when you have something potentially helpful to say. Which reminds me: I think some kind of sanctioned overflow forum for LtU for discussion that has strayed too far from PLT topicality would be more useful (both to those who'd read it and to those who'd rather not) than e.g. an implementation forum.

No room at the Inn :)

I agree -- it is off-topic for this forum. There are few good forums for this stuff.

I created a distributed systems mailing list some time ago. The only people on it are me and the author of the Opis paper.

A networking forum would not be appropriate for a discussion of distributed algorithms -- any more than Stack Overflow would welcome a discussion of lambda calculus. Unfortunately, distributed algorithms are underserved by forums (and students) at the moment.

The client would delay for variance and variance is small.

In the attack you describe, the client's observation time is assumed to be delayable by a large amount; it also connects the observation time with the stamp time.

  • Client receives a packet and holds and inspects it, immediately presenting its information to the cheating user
  • After 120ms, the packet is given to the obfuscated client. The cheating user is already aware of information from 120 ms beyond the world state contained in this packet.
  • The obfuscated client sends back requests (based on input from the user who's "seen the future"), including the time stamp of the latest world state it's seen in this packet

I expect the user to display the packets as soon as they are available to them. As I mentioned previously, I am willing to accept the very small level of cheating introduced by this. It would be like reading ahead in streaming video -- you won't see very far.

I separate the stamper from the client to allow third party clients. The world state packets are never sent to the obfuscated stamper -- only the requests are stamped. I actually give up knowing what time the player clicked -- I just find out the time the click got to the stamper. As the stamper is very close to them (whether it is a dongle or an obfuscated binary), it is taken to be "close enough".

  • Client receives a packet and inspects it, immediately presenting its information to the cheating user. If they have a very fast machine, they are able to get 5ms lead time. If they are looking for very specific information in the packet stream, they might be able to get 10ms. I am willing to accept this.
  • Clicking madly, they make a request. It is forwarded to the obfuscated stamper.
  • The stamped request is then forwarded to the server (perhaps captured by the adversary first) from the client who has seen a little of the future.

So there is a vulnerability (which I mention in a prior comment). The error is in connecting the potential "lead time" to the client's round-trip time. In fact, the lead is only as much as the server "pre-sends" to handle variance in client lag -- something it does uniformly for all clients.

Your generic discussion of basic networking would be illuminating to undergraduates but not the average reader of LtU -- and not myself. Let us make better use of one another's time. Devote similar attention to demonstrating your supposed vulnerability, and we will all benefit -- me by having a public vetting of my algorithm, you by having vetted an approach to lag levelling.

Another venue?

Your generic discussion of basic networking would be illuminating to undergraduates but not the average reader of LtU -- and not myself.

I agree that was Mickey Mouse stuff - I started with the terse version and you didn't (and still don't) seem to get it. I'm happy to continue the discussion if you want to provide contact info or suggest another venue. Otherwise, as a parting hint, try reading my prior post with "original client" (meaning the entirety of your client-side software system - obfuscated stamper and game application) in place of "obfuscated client".

[And BTW, there's big difference between obfuscation and dongles - the latter I agree are workable.]

Here Is A Google Group

Well, what about a Google group?

The attack you describe relies on assumptions I don't make; of course I don't get it. I have been more than willing to assume the best of you and dmbarbour; I would appreciate it if in the future I could be spared both "hints" and discussions of elementary material.

Spared!

Apologies for the condescending "hint". I found your previous post a bit snippy - elementary as my networking review may have been, it was in good faith. Anyhow, we can hash this out on the google group.

No longer supporting stated goals...

Earlier you said:

I would like a lag-levelled interface for online games. I would like to be able to say: "all users will experience 400ms lag". Not one ms less and not one more. That is relatively easy to do: clients with round trip time under 400ms less some processing time timestamp their message with a time difference from some reference time (most recent log in or some such) and send it. Clients with round trip times greater than our limit simply have their requests rejected and are thrown off the server (thus, we set a high limit).

Your recent posts have been confusing to me. You don't seem to be aiming to support the above stated goal. Instead, you're only aiming to level the average variance in lag (which, to me, means the time-averaged square of the standard deviation of the round-trip time), rather than leveling the average lag (the time-averaged round-trip time), for all users.

As you earlier explained your goal, 'lag leveling' requires that the server hold commands or the client hold world-model updates for a period of time prior to processing them. The amount of time one needs to hold a command or update is directly dependent upon the actual round trip time. That is, if you have a number for how many milliseconds a packet is held, you can automatically derive a number for the estimated actual or relative round-trip-time between clients.

The vulnerability Matt describes is also based on this earlier goal in how you explained 'lag leveling'. If a client artificially raises his lag (by presenting information to the client on a delay) then the server, if written correctly, will begin to estimate a higher lag for that client. A higher estimate for lag results in timestamped commands from that client being processed, by the server, with a reduced delay. The lower delay is intended to make the cheating client 'lag-level' with other clients. The actual effect is the lower delay gives the cheater an 'unfair' reaction time boost.

If your goal is to simply reduce the jerkiness and jumpiness of a game, but you don't care whether or not some players get a huge 'reaction time' improvement over others, then reducing the 'variance' in lag is relevant. But reducing variance does not, as I understand it, significantly impact the 'fairness' of most games... especially the action games, shooters, etc. where reaction time is a much greater resource than smooth gameplay.

What is it you're aiming to affect? Because, if it is variance, then you have been talking past Matt and I for a while now.

The timestamp in the packet tells us how long to wait.

Lag leveling requires the server allow requests stamped on the client with a game time trequest to take effect in the game world some constant period of time dnominal later, at teffective. The server does not receive a request, note the client it came from and the time and then guess the lag to know how long to hold it. It knows exactly when to process the request based on a data field in the request.

From this information one could, indeed, estimate the lag. I choose not to do so, and the estimated lag plays no role in the algorithm. I am in fact describing an algorithm for lag-levelling. I require that a variance and a bound (dnominal) be specified for the algorithm to work. The variance informs us how early we need to be with the streaming and allows us to kick clients with large swings in round-trip time.

If a client artificially raises his lag (by presenting information to the client on a delay) then the server, if written correctly, will begin to estimate a higher lag for that client.

Estimating the lag is not a secure operation. There is no purely algorithmic way to do that. Maintaining a lock-step heartbeat and constant lag is possible, however -- given a bound and a variance. A client whose lag exhibits variance that is too great is just forced to re-establish their local game time. Basically, any large change in round-trip time is treated as malicious. Is this a usability issue? It depends on the variance we set. This variance is a bound on what cheaters can accomplish (and some of it is eaten by processing time on their end).

I invite you to consider continuing this conversation on a Google group.

I do not even mention a

I do not even mention a mechanism for estimating the round trip time. I use hearbeats to measure its variance (large variance is suspicious). Its value is not used in the algorithm at all.

Large variance is common on congested networks, as happens when a player's roommate is busy downloading whole seasons of his favorite television series.

Anyhow, the mechanism for guesstimating round trip times doesn't matter. Fundamentally, players can manipulate any mechanism you develop to estimate RTT, since they ultimately control when they reply to a heartbeat, when they forward updates to a machine, and everything else. And once it has been modified upwards, they can modify it downwards... which would be undetectable if they also artificially raised variance to look like a congested network.

So the trick to getting an accurate measurement is to control the incentives... i.e. to make certain there are no objective advantages in manipulating the RTT.

The only correction is loss of connection :)

The round-trip time does not actually figure anywhere in my algorithm. Why? Because estimating it is so difficult and because clients can easily fake it. We assume a bound on the round trip time and we assume a variance. I make no provisions for adjusting variance or round-trip time upwards or downwards.

If there is significant variance in when the heartbeat arrives, the user must obtain a new stamper. In short, they will spend more time downloading than playing if it happens with any frequency. This may be a design flaw from a usability standpoint -- is 10ms variance enough? -- but it prevents me from having to guess the round-trip time at any point. If a player's roommate starts downloading a whole TV series -- which would make EVE, for example, totally unplayable -- they lose their connection.

Non-issue?

Thus, to avoid 'looking into the future', packets must be encrypted and every system capable of potentially decrypting the packets must only be able to do so at the discretion of a trustworthy component.

In other words, you're requiring a dongle? I don't think that's a non-issue.

Jason: The point is that if you have the program's code and you have the input stream it will be fed, then you can predict what its output will be - you don't have to understand how it works. So the attack is to intercept and hold packets for as long as you can so that the application will correctly report a high latency, and then use your knowledge of the queued packets to give a preview of what's to come.

Dongle is different issue

the attack is to intercept and hold packets for as long as you can so that the application will correctly report a high latency, and then use your knowledge of the queued packets to give a preview of what's to come

... and this attack is only possible if Jason doesn't actually solve the problem. After all, to say 'latency is N milliseconds, not a millisecond more and not a millisecond less' requires that IF applications are allowed to (or even capable of) holding packets THEN they cannot preview them until the required time.

The solution can be solved by breaking that if/then condition: either don't allow them to hold packets (e.g. interception of messages at the edge of a trusted grid with a provable-latency connection to client), or don't allow them to preview packets (e.g. via encrypting them and delivering the keys to decrypt them to a trusted component on the local machine).

In other words, you're requiring a dongle? I don't think that's a non-issue.

One requires a trusted component to serve the decryption keys or to decrypt whole packets. That trusted component does not need to be a dongle... one could embed a PC card (as Microsoft encouraged for DRM), or use a trusted server remote to the machine (at cost of latency), or trade protection for convenience by using a black box software module that is well enough obscured against manipulation of its runtime or drop-in replacement. (In particular, a black box software module would need to resist manipulations of its clock signal, though this could be done by NTP-style clock synchronization on an encrypted channel.)

But a dongle is a perfectly valid engineering approach, especially when supporting legacy hardware. It is no less valid to require such a trusted hardware facility be available than it is to suggest players should have GPUs, a mouse, a keyboard, and a monitor to get reasonable gameplay.

Dongle As A Service

In the same way that Steam has come to serve as a mechanism for distributing many companies' games, a good dongle could serve as the basis for a "trusted component" service.

You would want to make its functionality fairly limited. Untrusted code execution alert! Partners could provide keys and maybe some validity conditions that get applied to their application's requests. The dongle is updated and managed solely by the dongle's owner.

Trusted Computing...

I am interested in client verification -- not whether it is a good idea or not.

Unfortunately, you cannot trust a client so long as the user of the client has more privileges than does your software. Software verification is mostly capable of protecting users from your client, not the other way around.

That said, what you desire IS possible. But it would require you ship some dedicated hardware - something that cannot be disassembled without destruction of critical pieces - that carries its own private keys, a temporary counter, its own clock, and its own certificates from a mutually trusted authority.

Any such device will begin to drift after a while, though it could use a trusted NTP protocol/service to resynch, especially if it is partaking also in the delayed decryption of messages.

This device could, perhaps, be emulated by a software daemon that was properly obfuscated... but any such daemon could be compromised far more readily than hardware. If the hardware is mass manufactured for a game, it shouldn't raise the price unacceptably.

The round trip time on that will be too long for my application. I want a time within 10ms of when a user clicked.

How about you aim instead for: "make certain it is in the user's best interest to minimize the lag." Cheating, like other forms of economics, is all about incentives.

I would like to be able to say: "all users will experience 400ms lag". Not one ms less and not one more [...]

Anyhow, you'd need to encrypt the contents of such messages so that the client cannot process them until you give them a key. (This tactic is described in Policing Online Games.) Since you want a low lag for the key, that would be a job you could give to the Trusted Computing device, which could read the key out of each message header and be willing to deliver said key only after a time (listed in clear-text) also in the message header. The device itself could do this without trusting the client because it can trust notarized timestamps from the remote systems.

This would require a protocol to get the endpoint Trusted Computing devices communicating to get a handle on how to decrypt packets. E.g.

  1. Diffie Hellman to set a link opaque to the client
  2. signed packets along with certificates for the signature to establish trust in the device
  3. shipping a common secret key for decrypting+encrypting per-message headers in a manner common to all devices in a session... one that can be changed periodically
  4. keeping per-session/key counters and temporary IDs for encrypting message headers in a way that resists replay

Admittedly, I think quite a few users would become paranoid about exactly what this black-box device is doing... but it could all be put into something as small as a USB key, and, further, could be provided by a mutually trusted third party.

We can not catch this pre-dating in all cases.

Not without a trusted, nigh-unhackable open-it-and-it-breaks component on the client machines.

Now that we have got the fair lag, we can make all kinds of fair games.

Not yet. You would still need to solve the problem of user scripts, assistants, and associates that can automate various game labors. And then you need to solve the problem of moles, team sabotage, and sock puppets operating behalf of another team or player. And after that you'll need to ensure fair accessibility (e.g. equal access to language, equal monitor size...).

I think 'fair game' is a bit too much an ideal. You can go for 'somewhat more fair than it was before'.

In this scenario, the adversary (the user) has access to a signing machine but not the obfuscated secret. The replay attack I'm talking about is a user passing a request to be signed and capturing it to send at a later time.

Sorry, that is not replay. Replay would be sending this message twice. This is a slightly different problem of cheating the system by keeping, in advance, a pool of messages to be sent signed as if for an earlier time.

In any case, there is nothing you can do to prevent artificial delay (except by putting the TCM device directly on an Etherenet link, which would make me very paranoid). You must simply ensure it is not in the user's best interest to do so.

One way to ensure it is not in the user's best interest is to keep a counter for the session, and to audit and occasionally kill clients that seem to be dropping too many messages... i.e. detect and punish rather than prevent the impossible.

What color is your box?

There was some hope, a few years back, that a lot of this trouble could be handled by a technique called White-Box Encoding, where, e.g. one could embed a key in an instance of DES that would hold up to fully-open analysis techniques, including e.g. fault injection attacks and full run traces. Early eforts seemed somewhat promising that this might be a good medium-strength obfuscation technique (albeit with that firm sense of disquietude that every cryptogapher carries with them), until the current schemes were broken. Badly. (and disquietude snickers a little)

There is some ongoing work in the area of building a solid theoritical foundation for white-boxing, but the outlook is still grim.

What looks more promising is work along the lines of what Erik Anderson has done, working in a slightly different obfuscation model that involves a small tamper-proof module, as you've suggested.

Steganography/Crypto Links

Thanks, I'll definitely check these sources out.

Game Over?

Thank you for the references. I guess other ciphers are up next:

The success of this cryptanalysis originates from properties which are specific to the DES. The confusion property of the DES S-boxes, the diffu- sion property of the DES permutation P and the design of the expansion operation are used to extract the key. The analysis, while specific to DES, nevertheless points the way to techniques to analyse other ciphers.

Not sure I can take you on faith like that...

I am interested in client verification -- not whether it is a good idea or not.
Unfortunately, you cannot trust a client so long as the user of the client has more privileges than does your software.

Why is that? Because obfuscaters do not actually work? If an obfuscater for a cryptographic protocol exists, we can trust the client.

Custom hardware is not something I dismiss outright -- in fact, I considered that first. However, it is not germane to the matter at hand. To say trusted hardware is a solution does not really say anything about the suitability of obfuscated clients -- only that there is an alternative if they don't work out.

More generally, I must refrain from commenting on your other material about incentives, etc. To go on at length about my game as I did in my last comment was off-topic. I hoped to clarify my motivation but it is all too easy to chatter on about one's pet projects. To transform an accurate user click collector so it is unspoofable is sort of a programming languages topic; but the larger question of handling cheating in a lag-levelled environment is really a game design question.

Obfuscators solve wrong problem

Why is that? Because obfuscaters do not actually work? If an obfuscater for a cryptographic protocol exists, we can trust the client.

Obfuscators solve a much smaller problem than the one you're hoping they would solve.

Obfuscators cannot protect their IO boundaries. They don't help all that much against intentionally 'breaking' or 'intercepting' parts of their runtime (e.g. by tweaking the memory layout of the process in which they run) in order to get approvals and denials that otherwise shouldn't have been accepted. They help very little against duplicating the system and running the existing queue in virtual machines in which time is advanced a hundred milliseconds. Etc.

If you are able to find a cryptographically secure, tamper-proof obfuscator for cryptographic protocols - one that resists tampering, replacement, and duplication in addition to reverse engineering - then you could get obfuscators to do what you need.

To say trusted hardware is a solution does not really anything about the suitability of obfuscated clients -- only that there is an alternative if they don't work out.

True.

the larger question of handling cheating in a lag-levelled environment is really a game design question

Realtime systems of many sorts (especially realtime workflows) would benefit from supporting lag-leveled operations... though in PLT one would typically approach it from the "I'll assume systems will cooperate to achieve this" rather than the "I'll assume everyone and their script-kiddie is attempting to break it for benefits to self or group or potentially to sabotage group". Competitive systems are more difficult to reason about.

To tell the truth, I am

To tell the truth, I am having a hard time finding concrete implementations of obfuscators that satisfy my needs; but I just started yesterday. The tamper-proofness is what I am looking at first.

Lag-leveling definitely has application outside of games -- for example, a transaction processing system that did not favour firms that were closer to trunk lines. At present, a millisecond is worth a lot of money in that environment. This sort of thing is near the border of PLT and distributed systems. Ultimately, I think of this as just over the border on the distributed systems side. It's too bad there is not a "Pi the Ultimate" devoted to consensus algorithms and such.

But, for almost every other

But, for almost every other situation I can imagine, the best approach is capability security. Attempting to lock down services so they can only run through particular client smells of expending effort, processing power, etc. entirely for the purpose of making one's resulting product less useful and more fragile.

Actually, this situation also seems like a good application for some Byzantine fault tolerance. This is a distributed system requiring agreement in the presence of defectors. You just need to reproduce client A's processing on X clients B1, B2, ..., BX, and reach Byzantine agreement. [Edit: it occurs to me that the lag may be too great when processing a client's actions, but it's a possibility nonetheless.]

A capability structure for a virtual world has always seemed a bit too complicated, or required too many trusted servers, but I haven't thought about it in years.

A capability structure for a

A capability structure for a virtual world has always seemed a bit too complicated, or required too many trusted servers, but I haven't thought about it in years.

Capabilities will work well enough for virtual worlds, but if you want to control which computers serve as 'hosts' you might look into ways to limit automated distribution. E.g. mark capabilities as sensitive, such that only certified hosts may directly reference them.

I've spent a while thinking about capability approaches for open virtual worlds for socialization and gaming. I.e. how do you prevent a player from 'seeing' the room he was just in for very long after he's left it? Revocable caps, publish-subscribe systems, rules-based programming for simulations, and FRP mashups of databases for combining worlds, seem all to be part of my answer. You might examine Declarative GUI which touches on these issues.

I'd imagine a closed virtual world is a relatively degenerate case of an open one.

He's pretty heavy, he's my BA

Yeah. BA is pretty heavy, requiring a lot of messages to be exchanged to reach consensus. And if you want deterministic termination you need to toss a PKI and a boatload of signatures into the mix as well. It's likely not the best approach for this problem.