Trickles: A Stateless Network Stack for Improved Scalability, Resilience and Flexibility

Trickles: A Stateless Network Stack for Improved Scalability, Resilience and Flexibility (PDF) by Alan Shieh, Andrew C. Myers, Emin Gun Sirer (2005)

Abstract: Traditional operating system interfaces and network protocol implementations force system state to be kept on both sides of a connection. Such state ties the connection to an endpoint, impedes transparent failover, permits denial-of-service attacks, and limits scalability. This paper introduces a novel TCP-like transport protocol and a new interface to replace sockets that together enable all state to be kept on one endpoint, allowing the other endpoint, typically the server, to operate without any per-connection state. Called Trickles, this approach enables servers to scale well with increasing numbers of clients, consume fewer resources, and better resist denial-of-service attacks. Measurements on a full implementation in Linux indicate that Trickles achieves performance comparable to TCP/IP, interacts well with other flows, and scales well. Trickles also enables qualitatively different kinds of networked services. Services can be geographically replicated and contacted through an anycast primitive for improved availability and performance. Widely-deployed practices that currently have client-observable side effects, such as periodic server reboots, connection redirection, and failover, can be made transparent, and perform well, under Trickles. The protocol is secure against tampering and replay attacks, and the client interface is backwards-compatible, requiring no changes to sockets-based client applications.

What you get when you combine continuations and networking. The idea should be obvious to most LtUers.

I really can't believe this hasn't been mentioned on LtU before.

The site is here.

PDF link broken

The PDF link appears to point straight back here to LtU. The PDF can be found here.

By Allan McInnes at Thu, 2007-06-07 05:41 | login or register to post comments

fixed

By Derek Elkins at Thu, 2007-06-07 06:07 | login or register to post comments

Neat

I like it, although there are some downsides:

* The network continuation is 75+12m bytes (where m is the number of "loss events" and usually = [<=?] 1). A TCP header is 20 bytes, so the difference is significant but not terribly so.

* CPU overhead is higher but, they say, "does not pose a server bottleneck even at gigabit speeds." TCP's utilization is shown at ~55%, while Trickles is ~78%, when copying a file from memory.

On the other hand, strangely, they do not reference any of the other work on continuations and networking, particularly the webby stuff.

I do wonder how hard it would be to make it stateless on both sides....

By Tommy McGuire at Mon, 2007-06-18 21:32 | login or register to post comments

Stateless both sides

I do wonder how hard it would be to make it stateless on both sides....

I suppose it could be done, but then the connection would drop as soon as a packet was lost. As it stands, it survives packet loss because the client has enough state to retransmit.

By John Stracke at Wed, 2007-06-27 17:58 | login or register to post comments

Well, what if the routers take the responsibility of keeping this state to some extent (as they already do for various things, especially in multicast, and even for TCP connections)?
Yes, the routers then become the bottleneck, but in some cases this shift of responsibility might be useful.

By Andris Birkmanis at Thu, 2007-06-28 14:09 | login or register to post comments

The routers themselves then

The routers themselves then become vulnerable, and IP never guarantees that the same route will be used, so this doesn't seem viable. No, I think the incentives against misuse are properly aligned if the client must maintain the state.

By naasking at Thu, 2007-06-28 15:07 | login or register to post comments

Egress routers only

No, I think the incentives against misuse are properly aligned if the client must maintain the state.

I agree. But "the client" may have a broader meaning than "the software running on the specific chunk of hardware that person is holding on his lap". The client may be an organization, or unit within it. As long as this organization properly redistributes incentives internally (e.g., by firing abusers :) ), I believe it's perfectly ok to treat the whole organization responsible to the outside server. If the router we are talking about belongs to the same organization as the end user, then it is not vulnerable. E.g., I could envision egress routers maintaining TCP state for clients residing in their organization. This would factor network state out of hardware nodes running applications - I am not sure whether the benefits of this are worth the effort, but from mechanism design POV it looks doable.

By Andris Birkmanis at Sat, 2007-06-30 07:36 | login or register to post comments

Too bad, it starts with SYN, ACK

IMHO, this new protocol should have two variants:
1- a UDP like question-answer variant. The client can send request data with the fist "SYN" packet and the server is notified of the SYN packet (instead of having the SYN packet processed only by the network stack), so the server can choose either:
* to answer immediately: less latency (only one RTT) but the server is vulnerable to DOS attack with source IP spoofing, so it's useful only in protected networks.
* or to answer only with an ACK, but in the background the server starts also to process the request to fill a cache (with a low priority) . So when the request/reply to the ACK arrives on the server, hopefully the server can answer immediately as the answer is already in the cache.

This may imply using a new API for the client though, I'm not sure..

2- the TCP like protocol described in the article.

Sure the cache in (1) can be seen as a state in the server, but as it's only a cache, it doesn't really matter..

By renox at Sun, 2007-07-01 08:53 | login or register to post comments

Actually, it does support #1 already.

In the Trickles API, the client can write to the socket before connecting. Queued-up data is sent with the very first packet. This usage is of course not backward compatible with TCP sockets.

By andru at Tue, 2008-01-22 19:01 | login or register to post comments

Lambda the Ultimate

User login

Navigation