Programming and Scaling

Programming and Scaling, a one-hour lecture by Alan Kay at his finest (and that's saying something!)

Some of my favorite quotes:

  • "The biggest problem we have as human beings is that we confuse our beliefs with reality."
  • "We could imagine taking the internet as a model for doing software modules. Why don't people do it?" (~00:17)
  • "One of the mistakes that we made years ago is that we made objects too small." (~00:26)
  • "Knowledge in many cases trumps IQ. [Henry] Ford was powerful because Isaac Newton changed the way we think." (~00:28)
  • "Knowledge is silver. Outlook is gold. IQ is a lead weight." (~00:30)
  • "Whatever we [in computing] do is more like what the Egyptians did. Building pyramids, piling things on top of each other."
  • "The ability to make science and engineering harmonize with each other - there's no greater music." (~00:47)

And there are some other nice ideas in there: "Model-T-Shirt Programming" - software the definition of which fits on a T-shirt. And imagining source code sizes in terms of books: 20,000 LOC = a 400-page book. A million LOC = a stack of books one meter high. (Windows Vista: a 140m stack of books.)

Note: this a Flash video, other formats are available.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

The video is unwatchable

The video is unwatchable here (bandwidth, flash, meh). The other format is real video, not much of an improvement.

I was intrigued with what Kay had to say about the second point "We could imagine taking the internet a a model for doing software modules." We are doing it, of course, the problem isn't putting everything out there, it is finding things with all the noise, and then fitting them into your project.

MP4

There's also MP4 for the four individual sections of the talk (see phone icon).

You mean the phone icon with

You mean the phone icon with the disabled play button beneath it? Ah, got it, the other talks below are sections and they have enabled phone icons. Crazy UX.

Edit: the mp4's aren't loading either. I don't think many of these sites are PRC friendly, I'll just wait until I can get access to a proxy.

Slides

Are the slides available separately? The video feed doesn't show them.

Thank you for the links,

Thank you for the links, Manuel.

I was intrigued with what Kay had to say about the second point "We could imagine taking the internet as a model for doing software modules."

Indeed, I guess I was as much intrigued as you were, too. My curiosity rose its peak after seeing the "Technology / Arts" diagram VPRI had prepared for their presentation (~ 00:39min), I find, coincidentally, very much in the same vein of thinking as that one I had put together two or so years ago (if only for myself at that time).

We are doing it, of course[...]

I would argue "not quite, yet", as we are barely starting our scouting into this kind of ideas, imho.

I have been working on this

I have been working on this idea for the last year at least :). I'm finally making some progress on the discovery aspect of this problem.

I agree with the quote, too.

I agree with the quote, too. Internet sites have achieved two very important desires for a module system: they are highly compatible with each other, and each site evolves independently, without frequent need for coordinated multi-site refactors. It's another way that the actual Internet has been more effective than the efforts of people specifically focusing on language design.

I would point out two properties of Internet sites that make it so effective at modularity. One is that they use wire-level interfaces such as HTTP, JSON, and Protobufs that are highly stable across recompiles. Two is that the engineers simply prioritize compatibility. For example, Apache is far more reluctant to make incompatible changes to HTTP than a typical library maintainer is to make incompatible changes.

System designers are very

System designers are very close to language designers in that they deal with creating nice abstractions. It shouldn't be any surprise that they would be as good or even better at it than we are.

I was thinking beyond the wire protocols to the more fuzzy mental implications. Google has become part of our memory system; we don't have to remember things so well anymore, we can retrieve information via google on demand when we need it. Wikipedia has allowed large numbers of unpaid (crowd sourced) contributors to put the encyclopedia vendors out of business. Social networks are rapidly evolving what community means. For all that, programming itself is still a very unconnected activity; programmers must manually use the internet to enhance this activity without much help from language design and tools.

the problem isn't putting

the problem isn't putting everything out there, it is finding things with all the noise, and then fitting them into your project

I agree. That 'noise' can be a real problem - we must consider possibilities such as spam, denial of service, predictable failure modes, resilience, persistence, upgrade, and deprecation. And there are the normal properties we want from modules, such as ability to reuse them and generic types.

My current designs favor a network of registries, combining the benefits of content centric networking and object capability security. I add constraint-based linking (without names) to support resilience, deprecation, and reuse of a module without a whole dependency forest.

How far along are you on your ideas? Are they based on your YinYang work and context-constrained behaviors?

I think he meant something

I think he meant something different: he means that software should resemble the software that runs the internet (the TCP/IP stack), not that software should be structured like the web. Software should be small and simple yet generally usable for a lot of things. He also mentions this again by saying that we shouldn't have complicated word processors, spreadsheets, presentation software, etc. Instead we should try to find a few simple primitives that can be used to do all of that and more.

I suspected as much, and I

I suspected as much, and I admitted that I (still) couldn't watch the video. Will try to watch it at work.

The DSL examples at the end were interesting

The compactness of specialized languages, i.e. their ability to say a lot very concisely, has always fascinated me. Also the fact that DSLs were called Problem Oriented Languages 50 years ago is an interesting fact.

Language Glue

The idea of extending our languages with DSLs is one I embrace happily. A few smart, slightly obsessive fellows can each provide a language for his or her domain. Within these languages, we can be two orders of magnitude more effective at expressing domain-specific operations.

But this raises a meta-problem: how do we get a variety of independently developed languages to work effectively together in a large project? How do we reason about the resulting system?

A lot of issues seem to cross-cut the DSLs: resource management, incremental computation, reactive computation, concurrency, consistency, determinism, real-time, bounded-space, failure modes, safety, resilience, live programming, distribution, redundancy, persistence, scatter-gather requirements, system discovery and integration, bi-directional editing, collections, optimizations. I'm sure there are more.

When we develop our DSLs, we often lack perspective of how the DSL should integrate with the system. A result is that we often end up pushing a lot of important work into adapter code. Often, this adapter code is specialized, inefficient, inconsistent, verbose, difficult to reason about. It's also irritating to write - i.e. it smells like boiler plate.

In Programming and Scale, Alan Kay describes that 'complications' grow faster than complexity. I think most of this is due to the same problem facing DSLs: we develop abstractions suitable to the problem, without the larger-scale perspective that concerns integration requirements. Consequently, we get a lot of complications when integrating our languages, APIs, frameworks.

The reason the Internet works is because everyone is forced to use the same language glue - essentially, TCP/IP.

Granted, TCP/IP is not great language glue. Concerns regarding latency, type-safety, concurrency control, UI integration, and efficiency systematically pressure developers to build monolithic services and applications. TCP/IP is too general for a lot of possible optimizations, e.g. compared to content centric networking. We'll know we have good language glue when we can compose services and applications from fine-grained modules, and achieve high performance.

If we are to support DSLs effectively, we must first solve the glue language problem. We need a common target for all DSLs - one that systematically pressures DSL developers to properly handle many issues that cross-cut domains even if they aren't familiar with them. Putting effective integration on the 'path of least resistance' is desirable.

By comparison, supporting multiple syntaxes and a tower of staged compilation seems to be quite trivial.

But this raises a

dmbarbour:
But this raises a meta-problem: how do we get a variety of independently developed languages to work effectively together in a large project? How do we reason about the resulting system?

Ah, very good point & question. Thank you for asking! (*politician grin*)

As for how to best answer it (and especially if one's desire for scalability is way beyond for just one project and/or one controlled language toolset) ... my feeling is we(*) are only, shyly, starting to contemplate this type of issue's horizon.

AFAIC, true, that's exactly the question that has kept me kind of "obsessed" for a couple years now, and still does, in another thread. :)

(*)well, some of us

Reject 'extensible' syntax in favor of 'user-defined' syntax

I've also been interested in working with a lot of fine-grained languages for several years.

A few years ago I was very interested in 'extensible' syntax, by which I mean that the syntax and grammatical rules provide a mechanism for their own extension (Christiansen grammars, for example). I have since come to disfavor that idea due to interaction with modularity:

  1. I don't want syntax extensions implicitly crossing module boundaries, since that's a form of leaking implementation details of the modules
  2. I want parameterized modules. I even like the idea of parameterized languages. But if syntax is imported from parameterized modules, I get a nasty scenario where the client of a module affects how the module is parsed. Some sort of clean staging is critical if we are to parse the modules separate from linking them (which itself is important for IDEs and debugging!).
  3. Syntax management must not become boiler-plate, because that would defeat the purpose of syntax extension! Thus, per module, the syntax rules specification - including imports and syntax compositing - must be a very tiny fraction of the total code.
  4. I want modules to be fine-grained, i.e. often on the order of 50-100 lines. This, combined with the previous point, means the maximum tolerance for syntax management is one line per module.
  5. I expect to upgrade my language over time, so a syntax declaration is appropriate anyway to account for different versions of the language. This means the minimum requirement for syntax management is one line per module.

Today, I favor user-defined syntax per module, but where the syntax is immutable within each module.

I currently favor an ML9-like approach for describing modules, where each module starts with a one-liner language declaration. Multiple fine-grained modules could exist in a file, which is okay given content-centric linking, which targets issues of code reuse, resilience, and dependency hell.

Language modules can also be imported for use at runtime, e.g. to access a parser or evaluator or syntax highlighter. (I imagine each language module will export a standard set of composable partial-functions for viewing, translating, processing, partially interpreting code with respect to common abstract machines, describing errors, et cetera.) PEGs are certainly a possibility, there.

An interesting possibility of 'language as module' is that a language may potentially refer to an external service that is maintained independently of your source, so long as your IDE is configured to discover it. This allows the interesting possibility of 'social' languages that grow with the number of active users (i.e. we could model multiple dispatch that works between services and applications). For security reasons, my current language design supports this only indirectly: the @language line cannot access external languages (since it has no discovery capability), but may inject its own import dependencies to obtain the external resource from the client module.

All that aside, I think it's a mistake to pursue syntax first.

The real problem of integrating languages has to do with integrating their abstractions - those cross-cutting concerns I mentioned earlier, making our languages play nice with concurrency, planning and constraint systems, security requirements, consistency, distribution, security, etc. Our programming model must accommodate many independent domain models.

One of my big motivations for pursuing RESTful reactive programming (in favor of event-based) is to support continuous compilation and specialization, for live programming upon a living tower of languages. The very idea of compilation is to specialize the slow-moving parts in order to more rapidly process the fast-changing parts - I see no good reason for a static distinction between 'code' and 'data' with respect to continuous compilation.

Examples of effective DSLs?

What are some domains where DSLs have been effective or probably will be effective?

Parsing: Tools like Yacc/ANTLR/OMeta, and internal DSLs like parser combinators. And of course regular expressions.

Queries: SQL, Datalog, XPath.

Symbolic mathematics: Mathematica, Maxima, Maple.

There must be more.

Effective DSLs

There are a lot more.

You can find DSLs for multimedia of all sorts (sound, video, digital DJ, graphics shaders), display (SVG, HTML, PostScript, LaTeX, PovRay), statistics (R), interactive fiction (Inform 7), simulations of all sorts (biology, physics), robot behavior (Kodu), animation, hardware modeling (LabView)...

And more and more...

I find the DSLs on the tip of one's tongue boils down to experience. The more domains you get exposed to, the more likely you're liable to find domains where DSLs make sense.

Most LtU'ers, for example, probably wouldn't jump in to mention the enormous industrial significance of relay ladder logic languages for logic controllers. This is unsurprising, as the tech is old school, and one generally only gets exposure to it in areas like industrial automation. And don't forget the next level up, hardware description languages (VHDL and its ilk, though I like your shout-out to LabView/G). Nary a chip gets made without one of those, but again they're not exactly the new hotness to the programs-as-types crowd.

In another vein, I'd also mention the wildly-turing-incomplete set, such as lightweight markup languages (wiki markup, markdown, etc) and other data descriptors (JSON and that crowd). Though this gets into a philosophical zone over the term "language", they are formal string structures that serve particular domains. Markup, especially, is a huge component of HCI for the umpteen millions using wikis these days.

DSLs and Experience

the DSLs on the tip of one's tongue boils down to experience - I agree. Especially at 2am.

I also thought about mentioning a lot of language-like things - spreadsheets, for example, or domain-specific APIs, protocols, and frameworks. (A protocol is basically a grammar, right?)

If you start counting input and output languages - e.g. configuration files, log formats, etc. - the list would never end.

Inspiring, but...

I enjoyed the talk quite a bit and was very impressed with the ending where he presented the short listings for Nile and the associated compilers it required versus the length of Cairo. Quite inspiring.

I can well imagine how graphic operations would map nicely to that. But how often can that approach be used? I ask myself, is that something I would be able to do in my work to reduce my large projects and make them manageable? The answer: No, it isn't.

Partly, DSLs and compiler compilers are difficult. So they aren't for average programmers, and they aren't worth bothering with if the payoff isn't big enough.

But mostly, I think that DSLs, while powerful and useful, aren't the missing piece. I cannot imagine that empire building of code for Windows 7 is going to go away because of some well placed DSLs. (I'd love to be wrong, though)

I use a variety of languages, but mostly Common Lisp. I have written several DSLs to help support my own work, including a compiler/code-generator for targeting different server languages. So I am already leveraging this practice.

For a big application that cuts across several domains I don't see a solution for reducing its complexity (or complicatedness). I know there is no single magic bullet. I try to leverage every single productivity methodology I can. But even with all of them I still have a large body of code that is getting harder to maintain, change, and test and getting larger every day.

But that said, it doesn't seem to me that it has to be so large. I feel in my gut that it could all be a lot simpler. If not smaller, it could be approached more simply. But I don't have the answer, just the problem. Does anyone else feel that lines of code filling up a text editor is NOT what the work should entail? Blinking cursor should not be our sole tool?

He is not only proposing

He is not only proposing changing the languages and tools used, he is also proposing to simplify the requirements for the stuff we write. Obviously exactly recreating Windows and the important applications running on Windows is not going to happen in 20k lines of code.

For example word processors don't have to be as complicated as Microsoft Word. He proposes that we replace Word, Excel, Powerpoint, and other similar applications with one application that can do all of that and more with a smaller number of orthogonal tools.

One problem is that this works fine as long as you're in your own perfect world, but as soon as you have to inter-operate with the external world you can no longer decide the requirements. If you're writing a web browser you have to follow the web standards, you cannot invent your own simplified format (a web browser could be as simple as a sandbox running the code it gets -- the browser doesn't have to know anything about markup languages and all the stuff that the W3C produces).

The problem

If you're writing a web browser you have to follow the web standards, you cannot invent your own simplified format

The formula changes when you're writing both the browser and the server. You can invent your own simplified format. And supporting 'conversion' to DHTML + XMLHttpRequest or WebSockets is also quite feasible - i.e. with the idea of pushing the complications as far to the edge as feasible, so that it doesn't infect the perfect world.

A more common problem with 'reinventing' systems or developing DSLs is to forget some essential requirement. People end up working around the model rather than with it. As a consequence, the DSLs are slowly extended to become GPPLs, sort of like what happened with HTML or SQL procedures.

For example, Nile sounds nice... but how well does it support animation? or real-time? or incremental update (redraw only dirty rectangles)? or associating a user pointer input with a specific object?

When you try to walk the line of "as simple as possible, but no simpler", it is really easy to fall off the 'too darn simple' side and end up with a 'simplistic' system instead of a 'simple' one.

a web browser could be as simple as a sandbox running the code it gets -- the browser doesn't have to know anything about markup languages and all the stuff that the W3C produces

Like Google NaCl, which is now built into chrome.

But sandbox running opaque code is simplistic, not simple. Like all simplistic things, opaque code creates complications for its clients. Consider concerns such as: search, translation, blind accessibility, adaption to different screen sizes, mashups, bookmarking, history and back-button integration, et cetera.

NaCl isn't any simpler than native apps. The advantage is that it is flexible. People can develop awesome new systems atop it - maybe even simple ones.

The formula changes when

The formula changes when you're writing both the browser and the server. You can invent your own simplified format. And supporting 'conversion' to DHTML + XMLHttpRequest or WebSockets is also quite feasible - i.e. with the idea of pushing the complications as far to the edge as feasible, so that it doesn't infect the perfect world.

Yes, using your own server that's what I'd call staying in your perfect world that you control. Conversion from HTML + other web things to your custom format is essentially implementing your own web browser. This cannot realistically be done in 20k lines, so we fail that stated goal immediately if we try to inter-operate with the existing web.

You can still use the internet as he mentions, because TCP/IP is simple.

For example, Nile sounds nice... but how well does it support animation? or real-time? or incremental update (redraw only dirty rectangles)? or associating a user pointer input with a specific object?

That is a good point. Even more importantly for competitiveness with existing technology: can it do these calculations efficiently on the GPU?

But sandbox running opaque code is simplistic, not simple. Like all simplistic things, opaque code creates complications for its clients. Consider concerns such as: search, translation, blind accessibility, adaption to different screen sizes, mashups, bookmarking, history and back-button integration, et cetera.
NaCl isn't any simpler than native apps. The advantage is that it is flexible. People can develop awesome new systems atop it - maybe even simple ones.

These things can be solved by using common libraries on top of the opaque code. Encoding all these concerns into the language standard that is sent over the wire is bad design, just like encoding a sorting function into the syntax of a programming language is wrong. These are concerns that should be solved by a library, not by the syntax. For example if there is a common library for drawing text (perhaps a semi standard one provided by a standards body and hosted on their servers, but not one that people are forced to use), Google could intercept calls to that library and store the drawn text in its indexes. This would probably be even simpler than what they have to do now to extract the text reliably.

Innovation would be much quicker and the web stack would be much simpler. For example take rounded corners. How long did web developers have to wait until rounded corners were added to the format that is sent over the wire? With a system of sending efficient Turing complete code over the wire that has access to basic libraries for stuff like graphics these issues become trivial.

In any case I think it would be interesting to put together the minimal functionality that such a system would need, like access to graphics, sound, mouse/keyboard IO, networking, ability to execute code dynamically (for example coming from a server storing common libraries).

Living with Legacy

Conversion from HTML + other web things to your custom format is essentially implementing your own web browser. This cannot realistically be done in 20k lines, so you fail that stated goal immediately.

No, no, no. A legacy adapter doesn't mean you fail the goal immediately. It only means you'll have a transition period before potential success. The idea is to smoothly marginalize use of DHTML - over a couple decades, perhaps - so that you won't need it any longer.

things can be solved by using common libraries on top of the opaque code

Library design essentially is language design - every abstraction is a language modification. So this is the same as saying: "things can be solved by using common language on top of opaque code". You successfully regress the real problem - figuring out a common language or common libraries - up one more mostly-useless layer of opaque code.

Encoding all these concerns into the language standard that is sent over the wire is bad design

There are many problems that should be solved by libraries - certainly. But our ability to solve problems with libraries is ultimately constrained by design of our language. So good language design requires accommodating all relevant concerns, even if not directly answering them.

Google could intercept calls to that library and store the drawn text in its indexes. This would probably be even simpler than what they have to do now to extract the text reliably.

That wouldn't work very well at all. Even assuming people use 'standard' approaches to draw text (as opposed to converting to postscript and rendering to canvas with low level commands, or something), the search engine's 'spider' would need to somehow navigage and explore the application to get the rendering to trigger in the first place.

Innovation would be much quicker and the web stack would be much simpler. For example take rounded corners.

And other properties, such as interop, security, safety, mashups, failure modes, consistency, resource management, and job control may become correspondingly more complicated. If your only measure of simplicity is in terms of 'the web stack', a simplistic solution is guaranteed. True simplicity requires considering many more variables - and might still result in a simpler web stack.

You'll get innovation either way, just in different subjects - i.e. rounded corners and pretty stovepipes vs. mashups and business relationships.

a system of sending efficient Turing complete code over the wire that has access to basic libraries for stuff like graphics these issues become trivial

Rich code distribution has been my interest since 2003. And not just from server to client - we could achieve a great deal with client-to-server, server-to-server, client-to-client. The ability to abstract new overlay networks would be an enormous win for developers.

Of course, effective code distribution is another of those problems that takes good language design, i.e. so that we can achieve high levels of optimization, so that we can automatically shard an application out not just to a server but the server's dependencies, so that we can reason about security and resource management and prevent denial-of-service risks, and so on.

Turing-complete is useful, but we don't need that power for the vast majority of the language. In my current design, developers would achieve Turing-complete only by modeling incremental computations with explicit state. This allows me to ensure real-time computation, and simplifies job control (which cannot be achieved implicitly at distributed scales).

I think it would be interesting to put together the minimal functionality that such a system would need, like access to graphics, sound, mouse/keyboard IO, networking, ability to execute code dynamically (for example coming from a server storing common libraries).

We call that particular collection of functionality the 'PC'. ;-)

Library design, language

Library design, language design, it's all the same. The difference here is not in that, the difference is that the innovation is done distributively rather than via a standards body for the whole world. Standard bodies don't innovate well, and they aren't even able to recognize and adopt good innovation as we're seeing with the W3C. They reluctantly accepted (parts of) HTML5 because it was clear that browser vendors were going to implement it anyway.

That wouldn't work very well at all. Even assuming people use 'standard' approaches to draw text (as opposed to converting to postscript and rendering to canvas with low level commands, or something), the search engine's 'spider' would need to somehow navigage and explore the application to get the rendering to trigger in the first place.

So you also expose the navigation with an API that the search engines can understand. Or instead of letting the crawler understand the page directly you serve a special page to the crawler that it can understand. There is no problem here because the system is flexible enough to solve it. Moreover we are already seeing the same problem and solution with Ajax applications. This solved not by waiting 5 years for the W3C to standardize a solution, instead Google is saying "our crawler crawls ajax links like this". In this case the solution was to put the url in a special format that the Google crawler recognizes and handles in a way that Ajax links can be crawled. Wouldn't a general method that exposes all navigation uniformly to search engines be preferable?

And other properties, such as interop, security, safety, mashups, failure modes, consistency, resource management, and job control may become correspondingly more complicated. If your only measure of simplicity is in terms of 'the web stack', a simplistic solution is guaranteed. True simplicity requires considering many more variables - and might still result in a simpler web stack.
You'll get innovation either way, just in different subjects - i.e. rounded corners and pretty stovepipes vs. mashups and business relationships.

Why would those things be harder? I don't see it. Take security. Is it harder to implement a tiny sandbox securely, or to implement thousands of pages of standards securely? Why do mashups become harder? With mashups we are already seeing the trend in the direction of distributed innovation rather than top down standardization. Web services expose an API that returns a community standardized format like JSON. I fail to see what you mean by the other points, perhaps you could provide concrete examples.

Turing-complete is useful, but we don't need that power for the vast majority of the language. In my current design, developers would achieve Turing-complete only by modeling incremental computations with explicit state. This allows me to ensure real-time computation, and simplifies job control (which cannot be achieved implicitly at distributed scales).

We're talking about client side applications here, not server side applications running on a cluster. Across the internet it is impossible to do real-time computation anyway because any network communication can already take arbitrarily long. A Turing complete language can be powerful (obviously) and simple. A non Turing complete language will over time need to acquire more and more features that push its power up, but not so much as to become Turing complete. You can see this in almost any non Turing complete language: HTML acquires more and more features, CSS the same, configuration files sometimes even become Turing complete by accident (for example sendmail configuration files). Turing complete languages like Javascript only acquire a few features. The features it does acquire are either to accommodate for higher performance with low level features like packed arrays, or they are programmer productivity features. Neither is necessary for a low level code representation that is meant to be compiled to rather than programmed in.

I think it would be interesting to put together the minimal functionality that such a system would need, like access to graphics, sound, mouse/keyboard IO, networking, ability to execute code dynamically (for example coming from a server storing common libraries).

We call that particular collection of functionality the 'PC'. ;-)

No, the PC doesn't meet the requirements at all. If by PC you mean the hardware, then it doesn't support safe untrusted code execution. If by PC you mean hardware plus a mainstream OS like Windows or Linux, then it doesn't support untrusted code execution either, and neither is it even close to minimal. If you mean PC+OS+browser then it does support untrusted code execution via Javascript, but it's neither efficient nor does it have access to the APIs you need. Also PC+OS+browser is maximally complex, not minimally.

A Chromebook with NaCl comes a lot closer, but NaCl is a real world compromise not an ideal system.

As Alan Kay might say it: what are Maxwell's equations of the web?

expose the navigation

expose the navigation with an API that the search engines can understand. Or instead of letting the crawler understand the page directly you serve a special page to the crawler that it can understand. There is no problem here because the system is flexible enough to solve it.

Ah. So your solution is to either write some extra code just for the crawler (i.e. complication and a standard) or to develop a standard declarative interface to access relevant information about the code (which gets us right back to the situation you abandoned by accepting opaque code).

It has been my observation, over eight years, that this pattern of answer applies to every single problem.

we are already seeing the same problem and solution with Ajax applications

Indeed. DHTML became imperative, and opaque, because HTML was inadequate for our purposes.

Why would those things be harder? I don't see it. Take security. Is it harder to implement a tiny sandbox securely, or to implement thousands of pages of standards securely? Why do mashups become harder? With mashups we are already seeing the trend in the direction of distributed innovation rather than top down standardization.

A sandbox gives you security by sacrificing composition. For example, you cannot achieve secure mashups by using sandboxes. The "hard problem" to solve is achieving security without sacrificing flexibility, performance, composition, and so on.

I have never advocated 'thousands of pages of standards', so I'll just let that straw man burn.

I'm all for distributed innovation and development. I've spent time playing with the idea of Wiki-based IDEs, and my more recent notions involve content-centric linking tamed by a network of registries.

We're talking about client side applications here, not server side applications running on a cluster.

We're talking about Alan Kay's vision, Jules Jacobs. In that context, you should question everything you think you know.

For example: How valid is your distinction between client and server? Is it not equally valid to understand this as "running part of the server on the client" using DOM as an API? I.e. it's a server-side application, that happens to run in a distributed mode for performance. Cannot the DOM/API also be understood as a service of sorts? In a sense, the service controlling my DOM/API is a client of my DOM/API service. So perhaps we should also put part of this DOM/API service near its client.

The arbitrary distinctions we make should be the first casualties in any attempt to improve on the status quo, especially if simplicity is a goal. If you want to change perspective, first you need to shake it up.

My current vision is of a zero-tier architecture. Services are clients of services, and clients provide their own services in turn. Code distribution for performance and disruption tolerance occurs in every direction, and is transitive across relevant relationships. This is the ideal case.

And it's simple. There are fewer unnecessary distinctions, such as client vs. server.

It can be secure, too. But to make this happen does require abandoning 'opaque code' for something that can be more precisely be analyzed, sharded, and composed by intermediate services.

Across the internet it is impossible to do real-time computation anyway because any network communication can already take arbitrarily long.

Real-time on the Internet requires accepting indeterminism - i.e. the behavior of the system is different (but still occurs in time) during a disruption condition. I model a remote system falling outside its time limits with disruption.

Developing real-time computations on the Internet gives a lot of side benefits. For example, bandwidth consumption and latency is much more consistent and easier to predict. Variations and failure cases can be reduced by orders of magnitude. We can use the real-time structure for consistency, as an alternative to transactions or locks or other insecure concurrency control models.

Real-time can be combined with dynamic behaviors to modify timing requirements at runtime.

A non Turing complete language will over time need to acquire more and more features that push its power up, but not so much as to become Turing complete. You can see this in almost any non Turing complete language: HTML

Your examples are spoiled by the fact that you did not name even one 'non Turing complete' general purpose programming language. An alternative hypothesis is that DSLs cannot work alone.

Anyhow, my statement was that the vast majority of the language does not need to be Turing complete. For example, we can design a language that only becomes Turing complete in the presence of a time-step function and a reference to external state.

No, the PC doesn't meet the requirements at all.

To the contrary, it meets every requirement you listed: "access to graphics, sound, mouse/keyboard IO, networking, ability to execute code dynamically".

Even 'untrusted' code will execute just fine. Now, if you want to add some security requirements, perhaps you should be more explicit about it after you say 'minimal functionality'. ;-)

As Alan Kay might say it: what are Maxwell's equations of the web?

Ah, but that's a silly question. With physics, we're all forced by the great standards body in the sky to use the same equations. But the web? Might as well ask for Maxwell's equations for politics.

Alan Kay doesn't want equations for the web as it exists, but rather a nice set of equations such that - if they were broadly accepted - would serve as a simple basis for understanding and controlling the web.

I have something like such a set of equations, with RDP.

Ah. So your solution is to

Ah. So your solution is to either write some extra code just for the crawler (i.e. complication and a standard) or to develop a standard declarative interface to access relevant information about the code (which gets us right back to the situation you abandoned by accepting opaque code).

This does absolutely not get us back to the original situation! The only semi standard that you need it how a search engine crawler crawls the web. You need that in any case, but this standard can be extremely simple. For example a standard interface to the crawler: the crawlers runtime environment exposes two procedures, text(str) and link(str,callback) signaling to the crawler that these things appear on the page. This is just like ordinary clients exposing drawing/sound/keyboard primitive operations. Note that because I'm still stuck in a current www centric view these are the things I made up. It's likely that there are better options once you abandon all this legacy. But it is already interesting the generalization that this is on the current system. Link takes a callback, not an url, so this exposes all navigation uniformly, be it running a page of the standard opaque code format from a server or doing an "Ajax" request that returns a special format and displays it, or doing no network call at all and only navigating locally.

Then UI library writers can use that API to create two versions of their libraries: one version for the crawler and one version for the client. The version for the crawler makes nearly all operations do nothing, except the operations that draw text, they call the text() procedure and operations that draw navigation elements, they call the link() procedure when they draw a navigation element. We do not have to standardize on one UI library; only the essence of the interface to the crawler has to be standard, not every single thing.

So you don't have to write your code twice. Currently if you write Ajax applications and you want Google to index them with their mechanism you *do* have to write your code twice.

Indeed. DHTML became imperative, and opaque, because HTML was inadequate for our purposes.

Exactly, as any woefully non Turing complete language is bound to become. You get a better end result if you acknowledge this and design for it consciously.

A sandbox gives you security by sacrificing composition. For example, you cannot achieve secure mashups by using sandboxes. The "hard problem" to solve is achieving security without sacrificing flexibility, performance, composition, and so on.

I have never advocated 'thousands of pages of standards', so I'll just let that straw man burn.

I compared to that because the current system is thousands of pages of standards.

We already have sandboxes in the form of Javascript, the difference is that it is not ideal. Why does this mean you cannot achieve secure mashups, or have flexibility, performance and composition? The code running in the sandbox is allowed to make network calls (like Javascript can) and in this way it can do mashups just like they are done today, but with more flexibility and performance.

And it's simple. There are fewer unnecessary distinctions, such as client vs. server.

It can be secure, too. But to make this happen does require abandoning 'opaque code' for something that can be more precisely be analyzed, sharded, and composed by intermediate services.

That is a fine goal, but in the end some instructions are going to run on the client's hardware and some are going to run on the server's hardware. Network communication is not free, and developers are going to want to decide where something runs.

A system like yours where this is decided and integrated automatically could be built *on top* of this improved web, for those that need it and for those that it is performant enough for.

Real-time on the Internet requires accepting indeterminism - i.e. the behavior of the system is different (but still occurs in time) during a disruption condition. I model a remote system falling outside its time limits with disruption.

Developing real-time computations on the Internet gives a lot of side benefits. For example, bandwidth consumption and latency is much more consistent and easier to predict. Variations and failure cases can be reduced by orders of magnitude. We can use the real-time structure for consistency, as an alternative to transactions or locks or other insecure concurrency control models.

Real-time can be combined with dynamic behaviors to modify timing requirements at runtime.

In that way any application can be declared real time: if you miss your time schedule we'll just declare that acceptable behavior. The latency and bandwidth of network communication is impossible to predict reliably. You might still be able to get some benefit out of real time guarantees between two network calls, but this is not enough to do distributed consistency, as distributed systems inevitably make network calls. Also many inconsistencies appear not because of unpredictability of the timing of the code, but because of the unpredictability of the timing of humans. For example a banks that keep account balances. You cannot predict when somebody is going to transfer money from one bank to another, and therefore you have to use some kind of locking/transactions. As with automatic distribution mentioned above, this realtime stuff is something that should be built on top of a simple system by those who need it.

Your examples are spoiled by the fact that you did not name even one 'non Turing complete' general purpose programming language. An alternative hypothesis is that DSLs cannot work alone.

Anyhow, my statement was that the vast majority of the language does not need to be Turing complete. For example, we can design a language that only becomes Turing complete in the presence of a time-step function and a reference to external state.

Can you name a non Turing complete language that you're confident is general purpose for all the applications that we want to run? What is the advantage of using that language? Remember that the fact that you cannot do all analyses on Turing complete languages doesn't mean that you *can* do this on non Turing complete languages. Many analyses on non Turing complete languages are also undecidable, or completely impractical. If you're adding a way to make it essentially Turing complete, then I suppose that would be enough, but the question "what's the advantage of this?" becomes even more pressing. We can also view assembly language as non Turing complete, it only becomes Turing complete in the presence of backwards jumps.

Even 'untrusted' code will execute just fine. Now, if you want to add some security requirements, perhaps you should be more explicit about it after you say 'minimal functionality'. ;-)

...

Ah, but that's a silly question. With physics, we're all forced by the great standards body in the sky to use the same equations. But the web? Might as well ask for Maxwell's equations for politics.

Alan Kay doesn't want equations for the web as it exists, but rather a nice set of equations such that - if they were broadly accepted - would serve as a simple basis for understanding and controlling the web.

What gave you the impression that I was asking for Maxwell's equations for the web as it exists? Everything I've written so far is contrary to that.

Alan Kay is using the phrase "Maxwell's equations for X" in that way himself; this is not my invention. If you think his question is silly, you should probably tell him, not me :) While I agree with you that this analogy without context is not perfect, when he explains it in context in the linked video it is very clear what he means.

I have something like such a set of equations, with RDP.

Now that is very interesting. Can you share this description with us? If it really is such a set of equations, chances are somebody will implement it for you ;)

Little Bits of History Repeating

This does absolutely not get us back to the original situation! The only semi standard that you need it how a search engine crawler crawls the web.

Well, history doesn't repeat itself, but it does rhyme.

The situation with web crawlers is just one of many semi-standards that will eventually be developed. You'll ultimately want more semi-standards for translating natural language text, screen readers and blind accessibility, client-defined stylization (CSS equiv), occasionally offline work (disruption, storage), level-of-detail or zoomable user interfaces, embedding in other applications or services, work mobility between machines, history and wayback, and so on.

It might not seem like much when you're considering just a few more interfaces for the crawler. But say 'just a few more' just a few more times, and you'll have a real mess.

I don't actually mind an API-basis for communication between services (i.e. treating the service as manipulating an API on the client, and vice versa). But the problem of developing a sane API is no different than developing a language.

it is already interesting the generalization that this is on the current system. Link takes a callback, not an url, so this exposes all navigation uniformly, be it running a page of the standard opaque code format from a server or doing an "Ajax" request that returns a special format and displays it, or doing no network call at all and only navigating locally.

It is also interesting to note the problems this has relative to the current system. Link takes a callback, not a url and a standard operator, therefore the client cannot be sure what context it utilizes. Issues such as whether we can use a cached state for the link, or manage it in a search engine, keep it in history, or bookmark it, whether it is idempotent or it has side-effects, exclusive choice, and so on are invisible. This opacity also holds uniformly.

Exactly, as any woefully non Turing complete language is bound to become. You get a better end result if you acknowledge this and design for it consciously.

How do you distinguish the 'woefully' non Turing complete languages from the 'merely' non Turing complete ones? ;)

I have never advocated 'thousands of pages of standards', so I'll just let that straw man burn.

I compared to that because the current system is thousands of pages of standards.

I'm not advocating the current system, either.

We already have sandboxes in the form of Javascript, the difference is that it is not ideal. Why does this mean you cannot achieve secure mashups, or have flexibility, performance and composition?

The problem with sandboxes is composition - two different sandboxes cannot effectively share fine-grained objects and callbacks with one another.

JavaScript doesn't support secure mashups today. There is a reason we have the 'single origin restriction'. ES5 has a 'strict' mode that is intended as a transition to an object capability based version of JavaScript, which would allow secure mashups.

The code running in the sandbox is allowed to make network calls (like Javascript can) and in this way it can do mashups just like they are done today, but with more flexibility and performance.

Ability to make network calls is insufficient for secure mashups. You must to consider how the services interact once tightly interwoven (e.g. into the same page or display). In the general case, each app will have secrets that it wants to withhold from the other apps yet reveal to the client. Resolving this specific problem - without separating the display into distinct islands (I said 'mashups', not 'portals'!) - is necessary for secure mashups.

developers are going to want to decide where something runs.

Developers want rather arbitrary levels of control - whatever suits their needs at the moment. My answer to such scenarios is to favor a simple constraint model, i.e. such that developers can declaratively specify distribution requirements and preferences with the precision they desire, and a predictable solution will be found.

Long term, I'd like to eventually support a 'market of resources', e.g. where CPU, bandwidth, memory and such are auctioned at competitive rates. That covers the cost issues nicely.

In that way any application can be declared real time: if you miss your time schedule we'll just declare that acceptable behavior.

You misconstrue the scenario. I said 'indeterminism', but still in the context of 'real-time'. Reducing to a well-defined negative response state in real-time is not a property many applications support today. Requirements in a distributed system will cover disruption - or they were simply not sane requirements.

latency and bandwidth of network communication is impossible to predict reliably

I disagree. People can buy reliable bandwidth from providers if they need it. If you develop with quality of service specifications in mind, you can generally achieve them.

many inconsistencies appear not because of unpredictability of the timing of the code, but because of the unpredictability of the timing of humans

Naturally, a system can only be deterministic up to input. But that doesn't mean we need to introduce 'inconsistencies' in the system. Maybe you're confusing inconsistency with indeterminism?

this realtime stuff is something that should be built on top of a simple system by those who need it

Other way around, I think. It is really hard to build real-time computation atop a non real-time system, but vice versa is quite easy.

If you're adding a way to make it essentially Turing complete, then I suppose that would be enough, but the question "what's the advantage of this?" becomes even more pressing

I pursue a lot of nice compositional properties.

A property is 'compositional' if you can reason about it inductively across subsystems composed by a common set of composition operators. Compositional properties are important: they are the only properties we can reliably reason about at scale, or in open systems, when we lack the ability to peek inside every subprogram.

A simple example of a compositional property is that HTTP GET is supposed to be idempotent, cacheable, effect free. This allows us to crawl the web without modifying it, to keep bookmarks, et cetera.

A simple example of a non-compositional property is progress or latency for lock-based concurrency control. Two deadlock-free subprograms might deadlock when composed.

In general, we cannot achieve compositional properties such as progress, latency, incremental computation, job control, resource control (e.g. guaranteed release), eventual consistency, reactivity, resilience, content caching optimization, zoomability and level-of-detail, revocability, coordination, resistance to denial-of-service attacks, security, safety, disruption tolerance, etc. if we have Turing-complete functions internally.

Just switching to 'non Turing complete' isn't a cure, though. It's just one requirement for a good solution.

Can you name a non Turing complete language that you're confident is general purpose for all the applications that we want to run?

Sure. Coq, Dedalus, Bloom, Esterel, Funloft... these languages lack Turing complete functions, yet can model complete systems in presence of an infinite external input (such as time).

Can you share this description with us? If it really is such a set of equations, chances are somebody will implement it for you ;)

Maxwell's equations on a T-shirt would typically be presented without any context. To actually understand them, you usually need to take a class - in the middle of a long chain of classes.

Anyhow, I don't have fancy notation for RDP's equations, but I do have a small, simple set of formal properties (which is 'something like such a set of equations'): commutativity and idempotence of demand on an agent, duration coupling of demand and response, state as a continuous integral of observations over time, and inability to interact with the past (which actualizes as: references are not part of state).

It might not seem like much

It might not seem like much when you're considering just a few more interfaces for the crawler. But say 'just a few more' just a few more times, and you'll have a real mess.

I don't actually mind an API-basis for communication between services (i.e. treating the service as manipulating an API on the client, and vice versa). But the problem of developing a sane API is no different than developing a language.

I partially agree, but many of these semi standards are not a problem. Only the (UI) library developers need to worry about them. You don't need to standardize literally everything to get these things, and not every UI library needs to have all these things. The stuff you mention happens to be possible with the current web standards. Even if we end up with just as many messy standards as the current web standards (though that is quickly disproved: in the worst case we end up with the current standards minus the "blink" tag) the situation is still better because innovation can go faster for things that were not anticipated by the standards body. If there is a new scenario that needs to be supported, it can be adopted quickly by UI libraries instead of waiting for a standards body that will standardize it without any actual experience with it yet.

Another advantage is that you don't have to perpetually support ancient standards in browsers, which makes it pretty much impossible to write a new browser these days. The libraries that a particular web app depends on will not be in the browser, they are also served on the internet. So if a particular thing falls out of favor, we can ditch the cruft instead of keeping it around in browsers. In other words, instead of only accumulating more things, we have the possibility that things will fall out of favor. For example imagine a web in 100 years. Browsers will still support all the cruft from this age: the blink tag, the b/i/strong/em/etc tags, TIF files, etc. If instead those were implemented in libraries, those libraries will simply be unused after a while, and nobody needs to drag the mess along even in new web apps.

Issues such as whether we can use a cached state for the link, or manage it in a search engine, keep it in history, or bookmark it, whether it is idempotent or it has side-effects, exclusive choice, and so on are invisible. This opacity also holds uniformly.

An url has all of the same problems. If the result of a callback should be cached, the callback will take care of that. If an url needs to be cached, this needs to be encoded in the standard (and currently it is -- there is a multitude of ways that a server can signal to a browser that a request should be cached). Whether requesting an url has side effects is also unclear, and the same goes for the other things. Sure, according to the standard a GET request shouldn't have side effects. Remember what happened with Google web accelerator product that preloaded pages?

You can of course also encode all this information in the callback object (it doesn't need to be just a function pointer). How much of this information to standardize needs a lot of thought. Perhaps we only need to standardize a mechanism to put arbitrary information on the callback.

How do you distinguish the 'woefully' non Turing complete languages from the 'merely' non Turing complete ones? ;)

I inserted that word because you seem to make a distinction between languages that are Turing complete and ones that become Turing complete because of the environment they execute it. A language that is essentially Turing complete in that way may not acquire new features like really non Turing complete languages do.

The problem with sandboxes is composition - two different sandboxes cannot effectively share fine-grained objects and callbacks with one another.

JavaScript doesn't support secure mashups today. There is a reason we have the 'single origin restriction'. ES5 has a 'strict' mode that is intended as a transition to an object capability based version of JavaScript, which would allow secure mashups.

[...]

Ability to make network calls is insufficient for secure mashups. You must to consider how the services interact once tightly interwoven (e.g. into the same page or display). In the general case, each app will have secrets that it wants to withhold from the other apps yet reveal to the client. Resolving this specific problem - without separating the display into distinct islands (I said 'mashups', not 'portals'!) - is necessary for secure mashups.

Why couldn't two sandboxes share fine-grained objects if both codes running in the sandboxes agree to do so? Perhaps you wouldn't call such a thing a sandbox, but I don't know another word for it. What I mean by a master running servant code in a sandbox is that the master has the ability to decide how much the servant is able to access in its own functionality. In other words: the master doesn't just pass all its capabilities to the servant (like eval in Javascript does), but it can instead decide exactly which capabilities to provide.

In any case this is the question I'm asking, not answering: what is a good execution environment for such a system? Eval isn't good enough -- one option is a safe eval that along with code to run also takes an environment of capabilities passed in from the outside, but perhaps there are better options.

I disagree. People can buy reliable bandwidth from providers if they need it. If you develop with quality of service specifications in mind, you can generally achieve them.

That's not how the internet works. You might be able to buy reliable bandwidth to your ISP, but you'll not realistically be able to buy reliable bandwidth let alone latency without completely reworking the whole internet. Even in systems where you completely control everything (say a cluster), people don't build reliable applications that way. For example your bank doesn't rely on timing between its internal servers to make sure that a transaction completes reliably (i.e. that no money is lost or duplicated).

Developers want rather arbitrary levels of control - whatever suits their needs at the moment. My answer to such scenarios is to favor a simple constraint model, i.e. such that developers can declaratively specify distribution requirements and preferences with the precision they desire, and a predictable solution will be found.

Long term, I'd like to eventually support a 'market of resources', e.g. where CPU, bandwidth, memory and such are auctioned at competitive rates. That covers the cost issues nicely.

This is something that should be built on top of this, not at the base. People have already done work in this area when compiling to Javascript. I believe Microsoft has a .NET product that lets you annotate methods with "server" or "client" that decides whether the method will be compiled to Javascript and run on the client or be compiled to machine code and run on the server. Perhaps you know the name?

Other way around, I think. It is really hard to build real-time computation atop a non real-time system, but vice versa is quite easy.

You need to guarantee that some things happen in constant time, for example adding two fixed size ints. But you do not need to provide a whole framework and tools and design everything for real time applications (an extremely small percentage of applications). That can be built on top of the base.

In general, we cannot achieve compositional properties such as progress, latency, incremental computation, job control, resource control (e.g. guaranteed release), eventual consistency, reactivity, resilience, content caching optimization, zoomability and level-of-detail, revocability, coordination, resistance to denial-of-service attacks, security, safety, disruption tolerance, etc. if we have Turing-complete functions internally.

Just switching to 'non Turing complete' isn't a cure, though. It's just one requirement for a good solution.

Like before, you can build this on top of the simple base. There are millions of nice-to-haves for specific applications. Forcing the complexity on everyone by putting it in a global standard is a mistake. The other problem is that we don't yet know how to create compositional building blocks for most these things, so whatever we standardize would probably be mostly wrong. Also not all applications need these things in the same ways, so a single global standard doesn't work.

Sure. Coq, Dedalus, Bloom, Esterel, Funloft... these languages lack Turing complete functions, yet can model complete systems in presence of an infinite external input (such as time).

Right, if something is essentially complete but non complete with a sleigh of hand, that could also work. We're talking about an assembly language for the web. It smells like a (non) Turing tar pit argument: look how easy it is to reason about SK calculus programs, there are only a few simple rules! Until you start to write real programs with SK combinators, then you see that it's not easy to reason about those programs at all. The essential difficulties are still there, and because you wrote it in an less powerful language your program becomes relatively more complicated enough to negate the benefits from the theoretically easier reasoning in the system. It is fine that you can do reasoning in a language built on top of the assembly language, but you don't put that right into the assembly language.

Maxwell's equations on a T-shirt would typically be presented without any context. To actually understand them, you usually need to take a class - in the middle of a long chain of classes.

Anyhow, I don't have fancy notation for RDP's equations, but I do have a small, simple set of formal properties (which is 'something like such a set of equations'): commutativity and idempotence of demand on an agent, duration coupling of demand and response, state as a continuous integral of observations over time, and inability to interact with the past (which actualizes as: references are not part of state).

You can define Maxwell's equations with mathematics from his age in less than a page. That doesn't mean that you don't need a class to understand all its consequences. I can define a program in less than a page that is formally completely defined (for example a program that computes a Collatz sequence), but that is extremely hard to understand.

Can you give a complete formal model of what you have in mind (for example an operational semantics)? If you can't do that then your language is either fundamentally new and needs a new kind of mathematics to describe (unlikely), or the model of it you have in your head is not precise yet.

A few questions: what is a demand of an agent? What is an agent? With respect to what are the demands commutative and idempotent? (I suspect with respect to time) What do you mean by duration coupling? What do you mean by continuous integral? What do you mean by continuous? (the topological kind of continuous? -- if so, what is the topology you're working with?).

innovation can go faster for

innovation can go faster for things that were not anticipated by the standards body

Certain classes of innovation can be faster, certainly.

For example, secure code distribution (especially if we can distribute to both servers and clients) would allow us to 'abstract' new network overlays and protocols as we need them. I.e. an application could easily view a network as a distributed machine - no painful configuration or administration requirements, no waiting on standards bodies.

My motivation for pursuing a secure, effective language - i.e. one that lets us reason about denial-of-service attacks and secure mashups, and thus allows rapid innovation for distributed computing without a lot of human oversight - is apparently the same as your motivation for favoring opaque code. I don't think we can both be right.

you don't have to perpetually support ancient standards in browsers, which makes it pretty much impossible to write a new browser these days

Well, flexible support for adapters to a new API is its own challenge, but one I've solved in my design (at least on paper) via a content-centric linking mechanism.

An url has all of the same problems.

It does not. Since we know that a URL is to be used with a certain protocol (GET, POST, etc.) we know a lot about what we can do with the URL.

If the result of a callback should be cached, the callback will take care of that.

Whether caching is an optimization or a performance drain depends very heavily on context. It is not a decision that the callback, alone, should make. I'm sure you could develop idioms (e.g. provide a cache capability and policy to the callback), but you need to be careful to avoid over-simplifying these issues.

Perhaps we only need to standardize a mechanism to put arbitrary information on the callback.

Like cookies? or dynamic scope?

I did favor such solutions years ago, but later determined found use of implicit context to be opposed to secure interaction design and to multi-cast or CDN optimizations.

I've since developed a more secure and optimizable approach to accumulating meta-data, based on a simple network of inheritance between fine-grained reactive databases (which I call 'registries'). But it doesn't work the same way - i.e. you would use it with a resource specification, rather than with a 'callback' (or dynamic behavior).

Why couldn't two sandboxes share fine-grained objects if both codes running in the sandboxes agree to do so?

A sandbox uses calls to the environment. Code, scripts, objects, etc. moved between sandboxes would use the new host's environment. If each sandbox has its own network socket, then you must expose too much authority to to the sandbox in order to support secure computation with sensitive client data (i.e. stuff that shouldn't go to the network).

a master running servant code in a sandbox

If one app is master of another, you have already violated the essential secure mashup requirement - that both mashed apps can keep secrets from another.

what is a good execution environment for such a system?

Object capability model. (see erights.org)

As I said, JavaScript's future version - harmony - is pursuing this direction precisely because sandboxing is inadequate. Not sure whether JS will ever get there, though - I tend to believe in language improvements only after they're accepted.

Even in systems where you completely control everything (say a cluster), people don't build reliable applications that way. For example your bank doesn't rely on timing between its internal servers to make sure that a transaction completes reliably (i.e. that no money is lost or duplicated).

I think you're forgetting that banks did, in fact, build their own ATM networks in order to achieve some rather precise timing properties.

Anyhow, transactions are ultimately just another way to trade latency for consistency. They can be modeled easily enough with a stateful handshake idiom, if we really want them, but transactions are not a good default. For control systems, timing is used almost exclusively - traffic control, within your car, within an airplane, nuclear power plants, etc.

Obviously, we don't want fragile systems break after mistiming one update. Resilience and stability describe properties for quickly recovering to a good state after parts of the system gang aft agley. These are properties I pursue quite rigorously, i.e. achieving certain mechanisms for compositional resilience and stability.

[Market integration] is something that should be built on top of this, not at the base.

I wouldn't want market integration built into the network protocol, but secure, fine-grained sharding does need to be accommodated by the code - e.g. via its type system or some sort of annotation and contagion model. Without that, distribution is far too coarse grained.

You need to guarantee that some things happen in constant time, for example adding two fixed size ints. But you do not need to provide a whole framework and tools and design everything for real time applications (an extremely small percentage of applications). That can be built on top of the base.

Contrary to your assertions, most programming domains (including UI, multi-media, games, control systems, databases, etc.) do have either real-time or soft real-time requirements. Even those domains that do not seem to have real-time requirements (such as ray tracing) greatly benefit from real-time meta-facilities such as job control, auditing, debugging, and incremental status in real-time.

This is not something specific to "an extremely small percentage of applications". The fact that we make do without real-time today has more to do with our ineffective tools than a lack of desire.

In any case, if we want 'real-time' to be compositional - across libraries or services - it must be accommodated at the base. Otherwise we are forced to build a 'monolithic' application every time we want real-time.

Furthermore, predicative timing is also a security requirement - against denial-of-service attacks. If we want secure code distribution without constant human supervision, predicative timing is a requirement.

Saying, "but, you can build your base atop mine" is not a valid justification for the extra layer, especially if the converse is also true.

There are millions of nice-to-haves for specific applications. Forcing the complexity on everyone by putting it in a global standard is a mistake.

There are many nice-to-haves that apply across nearly all applications, services, and libraries. Supporting these nice-to-haves in the language design is actually the simpler solution.

To grok this, you must account for how libraries and services are developed and integrated. To avoid monolithic development per project, and complexity of extra code, it is essential that libraries and services be reusable in many applications. Developers working on specific problems typically lack perspective to write good code for broad reuse - i.e. they are focused on a specific app. Thus, the system should guide them to achieving features that they don't immediately understand are necessary - i.e. via compositional reasoning, path of least resistance, and systematic pressure. One might call these 'liberating constraints'.

Making it difficult (indirect, but possible) to express Turing complete computations is just as useful for managing the path of least resistance as it is for supporting compositional reasoning.

The use of opaque code in a sandbox is simplistic. Sweeping complications under a fresh layer of opaque carpeting doesn't mean they go away. It just means that people will stumble over lumpy problems all over again, until they've developed a whole new set of standards to work around them.

Same bad song, stuck on repeat. You are singing it.

we don't yet know how to create compositional building blocks for most these things, so whatever we standardize would probably be mostly wrong

Knowing - that we need a standard - is half the battle.

If we don't accept that much, we'll just keep singing the same bad song same bad song same bad song

In any case, one can understand NaCl as trying to be a de-facto global standard of sorts, too (one using x86 as its 'compositional building block'). So the goal isn't to be perfect... it's to do better than x86 - or .Net, or JavaScript, or so on.

Anyhow, I've been playing with this problem for many years now, running various user stories and developer stories, and implementing a few of the more promising models. I do have a very promising set of 'compositional building blocks'.

Right, if something is essentially complete but non complete with a sleigh of hand, that could also work. It also smells like a (non) Turing tar pit argument: look how easy it is to reason about SK calculus programs, there are only a few simple rules!

The difference between a simplistic language (like SKI combinators) and a simple one (like Forth) is not computational power, but rather expressive power (ability to express behaviors of other languages using only local transforms) and abstractive power (ability to modify a language from within itself). The same is true for non Turing complete languages.

I favor models that have more expressive and abstractive power than 'opaque imperative code' you seem willing to settle for.

I don't measure simplicity of reasoning by number of rules, but rather by compositional properties, and straightforward expression for a variety of developer stories. Having few rules contributes to simplicity, but more important is having good rules: orthogonality, duality, generality.

The essential difficulties are still there, and because you wrote it in an less powerful language your program becomes relatively more complicated enough to negate the benefits from the theoretically easier reasoning in the system.

That's how I feel about your 'opaque imperative code'. The essential difficulties are still there. Hiding them doesn't make them go away, it just makes problems and bugs less obvious. And the integration and interop complications negate all proposed simplicity benefits (with interest).

In any case, I certainly am guided by essential difficulties - such as orchestration of sensors, actuators, and data systems. And integration of different problem domains. And job control. And even continuous innovation.

You can define Maxwell's equations with mathematics from his age in less than a page.

How many pages does it take to fully define the notation with which his equations are written?

I believe I could describe all of RDP, including its composition rules and its formal properties, in less than a page - if I had a good notation and did not need to explain or implement it. But that's a pretty big gotcha.

Can you give a complete formal model of what you have in mind (for example an operational semantics)?

Mostly, yes. I'm developing a relatively formal model in Haskell typeclasses, and validating it with an implementation. But Haskell isn't the ideal substrate for such a proof, so I plan to try Coq after I'm satisfied with the Haskell model and implementation.

Regarding your questions about my model: you can look into my blog, but I'll answer here:

  • What is a demand [on] an agent? A signal, pushed to an agent.
  • What is an agent? A hidden behavior model, with constrained side effects (constraints involve commutativity, idempotence, duration coupling, and inability to indefinitely store references)
  • With respect to what are the demands commutative and idempotent? (I suspect with respect to time) Space. I sometimes use the phrases spatially idempotent and spatially commutative. Identical demands have no additional effect. Source of a demand cannot be observed. We can model demands at any given instant with a set.
  • What do you mean by duration coupling? Signals have liveness durations, which often overlap. Duration of response signal (signal from an agent) is precisely equal to duration of demand signal (pushed to an agent).
  • What do you mean by continuous integral? We understand state as an accumulation function in continuous time involving present state and present observations - no dependence on history.
  • What do you mean by continuous? Time is continuous. In some domains, state might also be continuous (e.g. a Newtonian formula of time), or might be a piecewise discrete function of continuous time.

Well, flexible support for

Well, flexible support for adapters to a new API is its own challenge, but one I've solved in my design (at least on paper) via a content-centric linking mechanism.

Care to share this with the world? Claiming to have solved something is easy, solving something is hard.

It does not. Since we know that a URL is to be used with a certain protocol (GET, POST, etc.) we know a lot about what we can do with the URL.

In theory yes, in practice no. You didn't notice part where I gave an example where this went wrong in practice? In practice, GETs are not idempotent.

I've since developed a more secure and optimizable approach to accumulating meta-data, based on a simple network of inheritance between fine-grained reactive databases (which I call 'registries'). But it doesn't work the same way - i.e. you would use it with a resource specification, rather than with a 'callback' (or dynamic behavior).

A reasonably precise description or better a proof of concept would be appreciated.

If one app is master of another, you have already violated the essential secure mashup requirement - that both mashed apps can keep secrets from another.

They can. The master app runs multiple servants and those can keep secrets from each other. Just like e.g. an OS can isolate different applications from each other, and can selectively allow them to share stuff (e.g. shared RAM in current OSes). Even if one of these apps invoked the other (so is the master of the other) in Javascript with eval, the code that is being evalled can still keep secrets by using lexical scope to make sure that the data doesn't escape to the outside.

I think you're forgetting that banks did, in fact, build their own ATM networks in order to achieve some rather precise timing properties.

They don't rely on precise timing for completing a money transfer correctly. Also, I'm not saying transactions are the answer, as you seem to have interpreted. Transactions are an answer to a limited set of situations. There are many answers for differing requirements. That's why we shouldn't try to put another kitchen sink in a standard, but rather standardize a simple "assembly language" on top of which this can be built.

Saying, "but, you can build your base atop mine" is not a valid justification for the extra layer, especially if the converse is also true.

It is a very valid reason for not putting extra complexity in a standard.

I wouldn't want market integration built into the network protocol, but secure, fine-grained sharding does need to be accommodated by the code - e.g. via its type system or some sort of annotation and contagion model. Without that, distribution is far too coarse grained.

That can be built into languages that are built on top of this. Not a good idea to standardize such highly speculative and so far unsuccessful features.

Contrary to your assertions, most programming domains (including UI, multi-media, games, control systems, databases, etc.) do have either real-time or soft real-time requirements. Even those domains that do not seem to have real-time requirements (such as ray tracing) greatly benefit from real-time meta-facilities such as job control, auditing, debugging, and incremental status in real-time.

This is not something specific to "an extremely small percentage of applications". The fact that we make do without real-time today has more to do with our ineffective tools than a lack of desire.

In any case, if we want 'real-time' to be compositional - across libraries or services - it must be accommodated at the base. Otherwise we are forced to build a 'monolithic' application every time we want real-time.

Furthermore, predicative timing is also a security requirement - against denial-of-service attacks. If we want secure code distribution without constant human supervision, predicative timing is a requirement.

Yet those applications seem to do very well without special features. If you have better tools, do present them. What does predicative timing mean? Google brings up two results: a scammy blog post advertisement and this LTU page. They way you let other people run code on your hardware is generally done by giving them limited computation resources. This would be useful, and the Racket/PLT guys have done things in this area where you can run a kind of sandbox with limited resources.

Making it difficult (indirect, but possible) to express Turing complete computations is just as useful for managing the path of least resistance as it is for supporting compositional reasoning.

The use of opaque code in a sandbox is simplistic. Sweeping complications under a fresh layer of opaque carpeting doesn't mean they go away. It just means that people will stumble over lumpy problems all over again, until they've developed a whole new set of standards to work around them.

Same bad song, stuck on repeat. You are singing it.

Remember that we are talking about a base, not about a language that people program in. Just as easiness to reason about is a non goal for an assembly language for a processor, it is also a non goal for an assembly language for the web.

If we don't accept that much, we'll just keep singing the same bad song same bad song same bad song

In any case, one can understand NaCl as trying to be a de-facto global standard of sorts, too (one using x86 as its 'compositional building block'). So the goal isn't to be perfect... it's to do better than x86 - or .Net, or JavaScript, or so on.

Anyhow, I've been playing with this problem for many years now, running various user stories and developer stories, and implementing a few of the more promising models. I do have a very promising set of 'compositional building blocks'.

It seems that you are singing the same bad song: claiming that you solved issues without actually providing a single shred of evidence for it. I agree that the goal is to be better than x86 etc, but if we're at it we might as well try to do more than just an epsilon better.

I favor models that have more expressive and abstractive power than 'opaque imperative code' you seem willing to settle for.

I don't measure simplicity of reasoning by number of rules, but rather by compositional properties, and straightforward expression for a variety of developer stories. Having few rules contributes to simplicity, but more important is having good rules: orthogonality, duality, generality.

Remember, this is not about inventing the perfect programming language for web applications. It's about a base on which such languages can be built.

How many pages does it take to fully define the notation with which his equations are written?

I believe I could describe all of RDP, including its composition rules and its formal properties, in less than a page - if I had a good notation and did not need to explain or implement it. But that's a pretty big gotcha.

That depends on how much mathematics you are willing to accept as given. Mathematics from his age: a couple of lines. Just real numbers and their properties: half a page. Nothing but set theory: one page, perhaps two (you define the natural numbers, rationals, real numbers, derivatives, and then you express his equations).

Now, you are free to use all of modern mathematics to define RDP. Surely that should be enough? It is more than enough for all existing programming languages. If it is not enough could the problem be that the idea you have in mind is vague, rather than that mathematics is not powerful enough to express it concisely?

Regarding your questions about my model: you can look into my blog, but I'll answer here:

What is a demand [on] an agent? A signal, pushed to an agent.

What is an agent? A hidden behavior model, with constrained side effects (constraints involve commutativity, idempotence, duration coupling, and inability to indefinitely store references)

With respect to what are the demands commutative and idempotent? (I suspect with respect to time) Space. I sometimes use the phrases spatially idempotent and spatially commutative. Identical demands have no additional effect. Source of a demand cannot be observed. We can model demands at any given instant with a set.

What do you mean by duration coupling? Signals have liveness durations, which often overlap. Duration of response signal (signal from an agent) is precisely equal to duration of demand signal (pushed to an agent).

What do you mean by continuous integral? We understand state as an accumulation function in continuous time involving present state and present observations - no dependence on history.

What do you mean by continuous? Time is continuous. In some domains, state might also be continuous (e.g. a Newtonian formula of time), or might be a piecewise discrete function of continuous time.

I wasn't able to learn much from your blog, unfortunately. There is a lot of text but it was all too vague for me (much like the explanation above) to infer anything concrete -- but perhaps I'm just not getting it. I was able to find some type signatures, but no actual implementation nor a description in English or math what these functions were supposed to do. It would be nice if you provided that too, along with the type signatures, and an explanation of how to actually use the functions to do something compelling.

Necessary Complexity

Care to share this with the world? Claiming to have solved something is easy, solving something is hard.

Deep in an off-topic thread is not the place to share such things with the world. But feel free to look up some of my older posts involving keywords such as 'linkers as constraint solvers' and 'match maker'. I am developing a blog article on the subject, as well, but it's queued up behind several others.

In practice, GETs are not idempotent.

In most cases we can treat them so. The real problem is that GETs depend on cookies, IP address, and other context. Together, those properties severely hinder caching other optimizations. Turns out, idempotence is not sufficient for the benefits we wanted.

Anyhow, a system should protect its semantics. Rather than limping along and seeming to work, people should quickly realize, "I shouldn't use GET this way because different browsers might cache it! Let's use POST instead" or whatever.

in Javascript with eval, the code that is being evalled can still keep secrets by using lexical scope to make sure that the data doesn't escape to the outside

JavaScript is not yet secure at the lexical scope (.callee, .caller, .arguments, global object, covert channels via dynamic attributes, etc.).

Saying, "but, you can build your base atop mine" is not a valid justification for the extra layer, especially if the converse is also true.

It is a very valid reason for not putting extra complexity in a standard.

I am guessing that you meant: "Avoiding unnecessary complexity in the standard is a valid reason for favoring one 'base' and rejecting the other."

I agree in principle. Like you, I'm opposed to 'kitchen sink' standards that have non-orthogonal primitives competing for purpose. But I think you and I would implement this principle quite differently, due to different understandings how to judge 'necessary complexity'.

Yet those applications seem to do very well without special features.

It does not seem this way to me. Applications today fail regularly, lack orthogonal persistence, are inconsistent and immobile between my desktops. Mashups exist (cf. Chrome extensions or Greasmonkey scripts), but are generally not secure (able to read way too much of each page) and not fine-grained (cannot share just a button from application A with app B). Synchronization artifacts between sound and video are common. Scatter-gather of live data is a struggle. It is difficult to make apps "fail safe", esp. in domains of command and control. Most computer games take years to build then ship with bugs.

I don't believe apps are doing well, today. We just have exceedingly low expectations, which we sometimes fail to meet.

What does predicative timing mean?

Just that you can make firm assertions about time. It's a weaker condition than real-time, but stronger than 'finite time' or 'progress'.

They way you let other people run code on your hardware is generally done by giving them limited computation resources.

I agree that this is part of a good solution. But, taken alone, it has problems with regards to ensuring safe or simple failure modes, and for figuring out who supports what with fine-grained mashups.

Just as easiness to reason about is a non goal for an assembly language for a processor, it is also a non goal for an assembly language for the web.

I disagree, on both counts. Even for processors, there is much pursuit of proof-carrying code for reasons such as security or safety. For distributed code, the security and safety requirements are much more relevant. As we move to more fine-grained code distribution, efficient dynamic linking and late-binding of optimizations also become important. There are many reasons we'd want to reason about our assembly languages.

claiming that you solved issues without actually providing a single shred of evidence for it

Evidence for a solution should wait until you've accepted the existence of a problem.

Remember, this is not about inventing the perfect programming language for web applications. It's about a base on which such languages can be built.

I don't believe the distinction you attempt to make is a valid one.

First, no matter what, a base is a language - whether it be x86 machine code language, or JavaScript.

Second, the 'full abstraction' problem means we must have a very direct correspondence to a high-level language if we are to maintain the safety and security properties of that higher-level language with any sort of dynamic linking or mashups.

So the problem reduces to simply: developing a good base (or 'core') language.

I agree that most of a good language can be put into libraries and modules. In that case, the core is essentially an assembly.

Mathematics from his age: a couple of lines.

Ha ha. Yeah, right.

could the problem be that the idea you have in mind is vague, rather than that mathematics is not powerful enough to express it concisely?

Could be I just don't know the mathematical field that would let me express it concisely (in less than a page). But RDP is not a 'vague' concept. I can provide a GADT concisely enough (though that is really just an implementation of the properties I mentioned earlier).

some type signatures, but no actual implementation nor a description in English or math what these functions were supposed to do. It would be nice if you provided that too, along with the type signatures, and an explanation of how to actually use the functions to do something compelling.

Thanks for the feedback. My blog currently assumes familiarity with other models - e.g. FRP or a synchronous reactive programming model such as Lustre. I would like to alleviate that requirement. I'll try to provide some text explaining the type signatures in my next article.

I plan to eventually write articles on examples, as I implement them.

Deep in an off-topic thread

Deep in an off-topic thread is not the place to share such things with the world. But feel free to look up some of my older posts involving keywords such as 'linkers as constraint solvers' and 'match maker'. I am developing a blog article on the subject, as well, but it's queued up behind several others.

I hope your blog article will contain (1) a description on how it actually works and (2) a couple of concrete examples demonstrating the usefulness. In this thread you describe the *syntax* of the system, but not 1 nor 2. Sean McDirmid there told you the same as I have told you in this thread: a grammar or type signatures doesn't mean anything. Imagine that Church came out with his lambda calculus saying "I have discovered a wonderful model of computation! Here is its syntax: e = id | lam id. e | e e, id = [a-z]+. [end of description of lambda calculus]". At the very least he should provide a semantics for his syntax, and to show his claim he also needs to show how this semantics models computation.

In most cases we can treat them so. The real problem is that GETs depend on cookies, IP address, and other context. Together, those properties severely hinder caching other optimizations. Turns out, idempotence is not sufficient for the benefits we wanted.

No, the real problem was that the preloader was doing GETs that were deleting data, i.e. "In practice, GETs are not idempotent."

JavaScript is not yet secure at the lexical scope (.callee, .caller, .arguments, global object, covert channels via dynamic attributes, etc.).

Yes, it takes some gymnastics to hide data in Javascript. I believe .caller is going to be removed from future versions of Javascript, so that will make it a lot easier. But of course eval is still far from secure, and cannot be used for safe mashups.

I am guessing that you meant: "Avoiding unnecessary complexity in the standard is a valid reason for favoring one 'base' and rejecting the other."

I agree in principle. Like you, I'm opposed to 'kitchen sink' standards that have non-orthogonal primitives competing for purpose. But I think you and I would implement this principle quite differently, due to different understandings how to judge 'necessary complexity'.

Right, I prefer to err on the side of not standardizing something. It's impossible to remove something from a standard once you've standardized it. In particular if X can be built on top of Y, then I would only standardize Y and not X+Y, unless there is very clear evidence that standardizing X+Y is better than just standardizing Y.

It does not seem this way to me. Applications today fail regularly, lack orthogonal persistence, are inconsistent and immobile between my desktops. Mashups exist (cf. Chrome extensions or Greasmonkey scripts), but are generally not secure (able to read way too much of each page) and not fine-grained (cannot share just a button from application A with app B). Synchronization artifacts between sound and video are common. Scatter-gather of live data is a struggle. It is difficult to make apps "fail safe", esp. in domains of command and control. Most computer games take years to build then ship with bugs.

I don't believe apps are doing well, today. We just have exceedingly low expectations, which we sometimes fail to meet.

I agree completely that there is a lot to be improved. Where we disagree is that a standards body is the way to improvement. For example orthogonal persistence is nice, but how to you plan to standardize it? There are many design decisions and trade-offs that need to be researched before standardizing it. And even then the finished system could turn out to be implementable by only standardizing a small interface instead of the whole system. It's nice to standardize a great way to do X, but the problem is finding that great way to do X (where X is orthogonal persistence, real time guarantees for multimedia and games, etc.).

Just that you can make firm assertions about time. It's a weaker condition than real-time, but stronger than 'finite time' or 'progress'.

So it's anywhere between two extremes? That doesn't seem like a terribly useful notion, but perhaps you mean a specific middle between real time and finite time. If so, what middle ground do you mean, and how do you plan to achieve it?

I disagree, on both counts. Even for processors, there is much pursuit of proof-carrying code for reasons such as security or safety. For distributed code, the security and safety requirements are much more relevant. As we move to more fine-grained code distribution, efficient dynamic linking and late-binding of optimizations also become important. There are many reasons we'd want to reason about our assembly languages.

Proof carrying code is a performance optimization (statically verifying vs dynamically verifying a property), and so far has been unsuccessful. I agree that it might become successful in the future, and that also in late binding of optimizations it might become useful even in an assembly language that is quite performant by being low level and statically typed instead of high level and dynamically typed like Javascript (where JIT is already very useful). However until we find a successful way to do it, it would be a mistake to standardize it. You can't just assume research problems solved.

Evidence for a solution should wait until you've accepted the existence of a problem.

I don't deny that there are problems to be solved, just that a solution should not be standardized before we find such a solution. Again, if you've really found solutions as you claim, please do present them. Until then, saying things like that just sounds incredibly pretentious.

Second, the 'full abstraction' problem means we must have a very direct correspondence to a high-level language if we are to maintain the safety and security properties of that higher-level language with any sort of dynamic linking or mashups.

That is not true, because if it were true we might as well give up immediately. The base we have to work with currently is x86 assembly, which doesn't provide any of these things. So by your claim, it is impossible to support that in a system built on top of x86. In a similar way that you can get memory safety in e.g. Javascript on top of x86, you can get additional security guarantees on top of an existing system.

Secondly, security is just one concern, of which I already agreed that it is necessary to have building blocks for it in the base. It is not necessary that the assembly language support nice abstractions directly. The things that need it can build on top of this, just like you can build a Javascript runtime on top of x86.

I agree that most of a good language can be put into libraries and modules. In that case, the core is essentially an assembly.

I agree that most of a good language can be put into libraries, but being able to build a good language with libraries is not a requirement of an assembly language. For example you cannot build a good language in ARM assembly by writing ARM assembly libraries. Yet ARM is a good assembly language. That is because we have another option available besides libraries: writing a compiler that compiles down to the assembly language.

Mathematics from his age: a couple of lines.

Ha ha. Yeah, right.

Before you try to make a fool of somebody else you might want to check the facts, especially if you are unfamiliar with the field, to avoid making a fool out of yourself. Maxwell's equations in modern notation are 4 lines. In the notation from his age this becomes 8 lines: two of the equations are vector equations in 3d, and in his age there wasn't a notation to capture these 3 equations in one vector equation. You can check this for yourself by reading Maxwell's famous paper: http://upload.wikimedia.org/wikipedia/commons/1/19/A_Dynamical_Theory_of_the_Electromagnetic_Field.pdf Of course he does a whole lot more in that paper; not only does he provide more equations than are nowadays called Maxwell's equations, he also uses them to solve many problems in electromagnetism and provides new insights. For example he uses his equations to show that his equations exhibit electromagnetic waves, and he calculates the propagation speed of the waves from properties of electricity and magnetism measured previously in a lab by other researchers. The resulting speed come out very close to the speed of light that other scientists had measured, thus he concludes that light is an electromagnetic wave (this is just one of the new insights he provides and open problems he solves. This is marvelous: the researchers who measured the electrical and magnetic properties did so to design and understand capacitors and magnets that you can hold in your hand, but it turned out that their measurements were enough to determine the speed of light, that other researchers were measuring in the most obvious way: by shining light and measuring how long it takes!

Translated to the current situation that would mean that you stated the semantics of RDP formally, you explained the meaning and consequences, and used RDP to solve a couple of existing open problems in computer science, as well as providing completely new insights on problems that we didn't even know we had. I would definitely not complain if you presented that!, even if it took you 54 pages like Maxwell. I'd also not complain if you just presented the formal model (equivalent to Maxwell just writing down 4 equations, or 8 in his notation). But if Maxwell would have just said "I have a set of wonderful equations that describe electromagnetism, the 'type signature' of the equations is real number = real number", people would not believe him, because for every genius that really did such a thing, there are hundreds that claim to have but in reality didn't.

Could be I just don't know the mathematical field that would let me express it concisely (in less than a page). But RDP is not a 'vague' concept. I can provide a GADT concisely enough (though that is really just an implementation of the properties I mentioned earlier).

That is not an implementation...that is a type signature.

Thanks for the feedback. My blog currently assumes familiarity with other models - e.g. FRP or a synchronous reactive programming model such as Lustre. I would like to alleviate that requirement. I'll try to provide some text explaining the type signatures in my next article.

I plan to eventually write articles on examples, as I implement them.

While I am not an expert on either to say the least, I am familiar with FRP's and Lustre's model of computation. Indeed I vaguely recognized some ideas from both, but the description was not enough to get a concrete picture. I eagerly await your implementation.

But perhaps you can even give a sketch here of some example programs that solve compelling scenarios.

Second Base

I hope your blog article will contain (1) a description on how it actually works and (2) a couple of concrete examples demonstrating the usefulness. In this thread you describe the *syntax* of the system, but not 1 nor 2.

I said how it works - pattern matching, even precise order - in the paragraphs below the grammar. I have a concrete example later in the thread.

No, the real problem was that the preloader was doing GETs that were deleting data, i.e. "In practice, GETs are not idempotent."

People can write invalid code in any system, including HTTP. But GETs, in valid practice, are safe and idempotent.

Right, I prefer to err on the side of not standardizing something.

But you do standardize something. You just prefer your standard be powerful, easy to hide bugs in, difficult to interop with.

It's impossible to remove something from a standard once you've standardized it. In particular if X can be built on top of Y, then I would only standardize Y and not X+Y, unless there is very clear evidence that standardizing X+Y is better than just standardizing Y.

This problem doesn't actually change for APIs. If we build API X atop Y, and X is widely used, it becomes a de-facto standard that is equally difficult to remove. The rule is that 'widely used languages are difficult to remove' (including APIs, protocols, and frameworks in 'language').

And that in turn is just an example of a more general rule that 'widely used platforms are difficult to change', where platforms further include gas, electricity, roads, programming languages, etc.

In any case, when X and Y are both suitable as 'bases', and neither is a superset of the other, you are really faced with X|Y. And that's the issue in our discussion.

So it's anywhere between two extremes?

No. As I said, predicative timing means that you can make firm assertions about time. That is, given a subprogram that exhibits predicative timing, you can give me upper and lower bounds for when it will complete (without running it).

You can logically derive that predicative timing is stronger (more constrained) than progress or finite time conditions, and weaker (less constrained) than real-time.

if you've really found solutions as you claim, please do present them. Until then, saying things like that just sounds incredibly pretentious

I claim a very promising set of compositional building blocks, which might become a solution. I claim some properties (constraints, mostly) that a solution must possess - not being pervasively Turing-powerful is right in there with not being pervasively mutable. I might claim a solution if my model fulfills its promise, but I have been careful not to do so before then.

if [the full abstraction problem] were true we might as well give up immediately. The base we have to work with currently is x86 assembly, which doesn't provide any of these things. So by your claim, it is impossible to support that in a system built on top of x86. In a similar way that you can get memory safety in e.g. Javascript on top of x86, you can get additional security guarantees on top of an existing system.

We cannot guarantee safety - that developer abstractions are not violated - when linking x86 code. We cannot securely mashup x86 code. We work around the problem by separating the x86 code into constrained sandboxes (e.g. different address spaces). We use obtuse, inconsistent, often inefficient serialization and synchronization protocols between them. The difficulty and expense of interop encourage us to develop monolithic apps rather than use fine-grained services.

We could build another 'base' language to get safety and security above the current system. But doing so is redundant, and wasteful in its own ways. If we have JavaScript, we don't need x86. We can model x86 above JavaScript, and still benefit from JavaScript memory safety. Of course, JavaScript has its own problems - like performance and efficiency and real-time computation - so perhaps we should pursue something different than JavaScript as a base.

It is not necessary that the assembly language support nice abstractions directly. The things that need it can build on top of this, just like you can build a Javascript runtime on top of x86.

Even if we build a JavaScript runtime on top of x86, we still have a lot of interop, safety, and security problems with other things built on top of x86. Thus, you really should understand JavaScript as an entirely distinct base - one that competes with x86, rather than cooperates with it.

The very word 'assembly' connotes elements together in concerted purpose.

If we want a good web assembly language, it must effectively support for interop at the 'base' level. Different languages and libraries implemented above this base (whether by compiler or interpreter or direct abstraction) should work together effectively, in concert.

.Net attempts to be such a language, with moderate success. It is certainly a superior option to x86 code, modulo concerns about patents and such.

I believe that orchestration languages provide the right direction in developing an 'assembly' that really lives up to the name. It seems to me that a lot of orchestration code is not algorithmic - dataflow, discovery, concurrency, synchronization, laziness, reactivity, resource management, etc. - and benefits hugely from late-binding of optimizations, specializations, distribution decisions based on context.

My own vision calls for ability to effectively abstract away the base - implement a 'tower of babel' using whatever languages are convenient, then systematically smash it back to the 'assembly' level via staged computation, entirely from within the language. One could then interop at the assembly level, or do so at a higher level by controlling the data reaching each stage of computation.

In my vision, there is little need for a distinction between abstraction and compilation in any case, because the 'assembly' itself has some built-in support for its own staging. Lisp is probably the earliest implementation of this vision, with its macros. But other concerns prevent me from accepting Lisp as a decent web assembly.

Before you try to make a fool of somebody else you might want to check the facts

I did. Based on what you say here, perhaps I could have been more generous and interpreted "accept the mathematics of his age" as "accept the 'notation' of his age". But my question was very precisely about defining the notation Maxwell uses. Even accepting mathematics of his age, actually writing out the definitions for the notations is going to take a lot more than a couple lines.

that would mean that you stated the semantics of RDP formally, you explained the meaning and consequences, and used RDP to solve a couple of existing open problems in computer science, as well as providing completely new insights on problems that we didn't even know we had. I would definitely not complain if you presented that!, even if it took you 54 pages like Maxwell.

Sounds like a typical CS thesis. I would like to eventually write something like that, but I feel getting a very solid grip on RDP through implementations is a better start.

That is not an implementation...that is a type signature.

A type signature is a theorem. An implementation is a proof. RDP is not a trivial theorem such as 'Real -> Real' (which is proven by the 'id' function).

I grant the GADT there is only part of RDP's theorem (it lacks the equational theories), and is not fully precise (because Haskell lacks dependent types and linear logic, which I could use to model latency and asynchronous arrows in the type system), but I'm just referencing it as evidence that 'vagueness' is not the problem.

RDP is an 'implementation' of the even higher level properties I mentioned earlier in the thread - spatial commutativity, spatial idempotence, duration coupling, and 'no looking back' (cannot use old references, cannot request old state, etc.). For example, one doesn't need 'choice' (:|:) or dynamic behavior in order to achieve those properties. What I'm aiming to achieve with RDP is a very generic model that still obeys those critical constraints, which is why I called the GADT an 'implementation' of these properties.

But perhaps you can even give a sketch here of some example programs that solve compelling scenarios.

I have difficulty thinking of compelling scenarios that wouldn't require much introduction. I'll develop some examples on my blog, but I think the only way RDP will gain acceptance is through developing something like a 'killer app' in it first, then explaining it in bits and pieces in terms of RDP.

So, for now, I plan to finish up the implementation.

Good luck with the

Good luck with the implementation, I'm looking forward to trying it out.

This discussion has so far has not lead to concrete attempts at a design, so let me make an attempt. Suppose I propose the following alternative to eval as found in Javascript to accomplish mashups. We have a pair of functions serialize and deserialize that can convert any object in the hypothetical language to and from binary data that can be sent over the network. They can serialize integers, strings, arrays, etc. Most importantly they can serialize closures, and they also recursively serialize the captured environment of the closure. Now we have to make sure that by default the language doesn't have any features like opening files or access to drawing on the screen that make calling a function that you got from over the network unsafe. We have to pass those capabilities in from the outside.

The problem that remains is that calling the function might consume time and memory. I think we could solve that by including a thread system that allows you to create a new thread with a given maximum memory space, and the ability to run a thread for a given amount of time (I think Racket has this). There are some difficulties like who do you charge the memory to if the inner thread calls a function passed in from the outside, what happens when threads end, etc. Alternatively we could model memory as a capability as well, by passing in functions that allocate and free memory that stop working when the parent that passed these capabilities thinks they got enough memory. Perhaps we could manage time explicitly as a capability in a similar way, but this seems more difficult and the method of donating time slices to other threads seems sufficient.

You could also have a function that creates a closure (with an empty environment) from an AST. That would give you the ability to dynamically generate and execute code like eval, but without implicit access to the environment. You could even write serialize and deserialize in terms of this given a suitable interface to closures. Or you could decide not to allow direct serialization of closures, but instead have the programmer construct the AST of programs to be sent over the network explicitly.

This leaves the underlying language unspecified. Obviously it would have to be memory safe, and we'd want it to be efficient and be able to compile lots of languages down to it efficiently. For flexibility in passing data around between different applications/closures instantiated dynamically it would be easiest if the language were dynamically typed, or at least we'd want the runtime values to be tagged. It would be a research problem how to achieve this inter-operation in a statically typed yet efficient way (so no serialization/deserialization between boundaries). Or has this been done?

What do you think needs to be added to this model for it to be a local optimum in the design space, for reasonably large values of local? What should be changed, what should be subtracted?

Parameterized effects

For one, I'd get rid of the idea of closed-over capabilities and just send around code that's parametric in its effects. So, for example if you have a website that wants to play sound, the browser would have to handle that effect for you to actually hear anything. The mash-up of two such websites would itself just be a website that wants to play sound.

Can you give an example of the type of dynamic instantiation that you want to that you think precludes static typing?

Parametric Effects

With capabilities, I can model a confined app in a straightforward way by accepting a pure AST then providing my own capabilities to it (sound, etc.). But it would be unclear to me how to mash such applications in a secure manner.

Using parametric effects, how would you express the secure mash-up of two such websites? How do they initiate a relationship? How would you model replies and callbacks, or multi-stage protocols? How would they protect secrets from one another, which they are willing to reveal to the browser?

I'm a bit curious how you imagine this all fitting together.

I doubt your users are willing to wait for static dependent typing between arbitrary mashups...

If you want encapsulation,

If you want encapsulation, you probably shouldn't be sending an AST. When you send code as data, it's open to inspection. So if you want secure composition (in the sense of respecting encapsulation), I'd think you'd need to have the browser in direct contact with the to-be-mashed-up sites. It (or some higher layer, anyway) can deserialize the data and turn it into code, and then supply the mash-up code with an abstraction that respects encapsulation.

BTW, I haven't seen anything in the problem description here that I'd expect to require dependent types. I think you're referring to our offline conversation, but if you go back and look, I was talking about using dependent types to manage sets of typed Processes.

When you send code as data,

When you send code as data, it's open to inspection.

If you send code to a remote host, it's open to inspection by that remote host. Modulo DRM technologies (which I refuse on principle, excepting for competitive games), the most you can do is obfuscate it.

The important bit for secure mashups is that ASTs from two different services are only exposed only to a mutually trusted host (e.g. a browser), and not directly to one another. The communication model between mashed services should provide logical encapsulation at runtime.

I was already assuming this much to be the case in your scenario. But it still leaves the issue of how we model the fine-grained communication.

anything in the problem description here that I'd expect to require dependent types [...] I was talking about using dependent types to manage sets of typed Processes

You imply that mashed services either aren't a set, aren't typed, or aren't processes. Which is it?

Mashups typically require at least simple protocols (message/response, where response involves a continuation) and benefit from fine-grained staged protocols built from these (where messages and responses grant fine-grained access to objects to be integrated with user events and invoked later). For secure mashups, these protocols must be protected. From our offline conversation, my impression was that you use dependent types to protect protocols.

I was talking about using

I was talking about using dependent types for managing a variable sized set of Processes, whereas here it sounds like we're talking about mashing up some fixed number of services/websites. Typing the protocol between two communicating Processes doesn't need dependent types. In fact, the types that I assign to Processes I call Protocols, and they probably capture at least some of the things you're interested in typing. The dependent type comes in when you have a set of Processes and you want to talk about the state of the nth Process. So maybe the Browser implementation would use dependent types (it probably wouldn't have to - you can usually work around a use for dependent types with some extra run-time data/checks), but the mashing code wouldn't, as far as I can see.

Variable number

In the general case, mashups are variable - events can trigger loading new web services, discovery mechanisms, mashup makers, browser extensions, etc.

Anyhow, I think you'll face plenty of security issues. You'll just regress them to a system that has less context and awareness of how to make good security decisions. For example, if you have a website that wants to access the network, you'll face the possible case that the website tries to access a private space address (127, 192.168, or 10.0.0) on your network.

Building all the special rules to handle these cases, while allowing legitimate applications and mashups, is a very difficult problem. You've yet to answer any of my questions regarding secure communication within a mashup.

The advantage of capabilities in distributed systems is that they allow us to reason about precisely which network and browser resources each service can access, independently of distribution. This lets us focus on distribution for other reasons (efficiency, latency, scalability, disruption tolerance), while shifting most of the security burden to the service providing the code rather than the client receiving it (whose main security concerns are resource consumption and deciding which services to trust).

Plenty of security issues

For example, if you have a website that wants to access the network, you'll face the possible case that the website tries to access a private space address (127, 192.168, or 10.0.0) on your network.

That assumes a protocol where the sandboxed code asks to open a socket on IP:PORT and the browser decides whether it's legal. I'm in favor of the ocaps concept of an unforgeable reference being used to select remote destinations. I just don't like the spooky action at a distance of ocaps doing things directly.

In the distributed case, you could build a system of trusted resource distribution on top of what I'm proposing, but I wouldn't want all code to use it. For example, if I visit some shady website, I might not want a shard on my machine to be able to send packets to arbitrary other addresses, even though the shard would have that capability if run locally.

if I visit some shady

if I visit some shady website, I might not want a shard on my machine to be able to send packets to arbitrary other addresses, even though the shard would have that capability if run locally

Why not?

Any attempt at a denial of service attack would be easy to shut down and trace, at least for programs constrained by my RDP model. I grant, such attacks are feasible in various other models.

Besides, your ability to distinguish 'shady sites' and properly manage permissions is suspect.

I would prefer to not delude users about the authority of a system. Secure interaction design requires awareness and clarity. Passing data to a redirect shard from a shady site must look the same as directing it to the shady site.

Confined apps could be achieved by the KeyKOS factory pattern and its like, and could be made equally clear to users. But we're rapidly approaching a time when all useful apps are networked, so this would be the exception rather than the rule.

I just don't like the spooky action at a distance of ocaps doing things directly.

The 'directness' of capabilities is a huge advantage for efficiency, brokering and staging. We can logically have data flowing between processes, while physically moving data on a far more direct path. We can establish this dynamically, without requiring wholistic optimizations. For code distribution, we can also split processes to multiple machines, and leverage a standard routing model.

I remain skeptical that your approach to data plumbing between processes will achieve similar benefits and simplicity.

Wholistic remedies

I'm currently working on an implementation in my free time, but I think the costs will be comparable to using direct capabilities without whole program optimization.

Homomorphic cryptography

If you send code to a remote host, it's open to inspection ... the most you can do is obfuscate it.

Homomorphic cryptography seems - to my not even amateurish eyes - to be making rapid progress, see e.g. Can Homomorphic Encryption be Practical?.

Short summary:

We showed that an encryption for the sum of 100 128-bit numbers can be computed from the individual ciphertexts in 20 milliseconds on a laptop

Homomorphic cryptography

In university, I had two professors that were working on this subject. I thought about mentioning it above, but decided it is not yet ready to fulfill that role.

The difficulty is: each 'operation' you wish to support must be handled specially, and the extent to which you support useful operations (especially those such as comparison, branching, and indexing, which reveal something about how numbers relate) tend to greatly weaken security and increase risk, which in turn requires ramping up the key sizes and performance costs.

Well, that, and performance is still awful. You might be better off paying for physical security and running the computations locally, than paying orders of magnitude more for space and time in a cloud service.

Homomorphic encryption is valuable for many specific encryption tasks, for smart contracts and such. But I think it will never be ready for general purpose programming.

Of course it would work like

Of course it would work like that. If you try to serialize a capability it would probably throw an error. You could also have it work over the network. So if you serialize the sound capability you instead send a piece of code that communicates back over the network to play a sound at the original computer. But that raises all kinds of issues of network failure modes unexpectedly appearing.

Dynamic instantiation does not preclude static typing, it just needs a new type system, so it is a research problem (AFAIK). In general it is unwise to assume research problems solved. Say you want to prove access to a piece of data to another process running on the same computer. How would you statically type that? One way is to serialize and deserialize the data. Then the data will be essentially dynamically type checked on the other side, and if it is valid it is returned as a statically typed value. If you don't want this cost you need to somehow let the type systems on both sides cooperate, and you'd need a trusted third party that type checks both sides.

So if you serialize the

So if you serialize the sound capability you instead send a piece of code that communicates back over the network to play a sound at the original computer. But that raises all kinds of issues of network failure modes unexpectedly appearing.

My suggestion was to get rid of capabilities and use pure code that is parametric in effects instead. If you do that, the problem you mention here just isn't around. If you actually do want to export network access to your sound card, then I'm suggesting you'd have to set that up explicitly.

Regarding static typing, I was actually just trying to understand what you see as the researchy bits. I agree there are some, and I have some opinions but they're probably too half-baked to share now.

Semi-transparent Distribution

that raises all kinds of issues of network failure modes unexpectedly appearing

One of the mechanisms to usefully support distribution in the language is to favor semi-transparent distribution, as in E language.

That is, you logically distinguish remote refs from near refs. These become locations where network disruption is observable. I.e. the network failure modes are no longer 'unexpected'.

A type system or membrane could prevent you from sharing a near-ref across a far ref. My RDP model uses an implicit membrane - you cannot statefully keep references, so capabilities shared at runtime are typically volatile unless one is distributing 'persistent' code (in which case there might be some persistent capabilities to other distributed components of the application).

Yeah, I think it's widely

Yeah, I think it's widely accepted that that a fully transparent model doesn't work well.

You'd have to find a way to integrate the semi-transparent model with the closure distribution though (if you decide to have the ability to send closures over the network, that is). It doesn't have to be a huge issue, but a programmer does have to be aware which things are going to be sent locally and which are going to be sent remotely. So instead of doing this:

    closure = () => sound.play(...)

You'd have to do:

    remoteSound = toFarRef(sound)
    closure = () => remoteSound->play(...)

And you'd basically have to write library functions twice:

    def playSong(sound):
       sound.play(...)
       sound.play(...)

And:

    def playRemoteSong(remoteSound):
       remoteSound->play(...)
       remoteSound->play(...)

It would be nice to have some kind of abstraction over that, either by using transparent distribution and having some kind of ability to customly deal with network errors from outside the library, or by having an idiomatic way to send the whole library function over to the other side and run it there. That doesn't work for things that need capabilities from both sides though.

Another problem is that you want to have some implicit way of using capabilities. For example in the first example what you really want is something like this:

    closure = (networkCapability) => networkCapability.sendRequest(remoteSound->play(...))

Here I assume that remoteSound->play() now returns some kind of object that represents a network request to the appropriate computer with the appropriate data. For example you could say that the operator is implemented like this:

   farref -> operation = NetworkRequest(destination = farref.origin, 
                                        content = serialize({"object": farref.object, "operation": operation}))

You have to do this because you can't assume that the closure has implicit access to the network when you send it over to the other side. Furthermore if serialize is a capability, then you'd have to pass that in as well. This gets ugly fast.

Writing it once

In E, you'd simply write the library (its API, anyway) for remote execution if you want it to work both locally and remotely. That's what the 'semi' means in semi-transparent distribution.

def playSong(sound):
  sound->play(...)
  sound->play(...)

This would work for both local and remote 'sound' services.

Distribution in RDP is considerably more transparent than in E. This is for several reasons: (a) it is normal to model disruption of behaviors even locally (e.g. for choice and dynamic behaviors), (b) references provided in RDP are logically revoked during disruption (you couldn't store them, so there is no risk of dead links), (c) RDP is a dataflow model, so there is no concurrency distinction for near vs. far behaviors.

Consequently, I can share a near ref across a far ref transparently, and the disruption semantics are well defined. A proxy cap (to return 'disrupted' status, possibly with a caching option) is necessary only when developers want to take special fallback actions during disruptions, as opposed to shutting down the entire behavior.

You have to do this [complicated messy stuff] because you can't assume that the closure has implicit access to the network when you send it over to the other side.

You're unnecessarily complicating matters. You can assume that a closure has network access (via far refs). The operation sound->play() immediately schedules a 'play' message to sound - placing it in a queue, perhaps for serialization on the network.

If someone wishes to deny you network access, they can expediently refuse to accept closures as arguments, and instead require you send an AST or script. This is what confinement patterns are for. Very few use cases need confinement patterns.

In E, you can write it once

In E, you can write it once because I suppose the networking capability is paired with the sound farref. That doesn't work with serializing closures, because the closure needs to get a different network capability when run on a different machine.

If you assume that closures get this network access implicitly, then how is that done? The system automatically gives network to the closure when the process calling the closure has network access? How does the system know that that process has network access, since it could have been passed in from anywhere. That process might even want to use its own custom network access object, just to see what calling the closure is sending over the network.

Also, closures are even useful when they don't have network access, if they only use local objects.

It seems wrong to lift network access to a special status, and create a complicated scheme for providing it implicitly in some situations. It is much nicer if we had a principled way to do that for all kinds of things we want to pass in implicitly, for example with some kind of dynamic scoping.

Consequently, I can share a near ref across a far ref transparently, and the disruption semantics are well defined. A proxy cap (to return 'disrupted' status, possibly with a caching option) is necessary only when developers want to take special fallback actions during disruptions, as opposed to shutting down the entire behavior.

Yeah, I was thinking something like that for an imperative language. You then distinguish between asynchronous and synchronous references, instead of near and far. A remote synchronous reference just waits for a reply and returns the result instead of returning a promise.

Parametric effects again

Sorry to bring this up again, but parametric effects neatly solve this problem you're raising of implicit effect passing. Another way to look at what I'm suggesting is that the meaning of capabilities should be dynamically scoped, so that you can have a capability "local sound" that just works when run in any local environment. My contention is that every capability should work this way (should be rebindable by the environment), but I suppose you could also try to make this a special mechanism layered upon regular capabilities.

Dynamically Scoped Effects

I would contend that the ability to rebind effects adds a lot of interface complexity and security risks for almost zero benefits. Capabilities capture and hide a lot of fine-grained complex effects. For your solution to work at all, you need to use very coarse-grained effects, such that they can be captured in broad sweeps and do not overly clutter the user interface.

If you have a subprogram that controls multiple sound systems, in general it would be hidden by a network cap anyway. There is no practical way to 'capture' such effects, since the browser cannot 'generally' fake a sensible response to high-level queries or network effects.

The only time that effects can be sensibly controlled is at construction, when developers have a clear understanding of fine-grained intents. Capabilities make this readily accessible, without providing the delusion of security and flexibility that dynamically scoped effects seem to offer.

The role of parametric effects in security

I agree with your characterization that capabilities are more fine grained than the interface to the effect parameters you'll want to have. But the two are dual to each other and I think compliment each other. Whereas you might have a token that denotes the capability "write access to file foo.txt on my local machine", in order to use that capability you would use a more generic effect such as "write to file" that takes the capability as a parameter.

Here are some benefits that this brings:

- Pure semantics. I find this approach more elegant than spooky action at a distance.

- Efficient revocation. The pattern for revocation in ocaps introduces another layer of indirection that would I think be quite expensive if used pervasively.

- Coarse types that tell you in broad strokes what capabilities might be used by a Process.

- Deterministic replay. One generic thing you can do with effects is capture interactions with a Process to be replayed later.

- Mocking. You can always stub in handlers for effects for testing purposes.

- Context dependence. This is the big one, so I saved it for last (not that this list is comprehensive... these are just off the top of my head). The pure capability approach is akin to making the environment a singleton. One problem with that is that it means you can't simply instantiate a new instance of a clique of interacting objects if they're interacting with capabilities. For a concrete example, imagine a little ant simulation where ants reference each other with capabilities. How do you make a copy of the simulation? Does it involve manually coding up a 'rebind' operation to hand out the new copies of capabilities to all the ants? Or does it involve threading a "context" explicitly everywhere? Neither of those are good solutions. Or for another example from this thread, consider moving a Process to another machine and hooking up local networking and sound without a bunch of manual plumbing.

Can you elaborate on the security threats you see? Giving the environment instantiating a Process control over what capabilities it can use seems like a security win to me.

Correction

I mostly stand by the points above, but I think the relationship I've described above between effects and capabilities is the wrong one - capabilities should be viewed the constructors for effects, not parameters to effects. If I have a capability to write to a certain file, that should just mean that I have the ability to construct such an IO effect. That makes the concepts of effects and capabilities quite orthogonal and brings the benefits listed above in a nice way... I think. I'll think more on this tonight and over the weekend.

Capabilities as Effect Constructors

I agree that 'capabilities as effect constructors' - e.g. a capability building a monad or arrow - is a valid way to model them. That can be used for some of the structural enforcement I mention below. But domain-specific security and effects is still a terrible idea, and the points I raise below still hold. Capabilities must be opaque, up to structural type.

No Role for Effects-as-Signals in Security

With capabilities, we do have parameterized effects: if I want to give you access to an effect, I provide it as a parameter. We can easily reason about and control which effects a subprogram can access, assuming you control its origin. To use a service, you typically provide a few of your own caps (callbacks, DOM, etc.), and don't worry about what other caps the service already possesses.

you might have a token that denotes the capability "write access to file foo.txt on my local machine", in order to use that capability you would use a more generic effect such as "write to file" that takes the capability as a parameter.

A typical capability will compose effects of different domains - perhaps "writes to logfile, alerts subscribed observers, and also writes to foo.txt". This sort of arbitrary composition is common for effects. The tight coupling of effects from different domains is often critical for security and safety. For example, to support transactions, we would need to ensure effects happen in a particular order.

In such cases, it is unclear which 'generic effect' a capability should be associated with.

Efficient revocation. The pattern for revocation in ocaps introduces another layer of indirection

Your model does not avoid 'another layer of indirection' for revocation. You'll layer in a stateful process that chooses whether to propagate effects based on state. This is similar to how it is achieved with imperative ocaps.

With RDP, revocation is very efficient and easy: you simply stop granting the capability.

Coarse types that tell you in broad strokes what capabilities might be used by a Process.

Except they don't. Your 'write to file on a remote machine' cap could have any effects at all, depending on who can observe that file or intercept the effects. All you have offered is the illusion of knowledge about effects. Such illusions are extremely harmful for security.

If we're going to tunnel signals between processes, we are much better off if we simply admit that the meaning of these signals is opaque in the semantics, so we can focus on safety, optimizations, and so on.

Besides, if we're 'instantiating' a process ourselves, then we really can get a broad idea of what effects it needs, based on which capabilities it takes as input.

Deterministic replay. Mocking.

Capabilities offer similar advantages, since every application starts confined and only has access to the caps you grant.

Context dependence. The pure capability approach is akin to making the environment a singleton

We can model mobile agents migrating between environments in terms of revoking one environment and providing another, yet still allow agents to hold their own capabilities. An 'environment' is essentially a record, powerbox, or registry of capabilities.

Capabilities are flexible, though. Agents can capture multiple environments and act as bridges between them.

you can't simply instantiate a new instance of a clique of interacting objects if they're interacting with capabilities

Do not fool yourself with illusions of mobility. Mobility and cloning must be carefully handled if you want them as generic features. The state of an environment and a process will co-evolve. You cannot simply copy or move a process into a new environment - attempting to do so would break the established relationships with the environment, such as waiting callbacks and buffers.

A language suitable for moving processes between environments would need to make guarantees about stable states for both, with some sort of well defined 'step' basis. Buffers must be emptied. Handshakes must be completed. Otherwise, developers must rely upon self-discipline - just as they would with capabilities.

For a concrete example, imagine a little ant simulation where ants reference each other with capabilities.

If you're using processes to model individual ants, you have already shot yourself in the foot and created a lame simulation. In domain simulations, the ability to tweak broad 'rules' (whether they be social or physical) is important, in which case the individual ants should be predicates in a database rather than individual encapsulated processes or objects in their own right!

But to follow this example as it was intended, I would model ants as only having access to their environment - i.e. ants are not directly connected to other ants, unless I'm modeling some sort of hive-mind in a fantasy setting.

Can you elaborate on the security threats you see? Giving the environment instantiating a Process control over what capabilities it can use seems like a security win to me.

Except that it's just the illusion of a security win - which, by nature, is a security loss.

As I note above, you're ultimately tunneling signals, but there is no clear indicator of how those signals are understood. Tight coupling of effects from different domains is essential anyway, for both security and safety. Signals must be understood as opaque effects, because - to any local approximation - that's exactly what they are.

Classifying effects by domain (network, sound, filesystem) - and consequently seeking domain-specific security - is ultimately a ridiculous, horrible idea that should be carefully placed in a museum of bad ideas (along with Null) so that other people can learn from it. It seems to be the sort of idea that people with no understanding of security fixate on, over and over and over again.

If we are going to classify effects, we should instead favor qualifiers that can be leveraged or enforced structurally: idempotent, commutative, associative, discrete, continuous, reliable, real-time, limits of disruption, linearity, regions, enforcing protocols and zero-knowledge transfers, etc. There are many useful features we can achieve with structural typing, but none of it is domain specific.

You and I have had roughly this conversation several times now, but you have rarely focused on the security aspect. Hopefully, this will be the time you get it.

In this case you do *not*

In this case you do *not* want the sound to play locally, you only want the local network capability. So you'd need some way to distinguish between what to take from the dynamic environment and what to take from the lexical scope.

If you model everything as a parametric effect, then how do you handle capabilities in a fine grained way? Sometimes you want to pass them around in different ways than via the dynamic environment.

So you'd need some way to

So you'd need some way to distinguish between what to take from the dynamic environment and what to take from the lexical scope.

I'm saying everything should be hooked up dynamically. There are a couple of ways you could encode the ability to play sounds remotely. It could be done as part of an effect that you lets you generally issue use remote capabilities, so that you hold in your position a reference that refers to "remote sound capability on machine XYZ". You could build such a system and it probably makes sense in distributed domains. I think it's still a good idea to be able to treat the entire network of machines as a parameter.

The second thing you could do is just use a local network capability to communicate with the remote process and have a more ad hoc protocol for telling the other machine to play sound (or more generally to run a Process in its environment).

Either way would probably work fine. See my response to David for a response to the 'too coarse grained' comment. We could still use something like 'capabilities' to represent fine grained authorities - they would just be tokens that are assigned meaning by the environment. This separates authority from access.

Dynamic Environment

You could model the dynamic environment, if you need one, e.g. as an extra parameter to the closure. But, as I noted earlier, we do *not* need a network capability to work with remote objects any more than we need a local memory capability to interact with local objects.

If we want to model explicit use of network capabilities just for remote object communications and such, that is possible, but does create a lot of optimization troubles (the runtime cannot batch communications as effectively, since everyone is using different network caps) and distribution troubles (network caps don't transfer easily).

You are still misconstruing the problem. We don't need, and we don't want, the local network cap. We just want the far ref, and a virtual machine with semi-transparent distribution.

Far refs in E

Generic "networking" does involve special capabilities of its own.

But far references are an E language primitive. They are not associated with a specific implementation. If I hand you my sound farref, any methods you call will be sent to me over the network (in the order you send them). If I were to call the same farref locally, it would cause asynchronous calls (via a queue), but would not actually involve the network.

Thus, farrefs do work with migrating closures.

The security risks associated with 'generic' network access do not exist when constrained to farrefs. I.e. without your network capability, the most the closure can do is send messages back to me so I can use my network capability. Thus, even though network caps include special privileges based on location (127, 192.168, and 10.0.0 addresses), I cannot gain any special privileges just by sending a farref to you.

see what calling the closure is sending over the network

This idea is one that people who understand nothing about security seem to fixate on. It has zero value from a security perspective - in general, data on the network will be opaque and inscrutable without context.

For security, it is important that you control arguments to the closure. You should treat it as part of a remote service. If you need confinement, you'll not use a closure - i.e. send a script or AST instead, that has no capabilities at all except those you pass into it.

If the goal is debugging, that should be done remotely anyway - i.e. the sender can augment the closure with its own debugging facilities - but as the host you are privileged to peek inside the closure if you want. If the concern is bandwidth abuse, the proper answer is serializing a sort of e-purse capability along with the closure, and using e-scrip (perhaps specific to you) to buy your resources.

closures are even useful when they don't have network access, if they only use local objects

Closures can be useful even if it just 'mostly' use local objects, i.e. from caps passed to the remote service (or to the closure, but that's the same thing). Indeed, one of the big reasons for code distribution is so you can more efficiently manipulate capabilities you already possess on the target machine.

But confinement patterns are available if you want them - e.g. I could model an 'interpreter' object to which you can send 'scripts'.

It seems wrong to lift network access to a special status, and create a complicated scheme for providing it implicitly in some situations. It is much nicer if we had a principled way to do that for all kinds of things we want to pass in implicitly, for example with some kind of dynamic scoping.

The 'principle' is that you can send messages to objects if and only if you have a capability to them. That is, a capability is a necessary and sufficient condition for communication. If an extra 'network' capability was required, that would be a violation of the object capability model.

Besides, using a farref is not the same as generic 'network access' any more than a reference to a local object is the same as generic 'memory access'.

You then distinguish between asynchronous and synchronous references, instead of near and far. A remote synchronous reference just waits for a reply and returns the result instead of returning a promise.

I think this would work poorly in the imperative context.

In E, developers get a lot of security and simplicity benefits from having multiple operations in a common vat (a vat is one thread), but if you wait on remote calls you'll need a lot more fine-grained threads and you start facing a lot of issues regarding race conditions and covert channels.

This idea is one that people

This idea is one that people who understand nothing about security seem to fixate on.

Thanks in advance for keeping ad-hominems out of this discussion.

It has zero value from a security perspective - in general, data on the network will be opaque and inscrutable without context.

For security, it is important that you control arguments to the closure. You should treat it as part of a remote service. If you need confinement, you'll not use a closure - i.e. send a script or AST instead, that has no capabilities at all except those you pass into it.

If the goal is debugging, that should be done remotely anyway - i.e. the sender can augment the closure with its own debugging facilities - but as the host you are privileged to peek inside the closure if you want. If the concern is bandwidth abuse, the proper answer is serializing a sort of e-purse capability along with the closure, and using e-scrip (perhaps specific to you) to buy your resources.

Are you proposing to change the representation of a closure from (AST,environment) to (AST,environment,e-purse), and then have the deserialization hook up the e-purse correctly somehow? Or do you want to involve e-purses in some other way? Please give a concrete code example, otherwise it's not possible to understand what exactly you propose to do.

There are plenty of reasons why you want to keep tabs on what a closure is sending over the network. For example in mobile applications there is often a cost per kilobyte sent over the network, so you might want to set a maximum network traffic cap that the code inside the closure can use. Or you might want to set a rate limit. Or you might want to temporarily stop it from having access to the network for whatever reason, or any other scenario that you can solve by just passing in the network capability that e-purses don't solve. If you have to pass in the network capability explicitly then you can easily build a network capability that deducts e-credits from an e-purse for network use. So the principle that we talked about multiple times applies again: don't unnecessarily bake in high level stuff into low level stuff, when the high level stuff can also be built on top of the low level stuff if designed correctly.

By providing network access to the closure implicitly you are intertwining two orthogonal problems: network access and convenient code distribution.

You can see the problem when implementing deserialize. Suppose the representation of a closure is an AST paired with a lexical environment:

def deserialize_closure(stream): 
  ast = deserialize_ast(stream)
  env = deserialize_env(stream)
  return compile(ast,env)

Because the environment might contain farrefs, we need to be able to deserialize them:

def deserialize_farref(stream):
  ip = ... read the ip of the origin farref ...
  id = ... read the object identifier identifying the object of this farref ...
  ??

Now at ?? we need to construct a farref. If the network capability of the farref is inside the farref itself (which seems like a good design to me), we need to have the networking capability in deserialize_farref. That means that we have to give deserialize special status to implicitly have access to network. It cannot be implemented in the user program. This does not seem like a good solution to me. It breaks the idea that distributable closures are a convenience built on compile(), i.e. closures solve the "convenient code distribution" problem.

So there are two things we can do: either we require the network capability to be explicitly passed in to the closure, or we add an extra parameter to deserialize for the network capability. The latter is more attractive to me, because it doesn't uglify the code nearly as much. Perhaps it would be useful to make this mechanism extensible, so that we can rewire arbitrary capabilities (not just network) when deserializing.

The 'principle' is that you can send messages to objects if and only if you have a capability to them. That is, a capability is a necessary and sufficient condition for communication. If an extra 'network' capability was required, that would be a violation of the object capability model.

As I said above, there are plenty of reasons why you'd want to give deserialized farrefs a different network capability than the unlimited ability to communicate back to the original object whenever and as much as it wishes.

In E, developers get a lot of security and simplicity benefits from having multiple operations in a common vat (a vat is one thread), but if you wait on remote calls you'll need a lot more fine-grained threads and you start facing a lot of issues regarding race conditions and covert channels.

If you don't want to wait, you don't use synchronous calls...

Are you proposing to change

Are you proposing to change the representation of a closure from (AST,environment) to (AST,environment,e-purse), and then have the deserialization hook up the e-purse correctly somehow? Or do you want to involve e-purses in some other way?

The closure itself would not have an e-purse. But the remote host would have a function like:

IllRunYourCodeIfYouShowMeTheMoney(closure,epurse)

This could be sitting alongside other functions like "IllRunYourCodeIfYouShowMeTheProof".

In general, payments would be counterbalanced with a purse going the other direction: IllProvideYouServiceIfYouShowMeTheMoney. The result is that most apps simply pay for themselves to run. While people can make a profit from this, the greater benefit is that people are careful about performance costs... and useless code (like viruses) cannot afford to exist.

If you're interested in details, I suggest you look up: Nick Szabo, Agoric Computing, Smart Contract, Micropayments. There is a lot of material out there that I do not want to discuss in detail here.

I agree that market integration should not be "primitive", but I certainly think it will become 'standardized' one way or another. And it does need to be accommodated in the language. Given your repeated complaints about putting features at the right level, I think you've been misunderstanding what 'accommodate' actually means, so I'll provide an example. With RDP, I accommodate market integration three ways: (1) predicative performance (time,space,etc.), so systems know where to associate costs in a mashup (and can make reasonable estimates). (2) clean disruption semantics - a system can safely shut down a behavior in a small, bounded amount of time. (3) secure composable communication model with ocaps, atop which we can build markets.

There are plenty of reasons why you want to keep tabs on what a closure is sending over the network.

Not quite. You offer reasons to keep tabs on when and how much a closure is sending over the network. The what is not a relevant part of that formula!

The network is not special in this regard. You also want to keep tabs on other resources: CPU time (when, how much), memory (when, how much), persistent storage (when, how much), power consumption (when, how much)... I hope you see the pattern!

Whatever solution is developed for keeping tabs on network costs should apply generally to other resources. Your fixation on the network and far refs in particular is simply not justifiable.

If you don't want to wait, you don't use synchronous calls...

Apologies for not adequately communicating my concern.

E has advantages by rejecting the possibility of synchronous calls waiting on the network:

  • Supports coarse-grained 'turns' in the vat. Developers aren't forced to use asynchronous calls due to fear that some synchronous call four levels deep happens to wait on the network.
  • Supports coarse-grained vats. Developers can have many concurrent activities within one vat, without worries that one of those activities will freeze the whole vat while waiting on the network.
  • Supports reasoning about determinism. Every synchronous turn in E is deterministic - the output is strictly a function of the state and the action. Determinism helps reasoning about security, and is also valuable for simplicity reasons.
  • Supports reasoning about disruption and failure modes. Logically, disruption only occurs between turns, and only to asynchronous promises and far refs. Developers can effectively handle these issues and recover.
  • Race conditions in E are a function of vat granularity. If we have a lot of small vats, communication between them is subject to a lot more fine-grained race conditions. This makes the system difficult to reason about. Race conditions are also a vector for covert channels.
  • Performance is also a function of granularity. Within a vat, we can pass mutable objects around and use shared-memory between tasks. Within a turn, memory locality of references is very high, which avoids cache thrashing.

This is what I meant by saying "a lot of simplicity and security benefits". I am wondering whether, and why, you would give up such huge advantages in pursuit of avoiding the distinction between 'near' and 'far' calls.

With RDP, I can reduce the near/far distinction without sacrificing the above list of advantages. But this is feasible only due to radical differences in the programming model, e.g. using data-flow rather than control-flow at the lowest level.

What does

What does IllRunYourCodeIfYouShowMeTheMoney do exactly? Whose epurse is that, and where does it come from? Where does the money come from and where does it go to? How does the IllRun... know how much to charge the epurse? Surely it needs to keep tabs on the network usage somehow. Perhaps a network capability would be a good way to implement that?

How do you use it to implement a policy like "run this code but don't let it get more than 1mb/hour network traffic".

Whatever solution is developed for keeping tabs on network costs should apply generally to other resources. Your fixation on the network and far refs in particular is simply not justifiable.

Who pays for what is not the problem, the problem is controlling network usage. Once you can do that, you can build a system that decides who to charge on top. The problem in most cases is not who to charge, but how much to grant. An app wants to run some code that it got in a closure, but it doesn't want that code to use up as much network as the closure wants. How do you solve that?

The network isn't special, you just made it special by granting special network access to deserialized farrefs. Which I'm saying you shouldn't do, you should treat network access like you treat any capability (i.e. pass it in to the deserialize function or pass it in to the closure).

As for the synchronous/asynchronous calls. There are really two issues at play here that E intertwines: asynchrony and failure.

SynchronousAsynchronous
LocalSynchronous local call (call on near ref)Asynchronous local call
RemoteBlocking network callAsynchronous network call (call on far ref)

As you can see, E has just 2 of the 4 possibilities. The language Io has both types of local calls. In Java syntax a.foo() calls foo synchronously and a.@foo() starts a new (lightweight) thread, and runs foo() in there. In the current thread this immediately returns a future. If foo was coded to do a synchronous network call, then putting an @ in front makes it an asynchronous network call.

Like you say there are reasons to just support the 2 corners of the table that E does, but there are also reasons to make the distinction between local/remote and synchronous/asynchronous separately. Asynchronous but local calls are a natural and useful thing to have, and once you have them remote asynchronous calls just fall out of synchronous calls + an @ sign to make them asynchronous. You want to make some calls asynchronously because they take a long time and you don't want to wait. The reason why they take a long time is not important. It could be a long running computation or a network round trip or disk IO or user input. If you don't want to wait you make the call asynchronously irrespective of the reason why it takes a long time.

The reason why we want to make a distinction between local and remote calls is that they have different failure modes. But asynchronous local calls don't have the same failure modes as asynchronous remote calls, so it's good to have them as a separate concept. Asynchronous local calls are also useful for other reasons you mention, for example access to shared mutable state and better cache locality for performance.

On data flow vs control flow. They are the same thing. Control flow is just a short notation for certain kinds of data flow. The do-notation of Haskell captures the essence of this nicely: when you write programs in do-notation you are using control flow, but under the covers this expands to data flow. For example instead of using direct state you could pass the state around in your program. The do-notation just lets you capture this pattern of data flow in a more concise notation.

What does

What does IllRunYourCodeIfYouShowMeTheMoney do exactly? Whose epurse is that, and where does it come from? Where does the money come from and where does it go to?

I give you code to run (the closure), and I give you incentive to run it (the purse), and then you run the code and take money from the purse.

Typically, the purse would be filled a little at a time (e.g. a few cents per minute of intense computation). Thus, you would appropriately control my access to resources, and I would control my losses.

Naturally, to validate that my money is being spent wisely, I can spend a little for my closure to 'call home with progress' information. I can do so until your reputation is sufficient that I don't feel as much need to do so. This is especially easy for problems where it is easier to check an answer than it is to create one. It also helps if the programming model is predictable, or deterministic so it can be run multiple locations to audit your behavior.

The purse can come from anywhere, but naturally it must provide a brand of e-scrip you accept (whether that be JacobsScrip or AmazonScrip or something else). Whether e-scrip is trade-able for cash on an open market would depend on its reputation and stability. (SecondLife cash, for example, is directly tradeable for cash.) Typically, one trades in e-scrip because it avoids a lot of the legal accounting hassles for a large number of small micropayments.

JacobsScrip might be something you mint yourself. E.g. when visiting a website, you might hand that website a deep purse of JacobsScrip so it can put code on your machine. When you no longer care about the website, you cut that purse, and the code on your machine quickly halts. This is, in essence, just a form of job control - market-integrated.

Anyhow, I gave you some keywords. This is a topic that interests me, but not one I know very deeply. Practice your Google-fu.

How does the IllRun... know how much to charge the epurse? Surely it needs to keep tabs on the network usage somehow. Perhaps a network capability would be a good way to implement that?

The function would be at the implementation layer itself - it's exposed by the virtual machine's administrator. Sure, it could use scheduler, network, memory, compiler, linker, etc. caps under the hood, but that has nothing to do with the semantics of the closure. I.e. I am not coupling to a particular mechanism here.

How do you use it to implement a policy like "run this code but don't let it get more than 1mb/hour network traffic".

You put prices on a curve that climbs however steeply you want, and let the client and your local optimizers decide how to most efficiently spend the money within the price policy. The other guy limits his losses by controlling the e-purse and deciding whether or not to distribute.

Who pays for what is not the problem, the problem is controlling network usage.

Controlling who pays for what is a solution to controlling network usage. It just attacks the problem from a different position - one that applies generally to all resources.

An app wants to run some code that it got in a closure, but it doesn't want that code to use up as much network as the closure wants. How do you solve that?

If we make code distribution explicit at the app level, the app would access to an 'IllRunYourCode' cap of its host (easy enough - it's effectively a public cap) and access to an appropriate purse (maybe its own). Given a purse, one can transparently create an attenuated purse that can deny access when one tries to reach too deeply too often. This would control the target code's resource budget, including network access.

However, I prefer other distribution idioms, where user code rarely handles touches code distribution, and leaves most of that work to cooperative optimizers (whose job it is to actually save money, and maybe skim some of the savings).

The network isn't special, you just made it special by granting special network access to deserialized farrefs.

That statement is not logically consistent. I cannot grant "special network access" because there is nothing special about network access! The network isn't special. It is another resource to be concerned about, but CPU, space, energy, etc. are equally concerning.

You make the network special because you FAILED to introduce capabilities for every other resource in the closure. For example, your deserialize could have required a 'memory' capability for every variable, and a 'scheduler' capability for every operation (add, subtract, branch, etc.). Clearly, you singled the network out.

I'll write a second response for the other part.

Anyhow, I gave you some

Anyhow, I gave you some keywords. This is a topic that interests me, but not one I know very deeply. Practice your Google-fu.

I'm not asking about how epurses and similar techniques work, I'm interested in how you envision them being use to solve the network access problem for closures. I can't find your thoughts on Google.

The function would be at the implementation layer itself - it's exposed by the virtual machine's administrator. Sure, it could use scheduler, network, memory, compiler, linker, etc. caps under the hood, but that has nothing to do with the semantics of the closure. I.e. I am not coupling to a particular mechanism here.

So at some level of the system you need to control network access of the closure. Wouldn't it be better to expose the ability to control network access of the closure, instead of just exposing the ability to charge for it? Then you could build the ability to charge for it on top of that. Furthermore, if we can choose a solution where less things are special primitives and not implementable in user land, isn't that better? We can do that by requiring that you pass in a network capability into deserialize. Then deserialization isn't special: you can implement it at the user level instead of implementing it at the system layer.

Another reason why it is the right thing to pass in a network capability into the closure is that you can use deserialize to send network requests to arbitrary locations. The serialized representation of a farref will include the ip and object id of the machine where the object that the farref points to lives. You can forge farrefs just like deserialize("farref(ip=234.243.23.12, object_id=34234)"). Then you can use the returned farref to send network traffic to that ip. So either deserialize needs to be a special system primitive that somehow gets a network capability from the system, or you need to give it a network capability in a parameter. The latter is a better idea. Suppose that the user decide that to limit the network traffic of the application that is also deserializing farrefs. You want to count the network traffic that the farrefs use towards the total network traffic of the application. So you need to pass in the application's network capability and not some system wide master network capability.

That statement is not logically consistent. I cannot grant "special network access" because there is nothing special about network access! The network isn't special. It is another resource to be concerned about, but CPU, space, energy, etc. are equally concerning.

You make the network special because you FAILED to introduce capabilities for every other resource in the closure. For example, your deserialize could have required a 'memory' capability for every variable, and a 'scheduler' capability for every operation (add, subtract, branch, etc.). Clearly, you singled the network out.

Wait a minute. I was suggesting to pass in network access explicitly like any other capability (either into deserialize or into the closure -- and I think that the former is a better idea), and you said that that was a bad idea. You are suggesting to treat network specially by creating a special system primitive to run code so that you can charge an e-purse for network access. If we decide to control memory with a capability then I'd pass that in to deserialize just like the network capability.

Wouldn't it be better

Wouldn't it be better to expose the ability to control network access of the closure, instead of just exposing the ability to charge for it?

It would not be better!

Resources - and their relative costs - must be considered both locally and wholistically.

An ideal optimizer should be able to move freely between resources (space, time, network) based on the relative local cost of each. For example, we can often trade space for time by recomputing values at need, and we can trade space for network by choosing an appropriate level of caching.

If you insist on separating out and specially handling the 'network' resource, your structure becomes rigid - the local optimizer is no longer able to trade network for other resources.

Resources are most efficiently managed in aggregate.

A lot of costs can be attenuated or shared between subprograms: garbage collection, network batching, inlining and specialization optimizations, shared caches, shared subscriptions, content-distribution networks, rework after node or transaction or TCP failure, et cetera. To make these work, we need something like 'taxes' and 'government' even down at the runtime level.

Example: ProcessA downloads a resource, then ProcessB later tries to download the same resource but discovers it was cached, it would be wrong to charge ProcessA for the whole download. Instead, you credit ProcessA part of the cost and charge ProcessB another part.

If we embed a bunch of resource management capabilities directly into each closure, they inherently become very rigid for both internal and composite optimizations. I would conservatively estimate that this will cost us two orders of magnitude for any sort of fine-grained web assembly.

You are still fixated on networking. Apparently telling you "don't fixate on networking" is sort of like telling you "don't think about a purple Elephant".

if we can choose a solution where less things are special primitives and not implementable in user land, isn't that better?

Yes, but only if all else is equal - simplicity, modularity, security, optimizability, composition, and so on.

pass in a network capability into deserialize. Then deserialization isn't special: you can implement it at the user level instead of implementing it at the system layer.

Exposing serialize and deserialize to regular applications is a very bad idea. Most code should not be granted authority to turn 'bits' into 'caps', nor vice versa, since doing hinders GC, optimization, security (via covert channels), and safety (similar to turning integers to pointers). I explained this in an earlier post that it seems you missed.

You seem to be assuming that a 'trusted' app is receiving the closure - i.e. an app trusted enough to have a network cap. This is a bad assumption.

The serialized representation of a farref will include the ip and object id of the machine where the object that the farref points to lives. You can forge farrefs

This is not a problem implicit to far references. You've just horribly designed your representation of them.

I use a typical scheme of HOSTID-OBJECTID where:
HOSTID is a secure hash of the host's public key.
OBJECTID is secure-random GUID.

There is no inherent dependency on IP or DNS. I expect to use HOSTID as a key in a distributed hash-table such as Chord, Pastry, or Tapestry. Since location is via distributed hash-table, developers cannot use HOSTIDs to ping arbitrary network resources. Initial attach to the distributed table is achievable by any mix of configuration, caching, and multi-cast. Hosts can export a registry of public services, which allows ambient computing and overlays.

HOSTID is self-validating - i.e. I can challenge the remote service to prove it has a public key with the same secure hash as my HOSTID. This way, I won't accidentally share an OBJECTID with a middleman imposter.

E language protocol (E-pluribus) uses a similar reference model, though the protocol is designed differently (mine is tweaked for RDP, overlay extensibility, and ambient programming).

Even assuming secure references, it is a bad idea to share the bits or bits-to-caps features unnecessarily, e.g. due to covert channels.

you need to pass in the application's network capability and not some system wide master network capability

You do need to know who to charge for what.

You are suggesting to treat network specially by creating a special system primitive to run code so that you can charge an e-purse for network access.

The function accepting (closure,e-purse) is NOT a primitive! It's a public service. Compare it instead to OpenGL - a foreign service, widely available, accessible through the runtime, standardized for portability. A function that takes an e-purse to run foreign closures would be: a foreign service, provided through the runtime, standardized for interop.

IIRC, one of Church's early

IIRC, one of Church's early formulations of lambda calculus had terms for networking and e-purses, but he eventually decided against them. Does anyone have a reference for this bit of history?

Interesting. I've never

Interesting. I've never heard that. But it would not surprise me to learn Church worked on a communication model or cost model, in more generic terms.

The idea of building cost models or linear logic into our programming models is, however, something I believe will be extremely important for 'Programming and Scaling', especially for integrating planners and AIs and such that could otherwise eat the world. I've been considering such a model for my work on 'generative grammars'. (You can't just generate sentences; you must generate cheap or profitable ones...)

That would be surprising

That would be surprising given that lambda calculus was designed (in the early 1900's) as a basis for mathematics and not for practical computation on computers.

I'm blanking on the name,

I'm blanking on the name, but there are other calculi built in the early 1900s for modeling chemical behaviors that happen to model both computation and conservation. [Edit: I was referring to Petri Nets.]

I long ago stopped being surprised when I learn people even in the 1800s were working on concepts relevant for practical automated computation...

But Matt tells me offline that the above is 'satire'.

tee hee hee

tee hee hee

If you insist on separating

If you insist on separating out and specially handling the 'network' resource, your structure becomes rigid - the local optimizer is no longer able to trade network for other resources.

Again, I don't. I pass in other capabilities just like networking. In fact because I write deserialize as a user space library procedure there is no other option.

You are still fixated on networking. Apparently telling you "don't fixate on networking" is sort of like telling you "don't think about a purple Elephant".

I'm fixated on networking because you are special casing it by providing special ways to control it instead of using regular capability passing that you'd use for other resources. Or are you suggesting that the same IllRun... function also charges the e-purse for all other resources, like time and memory? How then do you plan to let people extend it to new types of resources that come along?

Exposing serialize and deserialize to regular applications is a very bad idea. Most code should not be granted authority to turn 'bits' into 'caps', nor vice versa, since doing hinders GC, optimization, security (via covert channels), and safety (similar to turning integers to pointers). I explained this in an earlier post that it seems you missed.

You seem to be assuming that a 'trusted' app is receiving the closure - i.e. an app trusted enough to have a network cap. This is a bad assumption.

It seems that you have missed that I am not suggesting exposing anything. I said the opposite: write serialize/deserialize as a library procedure that gets no special status and capabilities from the system. This cannot be dangerous by definition. If you do not have the capability to read the contents of a closure, you cannot write code to serialize a closure (but you can write code to serialize other objects).

This is not a problem implicit to far references. You've just horribly designed your representation of them.

I agree that that is a better design :) Unfortunately it only solves the problem of having access to arbitrary network, it doesn't provide a mechanism to control network access (whether used directly by the program or internally by IllRunYourCodeIfYouShowMeTheMoney). I'm interested in the mechanism because I want to see how you plan to solve the same problem for new kinds of resources.

With the distributed hash table design we can split the ability to communicate with remote object X from the ability to use network traffic. Suppose we have a network capability that lets you network.send(objectID,message) where objectID is represented in the way you describe as HOSTID+OBJECTID, message is the data that you want to send to it, and network is a capability that lets you use network traffic. Note that network alone doesn't let you do anything, you also need an object to send a message to.

Now you can implement deserialize_farref as follows:

def deserialize_farref(stream, network):
  objectID = deserialize_objectID(stream) // this just parses the two numbers HOSTID and OBJECTID and puts them together in an objectID object that holds them both
  return new FarRef(objectID, network)

Now whenever you send a message to a FarRef, it does network.send(objectID, message).

You can implement charging for network like this:

class ChargingNetwork:
   def constructor(network, epurse):
      this.network = network
      this.epurse = epurse
   def send(objectID, message):
      this.epurse.transfer(costInDollars(message), myepurse)
      if myepurse got the money:
        this.network.send(objectID, message)

You can then implement IllRunYourCodeIfYouShowMeTheMoney like this:

def IllRunYourCodeIfYouShowMeTheMoney(code, epurse):
   deserialize_closure(code, ChargingNetwork(network, epurse))()

You also want to charge for e.g. memory, so you also add a memory capability as a parameter to deserialize. You then pass in a ChargingMemory to it in IllRunYourCodeIfYouShowMeTheMoney (if you have a lot of these or if you want to make the mechanism extensible then you'd wrap all the charging capabilities in a single object instead of adding a new parameter for each of them, or better yet you use one of the known extensible parsing techniques -- but this is a design issue that's beside the point I'm trying to make).

As you can see IllRunYourCodeIfYouShowMeTheMoney does not need to be a special primitive *if* you pass in capabilities to deserialize as I suggest. If nothing else you can view this as an implementation technique for IllRunYourCodeIfYouShowMeTheMoney.

The function accepting (closure,e-purse) is NOT a primitive! It's a public service. Compare it instead to OpenGL - a foreign service, widely available, accessible through the runtime, standardized for portability. A function that takes an e-purse to run foreign closures would be: a foreign service, provided through the runtime, standardized for interop.

That is what a primitive is. Of course it's not a machine primitive, but it's a primitive operation from the point of view of the user program because you cannot write it yourself. You can't write OpenGL from scratch in user land because you don't have access to the display. It is special much like print is special but factorial is not.

Network Resource

I treat networking like any other resource - i.e. I do not require capabilities to utilize memory or CPU or network for working with language objects. And I would control network resources by the same mechanism as every other resource - the e-purse.

The idea of simply charging for resources is is already generic across every family of resources and usage pattern I've even imagined. Of course, for a new resource, relevant parties would need to hammer out the fine details and ontology. But the essential architecture is stable.

It would be ludicrous to require a cap like: local.send(foo,Message). After all, we'd have an infinite regression issue. I think it just as ludicrous to require network.send(foo,Message). Holding 'foo' is, in the ocap model, sufficient proof that you are authorized to access it - whether it be local or remote. So there is no security reason to control network access. The only remaining issue is how to charge - which, with the e-purse, is the same way you would charge for local calls, but sometimes more expensive due to networking.

write serialize/deserialize as a library procedure that gets no special status and capabilities from the system. This cannot be dangerous by definition.

You cannot write serialize/deserialize without special capabilities from the system. If you can, your language is already unsafe - i.e. ability to turn bits into capabilities or back is dangerous by itself, as is ability to break encapsulation and peek inside closures or ADTs. If you have a static type system of any sort, deserialize is likely not type-safe, and we'd probably want to run some type-checkers while loading the code and integrating with mashups.

The consequences of providing those features in user-space - even as special capabilities from the system - is problematic with respect to security, safety, and optimizability.

You can implement charging for network like this:

I actually did pursue something like what you have developed above. That's why I speak so harshly of it today.

Consider:

  • delegation of a capability (or function accessing it) to another process on the same machine for, use at its own discretion (who should be charged?)
  • receive large data from some local process that is to be sent from your capability (should you be charged for the bandwidth? or should some of the charge be associated with the controller of the data?)
  • obtain cacheable resources that other processes may be accessing (who should be charged?)
  • when we serialize a closure on HOST1 and send it to HOST2, some of the references may now be local to HOST2. Why are you charging network fees for local sends?

And that's just for network resources.

If we add memory and CPU resources, these capability-based policies will be much more complex and even less realistic, and even more optimizations would need to be considered - i.e. space for network, inlining of mashup functions, specialization of your function with someone else's data, etc.

To perform cost-optimization, the charge policy must be written in a fairly simple, declarative way that the optimizer and automated systems can understand. And don't forget that the administrator must also understand these cost policies! The optimizer must also have flexibility to choose between resources (e.g. trade space for bandwidth, with caches). For high performance, we must amortize costs with other code in the local mashup by cross-optimization. Delegation of resources should have a fair cost model depending on whose code is controlling what.

Oh, and all this accounting (including itemized receipts) should be super-cheap. 5% performance costs is probably acceptable for immunity to viruses and to systematically pressure people to write really good code. But is 10% acceptable? So these costs records need to be amortized and very lightweight.

You will NEVER meet these requirements using per-resource capabilities. That approach is impossible for a typical administrator to grok or modify, rigid and inflexible for optimizations, funky for delegation, and has a ludicrous accounting overhead that you can begin to see even for something as coarse-grained as 'networking'.

There are no redeeming features, except that it seems obvious in a capability system. I would call this a fine example of 'simplistic' - a design that seems simple, but creates a lot of complication down the road.

Even as an implementation technique for IllRunYourCodeIfYouShowMeTheMoney, I denounce it.

it's a primitive operation from the point of view of the user program because you cannot write it yourself. You can't write OpenGL from scratch in user land because you don't have access to the display. It is special much like print is special but factorial is not.

There is a limbo in between primitive and user-space: extension space, filled with men of dubious trust who write high-risk plugins and extensions on your behalf. To accommodate these fine folk, I've developed a neat plugin extensible runtime design, which allows adding new runtime services and making them accessible in a capability-secure manner. Plugins can be embedded in the main process, or separated into different ones.

That is where I imagine 'IllRunYourCodeIfYouShowMeTheMoney' would be implemented, alongside other non-primitives but non-user-space features such as OpenGL, transport-layer extensions, foreign function adapters, and even bootstrap language facilities.

It would be ludicrous to

It would be ludicrous to require a cap like: local.send(foo,Message). After all, we'd have an infinite regression issue. I think it just as ludicrous to require network.send(foo,Message). Holding 'foo' is, in the ocap model, sufficient proof that you are authorized to access it - whether it be local or remote. So there is no security reason to control network access. The only remaining issue is how to charge - which, with the e-purse, is the same way you would charge for local calls, but sometimes more expensive due to networking.

I don't see why this is an issue. As I described a remote ref uses network.send internally when you do a remote call. The API of remote refs doesn't change, just the way they are constructed by deserialize. If you hold a remote ref to an object, you can send it a message because the network capability is encapsulated inside it.

You cannot write serialize/deserialize without special capabilities from the system. If you can, your language is already unsafe - i.e. ability to turn bits into capabilities or back is dangerous by itself, as is ability to break encapsulation and peek inside closures or ADTs. If you have a static type system of any sort, deserialize is likely not type-safe, and we'd probably want to run some type-checkers while loading the code and integrating with mashups.

The consequences of providing those features in user-space - even as special capabilities from the system - is problematic with respect to security, safety, and optimizability.

You can only write serialize for objects that you can peek inside. So if you don't have the ability to peek inside a particular closure, you can't write serialize for that closure. This is the right thing, because if you were able to serialize it you would also be able to peek inside it. Similarly you can only write deserialize for objects you can construct. I don't get what you perceive as a problem here. Perhaps you can give a concrete situation where this wouldn't work or would be insecure.

delegation of a capability (or function accessing it) to another process on the same machine for, use at its own discretion (who should be charged?)

You decide. If you don't mind to be charged, just share the capability directly. If you do mind, wrap it with something that transfers money from him to you when he uses it. This is like the real world. For example in the case of internet access, take an internet cafe. The internet cafe gets charged by their ISP, so they don't pass on the capability to use the internet to their clients. Instead they give their clients the capability to use internet for a fee paid to the internet cafe. With that money they pay their ISP. The same goes for the next two points: you decide.

For example a local caching service could work like this:

The cache service has a method cache.lookup(key,epurse). The cache itself gets charged for doing the operation (for example if it's doing network operations with a ChargingNetwork that charges its epurse). Therefore it charges the epurse passed in, say $2. It also remembers the epurse along with the cached result. When later another person is looking up the same thing, the cache charges this person $1, and transfers $1 to the epurse that it stored along with the cached result. This way $1 was paid for both lookups.

when we serialize a closure on HOST1 and send it to HOST2, some of the references may now be local to HOST2. Why are you charging network fees for local sends?

I only charge network fees for things that require network...?

There are no redeeming features, except that it seems obvious in a capability system. I would call this a fine example of 'simplistic' - a design that seems simple, but creates a lot of complication down the road.

You have permission to describe an alternative! I'd like to see something that is faster than passing a few capabilities around (which is already extremely cheap) and easier to understand for system administrators (it is not hard because the system administrator just specifies the cost policy for capabilities that the administrator gives out).

There is a limbo in between primitive and user-space: extension space, filled with men of dubious trust who write high-risk plugins and extensions on your behalf. To accommodate these fine folk, I've developed a neat plugin extensible runtime design, which allows adding new runtime services and making them accessible in a capability-secure manner. Plugins can be embedded in the main process, or separated into different ones.

That is where I imagine 'IllRunYourCodeIfYouShowMeTheMoney' would be implemented, alongside other non-primitives but non-user-space features such as OpenGL, transport-layer extensions, foreign function adapters, and even bootstrap language facilities.

Please do share your design!

Alternative to embedding cost caps

I envision every device as being a 'participant' in the cloud, and getting paid for it. Each device provides a declarative schedule of resource costs - e.g. a relation of (resource, time period, cost function) that covers the resources measured. Different costs may be available for long-term contracts (i.e. due to better guarantees of payment). Resources may also be auctioned on a more volatile market. All these costs can change over time (with relative levels of advanced notice, contract renewal, et cetera). The important bit, however, is that these costs are visible to whomever is considering whether to put code on the device (and how much). Alongside the IllRunYourCodeIfYouShowMeTheMoney service, we need a AndHeresHowMuchIllChargeYou service.

There are essentially two layers of optimizations: first, some external system decides which pieces of code to shard to which devices, based on being nearby other devices. And the devices themselves can perform some local optimizations to leverage machine-specific features and tightly integrate with other code.

For cloud services, there are two incentives to locally optimize code: first, the local servers now have a greater margin for profit - the distance between naive interpretation of code and this optimized form. Second, we can feed some of this margin back to the client, to improve competition or reputation. For mobile devices, there is at least one incentive to locally optimize code: to save precious resources (power, energy, bandwidth).

Information about actual costs feeds back to the distributed optimizer, which adjusts cost decisions accordingly about how to estimate the (costs, shard, location, time) relation.

The 'low level' detail - of exactly how to annotate the shard with profiling code - is pretty much completely irrelevant to this system. It doesn't need to be visible in user-space, though, and thus can be decided locally without forcing a standard. Any number of profiling mechanisms would work.

Devices in turn pay for the remote resources and services they use. Costs never quite even out (there is room for profit here) but that's okay. My goals are emergence of desirable features: system-wide optimization, systematic resistance to denial-of-service attacks, and systemic competition between cloud services (i.e. this strongly resists monopolies).

The trick to making this work is to decide where we CAN distribute shards of code. For that, I use a simple constraint model based on annotating code with trust/location requirements and contagion properties. The end result is that the distribution agent knows which remote services may directly observe which capabilities and state.

You decide. [Fee for service. Caches.]

I'm glad you're starting to really grok the e-purse design!

Once you've introduced costs, there is exactly one thing to optimize for: cost. You must assign CPU and space costs, too, otherwise optimizers would (and should!) consider them free, and gladly consume gigabytes of local space to save even one micropayment of networking costs.

It is true that you could 'decide' these things by this use of capabilities. Your ideas can work. I've pursued that direction before.

Unfortunately, capabilities are inscrutable to optimizers, and to administrators, and even to their developers. Seriously, it's the Brainfuck of payment mechanisms. A user-space closure is rarely in a position to know what fair resource costs are. I estimate that fine-grained code distribution and mashups can conservatively gain two orders of magnitude performance by avoiding this rigidity.

I would suggest that fee-for-service should be a separate layer, mostly orthogonal to fee-for-resources. Fee for service is primarily in the user-space layer. Fee for resources is the domain of distributed optimizers. The two are coupled, indirectly, only because some services are rigidly tied to particular resources.

You can only write serialize for objects that you can peek inside.

User-space should not be able to peek inside closures. Doing so hinders tight optimization, type safety, security of mashups.

Please do share your design!

In short: plugins provide sets of objects annotated with sensitivity constraints (public, private, private-but-non-portable, dangerous) and distribution or uniqueness constraints (i.e. unum, device-local, process-local). These constraints are used to sort the objects into different registries. Local applications obtain access to registries based on how much the administrator trusts them. (I literally provide the registry as an argument to the application, for capability security.) These objects can then be 'imported' from the appropriate registry.

I have decided on a content-centric approach to imports, which means I import objects based on their attributes rather than their names. A few meta-data attributes must be added to make this feasible. This allows me orthogonal configuration and policy management when I have more than one choice for a resource (such as Qt vs. GTK). It also gives me a lot of resilience, i.e. since I can easily fallback to a different plugin if one decides it's in a bad state and fails. I wrote a bit about this as policy injection [c2 node].

Anyhow, I can easily support multiple languages and new features this way. In the broad sense, there is nothing special about it - people are already familiar with pluggable browsers, web servers, and operating systems (with drivers). Why not a pluggable language runtime? But the content-centric discovery and capability-security mechanisms are rare, and I've never seen them used together.

This plugin mechanism is also orthogonal to my syntax extension mechanism, modulo the need for a bootstrap-syntax module.

E reference mechanics

E reference mechanics offer 3 of the 4 possibilities. Near references can accept both Immediate (foo.bar()) and Eventually (foo<-bar()) calls. Far references only allow Eventually calls. This gives us our semi-transparent distribution: we can treat near objects like far objects, but not vice versa.

Eventually calls are asynchronous, but rarely parallel. There are no threading semantics. An eventually call on a Near object will simply be queued for local processing. An eventually call on a Far object may be queued locally, batched (with promise pipelining), and flushed at end of turn. Implicit batching offers a great deal of performance when multiple calls are made.

Long-running tasks are often idiomatically modeled in E with incremental, local, eventually steps. Network and user events can thus be processed between steps. This idiom is common for a lot of systems with event queues.

The problem corner is the 'blocking network call'. There seems little reason to accept it, and I provided above a big list of reasons to reject it.

asynchronous local calls don't have the same failure modes as asynchronous remote calls

Their failure modes are essentially identical - i.e. exceptions can break a promise. The causes of failure are slightly wider, but modes are what we observe after failure, which is a broken promise.

On data flow vs control flow. They are the same thing. Control flow is just a short notation for certain kinds of data flow.

Your latter sentence could be rewritten as: control flow is a specialized subset of data flow. I do not believe you can conclude that data-flow is therefore a notation for control flow.

In any case, they certainly connote different programming architectures and different classes of concurrency models.

Eventually calls are

Eventually calls are asynchronous, but rarely parallel. There are no threading semantics. An eventually call on a Near object will simply be queued for local processing. An eventually call on a Far object may be queued locally, batched (with promise pipelining), and flushed at end of turn. Implicit batching offers a great deal of performance when multiple calls are made.

Yes, so far calls on near refs are lazy, but far calls on far refs are parallel. So the table really should have been:

SynchronousAsynchronousLazy
LocalSynchronous local call (near call on near ref)Asynchronous local callLazy local call (far call on near ref)
RemoteBlocking network callAsynchronous network call (far call on far ref)Lazy network call.

So in E you use the same syntax for two different concepts. And E does not provide local asynchronous calls, but it does provide local lazy calls.

The problem corner is the 'blocking network call'. There seems little reason to accept it, and I provided above a big list of reasons to reject it.

Rejecting it makes the language non orthogonal. We can separate what a method is doing internally from how it is called (synchronously/asynchronously/lazily), instead of coupling the two.

Their failure modes are essentially identical - i.e. exceptions can break a promise. The causes of failure are slightly wider, but modes are what we observe after failure, which is a broken promise.

The same could be said about a synchronous local call and a synchronous remote call. And I agree completely: this is actually an argument for treating asynchrony uniformly. What we want from an asynchronous call is (1) it should be executed in parallel so that we don't have to wait if it takes a long time and (2) failures and exceptions thrown should break the promise. What the method that we call asynchronously is doing internally that takes a long time or what it is doing that can fail doesn't matter. What I'm saying is that you should not view a remote call as really a remote call. There is no such thing as a remote call. What it really is is a local call that happens to use the network to get its result. And since this can take a long time, we often want to execute it asynchronously.

Your latter sentence could be rewritten as: control flow is a specialized subset of data flow. I do not believe you can conclude that data-flow is therefore a notation for control flow.

In any case, they certainly connote different programming architectures and different classes of concurrency models.

Right, I said it wrong, that's what I meant (subset instead of equal). As long as you have lexical scoping and function application you can do control flow via monads. Even if the underlying way that values interact with function calls is different, the conciseness in the notation for specific kinds of data flow could still be useful to you. For example in Haskell everything works with lazy values (say Lazy), and in ML everything works with strict values (say Strict). If your programming model works with time varying values (say Signal), you could still use control flow as a short notation for common patterns of data flow. It might turn out that that if you program with signals these monadic data flow patterns don't come up in practice, however.

Eventually is not Lazy

I'm really not liking the table as you've edited it. Eventually is not lazy. And asynchronous is not parallel. The prior table was better.

I agree that rejecting one quadrant of a table does make a language seem 'non-orthogonal'. But that's what semi-transparent distribution is all about. Trying to make it orthogonal seems to be a bad idea, probably because E uses an imperative programming model, which is, by nature, not orthogonal to concurrency and disruption. Why try to force it?

Rather than trying to make an orthogonal language by tacking on harmful features, I would suggest going back and finding the root reasons and fixing those - i.e. seek an alternative to imperative.

With regards to control-flow and data-flow: I agree that dataflow models can often support control-flow or work-flow models - often in rich, ad-hoc ways. But a good dataflow model will rarely need to emulate control flow. (Really, it seems the other way around: some 70-80% of applications is typically control-flow modeling data-flow, often badly.)

I'm really not liking the

I'm really not liking the table as you've edited it. Eventually is not lazy. And asynchronous is not parallel. The prior table was better.

Ah, you're right. I was thinking that the call would be made only when the result was needed. Then that's actually a pretty good design, but I'm still not sure if not providing the blocking quadrant is a good thing. If you regard a vat as a boundary of synchronous execution, then it's a good thing. But if you do that then you regard parallel computations running on one machine the same as parallel computations running on two different machines. It is often useful to distinguish between the two, because in the latter scenario there are fewer communication failure modes. If you run with multiple threads then blocking calls aren't bad any more; spawning a thread that can block on the network and then does some more work is a useful pattern. It's really the same as hooking up an event handler on a network request as you do in E, with the difference that computations can run in parallel on a multicore machine.

Rather than trying to make an orthogonal language by tacking on harmful features, I would suggest going back and finding the root reasons and fixing those - i.e. seek an alternative to imperative.

With regards to control-flow and data-flow: I agree that dataflow models can often support control-flow or work-flow models - often in rich, ad-hoc ways. But a good dataflow model will rarely need to emulate control flow. (Really, it seems the other way around: some 70-80% of applications is typically control-flow modeling data-flow, often badly.)

I agree with both points, with the caveat that imperative programming on our imperative machines is how you get good, predictable performance.

BTW, promise pipelining has a comonadic structure:

extract p = p.waitForResultAndReturnIt()
cobind p f = let q = new Promise() in spawn q.fullfill(f p); q

Where spawn spawns a new thread. The implementation is not purely functional, but the resulting interface is (just like the laziness comonad). Sorry for the awful mix of Haskell and Java.

We Dream of Parallel

E could be augmented with some deterministic parallelism. Though it is not well designed for such augmentation, a lot of the work on parallelism in C or Fortress could certainly apply.

But I agree that modeling parallelism and machines separately can be advantageous. I have developed a model of vats in Haskell, and each 'vat' is in a 'host'. This lets me use Haskell's rich types, data-parallelism, and laziness within a host and enforce serializable properties between hosts. Mutable state, however, should not be shared between vats.

with the caveat that imperative programming on our imperative machines is how you get good, predictable performance

I would disagree with this caveat, on the basis that our 'machines' and are not generally imperative - not once you account for sensors, GPUs, buffered actuators (e.g. sound), FPGAs and DSPs and so on.

Our CPUs today are imperative. But how much of our machine is the CPU?

For good, predictable performance, I think we can do much better than imperative, for matching both our machines and software requirements.

promise pipelining has a comonadic structure

I hadn't really thought about that.

I pursued promise pipelining for distributed systems a few years ago, but I decided that promises are ultimately a good, subtle way to surreptitiously freeze a distant process with a denial-of-service attack - not a feature I want in a web assembly. Also, broken promises seem to leak implementation details. (Exceptions have the same problem, but promises take that problem and spread it everywhere.)

I'm still very interested in pipelining and implicit batching between systems, though. That E uses promises to achieve them is clever, but is not the only valid approach.

I would disagree with this

I would disagree with this caveat, on the basis that our 'machines' and are not generally imperative - not once you account for sensors, GPUs, buffered actuators (e.g. sound), FPGAs and DSPs and so on.

Our CPUs today are imperative. But how much of our machine is the CPU?
For good, predictable performance, I think we can do much better than imperative, for matching both our machines and software requirements.

Right, I meant performance on CPU bound tasks. Interestingly although imperative languages are close to the hardware we have today, functional and dataflow languages are much closer to the physics. The natural way to do a computation (a+b)*(c+d) in hardware is to have 4 lines of data coming in, then two adders each of which has one line of data coming out which go into a multiplier. Hardware supports fine grain parallelism naturally. There is no implicit sequencing between the computations (a+b) and (c+d), i.e. there is no control flow - only data flow. So if you run a functional program on top of a normal CPU you have very fine grain parallelism in the program, then you remove that by turning it into machine code, and run that on a machine that is trying to recover some of the fine grain parallelism by using techniques like out of order execution. Unfortunately we're stuck with current hardware for the time being.

I pursued promise pipelining for distributed systems a few years ago, but I decided that promises are ultimately a good, subtle way to surreptitiously freeze a distant process with a denial-of-service attack - not a feature I want in a web assembly. Also, broken promises seem to leak implementation details. (Exceptions have the same problem, but promises take that problem and spread it everywhere.)

Yes, you still need to remember to do timeouts and such. Why do broken promises leak implementation details? Do you mean that they reveal that some computation used a remote resource? As long as you stay in the land of promise pipelining you don't have that problem?

I'm still very interested in pipelining and implicit batching between systems, though. That E uses promises to achieve them is clever, but is not the only valid approach.

Ultimately I think that any pipelining and implicit batching is going to use something similar to promise pipelining. You want a way to compute new values based on values coming from the network. So internally you want to hook up an event handler that goes on with the computation once the value from the network comes in. Do you see a completely different way of doing things?

Broken promises and timeouts

Broken promises in E will also report a 'reason'. To the agent receiving the broken promise, this reason is effectively an implementation detail. If we only reported broken as a boolean condition (and exceptions too) this would not be an issue.

Using timeouts means you have a bad semantic model to start with, and doesn't actually solve any denial-of-service or security. I could surreptitiously inject yet another unfulfilled promise into each retry.

any pipelining and implicit batching is going to use something similar to promise pipelining

An issue with promises is that you can make a single promise and use it in multiple locations in an ad-hoc manner. Also, we can create new promises, which are fulfilled in ad-hoc manners. If we control structure of pipelining and how we create promises, we can make stronger guarantees about them. Even better if the runtime can determine how long it should take to hear a result.

META: would you mind narrowing your 'deserialize' def, above?

Using timeouts means you

Using timeouts means you have a bad semantic model to start with, and doesn't actually solve any denial-of-service or security. I could surreptitiously inject yet another unfulfilled promise into each retry.

I mean the ability to do custom timeouts, not automatically retried timeouts. If you do some operation but it's taking too long, you often want to perform an alternative operation.

An issue with promises is that you can make a single promise and use it in multiple locations in an ad-hoc manner. Also, we can create new promises, which are fulfilled in ad-hoc manners. If we control structure of pipelining and how we create promises, we can make stronger guarantees about them. Even better if the runtime can determine how long it should take to hear a result.

Yes, I took that as a given. Staying in the land of promise pipelining and not using promises directly is one controlled way of doing things. The cobind I gave above is a way to wire things together in a more controlled way. I see promises as an implementation detail for more controlled building blocks.

META: would you mind narrowing your 'deserialize' def, above?

What do you mean by narrowing?

Pipelining without Promises

custom timeouts, not automatically retried timeouts

It doesn't matter whether retries are automatic or not. The surreptitious denial of service still 'succeeded'.

And user-space timeouts, even without retries, still smell of semantic problems - indeterminism, informality, lack of effective control - in addition to efficiency issues regarding polling and orthogonal persistence.

I see promises as an implementation detail for more controlled building blocks.

That seems like an abstraction inversion to me.

I spent a couple hours last night figuring out how to pipeline using a weaker primitive - my VatCB methods - which is just enough for Unix-like piping and tee-junctions, but doesn't allow joins (except via side-effects).

type CBFun cb x y = (x,y->cb)-> cb

class (Vat v, Callable v cb) => VatCB v cb where
  mkcb0 :: (x -> v ()) -> (x -> cb)

mkcb :: (VatCB v cb) => (x -> v y) -> v (CBFun cb x y)
mkcb op = mkcb0 $ \ (x,ycb) -> op x >>= call0 . ycb 

class (Monoid cb) => Callable m cb where 
  call0 :: cb -> m ()

pip :: CBFun cb x y -> CBFun cb y z -> CBFun cb x z
pip f g (x,zcb) = let ycb y = g (y,zcb) in f (x,ycb)

tee :: CBFun cb x y_ -> CBFun cb x x
tee f (x,xcb) = f (x, const mempty) `mappend` xcb x

-- call a pipeline and drop the last response.
call_ :: (Callable m cb) => CBFun cb x y -> x -> m ()
call_ f x = call0 (f (x, const mempty))

Now I can easily pipeline a bunch of method-calls between vats. I can:

let xyz = x `pip` y `pip` z in
let abc = a `pip` b `pip` tee xyz `pip` c in
call_ abc arg

Here, a, b, c, x, y, and z denote methods that may be transparently distributed between different vats or IO threads. The output from 'b' would go to both x and c. Advantages: there is no round-tripping with the caller (direct hops between vats), and communication is wait-free (control constructors with 'cb' type), and the pipeline does not create new threads (no spawning needed), and methods can be very precise and fine-grained (capability security).

I haven't found much use for it yet, but I suspect this will be valuable for legacy integration, and a useful way to develop multi-threaded applications. While I was tweaking the VatCB model, I also improved IO integration - a Haskell IO thread can call-and-wait or even join the pipelines by use of the Control.Concurrent.Chan type, using class (Callable IO cb) => ChanCB cb where cbChan :: (a -> b) -> Chan b -> (a -> cb).

Vats are sort of incidental to my work in RDP. They are powerful in their own right: vats are a lot like Erlang processes or Actors, except with the flexibility, security, and performance benefits of first-class methods and fine-grained capability security.

RDP is pipelined by nature, using signals in continuous time. Before I decided on using vats in the implementation layer, I did try promises and other things (STM, for example), but found promises make a poor 'building block' for reasons I describe above (and elsewhere). Those prototypes failed, but taught me a lot about what features I want in the implementation and semantics - such as wait-free communication, and absolute control over local glitches and latencies. This led me to my use of vats and temporal semantics just a few months after conceiving of RDP. After committing to temporal semantics, RDP's pipelines offer a lot of very nice properties - static latencies, real-time, duration coupling, anticipation. I find these 'guarantees' vastly preferable to 'promises'.

What do you mean by narrowing?

I mean physically. The '// HUGE COMMENT' requires horizontal scrolling, at least in Chrome and Firefox, and also widens this entire LtU thread to an intolerable degree (esp. for my ASUS EEEPC). You should be careful about how wide you make your '<pre>' blocks.

Proposed Web Assembly Model

serialize and deserialize that can convert any object in the hypothetical language to and from binary data that can be sent over the network

I initially pursued an approach like you describe in 2004, but found this idea has many significant problems beyond CPU and memory management:

  1. Capabilities protected by serialized objects or closures are now visible as 'bits' in the serialized representation, and accessible through deserialize. This is a security and safety concern.
  2. The extent of a closure or object is generally unclear, i.e. the extent to which it should include referenced objects, or which pieces are 'secret' and should be constrained to a particular machine.
  3. Garbage collection and memory management are difficult when we have references hidden in these opaque structures.
  4. Serialization is essentially crystalization, a snapshot of the past. In general, liveness is a desirable property, especially when we start to capture an environment.

Access to a serialization and deserialization 'capabilities' is acceptable, so long as you can control access to them. However, we'd still need to design a language that is 'good for distribution' to solve the extent, GC, and liveness issues. (A good start is to eliminate pervasive mutability, and to annotate code or objects with desired locations (A nearby B) and sensitivity concerns.)

calling the function might consume time and memory. I think we could solve that by including a thread system that allows you to create a new thread with a given maximum memory space, and the ability to run a thread for a given amount of time

This solution is one that has been used before with moderate effectiveness.

But it raises safety issues - e.g. we might stop a thread that is holding an important resource (e.g. a lock, or a callback that some other system needs, etc.) because it runs out of space or time. It is difficult to protect our abstractions or reason about the system when failure can occur at any arbitrary state.

As you note, there are many issues with using this technique at the application assembly level:

  • We might both benefit if I specialize (partially evaluate) one of my functions for one of your inputs. But who pays for this cooperative venture?
  • Who pays for the rework after a transaction failure due to conflict? Who chooses which transaction continues?
  • How do we provide a service the resources to process our tasks? Can we ensure our systems do not become bloated from temptation to skim extra computations off the top?

we could model memory as a capability as well, by passing in functions that allocate and free memory that stop working when the parent that passed these capabilities thinks they got enough memory. Perhaps we could manage time explicitly as a capability in a similar way, but this seems more difficult and the method of donating time slices to other threads seems sufficient.

Providing memory management and scheduling as capabilities is a well proven technique, e.g. used in KeyKOS. It is one of the approaches I favor. But capabilities do not resolve the cooperative payment or safety concerns mentioned above.

I'm not dismissing the idea. Actually, I see capabilities as a fine basis for managing resources at a sort of 'compute service' layer - e.g. a cloud. But that is not really the web assembly level, since it needs to make a lot of decisions that cross multiple assemblies.

I would eventually like to see people 'auctioning' a fraction of their spare compute, bandwidth, memory, and persistent storage, and participating in a sort of distributed compute service that touches every endpoint. The 'language runtime' would then be integrated with this compute service at every node that wishes to join in.

You could also have a function that creates a closure (with an empty environment) from an AST. That would give you the ability to dynamically generate and execute code like eval, but without implicit access to the environment.

Isn't that what libraries are for? ;-)

I'm guessing you mean to avoid interpreter overheads - or, in other words, you would like a suitable basis for staged programming.

This is certainly a valuable feature. From a security perspective, staging allows us to achieve confinement wherever we need it. From an abstraction perspective, staging allows us to 'separate concerns' between performance and abstraction. That is, we don't want to pay indirection for every layer of abstraction - we want to 'flatten' the tower of languages and abstractions at runtime. Support for high-level abstractions bring their own performance benefits, since we can do a lot of symbolic optimization for composition at the higher layers.

A trouble with ASTs as the basis for distribution is that we really do want to distribute capabilities. One of the big reasons for putting code on your machine is to efficiently manipulate objects that are on your machine. For security, performance, and simplicity - the best way for me to prove I have legitimate access to those objects is to name them (as opposed to embedding some sort of lookup mechanism in the distributed code). This allows me to separate concerns of resource discovery and code distribution.

Before I developed RDP, I favored an alternative to ASTs: pure functions.

A pure function can keep secrets from both its client (who cannot access secret values hidden in the funtion) and its provider (who cannot capture sensitive data passed into the function). Some pure functions can return a behavior to be executed, but even then both participants can control which secrets are seen. This assumes, of course, that a mutually trusted compute service is involved somewhere in the middle. But that is not an unusual condition - it's the basis for all mashups.

Staging with pure functions is feasible: trace compilers, annotations for partial evaluation or specialization, etc.

The problem with pure functions is that they tend to accumulate a lot of secrets (via currying), and it is difficult to distribute just 'part' of a function. For example, if you delegate secret parts of a function to a remote system, you are at severe risk of abstraction violation because network disruption is not a 'pure' effect! We must reject remote values.

My RDP design calls for an AST-based approach (pure data distribution) for persistent code (which means: anything that must survive network disruption). Volatile code is considerably more flexible, for example if you query me for some resource, I might send you a 'rich' redirect, which happens to involve a lot of code - perhaps including some functions and references to other systems that have part of the answer.

By its very definition, volatile code preserves disruption semantics: we treat volatile code strictly as if it's still part of the remote service, so network disruption requires stopping the volatile code, too. The costs of working with volatile code can be hugely mitigated by caching (which can allow us to maintain compiled code across network disruptions).

Volatile code distribution cleanly solves the 'pure function' distribution problem. It is not tightly structured like an AST or pure function - this gives the runtime the flexibility it needs to make good, fine-grained distribution decisions.

Another nice property is that volatile code tends to be very 'voluntary' by nature - i.e. we obtain such code through redirects, through accessing remote services. Thus, volatile code also avoids the issue of 'who pays whom'. (We don't ever need to 'push' volatile code. It is sufficient to push a reference that, when voluntarily requested, returns code saying how to use the reference.)

Developers and IDEs can leverage 'persistent' code distribution - that's how we get mobile agents, distributed applications and overlays, autonomous missions management and so on. I think this would be a mostly-explicit level of code distribution.

But a lot of 'web assembly' features happen with the volatile code. And if we are to support a lot of fine-grained volatile code, we really need a good, predictable, common model for it.

For flexibility in passing data around between different applications/closures instantiated dynamically it would be easiest if the language were dynamically typed, or at least we'd want the runtime values to be tagged.

I favor a static typing model within each application, and I favor disruption semantics if someone messes up at the border between applications. Failing fast and having a simple fallback plan is, I think, a better approach than limping along.

However, I do plan to augment this type reflection: for example, if you have a reference to an object, you can query a standard capability (and service, of remote runtimes) for some 'type info' about that reference. This would tell you what an input should look like, and what an output would look like, and maybe even give you an idea about what the object does (e.g. with design by contract).

Services leveraging those capabilities can provide dynamic adapters when you need them.

Dynamic typing is not the only approach to flexibility - and very far from the safest or most secure.

What do you think needs to be added to this model for it to be a local optimum in the design space, for reasonably large values of local? What should be changed, what should be subtracted?

Polling needs to go - it is inefficient if things haven't changed, and has high latency if things have changed. If you can read something on the network, you should be able to subscribe to it.

Furthermore, demand-driven resources are valuable - i.e. this is the idea of creating an object or service right when someone asks for it. Subscriptions are a useful way to indicate demand with a clear lifespan.

Stability is a huge concern. One of the potential failure modes for a complex system is something like an epileptic fit. The programming model and language should help developers reason about or protect stability, despite each programmer only thinking about his or her own slice of the pie.

Redundancy is important on the web, as a basis for availability, partitioning tolerance, and scalable performance. Redundancy is difficult if code is indeterministic or tends towards divergence. It is preferable that our web assembly be deterministic to whatever extent is feasible, and that it stabilize on a state that is consistent between observers. This has a huge impact on the programming language, for example denying use of actors model and discouraging state-per-observer.

Concurrency is a natural issue on the web, and intersects with almost every other concern - security, scalability, performance etc. A bad concurrency model will enable denial-of-service attacks, introduce covert channels, introduce indeterminism that hinders redundancy, and so on. History proves that, absent a decent concurrency model, developers will reinvent them - many times, inconsistently, and badly. I very strongly suggest a simple concurrency model be built in, though it should be flexible enough that other models can be built above it.

Working with large collections (millions or billions of elements) is a common issue on the web. In the general case these collections are live (can change over time) and derived (i.e. they represent predicates or views, not unique objects). We want to 'pass' these collections between different libraries or services - without unnecessarily serializing them - and perform compositions, queries, and filters as we go. Even if we do this in libraries (which I'm partial to) it would need to be well standardized, and user level code would probably benefit from syntactic sugar for it.

The eternal nature of web services introduces a unique challenge: all web programming is live programming. We cannot ever stop or shut down the web. The desiderata for applications and services on the web is typically 99.999% uptime, which means 5 minutes downtime per year. This suggests that we should favor a language (or programming model, at the assembly level) that is a suitable basis for live programming. A suitable basis for live programming is a language that we can reason about in the 'continuous present' - i.e. just from visible state and visible code. There are many models that are good for this - dataflow, flow-based, procedural, rules-based, temporal logic. OOP is not one of them: objects tend to escape their creators, which makes it difficult to reason about the state of the application in terms of visible state and code. The best models for live programming tend to 'externalize' state, i.e. modeling it as belonging to the environment (filesystem, database, abstract register machine, whatever).

The web is ultimately about discovering services and providing them. Today, we are stuck with humans in the middle of the linking process - i.e. search for services and libraries with Google, then integrate them. If we are to have true 'web assembly' languages, we need support for 'web assembly linkers' - i.e. that can locate dependencies on the web, and provide services to the web. Providing this securely is something I've been pursuing (linkers as constraint solvers, networks of registries). This is clearly most relevant at the programmer-code level, but I expect some de-facto standardization will be required to make it work consistently and securely. (Preferably, linking will be deterministic. However, we might need to settle for 'stable' to avoid too often transitioning state often between dependencies.)

Good luck with the implementation, I'm looking forward to trying it out.

I've considered the above issues and more (e.g. enabling continuous innovation and legacy integration). I crunched these requirements for years, going through one design and prototype after another... and all of them were complicated, ugly brutes (with the benefit of hindsight).

I figuratively stumbled across RDP while doing my 2010 taxes. This felt more like discovery than design, though I doubt I'd have understood it without my prior efforts (most critically, I was trying to reconcile my work on FFI integration for dataflow with both (a) multi-cast optimizations, and (b) a distributed equivalent to dynamic scope). Since its discovery, though, I've certainly been doing a lot of work to make it formal, precise, usable.

RDP seems to handle everything I throw at it far more effectively than I had imagined was feasible - at least at a high level, on paper. It has only improved as I've formalized it. Yet it is so deceptively simple that I can scarce believe it myself.

I hope to have the implementation finished and tested with various apps by October.

Orchestration Languages

DSLs are certainly part of the missing piece - we'll always benefit from dedicated languages for UI, device control, video processing, music and sound generation, animation, text layout, symbolic math, and so on. The idea is to support DSLs like we support libraries today.

But that still leaves the bit of making all these things work together. To make a robot dance, you need to detect rhythm in one domain and control actuators in another. How do we glue these things together?

Orchestration is the missing piece you're looking for. I estimate that about 90% of code in large projects is 'adapter' code that ineffectually glues together independently developed domain and data models. If we have a good language for this part of the task, we really can have a silver bullet.

Orchestration languages are based heavily around communication, composition, scatter-gather, data fusion, concurrency, consistency, synchronization, resource management, discovery, failure handling, configuration and policy management, open extension, dataflow, and workflow. Importantly, they must be bi-directional, able to push as well as pull, to gain data from sensors then manipulate actuators and databases.

A lot of orchestration languages have been developed. But I think we can do a lot better.

If developers build DSLs above a very good orchestration language, then those languages will work easily together (by construction - i.e. this only works because DSL designers were constrained by the orchestration language). We'll be able to avoid 80% of the adapter code, and abstract at least half of the remainder. Now, since we started with 90% adapter code and 10% domain specific, this would reduce us to about a 50/50 split, and cut us down to 1/5 the code we currently have today. Further improvements would require simplifying the DSLs, making them easier to integrate.

I've been doing a lot of work towards excellent orchestration languages, with my reactive demand programming model and further ideas I'd like to build atop it - such as generative grammars based programming (which supports complex planning, execution, and reactive replanning across domains), and content-centric matchmakers (even for linkers) as a basis of program modularity.

Orchestration is the new Scripting

I caught the orchestration bug working with the Orc language(1), but my view has expanded with greater exposure to the working practice and history of program orchestration. The vital upshot of my secondary revelation is that orchestration is another example of a new term for concepts that have been kicking around since the dawn of computing. The only way out of such reinvent-the-wheel quagmires seems to be to solve the abstraction-layering problem somehow. Plan9's "everything is a file" and Unix Pipes' "everything reads/writes stdin/out" appear to be successful examples of such. I think it behooves us to follow these patterns and keep finding excellent layers of abstraction. It's quite possibly due to this kind of orchestration power that "scripting languages" and their features have been growing into everyday programming practice.

With respect to history, I've noted the deep orchestration primitives evident in old, lasting automation technologies. And I'm talking real old; I've been exposed recently to JCL, the 60's vintage mainframe program control language. It's all about orchestrating input, output, and program allocation. In painful detail, of course, but that's exactly the point; back in the day you couldn't get away with high level abstractions because of tight hardware limitations. As these limitations eased up we got the higher-level script facilities of the shells and traditional first-gen scripting languages. And don't forget the declarative models-- Make is a build orchestration system. It lacks clean abstractions at a certain point, which prompted other build solutions and then higher level build-context orchestration systems like Jenkins.

So, I like to see work on orchestration systems, and I should add the systems you mentioned to my to-look-at list. But I think the orchestration problem is big and broad enough to need clear delimitation. That is, what abstractions are you targeting? Any system of nontrivial complexity will end up needing multiple layers of context-appropriate orchestration.

(1) Which can now be taken for a spin: http://orc.csres.utexas.edu/