Live programming in APX, an early peek

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

wow

super cool stuff. the mouse cursor is blocking the text for a while there at the start. why did the (o,o,o) circle appear as it did? is it guaranteed to always default to something visible? even if that means it doesn't always generate the exact same default values for the circle? i really do not like how the text below where you are editing keeps jumping up and down. super cool stuff.

why did the (o,o,o) circle

why did the (o,o,o) circle appear as it did? is it guaranteed to always default to something visible?

That was code completion. And...no, the ? mark (rendered as o) symbol represents a hole that is to be filled with a default (or random) value if possible. It just happens that the arguments of a shape can all be filled. The defaults also depend on the inferred type, so even though radius is a number like lerp percent is a number, ? with a type radius will lead to one number that is adequate in magnitude to be a radius, while ? as a percent will be between 0 and 1 (probably .5). Positions are random but depend on the dimensions of the screen, so if you draw multiple shapes with default values, they will not wind up in the same place.

i really do not like how the text below where you are editing keeps jumping up and down.

Ya, vertical jumpiness takes getting used to and might be forever annoying. I took care of showing error messages with a delta, but allocating the v-space occurs as soon as the error occurs...even if you never see it! Maybe I just need to try harder, or we can put the live feedback text somewhere else...like to the side. Or maybe even overlay it with the next line when the line is selected. There are lots of possibilities in the design space, but it is important to get the conversation started.

Looks cool

The effect of drawing other frames at lower opacity -- is that built-in or written in the language? Also, the code for 'inflect' and the 'InflectUtil' that hangs around through the entire session... what does that do? How is it "called"?

The strobe effect is built

The strobe effect is built into the UI library. Glitch will save trace state over time, which can be used to redraw frames at various times in various states (when we scrub through time, we are just redrawing the main frame for the new scrub-set time). So strobbing is an easy feature to support since execution is already indexed to time (likewise for other visualizations like graphing values over time).

InflectUtil is used towards the end of the demo to create the bounce (called at "new.inflect1(p1, ....)"), it is used to "inflect" velocity when position reaches a wall. The functionality was too niche to build in as a standard function (we'd really want full on collision detection and reaction, but that would be hard to do on a short demo). I also could have wrote it in real time, but it wouldn't have been very interesting.

But would it be possible to

But would it be possible to build the strobe effect from inside Glitch? Is there a way for one computation to observe another computation as being time-indexed?

I saw the bouncing part, but missed the call to inflect (with no voice overlay, I was just hopping around). Since it was there from the beginning, I thought it was always doing something. It does look like reflection code, though. I should have put that together.

In the language there is no

In the language there is no way for computations to observe different times; i.e. if you want access to the past, you'll need to save it appropriately (this isn't FRP). The only reason we can strobe at all is by going through Glitch's internals.

Inflect really shoud have been a couple of methods, but since Type-less doesn't support generic methods yet, I had to settle for a generic trait instead (and this genericity isn't really needed either).

bool scrubbing

Here is another short video that demonstrates why you'd ever want to scrub boolean constants:

https://www.youtube.com/watch?v=ctT7WIPeHjg

Hanging out the passenger side of his best friend's ride

What does scrub mean in this context?

Digital music/video

It comes from digital music/video tools. Originally, it just applied to time (scrubbing back and forth between multiple frames), but it makes sense to use the term for values also, since the experience is quite similar. I got the term from Bret Victor (http://worrydream.com/ScrubbingCalculator/), so I guess it is canon at this point.

It might not be the right term; tweaking might be more appropriate, since definition-wise it is probably the closest in meaning. I also thought about using the term twerking, seriously...ya, sure.

Well. Scrubbing made

Well. Scrubbing made immediate sense to me. So I would keep it.

fwi: makes sense to me

For what it's worth, I immediately knew what you meant by "scrubbing". I believe it's a pretty well understood term and your usage seems correct.

scrubbing

Scrubbing refers to the timeline, but in Maya and similar tools you can use sliders on any numeric attribute. In addition, you can set up so-called "set-driven keys" which allow you to treat an attribute as if it were a timeline. See for example this video .

Good point! On the other

Good point! On the other hand, scrubbing on code does not navigate through the execution space, but the solution space. We probably want spatial exploratory scrubbing also, but I'm not sure how to do that beyond using scrubbing to configure what loop iteration has its execution being projected in the editor (which is already pretty cool, but...a program is big).

Scrubbing (audio)

I think the term has to do with scrubbing an audio tape across a playback head to better navigate it. The difference between true and false is displayed reactively as the cursor moves up and down, even before the user has ceremoniously decided on a selection.

cool

1) please figure out how to get youtube to not put up really scary horrible things for what it thinks are related videos.

2) i wish the rendering would have (the option of showing) a quickly animated fading out of what code change cause the rendering to change. ideally the diff on the graphics would have a red circle around it and a text bubble with the name of the variable that changed. having the giant gutter in the middle is sort of making things be too far apart.

(1) is based on your own

(1) is based on your own viewing history, so I can't help you there.

(2) is a more difficult problem since lots of things change to cause the rendering to change. There are other mechanisms to debug your code, and we can create connections between rendered output and rendering commands, but not much deeper than that (you'll have to navigate!).

re: (1)

i never knew i wanted so badly to buy gold bullion and bulk up. guess i should listen to that electronic eliza.

If you don't have much of a

If you don't have much of a viewing history, then it might just be giving you default recommendations....welcome to masses.

Visual Studio can show you

Visual Studio can show you by what code a given HTML element was generated, and vice versa which HTML elements were generated by a given piece of code. Video demonstration. Maybe something like that could work if you manage to generalize it to all code. Whyline/omniscient debugging may be the answer.

Very nice demonstration you've got by the way!

Navigation is easy to do if

Navigation is easy to do if the relationships are direct. That circle on the screen can be mapped to the draw circle command....we can even "jump" to the execution context.

Diffing and animation execution, however, is a problem unsolved.

What I want to look at in the future is branching time so we can compare different forked executions side by side. That gets really close to repinning's conversational programming model.

Loop debugging

Scrubbing through the iterations of a loop while debugging a method called within. Scrubbing immediately changes the focus of the method being debugged, as you would expect!

https://www.youtube.com/watch?v=WgRGKQ-BdQA

Very interesting. How do you

Very interesting. How do you determine which context to show for things inside the loop body when you switch iterations? What if foo has loop in it too? I'm thinking about bigger programs where control flow in different iterations might be totally different, and even call totally different functions.

For a method call with a

For a method call with a postfix in a loop where nothing exists for it, you just won't find anything from the execution: the probes are setup, but come back empty after trying to fill them. Not the best failure condition, but sufficiently robust right now, I guess. Probably would be nice to highlight what is active or not, but that right now only applies to if statements.

For a loop, if the selected iteration disappears, it will just default back to the first one.

Live programming with

Live programming with multiple bouncing balls...the effect is kind of cool even if performance needs to be improved:

https://www.youtube.com/watch?v=e3LqsIxVax8

Essay is up, please no

Essay is up, please no social sharing:

A Live Programming Experience

Type system motivation unsatisfying

I like the essay, but the motivation for your type system will raise red flags among people familiar with type inference.

Static types offer a lot to live programming in enabling code completion, making sense of live feedback, and boosting performance (responsive live feedback does not come cheap!). However, existing type systems are not well suited to ad-hoc live programming where types, and even constructs being used, are unknown for long periods of time. Even languages that support aggressive type inference, such as Haskell, enforce a linear progression where types are first defined and then used immutably. Terms have types based on how constructs are defined, and therefore something that is unknown necessarily lacks a type.

APX includes a type system called Type-less that is based on "backward" type inference to handle programs with incomplete type information that changes incrementally. By backwards, we mean that the type inferred for a term is based on usage rather than definition: X is typed as a duck because it needs to quack as opposed to typing X as a duck because it is defined to quack. Going backwards allows for open ◌ terms that are unresolved but still have inferred types based on usage. These types can then be used for code completion, scrubbing, or to provide default values that enable earlier execution.

The description of existing type systems is not accurate. In particular, ML systems can perfectly well handle unknown expressions, by assigning them a flexible inference variable and letting unification do the work. For example, the Merlin editor-assistant for OCaml makes use of that to provide type feedback for incomplete or partially-incorrect editor buffers (in fact just running the standard OCaml type-checker on incorrect code will produce partial typing information on the missing/wrong parts, but before Merlin there was no tool to observe and exploit it, just a type error message). In particular, the sentence "Terms have types based on how constructs are defined, and therefore something that is unknown necessarily lacks a type." is recognizably wrong. For example, un-annotated function parameters start with an unknown type, that gets refined by unification on their use-site.

If the type system also has some form of structural sums or records (for example polymorphic variants or objects, using row variables), each use site may contribute a few possible fields/methods or cases to the known type of the variable. To someone that is not familiar with your own work, this would seem to significantly overlap the description you make of a specificity of APX, namely "usage defines objects rather than object definition restricting usage [, which] avoids pre-commitment".

Unfortunately I did not invest enough time to properly understand your Type-less work, so I could not comment on whether those existing type systems could work in your context of live programming. Clearly they would behave differently; for example, you explain that you select ThingA if only .animate is called, while ThingB also has a .animate method, this is a non-principal choice that row-based type inference would not make by itself. (It looks like you use a structural-type-friendly inference method in a nominal-system, with some heuristics to bridge the gap. I'm ready to believe nominal is important for your use-cases and the heuristics are reasonable.)

In any case, you have to change the description and motivation of the type system, because as written it misrepresents existing systems in way that could make your readers think you haven't properly done your related homework. It's just a matter of presentation: you should either say more about the difference, to avoid simplified ideas that turn out wrong, or claim less about what existing systems cannot do -- but then it's harder to justify why you went to the trouble of designing your own different thing from scratch.

Also:

Additionally, Type-less includes consistency checks to ensure that conflicting traits are not extended by the same term; e.g. using a number as a shape leads to a type error since 'Shape' and 'Number' cannot co-exist. This check covers many, but not all, of the errors that forward type checking would catch.

At this point I found myself wishing for examples of the claim in the last sentence (both examples of errors caught, and of errors not caught, to get a concrete sense of your "many, but not all" phrase). I understand that could break the flow of the article, so maybe as a footnote or optional tooltip? Generally I was happy with the degree of (im)precision of the text (it did not feel handwavy), but this particular sentence felt under-commented.

To further improve performance, the VTM dynamically compiles APX code using the Dynamic Language Runtime, unboxing and inlining operations on primitive values according to inferred and explicit type information. The result is encouraging even if there is probably still some ways to go.

Again, I would expect more details there. (Your earlier point about types being used for unboxing already had some small and natural suspension of disbelief: I would spontaneously assume that boxing does not matter here). Could you measure the time (or maybe the median latency between ticks, or what not) needed on a particular scripted interaction, with and without boxing/inling, to demonstrate that this is not negligible with respect to the other performance costs? (Again, in a tooltip; but a couple representative numbers could be given in the article text itself and would not hurt.) Alternatively (if you don't want to be bothered with performance measurements for type feedback), you may also not talk about unboxing and just mention that you use the DLR and that it is fast enough.

One of the few good "programming style" advices I have found that I hadn't thought about before is "don't mix different abstraction levels in a single place". I think this also applies to writing. When you are reading the description of a nice interactive visualization of programs, there is something slightly shocking in suddenly reading about "oh and, by the way, this programming pattern reduces register pressure" that encourages the reader to believe the claim is actually wrong. (Maybe this is also a reflex acquired by coping with the mistake many beginners make, of claiming at the most improper times that their choice of using `while` rather than `for` loops (or `if` rather than `switch` or whatever) was for efficiency reasons.) When you perform those kind of jump between abstraction levels (especially jumping down), you better have solid data to justify the unpleasant experience.

All in all, I like the essay. I know it may not have been designed for this purpose, but I hope you submit it to some major programming language conference (or several, sequentially), and it gets discussed there! It might help to highlight near the beginning what the contributions are (I would include the contributions from previous unpublished essays, and even possibly mention Bret Victor's innovations eg. "We present Scrubbing and Strobing, introduced as program-visualization techniques by Bret Victor, justify their use cases, and integrate them as general-purpose tools in a programming system"), and to have an implementation available for reviewers to try.

Types: I don't think you can

Types: I don't think you can add members to types in Haskell/O'Caml after they have been inferred on terms, not without a recompile. Also, none of those support retractions, or using a type before its been defined. In fact, none of that even makes sense in the context of those systems, they just aren't focused on integrative use beyond linear REPLs.

I definitely need to write another essay just about Type-less: it is definitely ready for it, and I could fill it with examples like why shape/number are not compatible (I would actually explain exclusive inheritance at that point), self types and f-binding, and inference of type parameter bounds in any context. This type system work really paid off, and now its easy to just forget about it.

Unboxing is huge at least for physics; my last prototype from the programming with managed time work had to hack it in without the DLR or static types; it was just necessary (and unboxing without static types requires specializing each execution, yuck!). How do you compare non-functioning with functioning? And when unboxing is such a standard optimization these days, why do you need to more than just mention that it must be done? But really, I want to mention unboxing because it is a nice motivation for types and we actually do it (type based optimizations are not some potential thing, it is a real thing that is done).

At the end of the day, what you are asking for are comparisons between my own prototypes, and that would just be dishonest. I see people comparing against themselves when they have novel systems with no true points of comparisons to appear rigorous, and there is nothing to learn in those tables! My goal is to build a working system, and I have some tricks to eliminate some of the enormous overhead of dependency tracing, versioning, and re-execution (slow/fast paths). But until I claim my X is faster than someone else's Y, there is no point to talk about numbers without any point of comparison. I know the standard is dishonesty: either record some meaningless numbers or hide the limits of your system. But no.

The PL community isn't really interested in this kind of work. I'm 100% focused on release, so no more papers until release.

I'm pushing for a product (an app...) and a OSS release, but I'm still waiting for approval (and even then, how many reviewers run windows?). Anyways, thanks for the feedback!

Types, benchmarks

The workflow I would envision with a row-variable-based-system on the examples you show is the following:

1. Infer the type of expressions as open structural types, eg. < animate: 'a -> unit; .. > ('a denotes a yet-unknown type variable, and ... indicates that the actual type may have more methods than that). At this point you don't known which implementation of "animate" will be called and thus cannot show any visualization.

2. Inspect the set of classes known to the whole program, and use whatever heuristic you use (eg. as few extra methods as possible) to select a compatible one. You can fail where there is none, or just not show anything yet. (This corresponds to a nominalization transformation, in a sense)

3. Refine the type of the inferred code with the new type information gained from the arbitrary choice ("in fact animate expects a color"), rinse and repeat.

Not sure about your "add members to types after the fact" point, but is that demonstrated in the essay as it stands? (Is it about using new methods on a variable with a previously inferred type? Or adding new methods to an existing class? Why is recompilation a problem in the context of live programming where the program is constantly changing anyway?)

I see people comparing against themselves when they have novel systems with no true points of comparisons to appear rigorous, and there is nothing to learn in those tables!

I disagree. A mistake people often do is premature optimization, in the form of implementing sophistication that was in fact not needed to reach their goal. Asking them to validate the fact that the sophistication was indeed necessary (by objectively comparing "system without trick T" with "system with trick T") is an effective way to ensure this. Of course the result you get is local: "the author was right to use T in this context". It does not tell you that all good solutions must use T (maybe there was a fundamental problem in the system, whose resolution would make T unnecessary) or that T makes the system competitive with the rest of the world; but that is not the purpose of this test. If you want me to pay attention to unboxing, you should first convince me that it actually matters.

On the other hand you are right that tables of numbers are boring when the answer is just a binary "yes, useful" or "no, does not matter". Feel free to condense table of numbers all going in the same direction into a few scalars: "adding unboxing always speed programs up, with a geometric average speedup of 5X on numeric programs".

How do you compare non-functioning with functioning?

I don't understand what you mean by "functioning" here, and I think this imprecision is problematic (or conversely, pointing out *where* performance is important for the user experience would be important). Showing real-time recording of the working software, with visible freezes in one case, would probably be enough. Or else you could try to think of a good numeric metric that reasonably reflects perceived usability (eg. just the computation time between two ticks to compare to your expected framerate), and compare this metric.

and even then, how many reviewers run windows?

Can't you provide a virtual machine?

The type system is nominal

The type system is nominal without overloading, so there are no heuristics: just traits being applied directly. Nothing is immutable in the type system, neither the values that are inferred types or the types themselves...I'm just not seeing that in existing tyoe systems, which all seem to make some kind of assumption about immutable type/value relationships.

By non functioning, I mean my ball won't even bounce without unboxing and slow/fast path optimization. I learned this the hard way in my last-last prototype, and I had to work hard to get perf to a "presentable" ooh-wow level in the last prototype. When I'm not behind the wall, I can maybe dig out the YouTube videos of before and after and show you. This prototype improves on the last prototype with the type system and the DLR work, and now even more examples are "feasible" vs. painfully slow and embarrassing.

This is not premature optimization, this is blood drawn from 7 prototypes, many profiling sessions, and desperate attempts to make this "work" at all. There are good reasons why no one attempts to trace dependencies dynamically, deal with dependency graph cycles, deal with retractions, version all reads and writes, allow for arbitrary code changes without restarting the program. It is considered very expensive, and it kind of is! There are some tricks that give us hope that maybe it isn't going to be that expensive, that it will be viable.

Now I could compare against that older prototypes, but a million other things are different. I could also spend a week of work to rip out unboxing to prove how important it is, but why? .Net has unboxed since forever, the value of it in certain domains like physics is not controversial.

The slow path/fast path is probably more interesting, and I can turn it off easily, but the system will just crawl and it will beg the question: what other things did you not optimize because you could rely on this other optimization? So say we just do slow paths, then you need to make your slow paths faster, but if you have fast paths and they are the norm, making your slow paths faster won't do much for performance, and you won't be able to measure it in your profiler until they do matter.

Again, I don't think anyone is interested in reading about made up metrics. What is the number on a good experience? Frames per second of what? I think the colliding ball example does a good job at showing where the system begins to break down, when you can see the strobes updating non instaneously, and getting stuck at the end where collisions actually happen.

Do virtual machines handle WPF well now? They used to have a problem with GPU accelerated graphics.

Hm

The type system is nominal without overloading, so there are no heuristics: just traits being applied directly.

I have a slow network connection and somehow cannot access the corresponding video, but the text explanation about the first set example seems to say that when you call the `.animate` method on an element of the set, then this set is inferred to implement trait `ThingA` (without any extra annotation on the user part). This is a non-principal choice (what I called "heuristic"), as trait ThingB also provides this method (with a different dynamic behaviour). Or are you relying on the fact that because ThingB inherits ThingA, all values with .animate must actually have the trait ThingA? (What would happen if you just had "ThingB :: Object"?)

By non functioning, I mean my ball won't even bounce without unboxing and slow/fast path optimization. I learned this the hard way in my last-last prototype, and I had to work hard to get perf to a "presentable" ooh-wow level in the last prototype. When I'm not behind the wall, I can maybe dig out the YouTube videos of before and after and show you. This prototype improves on the last prototype with the type system and the DLR work, and now even more examples are "feasible" vs. painfully slow and embarrassing.

There is enough meat in that explanation to be believable. Just insert it in the article!

You can grab an offline copy

You can grab an offline copy at this link (only 12 MB with all the videos):

https://onedrive.live.com/download?cid=51C4267D41507773&resid=51C4267D41507773%2111492&authkey=AMwcxdryTyPiuW8

Animate is a fixed method, so just calling it causes the iterator "t" to be a ThingA, which folds up into the set's type to refine the ItemT type parameter, which is then propagated to the new expressions. There is one namespace without overloading, so there is only one animate method in the world, and the choice is unambiguous. ThingB actually overrides the implementation of animate from ThingA, it doesn't define a new one. If ThingB didn't extend ThingA, then there would be two fresh animate methods in the world and ambiguity would arise. Now if ThingA and ThingB weren't compatible (I.e. they couldn't be extended together by the same object) and some other method call or annotation could disambiguate them. Anyways, code completion could also present 2 distinct animate methods with some other kind of disambiguator, which is where this is probably going eventually (methods are GUIDs, names are secondary mappings handled by the IDE, and then...code wiki).

I'm not sure the essay reader is interested in my own blood and sweat journey, but maybe I could articulate the performance story a bit better.

Types: I don't think you can

Types: I don't think you can add members to types in Haskell/O'Caml after they have been inferred on terms, not without a recompile.

Unless I'm misunderstanding what you mean, row polymorphism certainly allows this. A type is inferred with a row variable standing for the part of a record of which you aren't aware, ie. var x : [ foo : int; bar : string | r ] means we know x has foo and bar members and some set of unknown members represented by r. You can add whatever you like to r without recompiling any code dependent on x, and adding members to x only involves compiling any code that accesses x's new member.

To appear at the Strangeloop

To appear at the Strangeloop 2015 future of programming workshop.

Very interesting read, thanks

Do you find that your "meta level" constructs (strobing frequency, overall sampling time) hold at useful levels across different programming tasks?

I'm still at the demoware

I'm still at the demoware phase so I'm honestly not sure, but you'd want at least some configurability on window and strobe sizing, I'd bet! One complicating issue: for non-physics/animated applications, events are much sparser in time. Even if you are running a physics engine, nothing will change if everything is resting. You want to size your windows (by basing them on events?) for the amount of change going on, and something at rest won't have an interesting strobe.

We'd also have to be pretty clever for strobes to make sense at all in a non UI application, but maybe that is just a matter of visualization (very interesting problem to ponder!).

Verlet physics and time travel

I implemented a physics engine over this last week and integrated into APX. Still kind of early, I need to add position-based constraints next (the advantage of using Verlet), but it is looking really good:

https://www.youtube.com/watch?v=OLE04THYI2Y

mother

that is shaping up (ha ha) to be the mother of some - not all, but a lot of demos.

It is getting there. Still a

It is getting there. Still a lot of features to go between now and December.

Create by abstracting

A new video is up today replicating Bret Victor's Create by Abstracting feature of his Learnable Programming essay. The real action starts at 1:20, but the setup should be interesting also.

The concept works well, except for the differential part that seems to be limited to numbers only. It is worth thinking about how the technique can be extended.

Tuple selection

One thing (unrelated to the main feature of this video) that doesn't look terribly natural is when you select a whole tuple instead of just one component. You have added a green bar to visually mark the selection point, but it is still very small, much more than the numbers forming each component, and it looks hard to hit.

The red-bold highlighting

The red-bold highlighting makes it obvious when a tuple is selected or being targeted. It actually feels quite use-able, and there seem to be no accuracy issues with hitting the tuple, at least with the mouse (again, the red bolding helps a lot as feedback). I haven't tried this with a trackpad yet, and its definitely too small for touch!

ui from inference

This video really gives a great feel for how you leverage your inference. Having just plain numbers that are interpreted for scrubber-UI purposes as angles, positions, etc from usage seems to work quite nicely!

Thanks! These are quite

Thanks! These are quite simple inferences and I have yet to come up with a singular example that shows the full power of the type system (that it can infer complex type parameter bindings in the presence of subtyping, a feature that has so far alluded the field). Still much work to be done.

dataflow vs constraints

I've been a detractor of the notion of value-polymorphism or return-type polymorphism, e.g. in Haskell. My chief complaint was that it turns a dataflow problem in to a constraint satisfaction problem without paying back its complexity cost in terms of usability benefits. However, you've definitely made me reconsider that cost/benefit analysis.

One thing that I wonder about is how you bound the constraints. When you have a one-way dataflow analysis, there's that referential transparency sort of promise that the inferred type is a local property from the immediate inputs. However, with the backwards data-flow / constraints, where does the sphere of influence end? The whole program at main? Or is there some other boundaries? Can the programmer set boundaries? Is the influence lexical/syntactic? Or can it be dynamic?

Maybe I just need to re-read your paper...

My paper is out of date,

My paper is out of date, don't read it :)

Type checking is "local" in the sense that inferences are not allowed across certain module boundaries, in this case the trait (but in the example, there are no traits, it is one module). Consider:

def foo(x : Number): 
 return

trait TraitA:
 var a
 var b
 def bar0():
  foo(a)

trait TraitB:
 def bar1(c):
  foo(c.b)

In this example, TraitA.a is inferred to require the Number trait (so its type is Number). However, TraitA.b has an empty type, and inference cannot work across TraitB into TraitA...so we get a type error. We could relax this and allow type inference to be global, there is no technical limitation at least, but it just seems like the reasonable thing to do (all type inference is local, where "local" is a bit more permissive than it is defined in other type systems).

Otherwise, the magic is that the data flow analysis is backwards as well as forwards, that it will work with constraints like "a := b" and "b := a" in any order, with certain tricks to break cycles and be incremental in the presence of non-monotonic changes. The main trick is in how to solve the constraint "a := b" in the presence of subtyping and type parameters; the traditional coercive strategy used in most parametric/subtyping type systems just plain doesn't work and something else was needed.

You have arrived

Sean, this is my favourite video so far. Live programming meets micro-refactoring! A very compelling combination...looking forward to seeing how far you can push this.

Thanks! Check out the one

Thanks! Check out the one below also. Still a long way to go, the next step is to think outside the 2D box :)

Proposing generalizations

Maybe this is what you meant by "the differential part", but one interesting thing I saw in that video was dragging a variable into a constant and having it speculate as to the generalizing relationship. So x = 80 dragged into 40 would speculate that 40 should be generalized as x - 40. But then it appeared you could select alternatives from a dropdown list, such as x / 2. I guess this is basically a limited form of "programming by example" which more generally would consider several input -> output pairs when making a suggestion. I can see ML and search augmenting programming environments in powerful ways if these kinds of tools can be made to work well.

Yes, this is something I

Yes, this is something I want to explore further. Scrubbing is hard to generalize since most code isn't very continuous, but using run-time values to guide abstraction seems to be much more promising. You don't want just +, -, *, /, but perhaps an arbitrary existing function that takes the value being dropped and produces the value being dropped to, with some other argument. That could be kind of difficult, but hopefully types can be used to provide for more specialized selections of possible ways to move X to Y.

Direct manipulation + create by reacting

I added a direct manipulation aspect to the last example:

https://www.youtube.com/watch?v=hgpYEOmmlzc

+1


I really like that.

However, it would seem to me to make more sense to be able to drag a component out of an expression and drop it into a scope to turn it into a variable / argument, rather than having to create a variable and then link it to "pull" the subexpression out.

I thought about that, but

I thought about that, but what would the name be and what space would it occupy? Instead, I decided to build all capabilities around '?', just so it was more clear and consistent, even if less convenient. Trade offs.

If it was me (and it's not,

If it was me (and it's not, so I should probably shut up), I'd promote the ? to the name "slot" in that case - obviously this would mean assigning some unique temporary name to the expression under the hood, and an alpha-renaming step later as the expression is properly named. That's different to how you currently deal with ?, though, I guess.

In terms of space, I assume you mean scope? Again, if it was me, "in the scope where the user dropped it", assuming that's a valid scope for existing usages (and, of course, assuming you can determine that scope in your editor, and that scope actually means something in APX).

For me, the action of dragging a line to something implies more "link from this source to this target", rather than the "fill this space from here" approach you're taking.

Trade offs.

Indeed.

Holy terms

I totally get what you are saying. The problem with aiming at a scope is that those aren't very tangible, they don't have targets. With '?' we get around that easily, it becomes more predictable.

They are actual hole terms, but I was thinking about calling them holy terms. They are used when you don't want to specify something yet, or when you do t care about a value: e.g. you don't want a rotation (? as an angle evaluates to zero). Slot is way to verbose and anyways doesn't have the correct meaning.

I have an implementation of

I have an implementation of such dragging somewhere. They way I did it was as follows. Consider:

def foo(list, a):
   s = 0
   for x in list:
     s += (x+a)/a

You select a subexpression, say (x+a), and drag and drop it somewhere. You can drag it here:

def foo(list, a):
   s = 0
   for x in list:
     z = x+a
     s += z/a

where z would actually be a name hole. Right when you drop the expression the cursor gets placed on z and you can type the name that you want.

You can drag it further up:

def foo(list, a):
   s = 0
   def z(x) = x+a
   for x in list:
     s += z(x)/a

Note how it got turned into a function automatically because it contains the x variable which is not in scope in the place where we dropped the expression x+a.

We can drag it even further:

def z(x,a) = x+a

def foo(list, a):
   s = 0
   for x in list:
     s += z(x,a)/a

Now even the a variable is out of scope, so it gets turned into a function with two parameters. You can do the reverse by dragging def z(x,a) back into the inner scope.

Ya, this is a decent option.

Ya, this is a decent option. But the fluidity isn't good, when you drag to a scope, well, that scope has to have a target, but let's assume you can just highlight the line. Then a name has to be typed at drop (this requires a mode), or a name has to be made up and renamed, better (no mode, I believe Bret does this) but awkward. Separating val/def allocation and code movement into two steps solves this problem a bit, and allows for some reuse (I can always clone code by dragging from a hole to the code to be cloned).

Also note that connecting the return hole of a def sucks everything up below it until the expression selected, nothing fancy is going on, like a dependency analysis to determine what code to move or not. It really isn't meant so much as refactoring, but as a simple ability to build abstract methods backwards from concrete examples. So the UX tries to optimize just for that use case.

The way I view it is that

The way I view it is that you want to be able to drag code around anyway, so we might as well make it produce sensible code in more cases.

When you drag the expression you can highlight the drop site as a horizontal line between two existing lines. Another option is to move the other code around while dragging to show the space where the new code will be dropped (like this).

There is no modal editing for the name. Names are just for programmers, like comments. When you drag an expression a new variable is created without a name, but the variable does have an identity. You can later name that variable at any time. Putting the cursor on the variable name when dropping is just a convenience. Note that this model makes rename refactorings automatic. When you rename a variable you are just editing the name of the variable, and that name is displayed at the definition site but also at the use sites.

Maybe for the touch version?

Maybe for the touch version? The editor is not fully structured, and I'm doing a lot of unsafe transformations in movement (it find the expression, but doesn't consider dependencies, making a method just sucks in everything below the empty method to what the user selected). As I push into more and more features that I didn't anticipate 6 months ago...the implementation and design becomes more ad hoc (obvious in hindsight).

Talk dry run

https://www.youtube.com/watch?v=NJTssvFWJt0

The length is about right, but I'm not sure the story really works. Will have to work a bit more at it (and fix bugs).

Strange loop

Strange loop talk:

https://www.youtube.com/watch?v=YLrdhFEAiqo

And that closes out this topic!

Managed consistency

Yay! I like your observation that a pair of continuously-valued things is a continuously-valued thing...that naturally generalises to arbitrary n, perhaps that's what you meant by breaking out of the 2D box?

I enjoyed the scrubbing on the bounds of the loop. Does this rely on a '?' of type Color being chosen non-deterministically in each execution context?

The spatial/temporal angle is interesting...'onTick' gives you a global temporal index and then the enclosing 'for' loop adds a spatial dimension. Although you describe it as an imperative language, I find myself thinking of the 'for' as a universal quantifier, rather than an imperative loop.

I didn't quite understand the direct manipulation part (although I loved the velocity vector example). What is it about an expression on the left that makes it directly manipulable via a draggable circle on the right?

Finally, I like "managed consistency" in preference to "managed time". Managed time was cute but not actually very meaningful.

Good work!

"Breaking out of the 2D box"

"Breaking out of the 2D box" basically means being able to scrub on an infinite number of abstractions in a continuum. Right now, we can only really scrub on values (continuous or finite) or a small number of fixed abstractions. Without this, live programming just gives us a fancy navigable debugging experience for non-UI or numeric applications (and even for UI and numeric applications, it is limited).

? is a random color but is stable with respect to the execution path of the token. So each iteration of the loop re-executes with the same random color, while different iterations have different random colors.

Declarative binding/fact assertion and imperative assignment/side effects aren't so far away with managed time semantics (assign once assignment, retractable effects). It is mostly a matter of syntax at that point.

If an expression has a constant hole in it (or a holey term), then you can manipulate it. It is actually a bit more complicated than that, and there is some walking of expressions to determine if an expression can be manipulated, and if so, what part of the expression can be manipulated (basically, it is the problem of 2-way data binding).

I still prefer managed time in a conversational context, since managed consistency introduces too many questions even if it is clearer. I'm thinking about moving everything on to Jefferson's "virtual time" term and "virtual time machine" (like Timewarp). But at this point it probably wouldn't be very helpful (maybe I should just go with "React for Compilers").

Without this, live

Without this, live programming just gives us a fancy navigable debugging experience

For what it's worth, I found this probing and logging by far the most amazing part of the work and I wouldn't call it "just a fancy navigable debugging experience". I found it the most interesting precisely because it generalizes to any programming task. The scrubbing is cool and also generalizes quite well since almost all values either come from some small finite set or a number.

Changing the code by manipulating its output could be kind of interesting for GUI layout, but I think a better language for describing GUI layout is ultimately a better solution than dragging GUI elements around to pixel precision, especially since screen sizes vary and UIs need to adapt to screen size. Dragging an expression on top of a constant to turn it into a formula seems least interesting to me, because I don't think that generalizes at all. No offense to Bret Victor.

Do you have other ideas like logging/probing/scrubbing that generalize to any domain?

One thing I could think of is like the Whyline: for a given expression you often want to know what influences it (i.e. backwards dataflow), and what it influences (i.e. forwards dataflow). I think this would integrate particularly well with scrubbing/probing. Backwards dataflow is: for the expression under the cursor, what are the values/expressions that if I scrubbed them would change the value of this expression under the cursor? Forwards dataflow is: if I scrubbed this expression under the cursor, what are the expressions/probes that would change? This could extend to the output and log output: highlight the parts of the output and the log entries that would change. For backwards dataflow you could show for the pixel/object under the mouse cursor, which expressions affected it. Conversely, if you had an "input log", i.e. a log that shows all mouse clicks and other events that happened, then for backwards dataflow you could highlight the entries in that log that caused some expression to have a particular value, and for forwards dataflow if you put your cursor on an entry on the input log you see the expressions/output affected by it.

Changing code by manipulating output is hard

Jules, I agree with many of your points, but I would call them "fancy debugging tools." However, I think this is a situation where quantitative improvements in tools can lead to a qualitatively different experience.

I'm still not particularly excited about the scrubbing. Sean, look at the part of the video where you quickly change the location of a shape. You declare that it would be better if this were continuous and proceed to spend an order of magnitude more time trying to make the same change with the mouse. And coordinates are going to be something relatively well suited to scrubbing.

IMO, the important part of the technology is the way it enables feedback oriented development. But I doubt this is ever going to be something that comes working out of the box. "Figure out what code to change to get this output" doesn't scale, as far as I can tell. It quickly becomes a greatly underspecified and intractable constraint satisfaction problem. As language designers, I think the best we can do is make it easy to build custom visualizations and interactions. Automated visualization isn't really feasible.

If you've done some

If you've done some front-end web development you are familiar with the cycle of changing a value in the CSS and refreshing the page, then changing it again and repeat until you get something aesthetically pleasing enough. That's one situation where scrubbing could be a big help, but it's very domain specific.

In a more general context I think scrubbing is most useful not for finding the right values for constants but for understanding code. When you have a function that turns input into output (not necessarily numeric) you may want to try out various inputs and see the output. Often you even just want to know whether changing some part of the input has effect on a given part of the output at all. Scrubbing streamlines that process, but if you look at this from a bit broader viewpoint then what you actually want is a good editor widget for inputs, and automatic re-computation of the output. Scrubbing numbers or values from a finite set is just one example of such a widget.

By the way, I meant that "just" in the sentence "just a fancy navigable debugging experience" is overly humble :)

Agreed

changing a value in the CSS and refreshing the page

Sure, sliders are sometimes useful UIs. If using a slider is the easiest way to tweak your CSS, then use one. I agree it should be trivially easy to place one and have it control the look of your UI.

When you have a function that turns input into output (not necessarily numeric) you may want to try out various inputs and see the output.

But I don't necessarily want to use scrubbing to do that. I may to embed a view of 5 different interesting cases side by side. Or maybe I want to watch an animation. Or view a graph that allows me to comprehend a trend or behavior in a glance.

if you look at this from a bit broader viewpoint then what you actually want is a good editor widget for inputs

Yes, agreed.

The other point I was making is that I think the path from a UI manipulation to an effect on the program should be more straightforward than having a constraint solver search the program for holes that could be changed to achieve it. I don't think that scales well. Rather, when you're defining the UI, you should specify (perhaps implicitly) what the effect of an edit should be. A constraint solver (if available) should IMO be invoked explicitly if desired.

By the way, I meant that "just" in the sentence "just a fancy navigable debugging experience" is overly humble :)

Fair enough :).

I didn't add snapping to

I didn't add snapping to scrubbing in the editor, but I did do snapping in the drawn output manipulation. Snapping really is essential, since you are generally trying to hit relatively more round numbers when expressing constants! Otherwise, you can spread your numbers out in more vertical space, but frankly, snapping is the sweet spot. Also, it is quite different giving presentation on stage with a laptop with a weird resolution and a substandard mouse experience, which is why I didn't really think about snapping until more recently.

This really isn't automated visualization since everything is manual: you choose to draw or log what you want...we could add more of that but the primary way for programmers to dig out something is to add code themselves. The trick is only to provide programmers with a list of options with any meaningful output at all without specifying additional code (including argument values!). We need to think more about that, but I don't think it's an impossible problem to solve...just apply more creativity :)

Backing up

I shouldn't have jumped in with my concerns without first saying that the demo looked really good and you did a great job with presentation, so please rewind and pretend I started with that :).

Regarding snapping, I'm just thinking that in my experience, I don't ever want to switch to the mouse while developing. Now I ask myself, if I'm editing a counter widget with the keyboard, will I edit it using the up / down arrows? Or will I just retype the new value I want? It's almost always the latter.

You're right that my last comment there about "automated visualizations" doesn't really apply to you. The way that you inserted a manual "draw this vector" command seems good and very general. How does the "find a hole to change" mechanism work? The way you phrased it, I had in my mind search, but maybe it's much more mechanical than that. In that case you can disregard that entire paragraph.

I do think that your time-lapse editor is too specialized. Time should be just another parameter that I get to control. An alpha blended time lapse should just be a type of widget I can choose. I certainly understand why you would start with a more concrete framework, though, for the prototype. Again, the demo looked really nice.

No worries

No worries, your critique points were all valid :)

First, what the mouse does well, the keyboard does horribly, and vice versa. The mouse gets a bad reputation because it is often abused in programming environments to do what the keyboard could do better, but when you are doing anything that involves continuous space, the mouse will have a big advantage over the keyboard! Up and down arrows work for a few movements, but are very poor substitutes even on repeating mode. A horizontal touch surface built into your space bar would work much better (though just for 1D).

So the keyboard excels at discrete input, and we shouldn't try to do discrete input with anything else (well, maybe touch IF the keyboard is unavailable). Does the fact that keyboard sucks so much at continuous input that we should just not explore that at all in our programming experiences? Of course not! If you had something good to do with the mouse, you might use it; right now the mouse sucks for everything you want to do. BTW, I heard from someone that MS ran a study and found that people who say they never use the mouse actually use the mouse a lot during...debugging. So there is definitely a mode where at least VS programmers use mouse a lot, and it is distinct from editing...perhaps debugging can just be replaced by something that supports more mouse-centric continuous editing? (you go through bouts of typing and bouts of mousing pretty much as you do now)

Right now, we just go backwards looking at the expressions that led us to a value to find a constant or hole to fill in during scrubbing. It won't yet look through procedure implementation, though it definitely could and should.

We only store around the last 200 time points in the current system, so time is obviously not like any other parameter to control. Time with its arrow is an unforgiving mistress, and we must accept the fact that we have to limit the extent of time travel by forgetting the history at some point.

Don't get me wrong, fancy

Don't get me wrong, fancy debugging is cool, and it's definitely miles ahead of what we use mostly today. If I can figure out how to put that in C# or JavaScript, it would be a general win and I guess the whole live programming experience isn't necessary. Deterministic replay is a bit of a problem there, though. You would need some sort of programmer specified visualization to take advantage of it, but that is often as simple as printf! And printf can easily be dressed up into bar charts, tables, and node/edge graphs. That is done...we just have to do a bit of hard work to realize it on our day to day programming.

But I really think programming would be much easier if a full live experience with pervasive scrubbing can be realized. We have code completion, which is like a GPS, but if we could scrub through a bunch of choices quickly, observing with little latency their effect in the program, programming would be a much better fluid experience...like painting or tinkering with Legos. Generalization is the problem, but it is a hard challenge problem that allows us to focus our research.

In general, all statements are probably more or less related. The question isn't so much what, but how. However, it would be nice to go from reads to writes and vice versa. This is why I'm so down on direct use of immutable data structures (no problem using them behind the scenes, of course), they just lose so much history that we could really use! But usability is key: whenever I see this work for standard languages, people start talking about programmer intensive slicing and such...it really has to be one action to be used heavily (e.g. goto write from a read).

Changing the code by

Changing the code by manipulating its output could be kind of interesting for GUI layout, but I think a better language for describing GUI layout is ultimately a better solution than dragging GUI elements around to pixel precision, especially since screen sizes vary and UIs need to adapt to screen size.

This is actually quite easy in that you can refactor your absolute pixel code into relative pixel code, allowing more easily for responsive designs. Its like the pill shape example: we start out with a pixel-fixed set of shapes, union them, and then generalize!

Starting with the concrete and making the concrete (as an example) into something abstract is probably easier than starting with abstract. The fact that we start at abstract is probably why people see programming as so hard, when it isn't really necessary. Anyways, this is one of the points that Bret Victor got right that I never saw before reading his essay a few times.