Higher abstraction through NLP and automatic code derivation?

While waiting for my plane and thinking about a what a VHDL/Verilog killer would look like, I had a very (un!)original idea: describe what is to be done in English, and let the killer do the code derivation.

Here is an example of a description of how I want my blob detection (a computer vision application) be done on incoming images:

blob detection, image processing
- connect neighboring high-intensity pixels
- high-intensity - values exceeding a threshold
- neighboring pixels of an image
- 8 pixels surrounding the pixel in question
- image is a 2d array of pixels, variable width and height
- pixels come one by one
- pixels are 8 bits, but can be larger
- threshold is variable at run time
- use png files as test images

This description is enough for a human to write the code, but not so for a computer, that has no understanding of any of these words.

Now my question is, is there anyone working on this?
If not---why not?

To start working on this, here is what I want to do (in my spare time):
- P. Norvig's PAIP has a chapter about solving simple math questions written in English. That's my starting point to derive some sort of meaning from English language description.
- Get from meaning to code. Probably, I will have to search through a code space to satisfy the meaning derived.

I like to hear your reasons on why it is or is not possible!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Your problem really sounds

Your problem really sounds like one of machine learning. Just provide a large enough corpus of the kinds of images you want to recognize.

Even a drek hot programmer isn't going to be able to implement something very good in this area; natural language programming not withstanding.

Blob detection was just an

Blob detection was just an example for how a user would "program". I want to write a piece of code that automatically comes from a problem description in english to a solution in a programming language (VHDL in this case).

The solution doesn't have to be perfect, super optimized, or even working, but it should be a close approximation to what a human would write.

Again, your only hope is

Again, your only hope is deep learning. The problem basically boils down to Siri.

And what about the second

And what about the second step of going from potential meaning to runnable code?

Now that is an interesting

Now that is an interesting problem. There was an architecture put out by some of the ex-Siri folks a few weeks ago. I'll have to go back and find it.

We've been thinking of the problem like this: you have an object with some is-a, has-a (sub-object) relationships constrained via the utterance, but these are not enough for the object to run: it is abstract. So go and find a combination of traits to add to the (tree of) object(s) to satisfy the utterance with a concrete object, which is then executable. Now, choosing some traits preclude using others, so you want to find a (1) "valid" solution and (2) a "best" solution...what is nice about this formulation is that it is trainable...

Do I understand it correctly

Do I understand it correctly that it would require some coarse-grained, predefined/trained mappings from description to code? Something like implementations of solutions to some well-known subproblems that can be used as building blocks or templates for solving bigger problems. And the meaning->code step is done by checking for similarity in the existing set of solutions.

I wonder if taking these two problems head-on is doable. Maybe it's better to start with a DSL that allows for some ambiguity but not quite as much as a natural language does.

You can implement the traits

You can implement the traits in code, it's just their composition that needs to be trainable.

Natural language can be a

Natural language can be a huge thing to define, you might encounter thousands rules. You can simplify things a lot by using attempto controlled english, a version of english with smaller set of rules.

You can see it in action here. Just click on RACE link

when apps fly

if i get it right, an important term would be "expert system".
lately i found this neat explanation:
how inform 7 works could also be worth looking into:
(hell, after the last update, inform 7 itself might be worth reevaluating)
the problem with ACE as i understand it is there would be no idiomatic way to write procedural code.
in any case, ive come to the conclusion that a structural editor is necessary to allow any such system to have the usability i imagine and to scale well, so thats what ive been hacking on for some time now. please do browse thru my terrible notes:


At least one interesting attempt at this is Osmosian. The example source code is:

The background is a picture.

A button has a box and a name.

To clear the status:
  Clear the status' string.
  Show everything.
To create the background:
  Draw the screen's box with the white color.
  Pick a spot anywhere in the screen's box.
  Pick a color between the lightest gray color and the white color.
  Dab the color on the spot.
  If a counter is past 80000, break.
  If the counter is evenly divisible by 1000, refresh the screen.
  Extract the background given the screen's box. \or Create the background from the screen. Or something.
To create a work given a URL:
  Allocate memory for the work.
  Put the URL into the work's URL.
[.. A LOT MORE CODE ..] 
A painting is a picture.
To pick a spot anywhere near a box:
  Privatize the box.
  Outdent the box given 1/8 inch.
  Pick the spot anywhere in the box.
To print:
  If the current work is nil, cluck; exit.
  Show "Printing..." in the status.
  Begin printing.
  Begin a sheet.
  Center the current work's painting in the sheet.
  Draw the current work's painting.
  Center the current work's painting in the screen's box.
  End the sheet.
  End printing.
  Show "Printed" in the status.
The print button is a button.
To quit:
  Relinquish control.
The quit button is a button.
To run:
  Start up.
  Initialize our stuff.
  Handle any events.
  Finalize our stuff.
  Shut down.
To show everything:
  Hide the cursor.
  Draw the background.
  Draw the status.
  Draw the print button.
  Draw the quit button.
  Draw the text.
  Draw the current work.
  Refresh the screen.
To show a string in the status:
  Put the string into the status' string.
  Show everything.
The status has a box and a string.
The text has a box and a string.
A work is a thing with a URL and a painting.
The works are some works.

Honestly, though, I think English code isn't much more comprehensible than any other language after you hit a few hundred lines. For clarity and comprehensibility, the real problem to address (mitigate or automate) is overwhelming complexity.

Actually, the real problem

Actually, the real problem to address is that people don't really know what they want, so programming is more like a conversation with the computer, not dictating to it. Allowing the computer to ask questions would go a long way to making programming like this work.

no. i mean, yes. i mean no.

Hey, there are many times when i know what i want, and i still can't get any joy from the bloody development systems we're mostly saddled with! :-}

But yes in general I would agree that we need to be embracing more 'agile' approaches all the way through. And of course even when I know what I want, as soon as I have it I will want to revise it to make it better.

not bad

If true, a few hundred lines is actually pretty good. All the programming / formal / artificial languages I've encountered seem to turn into gibberish in less than 10 lines