The Free-Form Linguistics Revolution in Mathematica

From Wolfram via Slashdot:

With the release of Mathematica 8 today, the single most dramatic change is that you don’t have to communicate with Mathematica in the Mathematica language any more: you can just use free-form English instead.
...
In each case, the Wolfram|Alpha engine will synthesize Mathematica code to do what you asked, then apply this code to the result you had before.
Well, this is the beginning of something very remarkable: the ability to do programming with free-form linguistic input.
We’re just at the beginning of this process. But already the Wolfram|Alpha engine can handle a wide variety of Mathematica concepts. Like list manipulation, image processing, string manipulation, import-export, and even user interface construction.
...
I think this is all a pretty big deal. You see, in the past, if you wanted to do any serious programming, you really had no choice but to learn a precise formal programming language. But now you can just tell the computer what you want to do using plain English.
And the big effect of this is going to be that the barrier between programmers and non-programmers will come down. Everyone is going to be able to be a programmer.

I also think this is a big deal. The natural language parser they are using seems to be more powerful than anything that I've seen before, though as with Alpha, it seems to have definite limitations.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Interactive Language

I imagine that the immediate feedback users see will be critical to making this practical. I played with Wolfram Alpha, and my impression was very much 'guess the verb'. It helped that it provided answers for many different interpretations.

Natural language feedback

I imagine that the immediate feedback users see will be critical to making this practical.

Agreed. For a variety of reasons, natural language is not feasible as the "preferred form" for inputing or specifying a program. Nevertheless, I wonder about automated natural language generation as a kind of "semi-literate programming" which would also allow immediate feedback during development.

For practical programs, this would entail deep integration between natural language, pseudocode and symbolic representations, in order to manually choose the best form among, say: "Returns the continued fraction of the reciprocal of the mathematical constant e" vs. "Returns ContFrac(1/e)".

Fortran made the same claim

And the big effect of this is going to be that the barrier between programmers and non-programmers will come down. Everyone is going to be able to be a programmer.

I wanted to look up the continued fraction of 1/e the other day, and turned to Wolfram Alpha. (After seeing how easy the answer was, I felt a little silly that that factoid hadn't previously registered in my mind; but that's another issue entirely.) I think I tried "what is the continued fraction of 1/e", but that got me nowhere. I ended up just searching for "1/e", which got me the answer I was looking for.

I see that they fixed my original query in the last few days, so either they improved their software, or maybe it was a transient glitch.

I knew that I could have gotten the same information from Mathematica, but as I'm not a Mathematica guru and don't use it regularly, using Wolfram Alpha saved me the time and effort of consulting the help browser. And Wolfram Alpha also lets you see how to compute what you asked for in Mathematica, which is an essential feature. It's something of a cross between a super-charged help tool and useful computing environment

So while I definitely appreciate the utility of Wolfram Alpha, I do not believe it will ever be a good option for writing robust software. It might be an invaluable tool to assist in the act of programming, and a great way of learning how to program in Mathematica. But if your program has to work consistently and repeatedly, I really don't think this is a good idea. You can't get away with ignoring the formalisms of programming languages.

Though as I write this, there seems to be a strong similarity with what I have said and the arguments that were trotted out against Fortran and/or for machine language. This makes me feel like I haven't really articulated what I want to say, but I do think it would be very interesting to compare and contrast my argument with the old arguments against Fortran.

Everyone makes the same claim

Every 5 years or so someone makes this claim, be it natural language programming or visual programming or workflows... and it always fails. Changing the syntax aids in code readability (and thus maintainability) but the programmer still needs to know the concepts. It doesn't matter how nice the syntax is if you can't express what you want to do.

Mathematica might find some measure of success here due to its domain, but even being the proponent of more natural language -like programming languages I am... it's not because of some barrier to entry.

Reminds me of Inform 7

Inform 7 is a programming language for interactive fiction (i.e. text adventures) which is similar to natural language in some respects. At least it tries very hard. And while people have written robust and substantial programs in it at the same time the shortcomings of so-called natural language interfaces are apparent: You still need to know how to tell the compiler what you want. So in essence a the language is not so natural, it just seems this way, as you can sort of read the program as prose, but writing it is not the same as writing prose. Having a search engine that assists you in finding the right formulation for what you're trying to say, is probably a step in the right direction.

Most IDEs have search engines everywhere

They're just not that powerful.

Most code completion, for example, can be thought of as dynamically mapping a dot in a dotted form context-free grammar template into a search box. But the search strategies here in most IDEs are pretty naive: it can only search directly for a specific name.

To illustrate this, consider:

String c = "Hi";
c.BeginsWi
         ^
         .----------------------.
         | Clone             |\/|
         | CompareTo         |--|
         | Contains          |  |
         | CopyTo            |  |
         | EndsWith          |  |
         | Equals            |  |
         | GetEnumerator     |  |
         | GetHashCode       |  |
         | GetType           |--|
         | GetTypeCode       |/\|
         '----------------------'

vs.

String c = "Hi";
c.BeginsWith("Hello");
  ~~~~~~~~~~

.---------------------------------------------------------.
| Error	1                                                 |
| Description: 'string' does not contain a definition for |
|              'BeginsWith' and no extension method       |
|              'BeginsWith' accepting a first argument of |
|              type 'string' could be found (are you      |
|              missing a using directive or an assembly   |
|              reference?)                                |
| File:        C:\Users\jzabroski.BEDFORD\Documents       |
|              \Visual Studio 2008\Projects\Project1      |
|              \Project1\Program.cs                       |
| Line:        96                                         |
| Column:      19                                         |
| Project:     Project1                                   |
'---------------------------------------------------------'

The CLR mscorlib only contains a StartsWith() method on the String class, so when I typed in BeginsWith on my object reference c, it didn't know the difference between that and StartsWith or even Whisky Tango Foxtrot for that matter. In this sense, the "Intellisense" is not very intelligent.

There has been some exploration in this area, see for example Greg Little's Programming With Keywords thesis.