## Forth in hardware ("not dead yet")

Apparently Intellasys is alive and kicking out Forth-based CPUs, some of which have a rather Transputer like smell to them. I wonder what their VentureForth language is like? Apparently there are freebies to play with.

(I have zero to do with the company, I'm just curious about little languages and multi-core-ness.)

I know stack based languages are perhaps not always considered the height of PLT (interesting to read 'I remain adamant that local variables are not only useless, they are harmful'), but maybe there is a time and a place even for Forth?

## Comment viewing options

### Interesting, but not convincing

I dunno; he expresses a lot of odd opinions, with not much to back them up. The one that will probably get the least LtU sympathy is his belief that nobody should have a deep stack; on his Forth chip, the i21, the stack depth is limited to 18. To me, that suggests he doesn't use recursion (he does have recursion in his language, but it seems to be limited to tail recursion, used to implement loops).

I was interested to see what he meant about local variables, but it turns out he simply prefers globals.

### Avoiding naming can be useful...

...names occasionally bind to the wrong thing, after all.

But otherwise, the "local variables are harmful" strikes me as arrant nonsense. Items on an implicit stack are simply local variables without names (or else they are "named" with a natural number which varies depending on where in the code you are). By naming them, and letting the compiler figure out where on the stack (or wherever else) a particular local is, makes for much more reliable code. Even typed stack languages can't prevent referencing the wrong stack cell if happens to have the same type as the right one.

### reliability

Using just the stack to hold local values is very much like programming Haskell in point-free style. It doesn't make the program less reliable, using the wrong stack element would most probably be a catastrophic error that even cursory testing would find.

Not using local variables is pretty much inevitable if you use very short words in good Forth style. In a short word, you lose little by omitting local variables, and whatever you lose in readability you can make up in comments (though it is always a good idea to make the code itself obvious), but adding them would increase the code length dramatically.

I find it a much saner style than the huge, intricate functions with dozens of local variables that is common in mainstream imperative languages.

### Using just the stack to hold

Using just the stack to hold local values is very much like programming Haskell in point-free style. It doesn't make the program less reliable, using the wrong stack element would most probably be a catastrophic error that even cursory testing would find.

I can't be the only person here who's wasted hours on an utterly confusing debugging session because the two confused values were close enough that I didn't realise they'd been swapped even after the error. It's not often I do it, and it almost always could've been prevented by a typing discipline I would've seen as slightly paranoid at the time...

### Having just finish Thinking

Having just finish Thinking Forth I tend to think there's substance in what he says, even though he didn't spell it out in that whirlwind talk. I'm certain that you have not understood his meaning on local variables :-)

### Two guesses

Did Moore mean that local variables complicate predicting what machine code is emitted when a fragment of Forth is compiled? That would seem to be in line with what was going on with his i2l simplifications.

If that is right, does Moore say anything about achieving the same kind of simplification in higher-level languages using an appropriately engineered VM? I'm guessing he thinks it is better than nothing, but that something important is lost in terms of knowing what is going on with the hardware.

### Niche market?

Good luck making the economics work! With 24 little 1-GHz cores it could be useful for PDAs, printers, cameras, music/video players and such. I don't think Forth is a major selling point though. I could see an embedded developer coding DSP routines in Forth, and using a high(er) level language for almost everything else. The chip can almost certainly do it. But if Intellasys is so focused on Forth that they don't support more popular languages, developers will just choose some other chip unless this one offers a major performance or efficiency advantage.

### Forth has some very nice qualities and some interesting problems

Forth has some very nice qualities: it allows for short and elegant programs, much like FP languages.

On the other hand, the lack of typing can seriously hurt productivity. Types in a Forth program are implicit. I wonder if type infererence works on stacks. Perhaps C. Diggins can have a few interesting comments here, with his Cat language.

I had written only a couple of Forth programs in my life though, and only for educational purposes, so perhaps my view is not correct.

### Stacks and type inference

Forth has some very nice qualities and some interesting problems

Well put.

I wonder if type infererence works on stacks.

I suppose it could; a Forth word is just a function whose parameters are numbered rather than named. The fact that you can write a function with a variable number of parameters might be a problem.

### words are quasi-combinators, not functions

a Forth word is just a function whose parameters are numbered rather than named.

I think that's a misleading discription. Words aren't passed just the top few values on the data stack, they're passed the whole stack. Sometimes the way a Forth program uses the stack has no relevant resemblence to parameter passing (e.g., imagine writing a shift-reduce parser).

I would paraphrase Moore's philosophy of programming language design and programming this way:

a) Model the memory hierarchy in simple, general, realistic (i.e., close to the hardware) ways.

b) Regard the state of the machine as nothing more than your model of the memory hierachy.

c) Provide primitive state transitions (virtual CPU instructions) which map machine states to machine states.

d) Permit defining new virtual CPU instructions by sequencing primitives, choosing between alternative sequences, and unconditional branching -- i.e., instructions are defined by finite state automata.

That's most of language design. What he's done there is to construct a highly portable and cheap-to-implement yet fairly high-level general purpose framework for describing an operational semantics. The particular choice of primitive instructions determine an operational semantics

Programming philosophy: don't bother with any higher-level language -- just write in and extend the operational semantics directly by adding new virtual machine instructions. You'll be forced to think more clearly about what your programs do. You'll encounter fewer "impedence mismatches" where you have to fight against the programming language to say what you mean (e.g., tail calls in C). You'll probably come up with results that are much, much more economical in all but time to market.

So:

Models of the memory hierachy (the stacks and registers in forth) can be usefully understood as a particular, inductively defined set of fully reduced terms.

Primitive words are true combinators, mapping memory hierachy states to memory hierarchy states.

Defined words, if they use the return stack in only "ordinary" ways and do not modify code on the fly, are themselves true combinators. However, Forth permits reflection on the return stack and manipulation of where code is stored so, I give up and call words "quasi-combinators" (you could slice things up differently and get back to having them be true combinators but your model would drift away from how I sense good forth programmers think about them as they write them).

All of our (quasi-)combinators have the same domain and range here: machine states to machine states. Variable names would be excessive because, in effect, we only ever need one -- everything else is done with relative references to the value of that one variable.

In many other high-level languages, variable names are used specifically to abstract away from the operational model. It's then the compiler's job to, for example, translate a basic block into the operational model. That's irrational laziness in the hard-core forth view, as I understand it: the complexity costs too much, on many levels -- you wind up with seriously expensive and risky junk that way, even though, sure, you often get a shorter time to market than with approaches that try to automate things this other way.

We don't have much, compared to say Java, to tell us how well the forth approach works. We do have a lot, though: Moore's own work product is pretty interesting evidence.

There's something interesting about modularity in forth: In forth, each word's author is expected to be cognisant of and responsible for the whole state of the machine. For the price of assuming cooperation and trust between components, you get enormous flexibility and power.

-t

### Stacks and type inference: Citations

I Googled a bit and found some existing work on type inference in stack-based languages:

Type Inference in Stack Based Languages, Bill Stoddart, Peter J. Knaggs, 1993.

The Cat Type Inference Algorithm (blog post). The Cat language is apparently a stack-based language with type inference.

### But what about callbacks executed at arbitrary positions?

Isn't that an undecidable problem?

### Yes, but

Maybe, but I see no reason to think that a stack-based language has to support callbacks. Or, if it did, it could do type checking to ensure that callback functions left the stack unchanged.

### Support for callbacks is essential for almost any program.

Callbacks are necessary for almost any application. I don't see how they can be left out. Do stack-based languages have a different concept that offers equal functionality?

### That turns out not to be the case

Callbacks are necessary for almost any application.

That's simply not true. They're common in GUI apps, but most other apps don't use them at all. Even in the GUI world, there are examples of toolkits which aren't callback-based, such as the original Mac API, where an app was built around an event loop.

### Not really, I must insist.

Callbacks are used in many instances besides GUI:

1) in web frameworks
2) in subroutines like C's qsort
3) in parsing APIs
4) in asynchronous programming APIs
5) in timers
6) in unix signals

etc

As for the original Mac API, every GUI on the planet is built around an event loop. Some GUI toolkits hide the loop, but in either case, it's not possible not to use callbacks (unless you are crazy enough to have a giant switch statement on the widget's id).

### Never necessary

In all but one of those cases, I can think of an API that doesn't use callbacks. The exception is Unix signals, which are an abomination; they should never be used except for emergency exit. (Think: if you're in a system call when a signal comes in, it fails with EINTR. So you have to check for EINTR every time you make a system call. Almost nobody does, though, which means you can't trust all the libraries you use to do it right.)

As for the original Mac API, every GUI on the planet is built around an event loop. Some GUI toolkits hide the loop, but in either case, it's not possible not to use callbacks (unless you are crazy enough to have a giant switch statement on the widget's id).

That is exactly what we had to do in those days (except that, much of the time, we didn't have widgets; we had to draw most of the window by hand). Early Windows GUI programming was the same way.

### I am not saying it's not

I am not saying it's not possible to do all those things without callbacks. I am saying that callbacks makes those things vast easier to use and therefore are deemed necessary.

Take an event loop using widgets ids: isn't it absurd to have to remember each widget's id, whereas using a callback would provide a much better mechanism for handling events?

### Stoddart & Knaggs: just a beginning

The Stoddart & Knaggs paper lays a foundation for type inference, but doesn't present a complete inference algorithm. In particular, they don't consider recursion, which is where things get complicated.

### Cat: well, not quite

The Cat language is apparently a stack-based language with type inference.

I took a look, and it seems that their typing system is pretty limited. The type of a function is simply a tuple: (# of elements popped from the stack, # pushed on the stack); it can protect against stack underflow, but it doesn't address the types of the elements.

### I took a look, and it seems

I took a look, and it seems that their typing system is pretty limited. The type of a function is simply a tuple: (# of elements popped from the stack, # pushed on the stack); it can protect against stack underflow, but it doesn't address the types of the elements.

See the actual paper for more details. Cat inference/typing is not limited purely to argument counts.

### Cat

Cat and its type inference capabilities have been discussed a number of times on LtU. Here are just a few links to previous discussions:

Christopher Diggins, the creator of Cat, is a regular LtU contributor (as cdiggins).

### Thank you!

Thank you very much for gathering these links Allan.

I've gathered some interesting links related to the development of the Cat type system from even further back.

In the end it is always worth noting that Andreas Rossberg played a particularly important role in formalizing the Cat type system. There were also many helpful insights in these discussions provided by other members of the LtU community.

### Cat does address the types of the elements

it seems that their typing system is pretty limited. The type of a function is simply a tuple: (# of elements popped from the stack, # pushed on the stack); it can protect against stack underflow, but it doesn't address the types of the elements.

Thanks for looking at my paper, but it is unfortunate that it gave you that impression. The Cat type system is in reality very sophisticated.

Cat most definitely does address the types of elements. The following gives a type error:

define f { 42 "hello" add_int }


Cat can also handle sophisticated higher-order type checking for expressions such as:

  define g { [1] [add_int] compose apply }


The type is "(int -> int)" just as you would expect. The Cat type system is polymorphic and supports inference of equirecursive function types. The type inference algorithm can infer the type of any semantically valid Cat program. The type annotations are in fact sanity checks, and are simply compared with inferred types. I'd like to point to my implementation of combinator birds, to demonstrate the Cat type system working on some hard problems.

What sets my work on Cat apart from Stoddarts and Knaggs work is that Cat can infer types for higher order functions which Forth lacks.

Thanks for correcting me; I must have skimmed the paper too quickly.

### Rubish

1x Forth should be read in the context of Forth (obviously). Most of the statements in 1X Forth make a lot of sense in the context of Forth, but out-of-context for a non-Forth programmer it surely looks like rubish.
That's why, the other way round, comments on Moore quotes - and more generally on Forth - by non-Forth programmer are usually rubish; just the things I read here, sorry.

Please keep the language civil. Informed criticism is, of course, welcome.

### sorry

Here is a little bit more useful comment in order to let you forgive me.

Moore says that "locals" are harmful in Forth because the name of the game in Forth is to use these two stacks, in particular when running on a "stack-oriented" (as opposed to register-oriented) chip.
Locals are too often just a way to walk arround the real problem, generally a bad program design and/or bad factoring. The cases where having locals avoid awful stack jugglings are extremely uncommon: the fact is that usely if things one starts to juggle with stacks, one can look at the program and select a few meaningful data to be put in Forth (global) variables, and then it also makes sense that Forth' normal variables are global.

As for type inferrence and such IMHO this again a wrong idea from Forth programming perspective. The key advantage of Forth is the extreme simplicity of the design of the interpreter. It makes it easy to port and easy to modify by a single person, easy to adapt and fit to a particular target. That's why Forth was used early in the embedded world, a long time before the embedded hardware and compiler technology allowed one to use C, and is still in use nowdays.
From this point of view, Type inference or GC brings more problems than it solves: the GC and type inference gears would be bigger and more complex than the Forth interpreter itself; and for Forth experts, who take advantage of the on-target interactivity (and sometimes even compilation), they wouldn't have very few benefits.

### Type inference FTW

As for type inferrence and such IMHO this again a wrong idea from Forth programming perspective.

Type inference can be useful to verify code before sending it to a chip. You can use it to find bugs and to optimize the code. Not every implementation has to support it: if one interpreter receives pre-optimized and type-checked code from a trusted source, it can simply do its job, without any of its own type-checking.

### The unbearable lightness of type-checking

Not every implementation has to support it: if one interpreter receives pre-optimized and type-checked code from a trusted source, it can simply do its job, without any of its own type-checking.

Quite. It is wrong to suppose the possibility of type-checking a language necessitates a more complex implementation. There is no reason why a type checker could not be an ordinary Forth program.

I'm guessing it would be easy to write a cat interpreter in Forth.

### that's not the Forth philosophy

The forth philosophy as I understand it, is to try to Do The Right thing in the first place, and the Right By Design idea(l).
We don't rely on the compiler to find our bugs, we try to avoid them by adopting a sane programming discipline, which is BTW very close to extreme programming.

### Your language shows bias - a

Your language shows bias - a sufficiently powerful type system can (and generally does, whether checked automatically or not) form part of a sane programming discipline. This is in no way /relying/ on the compiler finding bugs.

Right By Design won't catch it when you mess up a piece of syntactic manipulation or other rote work somewhere in the process.

### Why use Forth for lowlevel programming only?

I would certainly love to use a Forth-like language for 'normal' apps if it boosted my productivity.

### I suspect part of the answer

I suspect part of the answer there is that there are other languages that suit this better and get your filthy high-level hands off Forth itself :-) Joy and Factor are probably both worth your attention if you want something to look at.

### That's what it's good for

Writing applications in Forth is like building superhighways with hand tools.

You have to develop a lot from scratch, because Forth lacks effective standards, modules, namespaces, data types, etc. Not that that's a problem. Simplicity is Forth's defining quality.

Forth is great for learning the mechanics of compilers and interpreters without getting bogged down in theory. The syntax is nice in certain circumstances. It gives you interactivity and programmability with very low overhead. But there are much better tools for dealing with complexity.

### Missing the Point of Forth

Perhaps you are missing the point of Forth. The Forth philosophy seems to me to be: There are no complicated problems - only complicated solutions.

Are you saying that Mr. Moores's VLSI design tools, the ones that I believe he used to create the microprocessor under discussion, are not real applications?

The main point I took away from the 1x Forth talk was this quote:

"The whole point of Forth was that you didn't write programs in Forth you wrote vocabularies in Forth. When you devised an application you wrote a hundred words or so that discussed the application and you used those hundred words to write a one line definition to solve the application. It is not easy to find those hundred words, but they exist, they always exist."

The size of the problem does not matter. The solution can be described by a limited, domain specific vocabulary, which can make use of other domain specific vocabularies.

### RE: vocabulary

My current lesson learned / food for thought from the discussions: It does seem like it is both the Forth and FP philosophy for making a good solution to aim at making a vocabulary or DSL that really fits the problem at hand. (Presumably this is true of whatever paradigm you are using. It isn't a ground-breaking idea, but I think it is interesting to see how people can go around the world by more than one route with a very similar destination in mind.)

### Economic choices

The key advantage of Forth is the extreme simplicity of the design of the interpreter. [...] That's why Forth was used early in the embedded world,

I can appreciate that simplifying your language makes it easier to target an embedded system—I wouldn't attempt to get full-fledged Haskell running on an 8-bit microcontroller. But that's an economic decision; once you've chosen to use limited hardware, you're more or less stuck with a limited language. Not surprisingly, LtU tends to attract people who are interested in more capable languages.

### Also unsurprisingly, the

Also unsurprisingly, the code necessary to program microcontrollers to do what they need to is also incredibly simple, and doesn't require a lot of "fancy" things, such as modules, higher order functions, etc.

### Yes and no

Maybe, but I didn't want to go there; it seemed rude to tell astrobe his chosen problem space was trivial.

Besides, I'm sure there's plenty of nontrivial problems in the embedded space—otherwise it'd all be canned components by now. Consider the problem of writing a TCP stack on a microcontroller.

### Complexity and microcontrollers

Going from programming large systems (Standard ML, TCL, Perl, Python, C++, Java, etc) to 8-bit microcontrollers was quite an illuminating experience for me. Suddenly my main concerns were efficiency, safety and liveliness.

Efficiency: You can't just port your vanilla encryption algorithm, you have to study it, dissect it and exploit the features of your language/environment (C, Forth, Assembly) to get it to run efficiently. Is floating point too heavy? Used to having malloc() around? You have a limited amount of time and space to do stuff (especially in interrupts).

Safety: Your code can't crash. There may not be an OS to handle recovery and you don't (typically) have log/dump files. Also, your code pretty much has to work out of the gate. When your code is embedded in hundreds/thousands of deployed devices, there are few opportunities for "field updates".

Liveliness: You can't 'hang'. You have watchdogs to make sure of that, but if your (reset)watchdog is being activated, you've got issues. (Potentially a debugging nightmare).

### Indeed

I've made the same transition myself in the last 8 months or so, and have found it similarly illuminating. And I think you've nailed the major concerns pretty well. The need for safety has definitely made me appreciate even C's limited static-checking that much more (not to mention all of the neat stuff you can do with splint). If only SPARK-Ada was available for the PIC (or is it?).

### ...but we need LtU type people in the embedded world!

Being "stuck with a limited language" is not necessarily dictated by the limited hardware. There is a reason why Forth programmers love the building block approach (and the ability to manipulate/re-wire the language itself).

Within the constraints of C there are lots of language tricks being employed (sometimes by exploiting Duff's Device, setjump/longjump, the preprocessor, etc). This is often done to make it easier to program state machines or introduce very lightweight concurrency. But, they are hacks.

What I want is a higher level language but with some control over code generation.

Forth is quite capable of this... but is hampered by, well, being Forth-like ;-)

### Hardware is always limited

I can appreciate that simplifying your language makes it easier to target an embedded systemâ€”I wouldn't attempt to get full-fledged Haskell running on an 8-bit microcontroller. But that's an economic decision; once you've chosen to use limited hardware,

Hardware is always limited. That's obvious. Even in PC-Land efficiency matters because if your language is 10% more efficient this means you can check 10% more files for viruses for instance. That may mean a top rank in benchmarks and may 10% more sells.
Economic choice, yes, because unless you're programming for the fun it's a major concern when selling a products. That's one reason why so many new software is still written in C/C++.

... you're more or less stuck with a limited language.

Perhaps, but it's certainly easier to upgrade a "limited" language like Forth than slimming down a "fat" language. Yes, Forth is running on 8bits CPUs as well as under Windows, and you can use it to build GUI applications. Being really portable on a wider range of platform is a major advantage for a general purpose language.
In my opinion, the most limited language is not the smaller one, that can be extended, but the bigger one, that cannot be scaled-down without losing features.

Not surprisingly, LtU tends to attract people who are interested in more capable languages.

And it seems that there's also readers interested in what happens when you throw away the type-checker and GC sidewheels and thus maybe more capable programmers.

### Understanding Moore

If you've never read any of his stuff before, here are a few things to keep in mind:

* He's focused entirely on embedded CPUs & hardware of his own design and PCs where he boots directly into his own code and doesn't use an operating system.
* He thinks in terms of entire systems, not just individual parts. By "system" I mean his application, his own compiler for his own flavor of Forth, and the hardware Forth engine of his own design that runs the code. Sometimes he'll remove features from his Forth because he can simplify the hardware as a result. Not a lot of people think at this level, and I agree it seems pretty hardcore!
* He views 1000 lines of code as an unthinkably large program :)