Non-English-Based Programming Languages

The recent discussion of the Chinese natural language included some speculation about what CS would look like had it not been dominated by English discourse.

The critique was raised that one could postulate any number of alternate histories. However, we can do better than pure speculation by looking at natural-language-inspired Programming Languages developed in non-english-based cultures.

To date I have only come acrosss a few such references to Japanse-based experiments in End User friendly programming tools. Unfortunately, they refered to non-english papers that weren't accessible to anyone unable to cross the natural language barrier.

This work was of particular interest because it went beyond the mere transliteration of keywords in Western programming languages. But alas, the references I encountered were little more than second-hand existence proofs of work we should know more about.

If anyone is familar with such efforts, any information and observations you could share would be most deeply appreciated.


Peter J. Wasilko, Esq.
Executive Director & Chief Technology Officer
The Institute for End User Computing, Inc.

These comments are not official IEUC positions unless otherwise noted.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Logo.

Back when Papert lived in Montreal, the province of Quebec was afire with a French version of logo. pour would replare for, avance would replace forward, &c. I made a game with it in the late 90s, when the environment was clearly out of date (and Papert was back in the US).

OCaml was mostly developed in France, but its keyword are clearly english. Ditto for prolog (afaik).

This is a case of

This is a case of transliteration of keywords. It isn't the only one. Algol 68 had provision for different "representation languages", which not handled handled difference of charsets but also could include the translation of keywords. The French translation of the Algol report thus proposed French keywords along with the English based one. (The implementation of Algol 68 allowed only the English keywords, I don't know for the others).

AFAIK, the common wisdom on those kinds of adaptation(*) is that they bring more problems than they solve. An infamous -- for those who had to suffer it -- example is that at least some versions of MS-Excel had localized function names. That meant that when importing a sheet from a different language version didn't work.

(*) at least among other indo-european languages speakers. A possible exception can be, as it is in the case of Papert, when the target audience is younger children. Possible as having keywords in another language help in not bringing out all the different meaning of them.

hmm, german

hey maybe german is pretty suited to being the base for a programming language, what with prepositions at the end of sentences and the prevalence of long composite words (could this be a germanic way of object orientation)

something like if we define Person, then we could say:

Firefighter = PersonWithJobFightsFires.

Prolog and Planner, Life and Oz

Even if the keywords are in English, the language designer's native language shows through to some extent. I don't have more than anecdotal evidence for this, though. For example, Prolog was designed by a native French speaker, Alain Colmerauer, and one could argue that the Cartesian rigor and austere holism of French shows through, especially when compared to the ball-of-mud, pragmatic style of its American predecessor, Planner.

From personal experience, I can compare the styles of two excellent language designers, Hassan Ait-Kaci (designer of Login and Life) and Gert Smolka (designer of Oz and Alice), who collaborated in the early 1990s. Ait-Kaci would design a beautiful language feature, such as Life's psi-terms or its residuation, which he would weave with the other language features into a coherent whole. Whereupon Smolka would take these holistic ideas and factorize them into their parts, a reductionistic trait which irked Ait-Kaci a bit! Needless to say, Ait-Kaci is a native French speaker and Smolka is a native German speaker.

Now that many languages (a)

Now that many languages (a) use Unicode extensively and (b) are very dynamic, I imagine there are many languages that can use non-English words for all their tokens. One that springs to mind is Nu (http://programming.nu).

Block-based languages

I'm designing a language around square based tiles. There is trouble fitting three short english words vertically into a tile at font size that makes them easy to scan. However, four large Chinese characters can fit in it very nicely as they are a fixed size.

Something to think about. If you were programming in Chinese, you wouldn't have to worry about formatting so much.

Very interesting.

Very interesting.

What do we think might be different?

So what actually varies between natural languages, and how could it affect programming languages in the first place?

Vocabulary and pronunciation changes. A no-brainer: vocabulary affects keywords. That's probably why everyone brings it up. I can't imagine how pronunciation could affect a programming language.

Word order and conjugation. These allow speakers to understand streams of words as denoting complex relationships. They don't work precisely, as there are generally many ways to understand a sentence. They're definitely not precise enough for a programming language. I wouldn't expect them to affect one much at all.

Self-consistency. Languages have varying degrees of this. Programming languages need much more than natural languages, though there are also varying degrees. This might actually make a small difference.

Idioms and preferences. Idioms can't possibly show up as themselves. Preferences... I don't know enough about other languages. I do know of one example: English speakers tend to dislike repeated words and phrases. I don't see how this could affect PL design, though.

Now, the mathematical background, preferences, preferred ways of solving problems, etc., of the PL designer should play a huge role. Claiming that this is a function of natural language is called the Sapir-Whorf hypothesis. For natural languages, this hypothesis is generally regarded as bunk.

The eastwest (toy) language

The eastwest (toy) language could, originally, let you write a program in any language you want. Here's a program, written in eastwest, inspired by the SICP and written in japanese, to find square roots:

http://www.youtube.com/watch?v=vwgvVpCRecE

here's the website (with more videos):

https://sites.google.com/site/rathereasy/eastwest