Typographical extensions for programming languages: breaking out of the ASCII straitjacket

A paper by Paul W. Abrahams circa 1993, abstract:

Using extended typography, we can design programming languages that utilize modem display and input technologies, thus breaking out of the ASCII straitjacket. We assume that a language has three representations: a visual representation that describes its displayed form, an internal representation defined for each implementation, and an interchange representation, expressed in pure ASCII, that is defined across all implementations. Using extended typography we can use distinctive typefaces to indicate keywords, thus removing the need to reserve them, and can introduce a variety of new symbols more meaningful than those used in most current programming languages. One benefit is the possibility of arbitrary user-defined operators. We can also introduce new kinds of brackets and methods of pairing brackets visually. Extended typography also helps to solve the problems of writing programs in languages other than English.

Sorry I couldn't find a non digital library link, but maybe this would help. I'm surprised no one has looked at this topic again since, at least in academia. Thoughts?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Fortress explores many of

Fortress explores many of these possibilities.

I didn't think Fortress

I didn't think Fortress provided rich typography as part of the programming process; it was just used in post-programming pretty printing.

It was intended for the user

It was intended for the user to use a clever editor rather than typing in the ascii backward compatibility text manually.

They had an emacs extension, at least.

https://code.google.com/p/fortress-mode/source/browse/trunk/trunk/fortress-mode.el

Proof assistants...

...make heavy use of similar ideas. You can use Unicode for identifiers in both Agda and Coq. I think that this is an obvious enough step that it's not really publication-worthy; possibly you could dig up some papers on controlling the size of lexer tables for DFA-based Unicode lexing/regexp matching.

There has been significantly more work on making grammars extensible. You're probably aware of most of this, but I'll list it anyway.

In addition, Agda supports mixfix operator definitions (see here for how they parse them). Coq supports operators through a grammar-update mechanism called notations. Essentially it works by letting you update the LL parse tables of the language within a scope. As is generally the case, Agda's mechanism is a more elegant, and Coq's mechanism is more powerful. (In particular, Agda does not support recursive notations.)

Of course, both of these are reminiscent of Lisp macros, and the Racket folks have a JFP paper under review, Macros that Work Together, which I found to be a very readable description of the Racket macro system. Jon Rafkind and Matthew Flatt had a GPCE paper last year about how to push this out to languages with richer syntax than Lisp, Honu: Syntactic Extension for Algebraic Notation.

Agda, Intensional Programming, ...

[Sorry, didn't manage to locate non-paywall links]

Neel has already mentioned Agda. Haskell has {-# LANGUAGE UnicodeSyntax -}, but lacks Agda's extensible mixfix notation.

But if you're thinking of something more general than user-defined brackets and Unicode symbols, you could add to Neel's list:

There's a large body of work on language workbenches, DSLs and extensible syntax (Sdf has some good pointers) which probably touches on this too, but I haven't checked. And then there are structure editors and visual programming languages... it depends on what you mean by "breaking out of the ASCII straitjacket".

Free links for a couple...

(The Davies & Kiczales paper seems to be unavailable.)
Intentional software
Expressive programs through presentation extension

The second link is broken.

The second link is broken.

Fixed.

Thanks, it was missing it's "http://". (Works in the address bar because the browser adds it, but in a link it ends up a relative link.)

Readability is paramount; extensibiility is important

Readability is a paramount consideration. The idea is to use syntax to make the intent and operation of programs more apparent to those who are already fluent in the language.

Extensibility is also important. The idea is to use syntax to make programming languages more extensible. For example, currently popular programming languages have an ever growing list of reserved words. Whenever, a new reserved word is introduced it can invalidate existing programs.

ActorScript (http://arxiv.org/abs/1008.2748) uses syntax for readability and extensibility.

For readability, ActorScript uses a different Unicode character for each primitive language construct so that it is visually more obvious which one is being used regardless of the context in which it appears. Also, it is possible to provide much better grammar diagnostics than is possible for more ambiguous syntax. For each Unicode character, there is an ASCII equivalent so that programs can be conveniently typed on a conventional keyboard.

For extensibility, ActorScript uses bold face for reserved words so that new ones can be introduced without invalidating existing programs.

Something like ActorScript

Something like ActorScript is what I'm looking for, which seems to have features similar to what Abrahams' described. But it still doesn't look great; did someone who was experienced in typography, layout, and color ever give any input into the design?

Typography and Unicode

Choosing Unicode for programming language constructs remains a work in progress. Do you have any suggestions for improvement?

I was actually just

I was actually just wondering if anyone had ever worked with a graphic designer in this area. There might be a lot of theory in visual chunking, prominence, font mixing, and layout that can be applied here that is way out of our field of expertise.

I really REALLY don't want fonts to be semantically significant.

I cheerfully use and provide unicode syntax for characters that don't have an ascii equivalent such as Greek letters that AREN'T visually ambiguous with something in the Roman alphabet, set membership operators and boolean combinators, but I don't want to sort out programs that have three different versions of 'a' or distinct capital Alpha and capital A that mean different things.

'No lookalike characters' is a rule that Unicode does not respect but which programming languages should.

To some of us, typography is not at all obvious. You have to get my attention with different letter forms, or I'm going to fail to understand the code (or the equation) correctly every time.

Are bold and italic sufficiently distinguishable?

Are bold and italic sufficiently distinguishable to be semantically significant?

Yes in strings, but not in single characters or mixed.

I don't have a problem recognizing an italic word as opposed to a regular word, but I do have a problem recognizing an italic character as opposed to a regular character when using single-character identifiers, or noticing that one letter in a word is italic/regular when the rest of the word is regular/italic.

Bold and italic are easily distinguished in most fonts

Bold and italic are easily distinguished in most fonts, even for single characters. I agree that mixing in bold and italic is a single name is very dubious.

Math uses 1-char bolding...

Eg. a bolded letter for a vector with unbolded equivalent for magnitude. I think it works quite well...the provisos being (a) the two symbols should refer to related concepts; and (b) you'd want a reasonably heavy bold typeface.

edit: sorry, you already said you find it hard to read. *shrug* I guess it depends on the reader (ideally, you'd like to be able to control concrete syntax with something like a user stylesheet, so a programmer could tweak any syntax they found difficult to read.)

Font size matters along with

Font size matters along with font smoothing technology. But with the advent of high-resolution displays like that on the Macbook Retina Pro, the, the definition of "hard to read" is changing.

I wouldn't use italics/bold

I wouldn't use italics/bold to distinguish between different things. In document typography, italics and bold have special meaning: italics often means an introduction and bold often means emphasis (so do italics). From Wiki:

By contrast, bold font weight makes text darker than the surrounding text. With this technique, the emphasized text strongly stands out from the rest; it should therefore be used to highlight certain keywords that are important to the subject of the text, for easy visual scanning of text. For example, printed dictionaries often use boldface for their keywords, and the names of entries can conventionally be marked in bold.

I would hope that these two concepts could be adapted to programming in a similar manner. We could come up with a bunch of rules on how to apply typographic styles to code, but I don't think they should be based on semantic differentiation.

Unicode bold and italics are for math, not typography

The whole reason that Unicode has bold and italic letters (and Old English and double-struck and other stuff) is for mathematical uses, where bold vs. non-bold and italic vs. non-italic is a semantic difference. Thus sin is s times i times n, whereas sin (in roman) is the sine function. In physics, bold variables are vectors, italic variables are scalars. Likewise, script H is the Hamiltonian, but if you use a Roman H, you get an integral equation instead. See "Unicode Support for Mathematics" section 2.2, Mathematical Alphabets.

and that is the reason why I took freshman calc twice.

First, understanding that there WERE separate versions of characters that meant different things, and second, figuring out how to distinguish them and rewrite the equations, held me back for a year in mathematics.

I was the guy in the back row with the laboriously-made flash cards that had specimens of various characters in different fonts, going through equations one character at a time and comparing, to figure out which 'n' I was looking at, before I could even start work on the problem.

I really don't want to repeat that horrible experience in programming.

Wrong metric

Horrible for you, useful for practitioners. Same thing with macros, types, and most other things -- they get abused, but are useful in scenarios.

sort of like color blind people

like being color blind and dealing with traffic lights.... except nobody's bothered to position code them so they can be easily distinguished.

although I could probably whomp up a name mangler...

Using those alphabets will mean I can't work on code unless I whomp up a name mangler to render all the names lexically distinct. Although I could do that. In programming it would be a lot easier than it was in math classes where you didn't even necessarily get the problems in a useful electronic form so a script could distinguish them.

But then it would render printed books about programming that used such a language (and the math-focused side of programming which produces the best books would, immediately) completely useless.

By using an IDE, a name mangler is not required

By using an appropriate IDE (Integrated Development Environment), a name mangler is not required for bold, itallic, etc.

I don't know what you mean.

An "appropriate IDE" will not fix whatever it is in my head that makes me effectively blind to the differences between italicized, standard, and bold letters.

Rendering bold and itallic for the challenged

An IDE could have have a mode in which the reserved word let is rendered by underlining or by any other decoration of your choice.

I "want" programming

I "want" programming languages to support scare quotes.

They already do. They look

They already do. They look like `\\`.

This reminds me of

This reminds me of ColorForth, where color was semantically significant.

"Typography of Code" research project at Adobe MAX 2010

I'm surprised no one has looked at this topic again since, at least in academia.

Adobe was looking into this a few years ago, but I don't think it ever went anywhere there. Here's a YouTube video of the presentation, with a live demo. I'm not sure if the people at Adobe behind this ever published anything about this anywhere, since they might not have been from one of Adobe's publishing research groups.

I couldn't find a good textual description of the system described in the video from anyone involved in the project.

Yes! This is exactly what I

Yes! This is exactly what I wanted, thanks!

I think we are getting way hung up on using typography to distinguish between different identifiers (Abrahams' proposal to be sure) and not enough on to make code more readable, chunkable, and scannable.

The video ends way too quickly though. All I can find out more is that the talk was done by David Durkee. Previous reddit discussion.

This is a nice start, but I'm sad there was no follow up. Is there any feasible way to break the chains of dumb text? (without going full-on structured/graphical, of course)

(fun) reminds me of Stroustrop's whitespace overloading paper

☎✆ anyone?

How about

λ, ¬, etc.

Design principles for the enhanced presentation of computer prog

Here is another interesting paper on the subject (CHI 86):

In order to make computer programs more readable, understandable, appealing, memorable, and maintainable, the presentation of program source text needs to be enhanced over its conventional treatment. Towards this end, we present five basic design principles for enhanced program visualization and a framework for applying these principles to particular programming languages. The framework deals comprehensively with ten fundamental areas that are central to the structure of programming languages. We then use the principles and the framework to develop a novel design for the effective presentation of source text in the C programming language.

The principles are:

  • Typographic Vocabulary: Choose a small number of appropriate type styles of suitable legibility, clarity, and distinguish-ability, applying them thoughtfully to encode and distinguish various kinds of program tokens. Within each typeface, choose or design a set of enhanced letter forms and symbols with which to represent the text effectively.
  • Typesetting Parameters: Adjust to enhance readability the typesetting parameters of text point size, headline size and usage, word spacing, paragraph indentation, and line spacing. Whereas Principle I concerns itself with the’ selection of symbols, Principle II deals with their application in clusters.
  • Page Composition: Bring out the program structure by carefully structuring the page composition through the use of grids, the application of explicit and implicit lines of varying thickness (typographic “rules”), and the inclusion of appropriate white space.
  • Symbols and Diagrammatic Elements: Integrate appropriate symbols and diagrammatic and graphic elements to elucidate essential program structure. In this way we achieve non-textual augmentations of the source code proper.
  • Metatest: Augment the source text with mechanically generated supplementary text, additional data and commentaries that enhance the comprehensibility of the source text.

Book-length version of the paper

Baecker and Marcus published an expanded version of this paper as a book: "Human factors and typography for more readable programs", Addison-Wesley, 1990.

It's been a long time since I've read it, but I rember that the book felt a lot like a published version of someone (I assume Marcus's) thesis. Using a small C program as the input, they start with the "standard" printed view of the code and then present various combinations of formatting options (layout, whitespace, type style and font selection, "links" to called procedures) that they then tried out in experiments. I can't remember the details of the experiments, but the "best" performing option is then presented at the end of the book as a set of formatting guidelines.

It was great work, and I'm amazed that it's been completely forgotten. It would be really nice if LaTeX modes for setting program text had an option to follow their recommended style.

Amazon link
Google Books

Sir, please step away from ASR-33

Poul Henning-Camp had a fun article in CACM titled "Sir, please step away from ASR-33" (Nov 2010, doi 10.1145/1839676.1839693). Closing quotation:

For some reason computer people are so conservative we still find it more uncompromisingly important for our source code to be compatible with a Tele-type ASR-33 terminal and its 1963-vintage ASCII table than it is for us to be able to express our intentions clearly.

He's off by a mile, though, when he says: "C++ is probably the language that milks the ASCII table most by allowing templates and operator overloading". In Haskell, operators can be defined from one or more symbol characters, being any of !#$%&*+./<=>?@\^|-~: or "any [non-ascii] Unicode symbol or punctuation". (Haskell 2010 language report)

input usability

oh, right, i'm sure it will be fun to learn how to enter unicode for every blasted thing i want to do. i for one am glad that i don't have to write in kanji.

so at least there's better be a way to convert round-trip-live between verbose ascii and compact unicode. :-)

Unicode is fairly easy to

Unicode is fairly easy to render after the fact; e.g. rendering -> as an left arrow rather than as typed (and converting it back to -> during relexing so you can type ->>-). The main problem I have with unicode symbols is that it is often ugly; our fonts are not designed to organically include unicode symbols. This is easy enough to solve with some font design, but chicken meets egg!

Edit: forgot that the lt symbol was poison.

rendering options

On a side-note, Smalltalk-80 used a single "<-" left arrow glyph for assignment in the UI, but this was represented as an underscore '_' in the ascii source code interchange format, which looked really weird. [Incidentally, remember to use &lt; for the less-than character in LtU's markdown.]

If you find a free copy of the paper, I'll read it. All I've seen so far is the abstract offered freely, which you quoted above. It notes a pure ascii interchange representation is assumed always present:

... a language has three representations: a visual representation that describes its displayed form, an internal representation defined for each implementation, and an interchange representation, expressed in pure ASCII, that is defined across all implementations.

So presumably you can always type an ascii representation when desired, unless a UI dev has an obstructionist attitude about what lowly users are permitted to do.

Rendering code in different ways is an interesting idea. Unicode glyph options might be too narrow a focus, when you can also think about showing code in graphical form too, which strikes me as useful. But maybe it's not a whole lot different from an IDE that suddenly puts UML in your face. Some things would would be easier to grasp as diagrams than as text. (At work I manually turn FSM tables into ascii art drawings to really grasp what they do.)

There's a user accessibility issue involved, not terribly unlike that in web page presentation, where one wants to let end users override how things are rendered by changing defaults and preferences to avoid being stuck with fonts and sizes problematic for any given individual. Ray might be representative of a large constituency anticipating a need to tune how things render.

The left-arrow reflected ...

... older versions of ASCII, where it was the glyph corresponding to the modern _ character. ASR-33's, indeed, printed left arrows.

and likewise, the up-arrow.

Pascal used an up-arrow glyph for dereferencing pointers, in pretty much the same way that C'ish languages use ->.

Later versions of ASCII (and therefore Pascal) replaced the up-arrow with ^.

What's interesting here is that the appearance of code changed but not the underlying representation or meaning. Thus, if you had a hardcopy printout, and were ignorant of the history, you might not figure out what representation would give you valid code. I can picture someone in the next generation with an ancient dusty greenbar printout, working their way through all of the unicode up-arrow glyphs, trying to figure out what the heck the compiler wants....

IIRC, there were four arrow glyphs in early ASCII. I don't remember where they all were now; just the two I used in coding.

Making ascii aliases for many unicode chars.

I'm just using character names between backslashes as a method for entering the corresponding unicode character. For example, if I type \lambda\ I get λ, which my lexer regards as a name.

I definitely feel the pain though; Unicode was not designed to fit our puny keyboards. I've already assigned names to over 300 'useful' characters, and even though they are mostly XML entity codes and existing practice from other familiar sources, it's hard to keep all of those names in my head. And I'm going to want at least a hundred more (mostly mathematical symbols) before I'm done.

It's a semi-extravagant use of keystrokes but does render code in an admirably compact visible form onscreen when you can just use the one-character mathematical symbol for an operation or predicate instead of making up some more-verbose ascii name.

Do you keep all the

Do you keep all the definitions in the lexer or the grammar? Or do you have a syntactic form for introducing new aliases like this?

Seems the latter would be nice because library authors could add new ones if they came up with a good use for another character.

Tradeoff is it might be too confusing, I guess. Similar to how naming your own entities in your DTD didn't really catch on very much (AFAIK).

there is an 'escapist' object.

An object of the 'escapist' class, which manages all the aliases and lookups, is one of the arguments to the lexer's constructor.

It has a public method for adding a new name/character alias, but I haven't (yet) hooked it up to a routine callable from the language. There'd be nothing particularly difficult about doing so, though.

The 'escapist' constructor creates an empty escapist (no character/name aliases at all). There is a public method for adding the "standard" set of 300-odd aliases. which, so far, I'm just calling immediately after the ctor.

If you're interested, I can email C++ code for the escapist class, or put it up on a blog. It has a dependency on the STL, but aside from the list of aliases at one per line, it's fairly compact.

I'd be interested mostly to

I'd be interested mostly to look at the list of aliases and steal the good ones. If you're OK with that and have time to put it up on one of the code hosting sites or somewhere that would be awesome. :)

Edit:
And also to see how you deal with the age-old subset vs. proper subset symbology question: ⊂⊆⊊⫋.

Hmmm. It's slightly ambiguous from your link...

I don't know whether you meant specifically the set relations symbols or the fact that some symbol aliases are proper substrings of others. The link you gave is an example of both.

The latter problem is simply dealt with by having the trailing backslash. Because aliases in the input stream are delimited by backslashes there's never ambiguity about where an alias ends.

The former, I haven't dealt with very much yet; remember I said I was about to import another hundred symbols from the 'math' section? Well, set relations are some of them. I'm very likely to define aliases for all of those characters, and will later decide which of them (or which few) will be used as the names of standard predicates.

Set relations

I meant the set relations specifically; espectially whether ⊂ is permitted as an operator and whether it is bound to subset semantics or to proper subset semantics.

Oh heck, steal as many as you want.

Oh, heck; steal as many as you want. I did. Most of these are stolen, very self-consciously, from XML, LaTEX, and ASCII character tables. I decided to go for complete coverage of ASCII, which may be unnecessary or even fugly, but enables some things. I also pretty much took everything users might be familiar with from XML, with the exception of one that XML got blatantly wrong. In some cases I've added things I think will be easier to remember or easier to interpret when discovered an ascii-converted program dump, and made them the default form.

Here is the permalink to the source code for the escapist class on my blog: Unicode, some more .

The title refers to the fact that the previous post on the blog was also about Unicode in programming languages, so that may be of interest here too. So here is that permalink.

Critiques welcome; I don't use c++ very often, and I'm getting back up to speed with it as I wrestle with this language implementation. That said, it's a very simple class. The only really obvious thing I couldn't remember/figure out how to do reliably was using the I/O library to read a formatted escape sequence using a hex number. So I did it the brutal way.

Menues for Unicode Characters

Menues are often used for Unicode Characters, but you would want to have some way to organize them if you have a lot of characters.

When I learn Chinese

When I learn Chinese writing, I don't remember the strokes for the character, but how it is pronounced since I'm used to pinyin input methods anyways (as are most people in China!). Something similar applies here.

Emacs has it

Emacs has something very similar in the TeX input method (activate once with C-x RET C-\ tex RET, then use C-\ to toggle). Since the names are the same as in LaTeX, many of the names are likely already known to the user.

I would have preferred a modern space cadet keyboard though.

I do not want to have to type Unicode

For all the let's-make-our-code-pretty-by-using-Unicode, I certainly do not want to have to type arbitrary Unicode. Even if my editor has added support for writing arbitrary Unicode characters, I do not want to have to type in character names by hand or, worse yet, select Unicode characters from menus. Hence I do not welcome attempts to use arbitrary Unicode in code, as seen in Agda and, to a lesser extent, in Haskell. I do not see why anyone thinks this is a good idea - maybe because they are only considering reading code and not typing it.

We read code more often than

We read code more often than we write or edit it, so having it pretty to read might work out better overall. But I do understand not wanting to write unicode by hand on a modern keyboard.

Keyboard macros could probably help quite a bit, though. And there may be new input devices that - e.g. based on sketching or gestures (or gestures) - that could enable a wider variety of characters and symbols. When I think about input devices I'd want to tweak code in an augmented reality system or smart home, I think keyboards won't make it. But there will always be a role for symbolic representation of code.

You only have to type it once

I think a reasonable compromise is to have keyboard friendly string aliases for the Unicode symbols, which you set up once at definition sites. As in, replace "pi" with the symbol after it's typed.

Seconded

I use Emacs "LaTeX" input method that allows to write the LaTeX names and get them replaced on the fly by the matching unicode character: \pi, \forall, etc. In my experience, writing this way is as convenient as the existing plain-ASCII syntaxes when they exists.

One downside of those mappins is that there is a learning cuve (when you fall outside the character set that you're familiar with from LaTeX), and not all LaTeX symbols map conveniently to unicode characters. Agda-mode has something similar that might be more principled.

We shouldn't be typing at all

I'd rather create programs with a gaming controller, making atomic refactor moves between consistent code states and navigating intelligent picklists. This paper's move towards model-view division of program representation is a great step away from archaic punchcard-looking editors and even more archaic everywhere-serialized representation of code. I would love to discard vim and git for an immersive visual interface over a relational database.

Intelligent pick lists or

Intelligent pick lists or menus are problematic in their own ways: an option you're seeking tends move around based on context in difficult-for-humans-to-predict manners, thus requiring developers actually read the lists each time. Intelligent menus only pay for themselves if the option selected is almost always among the first three in the list.

Though, something like that may work if you can somehow narrow down the options in a consistent, high-level manner, e.g. based on buttons or pi-menu directions. But I'm not sure what high-level concepts could be applied across a broad range of code snippets. Maybe something library-specific could be achieved.

In the age of touch, why

In the age of touch, why would the option have to be one of the first....three?

Touch is among the worst

Touch is among the worst interfaces for pick lists, intelligent or otherwise. The hand easily covers half the list, interfering with the reading thereof. The list covers a portion of the screen, interfering with contextual knowledge of where you're applying the option. The human visual focus is much smaller than the human hand at arm's length, so we inevitably end up covering with our hands exactly the things we're trying to read or influence. IMO, touching the same screen used for visual feedback isn't the brightest of HCI concepts, though it works well enough for low rate human-to-computer communication. (If we added some extra haptic feedback, e.g. touch screens that raise fluid buttons, this could be mitigated.)

Assuming you have abundant visual and haptic space for presenting as many options as you could ever want (e.g. using Oculus Rift for AR plus Myo or Leap for gestures) the remaining issue with 'intelligent options' is that they're intelligent. To a human, they're unpredictably located, and therefore inaccessible to muscle memory, and therefore require extra use of sensory organs (usually the eyes) to scan among the options available.

Scanning options costs time and mental fatigue. And these costs grow super-linearly with the amount scanned before making a choice. (The super-linear nature is due to making more comparisons between choices.)

Meanwhile, humans have excellent spatial memories and spatial processing abilities. With spatial memory, we can handle a stable 'organized chaos' quite well. With spatial processing abilities, we can handle 'automatic layout' fairly well (so long as it's simple). These work well with the sense of proprioception and the so-called 'muscle memory', which are both internal to the human and thus don't require scanning the environment. The mental fatigue of muscle memory is quite low, and largely subconscious.

Ultimately we must weigh the alleged benefits of 'intelligent' options against the benefits of 'dumb but predictably spaced' options. I've explored this some in my day job, albeit in terms of 'context-dependent' options. My informal user-studies suggested that context-dependent options are okay for 3-5 options, fewer if the options are relatively complex (require more time to scan and grok). This isn't to say that ALL hits must be within the first three, but the vast majority should be.

There are ways to mitigate issues with intelligent lists. E.g. if you can rapidly narrow down the list based on another input that doesn't involve looking at the options (perhaps voice or a secondary touch model) then it becomes easier to reduce the options to just a few for active scanning. OTOH, the same methods can work for narrowing items in a predictable automatic layout.

Picking from a list is better than touch typing?

I do not like the idea of having to search through lists to find items or, for that matter, relying on an "intelligent" system to guess for me what it thinks I might want; I, being a touch typist, much prefer being able to directly type what I want, where I do not even have to think much about the individual keys I am pressing. (In editors like Eclipse, I do not appreciate the menus they pop up, because searching through those menus is more of a hassle than simply typing, and the menus themselves often cover parts of the screen that I am trying to read as I type.)

This even applies to editors that attempt to be "intelligent" and make my life easier by adding extra characters ahead of time for me or dynamically changing what individual key-presses actually do, such as Eclipse. I often find myself deleting extra characters and reordering characters with such editors, and trying to read ahead of my typing to see just how the editor is going to try to "help" me is an unnecessary hassle; if I actually wanted what the editor thought I wanted I would have typed it in the first place! I want my editor to only handle indentation and syntax highlighting for me, and to let me type what I want rather than guessing for me what it thinks I might want ahead of time.

Re: Picking from a list is better than touch typing?

Touch typing has a lot of nice properties, but it also has a few disadvantages - e.g. it's hard to touch type without also having a place to sit and a rather bulky keyboard. And for component-based programming, grabbing from a list is often more convenient than rewriting the component or memorizing a GUID or similar.

Even though picking from a list is an awful interface for tablets and smartphones, it is often more convenient than hauling around a keyboard.

In an augmented reality setting, we might be able to project a keyboard whenever we need one, but component-based visual programming is also more feasible (due to larger virtual workspaces and more readily accessible component inventories).

Anyhow, if I were limited to the traditional desktop setup and relatively free-form text, then I agree with your observations. I've experienced similar troubles with auto-completion.

Bluetooth keyboard, cheap, fits in pocket.

I got one of these:
http://www.amazon.com/Bluetooth-Wireless-Keyboard-Android-Smartphone/dp/B0043862N4/ref=sr_1_2?ie=UTF8&qid=1374970254&sr=8-2&keywords=mini+bluetooth+keyboard

It fits in your pocket with the phone, and it's far less miserable than typing on a touchscreen. Being a thumbboard, it isn't as nice as a full-size keyboard, but it's good enough to make typing into a non-problem for most purposes.

I would far rather use a device like this than trying to program by picking stuff from lists. I can thumb text in without thinking (or while thinking about the next problem), so being slower than a regular keyboard isn't a problem unless my hands fall far behind my mind, which with programming doesn't happen all that often.

Bear

Program with a gaming controller you say?

I'd rather create programs with a gaming controller,

Back when the 6502 was king...

http://www.flickr.com/photos/25885309@N02/sets/72157604661612578/