Arc is released

Make of it what you will, but Arc is now officially released.

This part of Graham's announcement is a gem:

I worry about releasing it, because I don't want there to be forces pushing the language to stop changing. Once you release something and people start to build stuff on top of it, you start to feel you shouldn't change things. So we're giving notice in advance that we're going to keep acting as if we were the only users. We'll change stuff without thinking about what it might break, and we won't even keep track of the changes.
I realize that sounds harsh, but there's a lot at stake. I went to a talk last summer by Guido van Rossum about Python, and he seemed to have spent most of the preceding year switching from one representation of characters to another. I never want to blow a year dealing with characters. Why did Guido have to? Because he had to think about compatibility. But though it seems benevolent to worry about breaking existing code, ultimately there's a cost: it means you spend a year dealing with character sets instead of making the language more powerful.

This sure made me smile...

By Ehud Lamm at 2008-01-30 07:07 | General | other blogs | 34009 reads

I love Alan Perlis' epigram

I love Alan Perlis' epigram #19: "A language that doesn't affect the way you think about programming, is not worth knowing."

There certainly seem to be interesting new variants of Lisp (here's a worthy-sounding avenue of exploration) but from the tutorial it's not obvious what makes Arc one of them. So what makes Arc worth knowing?

By Roly Perera at Wed, 2008-01-30 08:35 | login or register to post comments

Bah

[[I love Alan Perlis' epigram #19: "A language that doesn't affect the way you think about programming, is not worth knowing."]]

Bah, this is obviously an 'ivory tower' remark: Java, D doesn't affect much the way you think about programing over C++, yet they're worthy to learn, the first one because it may get you a job, the second one because it's an improvement over C++..

By renox at Thu, 2008-01-31 06:27 | login or register to post comments

Mu

[When I started this comment, I disagreed much more with you than I do now :)]

Java, D doesn't affect much the way you think about programing over C++,

I agree that neither Java nor C++ affect much the way I am thinking about programming in general, and this is how I interpret "Perlis' Razor". (I left out D because I have not done anything substantial in it.)

However, note that (t)his view is philosophical, as it implies that only expressivity counts. So, yes, this is perhaps indeed what you call "ivory tower" remark. But that means your economics argument ("yet they're worthy to learn, the first one because it may get you a job") is orthogonal to the discussion, and that only muddies the waters here, really.

(As an aside, I do design and program quite differently in Java than in C++ (and other languages, too), so they very much affect the way I think while programming.)

I wish Perlis would have said "concept" instead of "language". I think there is not really any programming language in particular that is worth knowing, but concepts like (in no particular order) iteration, recursion, closures, objects, condition systems, continuations, monads, macros, all sorts of polymorphism, types, concurrency, etc.. All these do affect how I think about solutions when shaping them.

By michaelw at Thu, 2008-01-31 09:34 | login or register to post comments

Software has no price...

...when your time has no value.

By grettke at Fri, 2008-02-01 18:33 | login or register to post comments

Keeping track of the changes

I like Paul and his writings, but I had to admit that I really cringed when I read "and we won't even keep track of the changes". Unless, that is, his goal is to make sure that nobody ever uses Arc, in which case he's doing the right thing ;-)

The language spec, which I have yet to fully read, has a few things I like (such as a neat concise syntax for a kind of lambda expressions), and some things I didn't like (such as an automatic indexing of data structures when you use the data structure as a function). Overall, though, I see little to get excited about. It's a more concise dialect of lisp, but that's about it AFAICT. But time will tell.

By Michael Vanier at Wed, 2008-01-30 08:40 | login or register to post comments

Spec

The language spec, which I have yet to fully read, has a few things I like...

Do you mean the tutorial ?

By Charles Stewart at Wed, 2008-01-30 13:15 | login or register to post comments

Spec

Yup.

By Michael Vanier at Wed, 2008-01-30 23:12 | login or register to post comments

No hygiene

According to the announcement, arc's macros are not hygienic (no surprise), but it also has no module system. How does arc avoid hygiene problems?

By Charles Stewart at Wed, 2008-01-30 13:08 | login or register to post comments

With symbol generation

Or maybe the answer is "what problems?" Here's a quote (taken from here) from Paul Graham:

Classic macros are a real hacker's tool-- simple, powerful, and dangerous. It's so easy to understand what they do: you call a function on the macro's arguments, and whatever it returns gets inserted in place of the macro call. Hygienic macros embody the opposite principle. They try to protect you from understanding what they're doing. I have never heard hygienic macros explained in one sentence. And they are a classic example of the dangers of deciding what programmers are allowed to want. Hygienic macros are intended to protect me from variable capture, among other things, but variable capture is exactly what I want in some macros.

By Matt M at Wed, 2008-01-30 14:38 | login or register to post comments

gensym isn't enough

This is a classic misunderstanding of macros and hygiene. There are cases where gensym is simply not expressive enough to do what hygienic macros can do. If I have a macro that expands into

(list 1 2 3)

and someone else wants to use it in a context where list has been shadowed, there's nothing the client can do, and nothing I can do, to get the macro to work.

"So what?" you say. Maybe you write your macro only to expand into fully-qualified module-resolved names that--depending on what language you're using--may not be shadowable. Or maybe you just expand into obfuscated names that nobody's likely to shadow... Nice try, but that doesn't work when you write macro-defining macros like define-inline.

Paul makes two points that are true: 1) Sometimes you want to capture. True, but anyone familiar with real hygienic macro systems knows that this is perfectly possible; you can selectively capture with hygienic systems like syntax-case. 2) Nobody knows how to explain hygiene well. This is definitely an issue, in part because as a research community we're still getting a handle on the subject. (My forthcoming ESOP paper with Mitch Wand is work in that direction.)

By Dave Herman at Wed, 2008-01-30 15:58 | login or register to post comments

Dear Mr. Graham,

I have never heard hygienic macros explained in one sentence.

An hygienic macro is one where the meanings of symbols that aren't parameters to the macro are bound at the definition site rather than the expansion site.

-------------

Okay, a bit flip, and there is real complexity in explaining what any particular hygienic system promises and what it doesn't. So maybe he was saying that he's never heard any specific hygienic system explained accurately in one sentence.

I wonder why he would be opposed to a macro system that supports selective capture. I mean, besides the explanation bit.

By James Iry at Wed, 2008-01-30 16:38 | login or register to post comments

James' definition of hygiene

An hygienic macro is one where the meanings of symbols that aren't parameters to the macro are bound at the definition site rather than the expansion site.

Very nice.

By Charles Stewart at Thu, 2008-01-31 08:54 | login or register to post comments

An hygienic macro is one

> An hygienic macro is one where the meanings of symbols that aren't
> parameters to the macro are bound at the definition site rather than the
> expansion site.

That's a *very* LISP-based definition of hygiene.

I think part of the problem with hygiene is that in common usage it conflates two entirely separate concepts. Put simply: hygiene says there shouldn't be unintended variable capture (where "unintended" varies from language to language); referential transparency says that variables in a macro that are lexically bound in that scope should remain lexically bound to that scope even when they're inserted elsewhere.

If you want to a see a different approach to these issues that's different than the traditional CL/syntax-case approach, have a look at my Converge language. I don't know whether Converge's approach is the best possible approach, because ultimately this is a language design issue more than it is a fundamental theoretical issue.

By Laurence Tratt at Thu, 2008-01-31 13:08 | login or register to post comments

parameters to the macro

That notion is commonly used as a sort of best-approximation heuristic for understanding hygiene, but it's more like a rule of thumb than a specification. In fact, the notion of "provided as a parameter to the macro" isn't even well specified. For example, let's say I want to write an unhygienic loop macro:

(loop (begin
        (when (prime? i)
          (break i))
        (set! i (+ i 1))))

Why's it unhygienic? By your definition, it's because break is captured by the macro, but it's not provided as an argument to the macro. So this is impossible to write in a totally hygienic macro system like syntax-rules, right?

But! The body of the loop is an argument to the macro. So you can't really claim that break wasn't provided as an argument. It's just buried somewhere inside the body. If the macro can find an instance of the identifier break in the body, it can copy that identifier into a binding position, thereby "hygienically" implementing an "unhygienic" macro. (Deriving ⊥ is left as an exercise for the reader.)

This isn't an academic point -- the loop macro was thought to be impossible to implement in syntax-rules until Al Petrofsky showed how in 2001.

I claim there is a more precise but still relatively simple intuition for what hygiene means:

Hygienic macro expansion preserves α-equivalence.

One sentence, even!

Don't get me wrong -- making this formal and precise is still hard. The tricky part is defining precisely what α-equivalence means for programs with macros. See my aforementioned paper for a formal treatment of the subject.

But intuitively, the idea of "intended" vs. "unintended" variable capture has to do with the binding structure of a macro. If you can make that intention explicit by providing a specification of the binding structure of your macros, then you can actually define α-equivalence and provide a correctness criterion for hygienic expansion. Without such specifications, we are left with heuristics. Essentially, what existing algorithms do is dynamically (i.e., at expansion time) infer this binding structure.

By Dave Herman at Thu, 2008-01-31 15:16 | login or register to post comments

To poster

By grettke at Fri, 2008-02-01 18:36 | login or register to post comments

PG on hygiene

I think it is fair enough to say that implementations of hygiene are not trivial, but there is a tradeoff. Best practice in CL for macro writing results in macro definitions that are generally significantly more obtuse and verbose than comparable macros in scheme (possible with syntax-case), and which still have some subtle hygiene risks. Will Clinger has opined that CL does not run into problems with hygiene because of its module system (he was one of the architects of the CL module system). The tradeoff seems to be between ease of understanding the implementation on the one hand, and simplicity and brevity of code on the other.

What PH has written so far does not reassure me that he has not made a mistake here. I'd like to know more about how much code has been written in arc, and whether arc faces problems with macros in practice that CL does not.

By Charles Stewart at Thu, 2008-01-31 09:16 | login or register to post comments

Felleisen on toys

Felleisen blogged about this preëmptively over at the PLT Scheme blog almost a year ago. If you accept his premise...

The macro system of Common Lisp (and, by extension, Arc) is totally lame compared to that of PLT Scheme. [I am paraphrasing here, see his post for the exact wording.]
Because we support macros-as-abstractions, implementing classes, mixins, and traits as macros is not only feasible, it's a joy. Indeed, implementing an entire language, such as Arc is doable and is no just a toy (as it would be if implemented in a primitive macro system).

... then the inescapable conclusion is that Arc's own macro system is too feeble to implement Arc itself. Pretty amusing, if you ask me.

What's even more amusing is that Arc's current implementation does not seem to use any of the Scheme macrology:

$ curl -O http://ycombinator.com/arc/arc0.tar 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  150k  100  150k    0     0   336k      0 --:--:-- --:--:-- --:--:--  513k
$ tar xf arc0.tar 
$ grep -cE 'syntax-(rule|case)' arc0/*.scm
arc0/ac.scm:0
arc0/as.scm:0
arc0/brackets.scm:0

By el-vadimo at Sun, 2008-02-03 14:31 | login or register to post comments

Backwards Compatability

Languages aren't the only thing that suffer when they have to think about being compatible with previous versions. I'm sure that many applications have had their ability to innovate stifled by having to consider the release already out on people's machines. It will be interesting to see just how painful it will be to migrate from one version of ARC to the next, and whether the frequent changes were worth the cost.

By Ethan Vizitei at Wed, 2008-01-30 16:05 | login or register to post comments

Early alpha

This is clearly not meant to be the final answer. From the site:

Number one, expect change. Arc is still fluid and future releases are guaranteed to break all your code. In fact, it was mainly to aid the evolution of the language that we even released it.

By James Iry at Wed, 2008-01-30 16:32 | login or register to post comments

ASCII Only?

Doesn't good language design more to do with learning from the weaknesses of other languages than coming up with cool new ideas?

Does it really bode well for a language to announce your language with a quote about someone else's character-support mistake while proudly supporting only ASCII? (Isn't this a little like the American electoral tendency to vilify learning and wisdom?)

Unicode isn't perfect, but can anybody argue it's a bad thing? Haven't we had Unicode libraries and support long enough that implementation is a significant issue? Can anyone really claim that only supporting plain ASCII when you're building a language from scratch has any virtue (except maybe laziness and impatience)?

By sarah_a180 at Wed, 2008-01-30 18:48 | login or register to post comments

Not offended, but certainly not interested

I realize that supporting only Ascii is uninternational to a point that's almost offensive, like calling Beijing Peking, or Roma Rome (hmm, wait a minute). But the kind of people who would be offended by that wouldn't like Arc anyway.

I'm not offended by his choosing to support only ASCII, but the kinds of apps I'm writing often deal with non-English data. It isn't political incorrectness that would keep me from ever using Arc, it's that it is intentionally designed to be the wrong tool for my problem domains.

(What kind of people does that make me?)

By ned at Wed, 2008-01-30 19:46 | login or register to post comments

As an aside...

English isn't the only language which refers to foreign locales by names other than what the locals call 'em. The French refer to Germany as "Allemagne". The Chinese call themselves "zhong guo" (middle kingdom), and the US "mei guo" (a sort of transliteration of "America"). And "Peking" isn't really incorrect; it just is a spelling that uses a now-deprecated system of romanizing the Chinese language.

I suspect that arc not supporting Unicode may relate to the difficult of case-folding semantics in many other languages; which have nastier rules than does English. Most LISPs are case-insensitive, so Unicode support would either have to break with a longstanding LISP convention to do cleanly, or be messy and nasty. Unicode identifiers and normalization is a nasty enough problem as it is.

By Scott Johnson at Wed, 2008-01-30 20:05 | login or register to post comments

and I should mention...

that the Germans, of course, call their own country "Deutschland". Just to be PC and all. :)

By Scott Johnson at Wed, 2008-01-30 20:07 | login or register to post comments

Is ASCII good enough for anybody?

the kinds of apps I'm writing often deal with non-English data.

I'm English. I live in England. I usually only develop websites and applications used by English people living in England. ASCII isn't even good enough for me because it doesn't include the currency symbol used by English people in England, 'Â£'.

Even if you only care about Anglophones, ASCII still isn't suitable.

By Bogtha at Thu, 2008-01-31 05:22 | login or register to post comments

One should remember what ASCII stands for:

American Standard Code for Information Interchange.

By Scott Johnson at Thu, 2008-01-31 06:51 | login or register to post comments

Extended ASCII?

I'm a little surprised, because I naively assumed that pg meant extended ASCII - which does indeed contain the Â£ character. But after a simple test, it seems I'm wrong, as this is the result of a test:

arc> ("1Â£2" 1) #\?

I'm curious why it doesn't at least support code page 437 or ISO-8859-1. Requiring only one more bit, and without as many gotchas as UTF-8 (correct me if I'm wrong), it doesn't seem like it should be so difficult to implement.

By Daniel Waechter at Thu, 2008-01-31 20:00 | login or register to post comments

ASCII Only?

I can sympathize with that decision, as it avoids bloating a language specification with the mess that is Unicode. For example, the specification of "valid unicode string" is too complex to express in most type systems, whereas a valid ASCII string is simply an array of ASCII characters. There's also the question of encoding: Do you expand all characters to 32 bits using UTC-32? Even then you still have to deal with combining characters, etc, so that one UTC-32 element doesn't coincide exactly with one character. Or do you use a UTF-8 style encoding, where the notion of 'character' is ill-defined, since a character may require multiple UTF-8 elements. Even if those problems didn't exist, Unicode doesn't define a mapping to glyphs, so yet another layer of translation is required to manipulate or display them.

By Tim Sweeney at Wed, 2008-01-30 20:45 | login or register to post comments

Except...

In this case Paul Graham doesn't need to decide anything, as the infrastructure he uses, PLT Scheme, already has a full Unicode implementation. In fact given that PLT Scheme supports Unicode, and Paul states “For example, Arc's read is MzScheme's” he must have done extra work just to break Unicode support!

By Noel at Wed, 2008-01-30 21:19 | login or register to post comments

MzScheme supports unicode in

MzScheme supports unicode in one way, but not necessarily the best one: it represents strings as sequences of codepoints, so glyphs made up from several composing characters will be represented as several Scheme characters. Which is kind of painful -- look to comp.lang.scheme for much gnashing of teeth as R6RS made the same decision.

I think this is exactly what Graham is getting at: hacking up _a_ unicode implementation is easy enough, but developing the best one requires lots of work, and the consequences of prematurely standardising on an inferior approach are unfortunate.

By Vilhelm Sjoberg at Wed, 2008-02-06 18:33 | login or register to post comments

Factor..

.. has recently added support for 24 bit strings capable of representing all unicode code points. Details here.

Seems like a decent way to do this to me. Also, another contributor is writing a unicode library.

But the language is 4 years old.. hmm.

By Gavin Harrison at Thu, 2008-02-07 14:55 | login or register to post comments

The mess that is Unicode...

...is unfortunately, necessary. At least much of it is.

Likeitornot, there are many thousands of useful symbols used round the world, many of which look similar and have different meanings. Many different locales have different conventions for case folding, string sorting, what constitutes a "letter"--ch, ll, and rr are all considered single "letters" in Spanish, whereas in English they are digraphs. Then there's "backwards" scripts such as Hebrew, bidirectional scripts, and vertical scripts. I'm sure you're all aware of this.

(Though I'm curious: How does Epic handle internationalization of its games?)

Unicode is messy because natural language is messy and political. It may not be an optimal solution, but it is a satisfactory one; given the politics inherent in representing the world's natural languages (and the symbolic languages of many non-computer problem domains), I'm not sure that an optimal solution is possible. *That* is one wheel that I'm more than willing to have a committee invent once, rather than having to deal with myself. I'm unaware of any clean alternatives to Unicode which don't achieve their cleanliness by discarding whole swaths of human communication.

Many useful things in the domain of real life can't be expressed in toto by the type systems of most languages. That to me is an argument for stronger type systems, not for taking Occam's razor to reality just so it can be more easily coded.

One of the nice things about Unicode--especially at the level that most PLs operate--is that most of the work is already done for you. The Unicode consortium already has lots of helpful suggestions as to what level of normalization should be applied to identifiers, etc.

By Scott Johnson at Wed, 2008-01-30 22:22 | login or register to post comments

PG took a decision that

PG took a decision that seems to be unpopular, but I completely understand: multiple character sets should be managed by the application, not by the language.

Languages are complex beasts, and there is no consensus in the world about how to develop multiple language applications. Languages are too cultural to be defined by a Unicode committee. I'm grateful Unicode created some standards but some of the decisions they took define a way of working (we can call it framework) I dislike. Sometimes I would prefer to manage myself the mess of having multiple languages.

Obviously if you think PG point of view is wrong I recommend you not to use Arc in your next multi language development, but I'm sure if you want to develop an English and Chinese application it will be much more helpful to have some Chinese programmers in the team than some Unicode experts.

Finally, what happens when languages change? In Spanish ch and ll are not letters since 1994 -- we changed the grammar (rr was never a letter) Of course all of them are digraphs.

By rogersm at Wed, 2008-01-30 23:51 | login or register to post comments

Actually...

...internationalization is best handled by a combination of programming languages, the platform (including components such as the OS, the UI framework/environment, web browser, or whatever else you have), and the application. Mostly the middle part.

Languages should provide means of representing characters (symbols), and strings thereof. If you provide a means of representing characters, the encoding should be defined (or be parametrically variable)--being agnostic on this point is dangerous. Certain operations on strings, likewise, may need to be provided by the language for its own purpose.

I'll wholly agree that a "core" programming language (i.e. a set of constructs sufficient to define any program concisely--excluding domain-dependent or higher level services) shouldn't need to worry about how to render Chinese or Arabic properly. That sort of stuff can be handled by other parts of the software stack (though the application shouldn't do it--it's specialized common code that should be shared by whatever apps run on a given platform).

But Graham could have (and should have, IMHO) undertaken the basic steps to make Arc Unicode-friendly. Unicode is here to stay--the debate as to its merits was settled years ago. Program readers should understand UTF-whatever. The storage for a single character should not be limited to 8 bits. Perhaps Arc won't have problems migrating in the future, I don't know.

But PL support for Unicode isn't difficult. It really isn't. And much of the code and tables you need to do it can be downloaded for free here.

Pity that Arc isn't, and apparently won't be, on this list

And I respectfully disagree that there is "no consensus" on how to develop multi-lingual applications. Perhaps there isn't unanimous agreement, but if you write apps for Windows, there *is* a way to do it. 'Twas dictated FTMP by Microsoft, but there it is. Likewise in the FOSS world. Likewise in the Java software stack. Likewise in the browser/Javascript environment. International software is a reality today. It works. And on every modern platform you can think of where internationalization of a global nature is found (i.e. not just Western European stuff, not just the local language + English)--you will find Unicode.

A programming language of a modern design which ignores Unicode is like a PL of twenty years ago deciding to ignore ASCII. Bad move.

By Scott Johnson at Thu, 2008-01-31 00:30 | login or register to post comments

I'm surprised this interestest in low level thinking...

...it seems the problem is the limitedness of 8 bits storage for a single character.

Internationalization is not a technical problem, it's a cultural problem.

I'm not interested how a single character is stored. I want to have enough data types to develop multi language application without being limited by the Unicode Framework: too heavy, too complex, and difficult to develop with it. Having Unicode only strings is something I'm not interested in.

For developers working in applications with western languages there is no problem developing an application in English, French, German and Spanish without Unicode. Cyrillic languages (and modern Greek) are slightly more problematic but mainly because the programmer has no idea of their alphabet.

As soon as you move to Semitic languages things get incredibly harder. The right to left nature of these languages create some problems Unicode is not able to solve (and it should not) and the ligatures in Arabic script is a problem not solved by the anyone in the computer science/engineering world.

My only experience with Asian languages is Korean, a fairly easy language to work with. But I'm think Asian speakers are not completely happy with Unicode.

Unicode gives me a single character set. This is enough for me. I'm happy with a UTF-whatever data type so all my applications can interoperate.

Unicode support at library/application level is the right equilibrium between usefulnesses and complexity. Moving Unicode to language definition creates a lot of problems I'm not interested in solving.

And no, I don't think it is a programmer's right to be able write diacritical marks in your source code variables if your Italian.

By rogersm at Thu, 2008-01-31 13:25 | login or register to post comments

Lions, Tigers and Bears -- Oh My!

...if you want to develop an English and Chinese application it will be much
more helpful to have some Chinese programmers in the team than some Unicode
experts.

This is something of a strawman -- you don't need "Unicode experts" in either
case. If you use Unicode, you just need programmers that understand a little
about it.

Internationalizing an application by hiring staff from all over the world for
each version does not strike me as a good use of time -- or talent. Most of
your core logic is the same in the English and Chinese versions (as well as
the Korean and Gaelic and Gothic versions). So these "Chinese programmers"
would end up being translators, unless there were some serious programming
problems introduced by the Chinese character set -- the kind of problems that
might come up if your language did not have pervasive Unicode support.

By Jason Dusek at Sun, 2008-02-03 07:06 | login or register to post comments

The mess that is Unicode...

I agree that international language support requires a standard with a large character set with far more complex handling of case, directionality, display, and order than ASCII.

But Unicode ought to have defined an unambiguous encoding rather than suggesting that different strings of e.g. UTC-32 codes can produce the same character point. IEEE 754 made a similar mistake in defining +0 and -0 as "distinct but equal" values. These specs are incompatible with extensional equality.

Also, Unicode doesn't define any sort of mapping to glyphs, which to me is one of the important things such standard ought to do.

Many useful things in the domain of real life can't be expressed in toto by the type systems of most languages.

This is true, but character strings are a simple enough domain that they could be precisely characterized by simple types. So, it really sucks that the standard defines it in such a way that this can't be done.

To answer your question: We store all text as UCS-16, and went to great effort a while back to move everything from ASCII to 16-bit Unicode, right before the consortium realized that 16 bits wasn't enough. Nowadays, Windows doesn't support UCS-32 nor UTF-8, so we're stuck with too 2X bloat yet still not enough bits to represent all characters directly.

By Tim Sweeney at Thu, 2008-01-31 01:33 | login or register to post comments

Tim wrote:But Unicode ought

Tim wrote:

But Unicode ought to have defined an unambiguous encoding rather than suggesting that different strings of e.g. UTC-32 codes can produce the same character point. IEEE 754 made a similar mistake in defining +0 and -0 as "distinct but equal" values. These specs are incompatible with extensional equality

I'm assuming you are referring to the issue of combined forms, such as à, vs combing a with a grave accent modifier? To me, obviously, the use of combining forms ought to be preferred (they're more flexible); but many scripts that Unicode has to be backwards compatible with (Latin-1 for instance) have precombined forms.

Fortunately, even though two Unicode strings (encoded in the same representation) aren't bitwise comparable; they are normalizable; the normalized forms can be bitwise-compared.

I won't get into +/-0 in IEEE754; 'tis beyond my numerical chops to comment on that.

Also, Unicode doesn't define any sort of mapping to glyphs, which to me is one of the important things such standard ought to do.

Unicode does give "suggestive" glyphs for printable code points, but even coming up with an "idealized" glyph for a code point has to take markup into account. Font specification and rendering is a nontrivial problem in itself; a glyph for a 12-point font at a given DPI generally won't look good when blown up at a higher point size. You want a different glyph. Scalable fonts do a lot more stuff than just expand glyphs, after all.

To answer your question: We store all text as UCS-16, and went to great effort a while back to move everything from ASCII to 16-bit Unicode, right before the consortium realized that 16 bits wasn't enough. Nowadays, Windows doesn't support UCS-32 nor UTF-8, so we're stuck with too 2X bloat yet still not enough bits to represent all characters directly.

Are you using characters outside the basic multi-lingual plane (or that otherwise can't be represented by a single 16-bit number in UTF-16)? Given the storage demands of multimedia elements within a modern videogame, I'm surprised that two bytes per character of text seems to be regarded as an extravagance. I do embedded systems work with a fraction of the resources that you have available; yet UTF-16 works fine for what I do. And besides--complaints about mandatory UTF-16 on your development platform of choice ought to be taken up with MS, not the Unicode committee. (What do you use on non-Windows platforms like the PS3, Linux, or OSX?)

By Scott Johnson at Thu, 2008-01-31 02:01 | login or register to post comments

Performance.

If you do a lot of text processing, Unicode support blows your performance - Regexp matching etc gets VERY slow compared to extended ASCII handling.

Also, according to what I'm told, you need shift handling ANYWAY, according to the people that live in the affected areas (asian countries.)

By Eivind Eklund at Fri, 2008-02-01 12:14 | login or register to post comments

shift handling

What do you mean, "shift handling"? You mean, handling upper case and lower case Chinese characters? I don't understand what that means.

By Jason Dusek at Sun, 2008-02-03 07:10 | login or register to post comments

I Guess it Scratches Paul Graham's Itch...

...so, good for him. But I certainly don't see how this advances Lisp at all.

By Paul Snively at Wed, 2008-01-30 20:03 | login or register to post comments

Agree

I couldn't have put it any more clearly than you just did.

By Michael Vanier at Wed, 2008-01-30 23:45 | login or register to post comments

Also agree

I was hoping that he would come up with a radically different macro system, but I suppose that hasn't happened yet. (or ever will happen, given that Paul doesn't seem to like hygenic macros)

I will admit that the negation syntax and the function composition syntaxes are nifty, but really it seems to me that all of Arc's features have been implemented in SRFIs, some for more than half of a decade. Every other feature that isn't in a SRFI could be implemented even with define-syntax without too much difficulty. Similar arguments could be made in favor of CL vs Arc, respectively.

More power to Paul, but if I wanted a new Lisp to replace Scheme, I would use Qi. (is that considered a Lisp dialect?) But in both cases, I'd rather use Haskell anyway.

Another thing that I'm not huge about is the lack of an official language standard. I hope it doesn't end up like Perl or Python, with only one interpreted and/or bytecode compiled implementation without any practical alternatives. But as others have pointed out, this situation might be ideal for a "batteries included" language.

By Timothy Beyer at Tue, 2008-02-26 11:25 | login or register to post comments

Inertia

Another thing that I'm not huge about is the lack of an official language standard.

I think a language standard counts as "forces pushing the language to stop changing"...

I note that PG's Six Principles for making Things is a sort of follow-up to this release notice. It doesn't really help me see what needed solution arc simply provides.

By Charles Stewart at Tue, 2008-02-26 12:15 | login or register to post comments

I figured that Paul would

I figured that Paul would say something like that. :)

I suppose that his reasoning makes sense, but it sounds a bit frightening for programmers writing large programs in his language. (as far as compatability is concerned) Not to mention the fact that if his implementation isn't a success, the language will languish.

Though, to be fair, if it ends up not having many libraries, presumably Arc programmers could leverage PLT scheme libraries. And I do think Paul Graham is extremely committed to his language, so I think he will do whatever it takes to make it succeed.

By Timothy Beyer at Fri, 2008-03-21 12:07 | login or register to post comments

Ahh, the burdens of supporting a userbase.

Must be the reason that the best programming languages (or, to quote Bjarne Stroustrup, the ones that nobody complains about) are those which nobody uses.

Alpha-releases are well and good. Arc will be better off if Graham considers the feedback of users--and it might even be more better off were he to have released it even earlier; inertia is a barrier to improvement (or at least to change; not all changes are improvements).

But at some point, the language will have to stabilize (and have changes follow a disciplined strategy) if it is to succeed. (What "success" will mean for Graham, I don't know).

By Scott Johnson at Wed, 2008-01-30 20:51 | login or register to post comments

Any good summary yet...

of the differences between Arc and other Lisp dialects? In what way is it an improvement (for some tangible class of applications--changes which are globally better are unfortunately rare) over CL or Scheme? Looking at the tutorial, some surface differences (i.e. if instead of cond) stand out... but that's hardly compelling.

The main advantage (or possible advantage) which comes to mind for Arc is a (as yet) single implementation (something which has greatly enabled Python, Perl, and similar)--as opposed to multiple implementations with known compatibility issues, as is the case of both CL and Scheme. But that assumes that a) the language attracts users to begin with, and b) Graham has the technical and organizational skills to permit the language to evolve gracefully and serve the needs of its users. The technical skills, I've no doubt. The organizational skills...that is the big reason why guys like Linus Torvalds, Guido, and Larry Wall have succeeded in leading wildly successful projects, whereas other projects that are arguably technically superior have failed. It's why Jimbo Wales was the right guy to lead Wikipedia and Larry Sanger wasn't, despite the latter's far superior academic credentials.

If Arc is to succeed, Graham must, at some point, let it go. And that might be the hardest challenge for a brilliant guy who is a well-known perfectionist with a well-chronicled dislike for compromise.

By Scott Johnson at Wed, 2008-01-30 23:39 | login or register to post comments

Maybe I'm new here...

Here's why Arc is great: because we're all talking about it here and so is everyone else. It's great because I am excited to try it. Can you remember the last time you felt that way about a dialect of Lisp? Was it recently?

Unicode vs. ASCII:

I felt like it was pretty clear that PG put out this version of Arc with the idea that it was *not* finished and further that it was a way to flesh out programs quickly and do some exploratory programming. The combination of those two things makes it not having Unicode support a non-issue for me. I don't prototype a program to support 20 languages. I don't expect something that is brand new and actively stated as being unfinished to be finished and polished. Why does everyone else? And finally, if you want Unicode support so bad, get to work, PG has already said on the Arc site that he will roll your changes into the next version. What more could you want? If you're a Lisp programmer are you really telling me that you expect someone else to do all the dirty work for you? Isn't the point of Lisp that you get to do your own dirty work and at 10 times the speed?

By jdvolz at Thu, 2008-01-31 09:53 | login or register to post comments

But it doesn't even slice bread!

I felt like it was pretty clear that PG put out this version of Arc with the idea that it was *not* finished
...
I don't expect something that is brand new and actively stated as being unfinished to be finished and polished. Why does everyone else?

Hey, don't go making sane comments around here! ;) That's a good question, which I was thinking about myself.

The most specific answer in the Unicode case is that Graham's announcement wasn't particularly tactful on this front. If he had said "we don't have Unicode support yet, but <something about the future Unicode strategy>", the response might have been more muted. But really, because of Unicode PCness, he would probably have still had to justify that decision further, which is part of what he did in the announcement. Faced with the certainty of criticism about Unicode support, perhaps a preemptive strike against complainers is as rational a strategy as any.

Slightly more generally, Arc just had so much by way of expectations built up amongst some people, that anything short of a revolution in language design — one which was immediately and obviously recognizable as brilliant and attractive — would have been a disappointment. Plus of course there are a lot of people who just reflexively criticize anything done by someone with Graham's reputation (i.e. tall poppy syndrome).

An even more general problem, which provides a backdrop for both of the above points, is that most people are very bad at understanding the concept of a work in progress. The whole idea of alpha and beta software is still new, historically speaking. Most people don't actually ever create anything significant themselves, so have little concept of the incompleteness, rough edges and slow progress that characterize the development of almost anything non-trivial. When they actually get to see something in an early stage of development, warts and all, they don't know what to make of it, and we see the results of that confusion in the various strange reactions.

That's not to say that there aren't any legitimate questions that can be raised about the purpose of Arc and some of the technical decisions that have been made. I'm a bit suspicious of the non-hygienic macro thing myself, but again, there are definite pragmatic reasons for not trying to solve the macro problem once and for all before releasing a new language. CL is evidence that dirty macros may be good enough from a "worse is better" perspective, especially if you decide you're not going to rely as heavily on macros as e.g. PLT Scheme does.

Arc doesn't seem to be providing (or claiming to provide, afaict) any major technical breakthroughs. The focus seems to be on language usability for a particular type of user. What will make it most interesting, as others have suggested, is if it attracts enough of a userbase to turn it into a "popular Lisp", with a single implementation, which would be quite an achievement.

On the question of "how this advances Lisp", I think it can be seen as similar to the venture capital model of advancing the economy: you create a large number of experiments, exploring the solution space, and although many of the experiments are failures, they can still teach us a lot. There've been plenty of Lisp dialects in the past that haven't gone anywhere but are interesting to study for various reasons. How else is Lisp going to advance? (Of course, there's a faction of CL users who think the language attained perfection sometime in the past couple of decades, but they don't read LtU.)

I agree with Scott's point that for Arc to achieve significant popularity, some "letting go" will ultimately be necessary — e.g. I think that a barrier to wider popularity for some Scheme implementations is that they're still controlled in a somewhat cathedral-like way by their creators, even if they're open source; or else the implementations are "too advanced" and not as hackable by ordinary mortals as some of the more popular open source languages. Releasing Arc at this relatively early stage seems like a good step in the direction of letting go, and it seems premature to speculate further on that.

[Edit: I think Raganwald's take on this is pretty good.]

By Anton van Straaten at Thu, 2008-01-31 13:26 | login or register to post comments

Arc == Total Perspective Vortex

The focus seems to be on language usability for a particular type of user.

Looks like that user would be...

Paul Graham!

By glennehrlich at Thu, 2008-01-31 17:13 | login or register to post comments

That's the stated spec

Yes, he has said that in so many words. Anyone who's surprised by that now wasn't paying attention. The kibbitzers are as responsible as anyone for the perspective vortex in this case.

By Anton van Straaten at Thu, 2008-01-31 18:03 | login or register to post comments

Quite recently

...because I am excited to try it. Can you remember the last time you felt that way about a dialect of Lisp? Was it recently?

Incredibly recently.
Quite recently.
And not really that long ago.

But still, I get your point, you're talking about a possible benefit whereby Arc increases interest in Lisps and Lisp-like concepts in general. That's not totally out of the realm of LtU, it's just less of core interest than programming language theory, design, and implementation. So you won't see quite as much discussion about what role Arc might play in increasing awareness of functional programming, syntactic abstraction, "code as data", etc.

Unicode vs. ASCII: I felt like it was pretty clear that PG put out this version of Arc with the idea that it was *not* finished

Some of the wording in the announcement is a little vague. It's not clear if he meant "it's just ASCII for now and that's a good thing since it allows us to get it out the door fast" vs "it's just ASCII and that's a good thing because that's all Arc needs for its long term design goals and anything else will just be a waste of time." A lot of people are taking it to mean the later and trying to hash out why that could possibly be a good design decision.

By James Iry at Thu, 2008-01-31 13:28 | login or register to post comments

redefined

"Here's why Arc is great: because we're all talking about it here and so is everyone else."

I think you have redefined great to mean - this weeks new toy.

By Isaac Gouy at Thu, 2008-01-31 16:42 | login or register to post comments

We're ALL talking about it?

No, we're not all talking about it. I bet not even most PLT people in either the industry or the academia are talking about it (there were things like 2008 Lang.NET symposium going on). Programmers in industry are definitely not talking about it.

It seems to me that the only people who are talking about it are those who have followed Paul Graham's writings over the years and wanted to see what it, the thing, the real deal would look like.

"This is it?" is what some ask. No, it's not because of lack of unicode, which is a total distraction.

By Koray Can at Thu, 2008-01-31 19:19 | login or register to post comments

Not that big a deal...

Start with a lexically-scoped Lisp-1. Toss in setf, only call it =. Shorten a few other function/special form names. Change the syntax on some binding forms and a few commonly used control structures. Mix in non-hygienic macros and a few items that support web-based dispatch and transformation.

Oh yeah! ASCII only :-).

Wrap in lots of hype, ship.

Almost all Lisp users build/modify their own Lisp environments at one time or another in their lives. In most cases, these forays into the new are not treated like the second coming of McCarthy. Nothing particularly interesting here.

By fadrian at Thu, 2008-01-31 20:49 | login or register to post comments

Ya know

second coming of McCarthy.

Too bad. If it was, he might have a chance to meet himself.

By James Iry at Thu, 2008-01-31 21:20 | login or register to post comments

What have other languges done?

What have other languges done in the last years that has nothing to do with types?

By matthias becker at Fri, 2008-02-01 12:03 | login or register to post comments

Well...

there's concurrency... mobile code... proof-carrying code... stronger and better module systems... metaprogramming... increased use of deductive rather than inductive reasoning...

Lots of interesting directions in PLT research besides type systems.

And on the indsustrial side of things, many newer languages have combined the lessons learned from theory with increasingly powerful toolsets. Things like O'Caml, Scala, or F# aren't all that interesting to theorists--but as they help bring new techniques to the masses, they are useful nonetheless.

So many languages have accomplished many things new. Right now, the main compelling thing about Arc seems to be the faint glimmer of hope that legions of CL and Scheme hackers (or even the users of one of these) will abandon the several mutually-incompatible versions of both, and re-unite the Lisp community, or some segment thereof, on a single implementation. Other than that, it appears to be another Lisp dialect. Nothing wrong with that, but many people seem to be expecting more from Paul Graham, given the time he took and his reputation.

By Scott Johnson at Fri, 2008-02-01 17:32 | login or register to post comments

Er...

Concurrency, mobile code, proof-carrying code, module systems, and metaprogramming are "besides type systems?"

OCaml and Scala aren't all that interesting to theorists?

Are we on the same LtU? :-)

By Paul Snively at Fri, 2008-02-01 18:09 | login or register to post comments

Touche...

to call the things I mentioned "besides" type systems, is perhaps a bit of a stretch. All of them are, of course, informed or improved by type theory research.

But, they are axes of PL improvement/capability that are interesting even not considering their type-theorietical aspects. Plus, they are all routinely done (and studied) in the context of dynamically typed (or untyped, if you prefer) systems. Erlang, Javascript, Smalltalk, Ruby--oh, and numerous flavors of Lisp--are all interesting languages for various reasons listed above, and these are all dynamically typed.

*That* is what I meant.

Occasionally one encounters the (rather provincial IMHO) view that CL and/or Scheme represent the pinnacle of PL research; and that the interest that PL theorists have in powerful type systems and such is ultimately unproductive and uninteresting. The question "what have other languages done in the last years that has nothing to do with types" can either be interpreted as an innocent query as to the fruits of PLT labors; or as a derisive putdown of PLT. I assumed the former in answering the question; I mention the latter possibliity to you only in explaination. :)

Regarding O'Caml, etc.... perhaps I got a bit ahead of myself. PLT research *is* done using these, and/or extensions to these. OTOH, much of their value is in moving established ideas from research to production environments. Which is not a bad thing. Of course, the same can be said about Java itself--despite the fact that Java is not a terribly interesting language from a theoretical point of view (I think I'm on firmer ground here); it's a great tool for experimentation due to the infrastructure that exists all around it.

By Scott Johnson at Fri, 2008-02-01 19:14 | login or register to post comments

I'll go with that summary.

I'll go with that summary. He might as well have called it Yarc, where RC expands into any noun phrase suggestive of mild underwhelmingness.

By Roly Perera at Fri, 2008-02-01 14:45 | login or register to post comments

Move on -- nothing to see here

This could be a test of how willing geeks are to follow a selfdeclared guru who does nothing more than every hacker is doing all day at home.

come on... renaming some stuff and adding prehistoric html output, doesnt makes a new language.

What we see here is just the result of some spare time killing of a bored millionaire, imho.

By bonk at Thu, 2008-02-14 15:18 | login or register to post comments

how do i sign up to do that gig?

By raould at Thu, 2008-02-14 20:45 | login or register to post comments

How to become a bored millionaire.

First, you'll need to go make a million dollars...

(OK, its an old joke but I couldn't resist).

By Scott Johnson at Fri, 2008-02-15 19:56 | login or register to post comments

too terse

I think many of the function names (particularly the higher order functions) have really cryptic names, such as "rem" "pos" "trues" (which I think is filter-map). Using "mac" for a macro when "macro" is already really short seems overkill, at least to me.

Same thing goes for "def" instead of "define"/"defun". It would have been cool if he used = instead of "define"/"defun", but then he would have to either omit or redesign his side-effect functions. But then again, since most lisps have mutable functions, he could generalize functions AND side effects using the same symbol, which would be kind of neat.

By Timothy Beyer at Tue, 2008-02-26 11:51 | login or register to post comments

User login

Navigation

Arc is released

Comment viewing options

Browse archives

Active forum topics

New forum topics

Recent comments