LtU Forum

Help with study of functional programmers

Are you currently developing or maintaining a medium to large-sized
program written in a functional language, such as Haskell, F#, OCaml,
or Lisp? I'm a PhD student doing a study of functional programmers, as part of
a research internship at Microsoft, and I would like the opportunity to look over
your shoulder while you do debugging or coding on your project.

I'm looking for people with at least a year's experience doing
functional programming, and who are currently working on a real
project (i.e. for some purpose other than learning functional
programming). I'd simply come
watch you work, and ask a few questions along the way. You'd do
whatever you would normally be doing. If you're near Seattle or
Portland, I'd come to your office for a couple of hours. If you're
not near Seattle or Portland, then we'd set you up with LiveMeeting
or some other remote screencast software so I can watch you from here.

Obviously security concerns are an issue - I will not share any
proprietary information that I learn about while visiting you.

In exchange for your help, Microsoft will offer you your pick of free
software off its gratuity list (which has about 50 items, including
Visual Studio Professional, Word for Mac, XBOX 360 games) or any book
from MS Press.

We're doing this because expert functional programmers have not been
studied much. We plan to share our findings through academic
publications, to help tool developers create debugging tools that are
genuinely helpful in real-world settings.

I'm hoping to finish my observations by August 8th, so please contact
me immediately if you're interested!

Thank you,

Chris Bogart
425-538-3562
t-chribo@microsoft.com

(Correction 7/18/08: I was mistaken in my original posting: I *can* use
participants in Europe as well as the US)

Lisp-like language for Domain Specific Languages prototyping

A new version of MBase framework is released, along with a DSL building tutorial. MBase is a DSL prototyping/implementation framework for Microsoft .NET - a Lisp-like language with some unique features.

Tutorial:
http://www.meta-alternative.net/calc.pdf

MBase documentation, including some highlights on the language architecture:
http://www.meta-alternative.net/MBase-0.6-doc.pdf

A bigger example - a subset of Paul Graham's Arc language compiler for .NET, a literate program:
http://www.meta-alternative.net/pseudoarc-Jul2008.zip

Project page, where MBase distribution can be downloaded:
http://www.meta-alternative.net/techpreview.html

Creating a markup language compiler

Hi everyone,
I'm undertaking creating a proof of concept for a software idea I have.
The basic idea is to create a language/program/platform/framework in which a developer can write a program that can then compiled (or otherwise made) to run on multiple platforms. Specifically, an instruction set consisting of three instructions/functions would be allowed. A developer (which in this case would be me) would use these three instructions to form a program. This simple program would be compiled so that the program could be executed on one or more different platforms.
One analogy I can think of is, for example, if the platforms were mobile computing platforms. So the instruction set might be

  • Display calendar (parameter 1: date;)
  • Create new calendar event
  • Display warning if new event conflicts with current event

While the most basic and logical program that can be created using these three instructions is in the order above, theoretically, the compiler should be able to take any combination of them and generate program code to run on the selected platform. The selected platform would be one or more platforms from a predefined set such as

  • Windows mobile X.X
  • Palm X.X

At this point, I'm thinking that the language would be a markup language similar to OpenLaszlo's LZX or Facebook's FBML. But I also know my disposition is due entirely to my current involvement in UI development.

What would be the simplest way to create this proof of concept?

A couple of ideas come to mind.

  1. Write a sample program in each of the platforms' native languages
  2. Examine the generated assembly code? Examine the generated machine code?
  3. Play around to try to determine what the output is for a given function in the native language
  4. At this point, great. Assuming I know what code is being generated from a given function/piece of functionality, wouldn't I still have to figure out what additional code is required to make the program function?

Integration:
This, I imagine would be one of the more difficult parts. Figuring out how to take the same program and have a compiler generate multiple versions of it capable of running on multiple platforms.

So, what am I in for? My undergrad is computer engineering. But it was more hardware than software focused. To get this done in the shortest amount of time – I’m trying to get this idea funded so I can hire “real developers” and assume the proof of concept won't survive beyond funding - do I learn:

  • Compiler design?
  • Reflection?
  • Compiler reverse engineering?
  • Other things?
  • Programming language specification?
  • Programming languages (theory)?

Which language is best suited for this? The platforms I’m going to target are nearly all Java based. Is there any open source project whose code I could benefit from learning?
As small as this is in scope (three instructions, 2-3 platforms to run on) this would be the most advanced programming project I’d ever undertaken and am nervous about doing so. But it’s something I believe in as well that believing that I can do anything anyone else can (however that’s a function of time), I’m going to keep going forward.
Any advice is well appreciated and I look forward to keeping you informed on my progress.
Shmuel

In search for a programming language to replace spreadsheets.

Hello,

I'm in the search for a good programming language to replace much of what I use spreadsheets for today:
It should be constructed to easily define constants / sets of data, and calculating new data from other data, using user-defined formulas.

The advantage of a spreadsheet is that it's extremely easy and fast to work with. But a spreadsheet can soon become complex with:
- lots of long, hard-to-read formulas
- the same formula definitions duplicated all over the document
- data and formulas mixed up all over.

I would like a language that can be used like this:

You define a data field to be a function of multiple arguments like:

Amount Income(Company, Month, Year)

You can then derive other fields like:

Amount Income(Month, Year) = sum(Income(Company, Month, Year))
Amount Income(Year) = sum(Income(Month, Year))

You can also define a lot of data:

Amount Income(Company, Month, Year) = [
  ("Microsoft", "01", "2008") => 1234,
  ("Microsoft", "02", "2008") => 15664,
  ("Sun", "01", "2008") => 1564,
  ("Sun", "02", "2008") => 1652
]

You can now ask for

Income("2008")

and get the appropriate amount: (= sum [for all months] [for all companies] Income(Company, Month, "2008"))

I know many imperative languages, but I've only experimented very little with other programming paradigms. I'm thinking of learning a declarative language like Prolog for this .. but are there better alternatives? I like the idea of Haskell being strongly typed, but I don't know if it will be good for this.

The idea is to be able to model very complex systems, while keeping the calculations concise. Another advantage would be that you are able to reuse formulas. Ideally there should be a nice IDE with the possibility of getting a graphical/tabular view of the calculated data.

Have any of you thought of something like this, or even made or heard of something like this?
I would very much like to know your views on this, and if you know any good languages for this problem domain..

Parser Generators Supporting Astral Characters

It's important to me that the programming language I'm working on has full Unicode support, which includes supporting the 'astral' characters from planes 2 through 16.

I'm not very interested in parsing, so I would really prefer to use a parser generator instead of writing a parser myself. However, the parser generators I've looked at only support plane 0, the Basic Multilingual Plane, despite claiming to have "full" Unicode support.

As a side note, they often don't even warn you about this when specifying astral characters in the grammar. They happily accept code points over 0xFFFF and output broken code. The generated code doesn't even check for this stuff at runtime, but rather just behaves unexpectedly. For example, when allowing [0x010000..0x10FFFF] characters in identifiers, SableCC output code that parsed my entire input file as a single gigantic identifier.

So I'm wondering, what popular parser generators support astral characters? Or if there aren't any, what unpopular ones? Am I not going to be able to use a preexisting parser generator to get what I want?

MISC: An experimental LISP-like language

MISC started when I got sick of XML's lack of consistent structure and it's heavy markup. I wanted something consistent, clean and executable like LISP but with a more general basic data structure. This was the initial spark "LISP with maps instead of lists".

  • [if [ 5 10] then:[+ 5 10] else:[- 5 10]]
  • [let '[square:[lambda '[x:1] '[* x x]]]
      '[square 12]
    ]

  • [take 20 [numbers from:0]]

The result is not a language for real-world use but one that I hope embodies many different unique and novel design decisions in a way that triggers thoughts and ideas in those that take the time to play with it. I've written up some of my experiences of designing the language and a fair bit of documentation. It also runs as an applet with some neat inline documentation.



What's different or why you should be interested:

  • Novel LISP-like language
  • Homoiconicity (ie. source code that is data) using maps
  • A great lazy data-language
  • Cool stuff with metadata
  • A metacircular– interpreter and syntax-colourer

Find out a bit more about MISC, and run it directly in an applet.
(NB. All the examples are clickable and use alt-return to execute code)

Of Generics and Erasure and, of all things, GC and memory layout

I'm trying to grok the difference in generics implementation between Java and C++, specifically the implications of type erasure in Java.

Most of the info I'm finding is blogger stuff, so I'm wondering if there's any better-than-most analysis of the issue out there.

This issue also makes me wonder about the impact of having to store type info along with data - a required "vtable pointer" in C# lingo if I get that right - on some traditional memory layout schemes.

Does this put an end to BIBOP like value typing schemes, if many small words are type parameterized? Box[T], Pair[T,U] and so on???

Or how about the two word Cons cell tricks (via distinguished non-Cons object header word) vs. List[T] where one needs a vtable word in every Cons cell along with head/tail (car/cdr whatever), which with 2 word alignment is going to make for big fat 4 word cons cells?

Inquiring minds want to know :-)

Scott

Implementing fast interpreters

The webkit community announced Squirrelfish last month, showing important performance improvements.

The post has some links LtU readers may find of intererst (such as The Structure and Performance of Efficient Interpreters, by M. Anton Ertl and David Gregg, November, 2003).

I am thinking of building a Scheme interpreter using a (TR) SECD machine, and I would be most grateful if any LtU readers would be so kind as to provide some additional links about improving interpreter performance.

PyPy's prolog-based JIT prototype

PyPy was discussed couple of times here before. One of the interesting (recent) additions to this project is a prototype JIT generator written in prolog. Paper is titled Towards Just-In-Time Compilation and Specialisation of Prolog.. More details about choice of prolog as a prototyping language are on the PyPy blog.

Cheers,
fijal

type-checking programs with unknown types

I find myself in need of a type system which can deal well with types that are unknown at compile time, by doing as much static type-checking as possible and then inserting downcasts to deal with unknown types at runtime. Is anybody aware of such a system already in use? Let me motivate this by a short example in some functional pseudo-code:

useNumber(number) = number + 2
useList(list) = head(list)
googleURL = "http://api.google.com/GoogleSearch.wsdl"
google = Webservice(googleURL)
results = google.search("testing")
a = useNumber(results)
b = useList(results)

Clearly the type of a dynamically-generated proxy like google may be, in general, unknowable at compile time. The obvious way to handle this is to use a Univ type which is downcast, either implicitly or explicitly, as needed. That approach seems to be typical for current languages from Java to Haskell. However I'd like to go a bit further and reject programs like the above where it's clear that some variable (in this case results) could never satisfy the constraints on its type.

This sounds to me (perhaps deceptively) like a simple problem which someone is bound to have solved already, but if so I have yet to find the solution in the literature. Maybe that's because it's so simple that it's not worth publishing, and I just need to think about it a bit.

Here are some examples of ideas I have reviewed and rejected:

  • Static Typing Where Possible, Dynamic Typing When Needed -- a provocative title, unfortunately not borne out by the content of the paper.
  • Cartwright and Fagan's "Soft Typing" (1991). If I have understood this paper, soft typing cannot distinguish between programs which it merely cannot prove are correct and programs which are obviously incorrect. In both cases, types simply fail to unify. However I would like to make such a distinction, so that at least some programs can be definitively rejected at compile time, and I don't mind forcing programmers to make some concessions to the type system to allow this.
  • Boo is a language I've seen mentioned as supporting both static and latent typing with type inference. Based on a cursory review of their documentation, it seems they just have a Univ type with implicit casting (and a somewhat ad hoc type system in general).
  • HaXe is another such language, which I rejected when I saw that their idea of type inference is "the first time the function is used, the type of the parameter will be set to the type it was used with".

Thank you for your help.

XML feed