source code conversion

I'm looking for a tool or library, which allows to combine different frontends and backends to convert sourcecode of one pl into another? Even if the translation is incomplete, a lot of tedious work could be done by a tool like this.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

The black art of translation

In the distant past, I've written full blown translators for Pascal-to-C and RPG-to-VB. I've also written scripts to approximate a VBScript-to-C# translation, and am converting many of the examples of CTM from Oz-To-Alice by hand.

Don't know of any general level PL translators. You'd need to be more specific about what PLs are involved. The biggest hassle with translation is the tradeoff on the approximation. If you make the translation produce runnable code, it'll likely be unmaintainable. If you approximate the idiom of the target language, it'll likely be unrunnable.

Have you written any thing

Have you written any thing to convert from C to Java? The only thing I found out there that actually does anything is called Jazillian. Plenty of good documentation on their site.

Pascal -> C

Translating Pascal to human-readable C would seem to be pretty straightforward. Pascal's inner function definitions would be pretty easy to handle, you could pass a pointer to the variables that are neither local nor global, which is a special case of closure conversion. Strings would be a minor challenge; you'd probably have to accept less-than-idiomatic C.

But it would be virtually impossible to translate C to Pascal without leaving significant parts of the translation incomplete, or abandoning either human-readable code or cross-compiler portability. Function pointers and setjmp/longjmp would be very difficult to handle, not to mention pointer arithmetic and typing, or less than common idioms such as Duff's Device.

Of course, there is a fair bit of C code that's hopelessly tied to a particular implementation, so no matter what you do, you can find a "pathological" C program that will break the translator. And there are a lot of pathological C programs floating around out there.

Not sure what you are looking for in a C -> Java translator, but considering the difficulties above for a problem that's probably somewhat easier, C -> Java would seem to be a fool's errand...

Not worth it from what I have seen

Perhaps you can write one that does better, but in my experience, auto-generated code (which is what you are talking about) is harder to maintain in the target language, than just re-writing everything. Now if you just need one or two small patches, it isn't too big a deal to find where to put them. When you want to make a larger change it becomes hard.

Do you really need the translation? Can you just work with a bridge between the old and the new? SWIG allows you to easily call C/C++ from a dozen other languages. Perhaps you can get by with that (writing front ends or back ends as required to get the functionality you want).

If your problem is porting some obscure language to a new platform where no compiler exists, you have to make some decisions. It might be a good idea to do a re-write from the ground up, just to solve architectural problems in the code. Or maybe just write a front-end for gcc.

Well, translating from XYZ to C/C++

is sometimes called compiling ;-)

A "Holy Grail": An emacs command "M-x tranlate-to-" that will convert one high level language to another (and back again!). That way I wouldn't be forced to work in Perl, Python, Ruby etc. I could do Scheme all day long!

I'm reminded of the old

I'm reminded of the old Babelfish game of translating something from one language to another and back again (over and over) and being entertained by the gibberish that resulted.

For language translation we can assume that the semantics will remain, but it seems that the chosen implementation details may deteriorate in quality between translations.

yes i was dreaming probably, but ...

i thought if languages aren't too different, a lot of code could be reused. I'm not longing for the perfect translation but for something that removes the tedious task of converting one syntax to another from me. Probably a lot could be done by little perl scripts, but i was wondering if a more scientific approach would be even more helpful. As an example think about translating two c clones like java and php. I know, there are a lot of differences between the two, but if your codebase is big enough, tranlating the things that are handled the same, still would be helpful.

Dream No. 2:
We have a xml format to express statements of programming languages and every compiler is able to read/write this format.

Dream No. 2b

Or how about Dream No. 2b, Where everyone programs in Lisp?

SharpDevelop - A C#↔VB.NET→Boo

For what it's worth, SharpDevelop will translate nicely back and forth between VB.NET and C#. The Wikipedia page claims it can translate an entire project, and translate to Boo, apparently.

To quote from Wikipedia on the subject of the included parser source code (at least for code completion):

For code completion SharpDevelop uses its own parsers for C# and VB.NET. They were generated using a grammar description and a modified version of the Coco/R compiler generator of the University of Linz. The source code contains this generator.

CLR conversion not that big a deal?

I suspect the reason that is do-able is because all the languages that target .Net / CLR get shoe-horned into it. Witness the rewrite of VB so that it would work on CLR. I don't know if Eiffel.Net and F# etc. disprove my pet theory...

C# / VB / Boo not that big a deal

Well, as far as I remember, VB is currently a subset of C#. I even heard rumors (probably unfounded) that they actually shared the same compiler with just a different parser. As for Boo, a huch tells me that it's actually a (mostly untyped) superset of C#.

Not precisely

There are constructs that can't be translated to C# (like named indexers). But yeah, VB.Net is really much more like C#/Java than like VB6.

Java --> C#

Even better than that: .Net comes with a tool that converts from Java to C# with very little clean-up needed! So, you could convert your Java to C# and then convert again to VB.Net! This doesn't the user interface though.

o:XML

The o:XML project aims at doing something akin to what you ask. They basically rewrite every language to an XML AST, then transform it to some "universal" XML dialect with XSLT, and from that universal dialect to the target language AST with XSLT and finally serialize this to the language.

As a side note, I personally believe that this is a waste of time. At best, this is reinventing Lisp. But, well, I might be wrong.

Concatenative Languages for Translation

I'm working on a concatenative language for the purpose of universal translation (among others) , called Unimperative. It is a high-level pure-function stack based language, heavily influence by Joy. Since most modern languages are in effect stack based, a pure concatenative language is a natural vehicule for translation.

The plan is to use it for my other language project, Heron.

but they're stack based in a

but they're stack based in a completely different way! just because "stack" is a common data structure used in two different technologies doesn't mean the two technologies are otherwise related...

more precisely, what about state? concatenative repeatedly transform a single block of (ordered) state. in contrast, other languages directly address values on the heap.

don't get me wrong - i think concatenative languages are cool, use the term "stack based" myself, and have even written one. but i don't follow your logic here.

Stack Based Languages

I say many high-level languages are stack-basked because theny can be easily translated to a low-level language which is clearly stack based (8086 assembly, JVM byte code, etc.), and conceptually they can be thought of as manipulating a stack. A function roughly corresponds to a stack frame.

The problem of heap manipulation, is similar to any other desired side-effects, (e.g. system calls, I/O, etc.) and one solution is to have API calls do such work for you.

I don't know all the common solutions for mixing side-effect dependant code, with pure/clean code, but one approach which I think has promise is to use a side-effect tagging system.

Stratego

Have you looked at Stratego/XT? It looks like a very mature language and toolset for program translation. It will compile your translator to C.

ANTLR

If you would consider a toolkit instead of a finished tool, you might look at Terence Parr's book The Definitive ANTLR Reference, in which Parr describes using the ANTLER parser generator to do source-to-source translation and other similar tasks. The ANTLR web site (http://www.antlr.org/) has a link to the list of target languages already supported by ANTLR.