LtU Forum

Owl: A parser generator for visibly pushdown languages.

Owl is a parser generator which targets the class of visibly pushdown languages. It is:

  • Efficient - Owl can parse any syntactically valid grammar in linear time.
  • Understandable - like regular expressions, its parsing model (and error messages) can be understood without talking about parser states, backtracking, lookahead, or any other implementation details.
  • Easy to use - using Owl's interpreter mode, you can design, test, and debug your grammar without writing any code. An Owl grammar compiles to a single C header file which provides a straightforward parse tree API.

The restriction to visibly pushdown languages imposes restrictions on recursion. Rules can only refer to later rules, not earlier ones, and all recursion must be guarded inside guard brackets.

[ begin ... end ]

The symbols just inside the brackets are the begin and end tokens of the guard bracket. Begin and end tokens can't appear anywhere else in the grammar except as other begin and end tokens. This is what guarantees the language is visibly pushdown: all recursion is explicitly delineated by special symbols.

https://github.com/ianh/owl

Note: I am not the Owl author, but I found it to be a useful tool for my own projects. I play around with toy programming languages as a hobby and I am not well-versed on automata theory.

I am posting here because I would be interested to know what your thoughts are about the class of visibly pushdown languages, specifically the constraints it imposes on recursion, how that compares to recursive descent, and any pointers to variations or similar approaches that you know of.

Why is there no widely accepted progress for 50 years?

From machine code to assembly and from that to APL, LISP, Algol, Prolog, SQL etc there is a pretty big jump in productivity and almost no one uses machine code or assembly anymore, however it appears there's been almost not progress in languages for 50 years.

Why is that? Are these the best possible languages? What is stopping the creation of languages that are regarded by everyone as a definite improvement over the current ones?

Functional Constructors in Theme-D

I redesigned the constructors in Theme-D version 2.0.0, see [1]. The design is inspired by the OCaml object system. Constructors are special procedures used to create instances (objects) of classes. They are not defined like other procedures but Theme-D creates them using the construct statement in a class and field initializers in a class. The translator-generated default constructor is sufficient in many cases. For example, consider the class <complex> defined in the standard library:
  (define-class <complex>
    (attributes immutable equal-by-value)
    (inheritance-access hidden)
    (fields
     (re <real> public hidden)
     (im <real> public hidden))
    (zero (make <complex> 0.0 0.0)))
The translator-generated default constructor takes two real arguments and sets the first to the field re and the second to the field im.

The programs objects1.thp and objects2 in subdirectory theme-d-code/examples of the source code [1] demonstrate user-defined constructors. Here is the first example:

  (define-class <widget>
    (fields
     (str-id <string> public module)))

  (define-class <window>
    (superclass <widget>)
    (construct ((str-id1 <string>) (i-x11 <integer>) (i-y11 <integer>)
		(i-x21 <integer>) (i-y21 <integer>))
	       (str-id1))
    (fields
     (i-x1 <integer> public module i-x11)
     (i-y1 <integer> public module i-y11)
     (i-x2 <integer> public module i-x21)
     (i-y2 <integer> public module i-y21)
     (i-width <integer> public module (+ (- i-x21 i-x11) 1))
     (i-height <integer> public module (+ (- i-y21 i-y11) 1))))
The constructor of class <window> passes the first argument str-id1 to the constructor of its superclass <widget>. The constructors also initialize the fields using their arguments. Note that the field initializers may contain more complex expressions than just copying an argument variable.

Here is the second example:

  (define-class <widget>
    (construct ((str-id1 <string>)) () (nonpure))
    (fields
     (str-id <string> public module
	     (begin
	       (console-display "new widget: ")
	       (console-display-line str-id1)
	       str-id1))))

  (define-class <window>
    (superclass <widget>)
    (construct ((str-id1 <string>) (i-x11 <integer>) (i-y11 <integer>)
		(i-x21 <integer>) (i-y21 <integer>))
	       (str-id1) (nonpure))
    (fields
     (i-x1 <integer> public module i-x11)
     (i-y1 <integer> public module i-y11)
     (i-x2 <integer> public module i-x21)
     (i-y2 <integer> public module i-y21)
     (i-width <integer> public module (+ (- i-x21 i-x11) 1))
     (i-height <integer> public module (+ (- i-y21 i-y11) 1))))
Here we log the calls to the constructor of <widget> to the console. Note that we have to declare the constructors as nonpure if they have side effects.

[1] Theme-D Homepage

Deterministic Concurrency

Toward a deterministic treatment of concurrency for the general case.

Various desired forms of reasonableness

A small study in the vagaries of Rgular types in C++.

Abseil blog post.

How can we get better statically typed assurances around the nuances of such concepts? Which languages help the most? Things like Rust or ATS come to mind.

Bjarne Stroustrup interview on Youtube.

Lex Fridman just interviewed Bjarne Stroustrup on youtube. It's long; a bit over an hour and a half.

But they're talking extensively about C++ and its origins and evolutions and guiding principles.

Here's a handy link to the interview.

Type Mapping in Source-To-Source Translation

I'm in the middle of writing an Object-Pascal-to-Java translator. One interesting aspect is the mapping between source and target types. Is anybody aware of literature about this topic, books, papers, etc.?

I would be interested in things like rules for "widening" a target type over the source type; constraints that must hold on operations applied in the target language so that the result complies with the result in the source language; maybe a formalism where you could even show (prove) that, given some sets of types and operations; etc.

I do not aim to set things up this way, but would enjoy reading some theoretical material about what I'm doing. Is there serious research in the area of source translation, anyway?

Histogram: You have to know the past to understand the present by Tomas Petricek

Histogram: You have to know the past to understand the present by Tomas Petricek, University of Kent

Programs are created through a variety of interactions. A programmer might write some code, run it interactively to check whether it works, use copy and paste, apply a refactoring or choose an item from an auto-complete list. Programming research often forgets about these and represents programs as the resulting text. Consequently, thinking about such interactions is often out of scope. This essay shifts focus from programs to a more interesting question of programming.

We represent programs as lists of interactions such as triggering an auto-complete and choosing an option, declaring a value, introducing a variable or evaluating a piece of code. We explore a number of consequences of this way of thinking about programs. First, if we create functions by writing concrete code using a sample input and applying a refactoring, we do not lose the sample input and can use it later for debugging. Second, if we treat executing code interactively as an interaction and store the results, we can later use this information to give more precise suggestions in auto-complete. Third, by moving away from a textual representation, we can display the same program as text, but also in a view inspired by spreadsheets. Fourth, we can let programmers create programs by directly interacting with live previews as those interactions can be recorded and as a part of program history.

We discuss the key ideas through examples in a simple programming environment for data exploration. Our focus in this essay is more on principles than on providing fine tuned user experience. We keep our environment more explicit, especially when this reveals what is happening behind the scenes. We aim to show that seeing programs as lists of interactions is a powerful change of perspective that can help us build better programming systems with novel features that make programming easier and more accessible. The data exploration environment in this interactive essay may not yet be that, but it gives a glimpse of the future.

Rope

This is my very first post on this site. Also this is my first post regarding my new PL idea: Rope

First, an introduction to me (in a hopefully non-narcissistic way):
Skip over this if you want to avoid hearing someone talk about themselves
My name is Jocob (it was a typo from Jacob; we didn't learn about it until I was 19 and by then I was a legal adult so I had the choice to keep it). I've been programming for about 5 years. First Excel macros, then C# in Unity, and from there I decided to get a career in the field. Now I'm a "DevOps Engineer" (I don't know why they prepended a paradigm onto a job title). Professionally, I work mainly with Python, HTML/JavaScript/CSS, PowerShell, Bash, and increasingly GoLang and Ruby. Personally, I like to use C# both for Windows Apps and for Unity. I'm also building a site on AWS using Django/Bootstrap/PostgreSQL.

I have no experience or education designing programming languages.

If I have seen further than others, it is by standing upon the shoulders of giants. - Isaac Newton

I get it, and I agree with it for the most part; why waste your time doing what has already been done? What progress would we make if everyone started from how to make fire and tried making it to the moon? But there is something that I don't agree with. (Not with Newton, but with a secondary application) Should we all only blaze trails where a previous trailblazer left off? My personal experience has taught me that, if you disagree, then you'll learn more and, rarely, discover something amazing. It is with that mentality that I want to introduce "Rope."

Second, an introduction to ROPE:
Skip over this if you get annoyed reading about people's naive hopes and dreams
The idea is simple: I want a "language to create languages." Here are the main features I'm aiming for:

  • Doesn't inherit from Fortran: I want the language itself to help anyone understand exactly how they're instructing a machine to operate, so I want it to inherit directly from machine language. The dream is that someone without a computer science degree could understand how ones and zeros translate to a UI on a monitor if they took the time to read the source code.
  • Intention over implementation: Each basic function will have an input/output criteria with a standard implementation. But if someone can propose a better, faster implementation, then that function can be replaced without impacting existing code. This would mean, ideally, that people would never need to worry about losing functionality or rewriting their code in the event of an update. They could just update, and things would work better.
  • Both a Script and a Markup Language: While someone would be able to write any application they want in Rope, they would primarily use it to define a higher level language. I'd hope to eventually create ports for other major languages that would allow Rope to import and export to them. Subsequently, this would enable a "translate" feature that recreates scripts in another language. (Substituting raw Rope syntax wherever there is no translation, which would indicate to the Rope developer that the port needs to be improved.
  • Prepackaged IDE, Compiler, and Interpreter: Rope would come with everything you need to write and play right out of the box. I know personally that the faster I can see the output of my code, the quicker I can learn it. The compiler would be able to produce an executable file if you write as a script, or it could produce another compiler if you write as markup. The interpreter behaves the same as the compiler, but could also read and run ported languages as long as their is a markup file exists. This would, theoretically, allow for non-interpreted languages to be used live scripts.
  • Visually incarnatable: Because the core of the language would be simple, and to help make the language easier to use in other Countries, Rope would be able to be represented entirely visually.
  • Built-in basic kernel: To help with being cross-platform, and because Rope aims to be able to directly bridge human and machine, Rope would have an ability to interact directly with hardware. Markups would be written to allow it to read device drivers for various platforms. Ideally this would mean that it could use *Nix drivers and Windows drivers at the same time.
  • Community linked at every aspect: Each implementation, for example the "Output to Console" implementation, would have both an "intention ID" and an "implementation ID." The IDE would integrate with an online resource that allows people to discuss and collaborate on each piece of the language itself. You could, in this example, click on the "print" (or whatever it will be called) function, and see the discussion and versioning of that core function. It would allow the language to grow and be discussed much more rapidly than through mediums like GitHub, because it could be done in the middle of coding. (Though that may be distracting...) There would also be a like/dislike voting system for everything, as well as "Implementation Alternates" thought could be seamlessly swapped out during development and then automatically aggregate bench-marking from all willing participants.
  • Syntax to define syntax: This is a necessary part of the "language to create languages" goal, but there's no reason why that couldn't be used within the code itself to make things easier. For example, someone could want one script with lots of commenting, and another separate script with lots of string literals. Rope would allow them to configure each script at the start to handle those syntactical preferences.

That's all I got for now. But I'm sure I'll come up with more. The lofty goal behind the "language to create languages" philosophy is that, at the start, everyone tries to make their PL perfectly suited for its purpose. But as we've seen, languages keep expanding and descendants get spawned as a result of either success or failure. So if we had to do it all over again, wouldn't we want a language that intended to have descendants. Wouldn't it have been nice to have a language that lived for its children? That's the idea I want to explore.

Oh and as a final thought. The origin of the name is this:
Recall
Observe
Prove
Express
In terms of how information is handled, it essentially means that it can pull from either memory or the outside world, process it, and then output it back into the world. The name is meant to represent every state of information handling, because that's essentially what a computer is, and be a snappy acronym as well. In regards why "create" isn't included in my theory of information handling, there's actually a philosophical reason for that. I don't think anything in human memory comes from absolute nothingness, and that creativity is probably an illusion.

NDArray/multi-columnar with efficient CRUD operations?

I'm in the hunt for material or implementations of NDArray and/or columnar structures with not-bad support for CRUD operations (insert-update-delete) that could work in-memory.

I'm aware of kdb+ but my understanding is only for big append-only loads then calculations on it.

Currently, I have for my own little relational lang (http://tablam.org) several structures that backed the relations (BTreeMaps - Vectors - Table - Scalars). The main is a table:

https://bitbucket.org/tablam/tablam/src/41b3b0676d6062031998af48b683f8f0062bf279/core/src/types.rs#lines-221

#[derive(Debug, Clone, PartialEq, PartialOrd, Eq, Ord, Hash)]
pub struct Table {
pub schema: Schema,
pub data: Vec,
}

and this is ok. But wonder what other options I could explore. (The allure for me in use just a NDArray-like container is that I can reduce my implementations to just 2 BtreeMaps - NDArray)

XML feed