CUE: An open-source data validation language

There are two core aspects of CUE that make it different from the usual programming or configuration languages:

- Types are values
- Values (and thus types) are ordered into a lattice

These properties are relevant almost to everything that makes CUE what it is. They simplify the language, as many concepts that are distinct in other languages fold together. The resulting order independence simplifies reasoning about values for both humans and machines.

It also forces formal rigor on the language, such as defining exactly what it means to be optional, a default value, or null. Making sure all values fit in a value lattice leaves no wiggle room.

The application of set-theoretic types is getting more and more mainstream as they turn out to be very effective in (partially) typing dynamic languages (i.e. Typescript). Other popular examples are Scala(3) and Kotlin, together with other (less mainstream) examples such as Ceylon, Typed Racket and the Avail programming language.

I've dabbled in Avail a few years ago, and was very impressed by Avail's type driven compiler, but not so much by its (extreme) DSL support. In contrast to Avail, I believe CUE is much more pragmatic while not allowing any 'tactical' compromises.

CUE has its origins at Google: the ability to compose CUE code/specifications is really useful at the scale that Google operates. The creator of CUE - Marcel van Lohuizen - has recently quit his job at Google to become a full-time developer of CUE. I think Marcel has created something really interesting, so I understand his decision!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

A nice overview

May be a video will whet your appetite a bit more - or may be not?

I think this video really shows the interesting bits and pieces of CUE.

Question: does CUE signify an interesting new approach in the realm of programming languages? I believe so, because I have a hard time finding related work that is close to CUE.
Some feedback on related work is much appreciated.

Marcel van Lohuizen - the creator of CUE - traces CUE back to the graph unification model that is/was in common use in computational linguistics. May be we can take some lessons learned from computational linguistics (especially scaling up to 100.000 lines of constraints) in order to improve programming language design?

btw. lambda the ultimate is really slow and unpredictable - not sure why. Posting is a bit of a challenge lately.

ltu

This is the first time that I've been able to get to ltu in ~6 months ...

Programming with lattice structure

If you are interested in where theory is going, I recommend looking at Flix, an ambitious PL aimed at supporting applications in static analysis and Datafun, a more theoretically oriented PL among whose two developers is our very own neelk. Both are extensions of Datalog with support for defining fixpoints.

Nice references

Thank you Charles, for your comment and references to modern and related work.

Indeed, the theory behind Flix and Datafun is very solid, so that means CUE is in good company!

I've encountered Flix and Datafun in the past. Flix really stood out for me because it's so close to Scala and I like Scala a lot.
That said, Flix's datalog embedding feels a bit second-class to the rest of the language.
A really neat hack of Flix is that it uses intersection types to 'contagiously' inject effect types as a parameter and return type. I think other languages (Scala and CUE!) should consider Flix's hack to track effects to similar effect.

But what I really like about CUE is the pragmatism and the immediate value it already brings to validating json and yaml with its tooling.
I also think CUE's semantics are indifferent to Datalog (it appears to be closer to propagation networks). But it could well be that an efficient implementation of CUE will be Datalog based.