Graphics primitives?

I want to recreate most of graphics primitives (lines, rectangles, circles, filling, gradients, ...) myself from scratch. From the deepest zero, like composing usefull functions from combinators. What are analogues to combinators or λ-abstractions in graphics creating?

The first-in-mind thought was "gotoxy, setpixel, getpixel", but it can deal only with raster canvas. The basic "pendown, penup, moveto" is not so bad, but how to express filling of area with it or draw raster images (or drawing curves, thick lines)? Furthermore, I mentioned only monochrome drawings, what are primitives when dealing with colors? RGB, CMYK, 16-bit color-components? What is the ground found in computer graphics reserch area, which can be extended to something usefull?

Language to deal with it is haskell (or haskell-like). I expect to recreate canvas-drawing abilities of frameworks like "SVG", "OpenGL", "LOGO", gui-like interfaces, and have extensibility to reimplement all other frameworks (GTK, QT, directx, ..?). (last question is joke, indeed, just wanted to show list of possible uses).

Maybe there is no unifing ground for all I metioned. Then the question is: what are branches of computer graphics and ground reserch state with this braches?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Adobe PDF is one standard

Adobe PDF is one standard you can look at. It's complicated and includes other things you don't need to worry about, but it's complete, general, and well documented.

Nile/Gezira is doing

Nile/Gezira is doing something similar. Also previously mentioned here

Nile is well worth a look

I'd second that. It's a dataflow-style DSL for describing graphics (and sound, etc) pipelines.

No single framework

I think you'll find that there is no single unifying foundation that will serve you for all of graphics, but it already sounds like you are restricting yourself primarily to 2D vector graphics.

When talking about the "combinators" of graphics, I can imagine you'd want to think about:

  • The Porter-Duff compositing algebra (here)
  • Constructive Solid Geometry (CSG) is, in a sense, the extension of the ideas from Porter-Duff to 3D solids - an algebra over 3D volumes.
  • PostScript is a stack-machine language for defining 2D graphics, and is the foundation from which PDF arose. From what I understand it used the kind of "pen-oriented" model you are talking about, but can also do filled areas.
  • You should consider Conal Elliot's "Functional Images" (here). Really, a lot of Elliot's work, and other work on using FRP for graphics, is relevant even if you don't consider animation/interactivity, simply because there is an emphasis on clean compositional semantics (see also Elliot's "Programming Graphics Processors Functionally").

For colors, I'm not sure that there are particular references to follow, but a "foundational" system would probably not be able to work in just one color space like RGB of CMYK. Really you want to think in terms of spectra (functions from wavelength to intensity/reflectance/etc.), and then you can work in terms of multiplication and addition of spectra.

Thanks for the links!

Conal Elliot's works are great to study. Also thanks for an idea to represent colors as spectra combinations.

As for PDF/PostScript, it is good but complicated. I am sure, it can be simplified to some "core" graphic creation language.

As for PDF/PostScript, it

As for PDF/PostScript, it is good but complicated. I am sure, it can be simplified to some "core" graphic creation language.

Yes and no. In one sense that is trivially true, as it includes things related to file formats, compression, printing, backward compatibility, etc. I assume we are just talking about the imaging model contained within it. If you need to describe operations on 3D or 4D primitives, then PDF doesn't get the job done at all. What about fonts? If you don't care about the subjective appearance of text, then you can throw out a lot of other stuff right away. If you do, then things get very complex very quickly. If you didn't care about pixels and only needed to stay within a 2D vector space then a lot of complexity also vanishes. On the other hand, if you could fix everything on a single pixel output space, it would be simpler not to deal with vector graphics in the first place. Summary: a lot depends on what you want to do.

Vector imaging is all

With vector image creation primitives I can create a filled shape called "pixel", so I reinvent raster graphics on top of vector. That is also correct when displaying fonts. As for 3D\4D - that's really difficult not only in PDF, so I don't want to go that way now.

The output that people see

The output that people see is built out of pixels (they are not really squares, but that is a side issue). The vector graphics descriptions are convenient for mathematical manipulation, but they have to be sampled as pixel output in order to do the job of creating visible images. Also, some graphics elements already come in the form of pixelized images or images represented as signals (like jpeg) and they need to be first transformed and then resampled to create an output of a given size and location. General principles of digital image processing do a good job of saying how to do these things when the shape/contrast signal is low frequency (wrt the output pixel space) and not symbolic. When the thing to be sampled is high frequency and symbolic (like a TrueType font), then special principles come into play to create really good looking output.

Color Perception

Be careful here -- color is not determined by the wavelength of light.

Due to our perceptual systems it is largely determined by regional contrast.

This is important for things like color constancy; color constancy is what allows us to perceive an object as having the same color under a wide range of illumination conditions.

This makes color one of the hardest things to model well.


The Diagrams library is an EDSL for Haskell that supports composition of images and tracks precise boundaries. (It supports rendering via the Cairo bindings.)

GPipe, Spark, Vertigo

OpenGL, PostScript, Cairo, etc. are very imperative approaches to drawing. We can model them in a functional system (using monads or arrows) but it seems difficult to reason about the resulting images or compose them.

But, under-the-hood, OpenGL and our graphics accelerators have grown a powerful programmable pipelines, which can be understood and composed in a more functional and declarative manner. There is a lot of promise here, though some challenges remain - such as maintaining VBO objects in video memory, declarative animation, motion blur.

Related work include GPipe, Spark, and Vertigo.

Thanks for the links!

Surely, I have to look on Diagrams. GPipe, Spark and Vertigo are interesting too, but I think they can hardly be used in canvas drawing (by canvas I understand HTML5 canvas). Thanks for idea constraint!

Features for Graphics Systems

You'll never find a unifying ground because there are so many features that might be desirable, and complicated feature interactions that make it difficult to avoid tradeoff.

Some features I've been interested in:

Scalable or Zoomable: ability to index elements and figure out (with fair precision) whether they'll contribute to a particular scene, and how much they contribute. Avoid load-zones or similar. Avoiding visual artifacts on zooming, such as pop-in. Zoomability is difficult to achieve with imperative graphics models because one doesn't know how much of the scene a subprogram will affect until after running it.

Animation and Motion Blur: naively I could just draw each frame independently, but that can be inefficient if it means updating vertex buffers every frame, and also can result in temporal aliasing. It is feasible to put simple animations directly on the GPU via some shader code and appropriate annotations in the vertex attributes.

Interaction: associate a pixel back to a set of possible targets. A lot of simple drawing models tend to forget where the pixel comes from, or fail to make this robust. Obtaining precise feedback in the face of hardware-accelerated animation is a challenge. One possibility that interests me is generating an extra PBO each frame that simply associates pixels with an object identifier associated with a vertex.

Reaction: update objects on the GPU in response to user or network events, which means maintaining relationships and state over time. To be efficient, the state should really be incremental with the events. Declarative approaches tend to make this difficult to model - even FRP, for example, wouldn't make it easy to update just the changed buffer objects on the GPU.

Collisions: - need to detect overlaps between elements, perhaps for physical modeling or automatic layout. This tends to constrain your graphical primitives since it can be difficult to detect precise collisions for arbitrary shapes.

Parallel Rendering: - render different sections in parallel, perhaps across multiple GPUs.

Feel free to list any features I missed.

Anyhow, despite conflicts, I would think a simple two-language approach might work fairly well, e.g. sort of like HTML5 with WebGL canvases. One language provides a lot of nice declarative features. The other provides power, performance, and flexibility when we need it.

2 conflated things here?

there's drawing pixels vs. groking shapes. things like collision/overlap detection is more properly in the "grokking shapes" world. drawing a line, even with blending etc, is more in the "drawing pixels" world. flood fill can be in either/both.

2 entangled things here.

I think most developers in graphics space have a solid handle on the distinction between shapes and pixels. I also think that the relationships between them tend to exist in complicated knots, especially when influenced by concerns like zoomability, interaction, or animation.

If you want an example of how the ideas get entangled at the extreme, see Minecraft. ;)

Primitives are dictated by hardware

You should probably take as primitives whatever the hardware you want to run on draws efficiently. Setpixel-like primitives are far too slow nowadays. The most obvious choice is drawing a bunch of triangles + shaders on a GPU.

Conal Elliott has done very interesting work about abstractions you can build on top of this. His paper Programming Graphics Processors Functionally is highly recommended.

You should have a basic

You should have a basic understanding of pixel and vertex shaders, conal's vertigo work does a great job of exploring what can be done with that hardware (mostly centered around implicit surfaces). Lighting and such can also be expressed as math, the nice thing about a programmable graphics pipeline is that there are no primitives, you just program the pipeline.

Life gets weird as we hit last generation and current generation pipelines (these aren't used much in games because consoles don't support them). Geometry shaders are very imperative as it basically streams out new and transformed triangles (they might be a dead end also). Hull shaders are also a bit weird, but work great for surfaces that can be expressed with continuous functions (bezier curves). There is a lot to explore with new hardware that hasn't happened yet.

"A pixel is not a little square"

As covered in great detail by Alvy Ray Smith. It would be a dangerous fallacy to think that you can recreate raster graphics "on top of" vector graphics by filling little squares.

For a system to span both raster and vector models, it needs to have a clear model of both sampling and reconstruction. Fortunately, both of these fall under the topic of signal processing and are well understood mathematically - albeit through continuous math rather than the discrete variety we computer scientists may prefer.

Subpixel Rendering

If you know the actual structure of your pixels, there are many interesting techniques available. Unfortunately, they tend to require a lot of specialized code for different displays.

cf. subpixel rendering on Wikipedia

Generative Modeling Language

You may find Sven Havemann's GML interesting (see for some nifty examples and various references). Its surface syntax is much like PostScript but it is a 3D modeling language. From his thesis:

The central problem is the absence of semantic information in the model ... A single triangle that is part of a large triangle soup does not know whether it is part of a wall, a door, or a car. ...

The solution to this problem offered by generative modeling is not to store the result of a design process, but to represent a model by the design process itself: Store the tools used rather than the models created. The model can be re-generated from the operations whenever it is needed, and at any resolution that is needed.


Context Free Art

Yet another image generative paradigm:

Regarding Shape Grammars

Bakul Shah and Tegiri Nenashi both point to shape grammar techniques for describing scenes. Such techniques seeded my interest in more general applications of generative grammars. I wouldn't really call them `graphics primitives`, but it is just as important to have good composition mechanisms as to have good primitives.

A common problem with shape grammars is that it is difficult to encode practical requirements - such as physical laws or building codes. In a recent discussion of generating 3D worlds for FoNC, Devon Sparks says it well:

There's a trend in architecture schools to offload the form-finding "creative burden" to computers with the use of shape grammars. Though they're a driving force in many departments, some will admit behind closed doors that they're also a bit of a red herring, and that years in the spotlight have yet to bear fruit. My own observations are that, rather than easing the burden, shape grammars have shifted the focus of labor: students trade their Olfa knives for a keyboard and mouse, and spend hours debugging Rhino scripts instead of erasing lines. Because most grammars are agnostic to physical law, they also generate needlessly inefficient, material-laden architecture, which rightfully sends the building scientists into the streets screaming blasphemy.

GML does define graphics

GML does define graphics primitives but its focus is on generative, procedural compositional rules. It is used to describe architectural structures though I suppose you can use it for a scene graph. Not sure how GML fits in the larger world of architecture. Your note reminds me to take another look at William Mitchell's "The Logic of Architecture:  Design, Computation, and Cognition" (MIT Press), which, IIRC, talks about a language of architectural design with its own design operators and grammar rules. I think he even used Prolog to describe some of this! (The book was published in 1990). But I don't see why something like GML can't be used to describe a reference model of a house for example, and other systems such as electrical wiring etc can be described in a parallel model somehow overlaid with the graphics model. Thanks for the FoNC reference