archives

Developer Guided Code Splitting

Google Web Toolkit, which compiles Java to JavaScript for running code in the browser, now includes Code Splitting, for reducing application download time:

To split your code, simply insert calls to the method GWT.runAsync at the places where you want the program to be able to pause for downloading more code. These locations are called split points.

A call to GWT.runAsync is just like a call to register any other event handler. The only difference is that the event being handled is somewhat unusual. Instead of being a mouse-click event or key-press event, the event is that the necessary code has downloaded for execution to proceed.

Lagoona, component-orientation

(i thought i read a comment on LtU about Franz, but can't find it now.)

Lagoona apparently addresses some issues around component-oriented approaches to systems development, and has roots in Niklaus Wirth's works such as Oberon. Is anybody on LtU more familiar with this stuff, such that they could opine / compare / contrast?

(tags like: "component-oriented", "distributed extensibility", "decentralized", sorta beyond-OO, that kind of fun kool-aid.)

BSGP: bulk-synchronous GPU programming

A SIGGRAPH paper by Qiming Hou, Kun Zhou, Baining Guo, abstract:

We present BSGP, a new programming language for general purpose computation on the GPU. A BSGP program looks much the same as a sequential C program. Programmers only need to supply a bare minimum of extra information to describe parallel processing on GPUs. As a result, BSGP programs are easy to read, write, and maintain. Moreover, the ease of programming does not come at the cost of performance. A well-designed BSGP compiler converts BSGP programs to kernels and combines them using optimally allocated temporary streams. In our benchmark, BSGP programs achieve similar or better performance than well-optimized CUDA programs, while the source code complexity and programming time are significantly reduced. To test BSGP's code efficiency and ease of programming, we implemented a variety of GPU applications, including a highly sophisticated X3D parser that would be extremely difficult to develop with existing GPU programming languages.

The language acts to simplify CUDA, which reminds me of assembly code even if it uses C syntax, with, among other things, a higher-level memory model and implicit data-flow (so you don't have to explicitly partition your code between different kernels). Here is one trick that really impressed me:

findFaces(int* pf, int* hd, int* ib, int n) {
  spawn(n*3) {
    rk = thread.rank;
    f = rk/3;  
    v = ib[rk];
    thread.sortby(v); 
    require
      owner = dtempnew[n]int;
    rk = thread.rank;
    pf[rk] = f;
    owner[rk] = v;
    barrier;
    if (rk == 0||owner[rk-1] != v)
      hd[v] = rk;
  }
}

After the call to sortby, all threads are sorted by rank according to the values of v, rather than explicitly sorting a list or some other auxiliary data structure that would have to be allocated into memory. In other words, the call forces a reality where all the threads are coincidentally arranged in the way we want them to be...an interesting PL concept.