## Operational Research (OR)/ Constraint Programming (CP)using GPUs

Dear all:

Recently one sees lot of work on porting computational extensive algorithums on GPUs. I wonder if there is a work or interest with in the CP/OR area in this direction?

Thanks,
malik

## Comment viewing options

### Not so suited for CP

Most constraint programming propagators are not well suited for implementing on a GPU. The problem is that GPUs are made for data-parallel processing, while propagators are very seldom data parallel.

### True...but you could

True...but you could parallelize the solution process: the propagator creates a highly parallel network of equations that are then fed to the GPU. I don't understand CP so well, so I might be completely misunderstanding this, but I do know I get pretty good performance using a simple system of assignments whose solutions are not visible until the next step (then we just apply the steps until we converge to a solution). This is how GPU-based physics engines work.

### What equations?

I'm not sure what kind of equations you would like to construct. A constraint in a CP system is typically implemented by a propagator that removes inconsistent values from the set of possible values for its variables. A constraint may be a simple equation (3x+y=z), a set of equations (i.e., all_different(x1,...,xn), meaning AiAj.xi!=xj), or some complex constraint (e.g., some variables form a tree, a set of tasks form a disjunctive schedule, a sequence of variables belong to a regular language R).

My guess is that you mean that one should use a GPU to execute the propagators themselves in parallel. There are unfortunately several problems with this. For example, the set of propagators is typically very heterogeneous, with large differences in size, complexity, and scope. Another problem is that the propagators operate on a relatively small set of shared variables, leading to either costly synchronization issues or to costly merge operations. A big problem is also that many interesting propagators use lots of loops, dynamic memory allocation, and pointer based data structures. All things that are ill-suited for GPUs.

As a general comment, I'm not at all averse to GPU-based computation, I just think it is unlikely that CP will fit onto that kind of hardware. I'd love to be proven wrong.

### As I said, I really don't

As I said, I really don't understand this field so well, but take 3x + y = z, we can then transform that into three assignments:

x' := (z - y) / 3
y' := z - 3x
z' := 3x + y


Each assignment reads old values, where x', y', z' do not become x, y, z until after the entire block of assignments has occurred. This allows us to represent x, y, z, as two GPU textures: one being read and one being written, where we "swap" the textures after each assignment block retires.

The above assignment obviously won't converge as they don't retain previous results (necessary for iterative convergence), so we modify the assignments to only take so much of the new values, say 33 percent:

x' := x * .66 + ((z - y) / 3) * .33
y' := y * .66 + (z - 3x) * .33
z' := z * .66 + (3x + y) * .33


Now, run this assignment block 100 times and the values of x, y, and z will approximately conform to the constraint. Additional constraints can easily be added as additional assignments, and the GPU will really excel when there are hundreds/thousands of constraint-related variables with similar sets of constraints (issues like branch coherence become an issue if the constraints are too dis-similar).

The CPU can just naively generate a bunch of assignments from the set of constraints that we want to maintain, then we use the brute force of the GPU to find approximate answers for the variables using these assignments by executing them lots of times.

Anyways, this is how my physics engine works, which is a special/limited case of constraint programming. I'm curious how that would apply to constraint programming in general, but I'm not very knowledgeable in this area.

### CP variables contain sets of values

In a constraint programming system, a variable contains the set of values that are still possible for that value. This means that the translation into equations that you proposed isn't really meaningful unfortunately.

As a simple example on how CP works, consider a 9Ã—9 Sudoku board (nowadays the standard example in CP). You can represent each field by a variable with the initial set of values {1,2,3,4,5,6,7,8,9}. The constraints are an "all different" on each column, row, and major block. Say you have a row of variables x1,...,x9. The puzzle specification says that x3 should have the value 2. Then 2 is removed from all other variables in the row (each now having a domain {1,3..9}). There are also more advanced filterings used. For example, if the union of the domains of k variables from the row has exactly k values, then no other variables can use those values. For example, let the domains be x1={1,3,7}, x2={1,7}, x4={1,3,7}, x5..9={1,3..9}. Then all variables x5..9 can be updated to have the new domain {4..6,8..9}.

For the interested, a nice overview of how to propagate the all different constraint is the survey by Willem-Jan van Hoeve.