Little language for use within Java, suited for users to define "rules"

OK. I'm faced with a bunch of decent choices for a problem at work, and it might be a general enough problem to be of interest to others.

I have a program that needs to scan through an annoying log file to pick out and reassemble individual low-level events into "operations" -- one action taken by a user or administrator might result in a large number of low-level events, but we're really interested in what the user thought he or she was doing, hence the reassembly.

So far, I've made the log interpretation work fairly well by "pretty-printing" various event and attribute names using the terminology users prefer, rather than the system's names. Now, the customer wants to go further: I need the code to check for certain combinations of attributes within the events and then label the operation records appropriately.

I can do this in code. I could do this sort of check efficiently if I wanted to add a rule-based inference system, but I fear that this is more solution than is needed or desired. For various reasons, I don't want the customer writing code, but I need good performance (there are a *lot* of operations per day).

In rough order of increasing complexity and power:

  • Simple expression language, like JXPath
  • "Full" expression language, like the EL from the JSP spec.
  • Object graph navigation language (OGNL), which includes a limited lambda function capability
  • Inference engine, hopefully with a Rete implementation (e.g. Drools)
  • Java-based little language, like Groovy or Jython
  • Embedded Scheme interpreter, with some sort of user-friendly DSL.
  • Java "fragment" compiler (Janino or some sort of direct byte code writer)

So, at the start of the list we have a bunch of expression languages, limited, but decent at expressing relationships between parts of an object tree (the operation and its associated attributes). In the middle, we start seeing rules as a DSL, and at the end, we get full languages shoehorned into a seemingly simple log interpretation program.

I'd like to keep this program fast and relatively simple, while providing the maximum flexibility to users creating rules or expressions to filter out the interesting events. I'm wary about full languages -- too powerful, possibility of writing code with side-effects or thread issues -- but I don't know whether ordinary expression languages will do the trick. The middle alternative of defining a limited language that can be analyzed or checked for problems without being run sounds appealing.

Thoughts? Examples? References?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Jess?

I picked up a book called Jess In Action a little while back and it seems to fit your bill pretty well. Basically a DSL Lisp dialect meant to be embedded in Java. Pretty powerful, lispy, and well maintained.

License

If you do look at Jess, look at the license first. We looked at it some time ago for possible use in an academic project, and the license terms were quite restrictive, if I remember correctly.

Hecl

I'm a little bit leary of publicizing this much right now, given that I still don't see it as being ready for prime time, but what the heck. Hopefully this is a forum where people will not just kick it if they don't like it, but give me ideas for improving it.

I've written a 'little language' myself, called Hecl, which aims to be small, dynamic, and easy to learn.

http://dedasys.com/freesoftware/hecl/

I'm not sure it's what you need, and you may even cringe at my code, because I haven't got a lot of Java experience. It was a fun project though, and I do think the idea is worthwhile enough that I will keep working on the system.

jacl

Forgive me for not having spent enough time to read the documentation to work out this answer for myself, but maybe the answer is of interest to other LtU readers: how is hecl different from jacl, in terms of design goals and use?

Re: Jacl

The key difference seems to be first class references, which Jacl (and indeed, Tcl) does not have. Looks like David has dropped the [expr] command in favour of individual math commands, changed [uplevel] to [upeval], and if/while etc now take commands instead of expressions for the condition part. At least, that's what I can make out from the examples. It looks interesting, but I wonder if it is different enough to, well, make a difference.

Smaller, simpler

Hecl is smaller, simpler, and, well, maintained.

I've certainly borrowed ideas from Jacl, which is nice code, but while I was at it, I wanted to do some things differently from Tcl.

I like Tcl, generally, but also needed something smaller, to fit in a reduced environment like that of my cell phone. I'd like to build the core for that and then add extensions.

Aside from some effort on Neil's part, Jacl isn't really maintained as far as I can tell (it doesn't even compile on my system, using gcj). Also, Jacl will always be constrained by trying to keep up with Tcl.

Actually, this might make for an interesting discussion in and of itself - how to grow a standard high level language with different implementations (Jython/Python, for instance). Especially languages like Python, Perl, PHP and Tcl that have, from the outset counted on C libraries to do heavy/fast lifting for certain tasks.

I do not mention Tcl or Jacl on the web page because I think that may be a bit of a liability in terms of marketing the idea, unfortunately. Jacl is, of course, mentioned in the NOTICE file to give credit where it's due.

I've had the idea in my head for a while... the original version of Hecl was written in Erlang, of all things, to provide an environment for scripting web pages for those not wanting to deal with what they viewed as a strange language.

Maintained

I'm not sure about the state of Jacl these days. (Is anyone?) My own contributions stopped some time ago, as I rarely use Java these days. There was a flurry of activity on the mailing list about 6 months or a year ago, but I haven't seen anything since.

I'd like to look into Hecl some time, especially how references work in the language. I have some tutorial chapters to look at before then, though... ;)

Hecl looks nice

To make variable access more concise, you could treat variables like functions:

var   	  instead of    set var           
foo 1     instead of    lindex &foo 1
foo a 	  instead of    hget &foo a   

Confusion

Down that path lies confusion...

How do you tell commands and variables apart?

$foo is basically equivalent to 'set foo', so I'm not opposed to adding in a bit of syntax sugar, just as things don't start to look like line noise.

http://dedasys.com/davidw/

Easy


How do you tell commands and variables apart?

You don't. You just remember their names. :-)

Frink again

You might play with Frink as an option.

Frink has its roots in another language that I worked on many years back, which was designed to let users write rules for a rule-based water reservoir simulation. The language was designed to help prevent users from writing rules with lots of side-effects.

It's not quite clear what type of rules your users may be implementing, and Frink may well be overkill for your situation.

Frink is intended to be rather easily embeddable in Java programs, or vice versa--you can call any Java methods from within Frink.

It would be possible to write rules with side-effects in Frink if someone used the object-oriented features. Otherwise, functions are generally free of side-effects (there are no global variables, for instance.)

It certainly won't be as fast as a language written specifically for the task, of course. I'd usually lean toward a custom language, depending on your problem. Writing your own language is either a last resort, or an optimal solution, depending on your outlook.

Embedding Frink in a Java program can be rather easy. In some cases, it's two lines of code. See the embedding documentation, especially the documentation of the Frink class for more information.

In any case, it sounds like a fun problem. You have some interesting references in this article, too. Thanks!

SISC Works For Me

As part of the Introduction to AI course I am TA on I have written a lot of SISC and Java code for spam filtering. Several techniques the students try are rule based. We also do a bit of deduction using Schelog as the logic engine (I suggest you use Kanren though -- it should be faster). I'm quite happy with SISC for this task and recommend it. It should be easy for you to create a little language if you don't want to expose the full power of Scheme. Note that Kanren doesn't use continuations so you may achieve a speed-up by using one of the Scheme->Java compilers (e.g. Bigloo, Kawa). The only caveat is that Kanren and Schelog are backward-chaining systems, which may not be optimal for your application.

Drools

There is also Drools which has a Rete implementation in Java. I haven't used it myself but it seems quite popular in the Java/business rules communities.

Sorry

Should have re-read the initial post before adding that.

UI for expressions

You might want to look at how "smart playlists" work in iTunes and how mail filter rules work in Outlook and Apple Mail. This style of user interface is proven to be understandable by non-technical users, and it allows reasonably complicated rulesets.