Syntactic/Semantic Control in MDA Framework

Hi at all,
I'm an italian student and I'm working on a thesis about semantic verification of xml models in MDA (Model Driven Architecture) context.
MDA is an OMG standard based on visual programming languages and automatic generation of code.
In particoular i'm working on a framework that allows programmers to define SOA three tiers web applications.
The programmer can define frontend and backend of these applications in terms of workflows (i.e. graphs).
The problem is that the syntax of these visual programming languages (the graphs for frontend and backend) is underconstrained, so a programmer can define senseless combinations of states.
There are at least two types of errors:

1) Two (type of) states are connected with a transition, but this transition is semanticaly senseless

2) Two (type of) states are connected with a transition, but a pre/post-condition involving other states doesn't hold. For example a transition between two (type of) states is admitted if and only if there is a transition from the fomer/latter state to another state.

The first type of constraints is easy to impose: a simple table could be added to the framework so these type of senseless cuold be checked on-the-fly.

The second type of constraints involve more complex reasoning, and i think there are basically two types of solutions:

a) To check the models (graphs) defined by the programmers, using techniques inherited from Model Checking (formal verification). In this case the control is a post-development control.

b) To define a precise syntax and semantic using graph grammars. I'm quite ignorant in the domain of graph grammars, so i don't know if these instuments are intended for this purpose.

I'm writing to you 'cause i think the problem about syntax and semantic of visual programming languages is undertheorized.
The first solution i proposed has some problems: model checking is intended to check concurrent system, and not to check the problem i described. On the other and, logic, and CTL logic in particoular, is general enough to represent every type of constraint. Still, there are some computational complexity problems (i.e. state explosion).
The second solution is only a vague idea: i don't know graph grammars...
I would like some opinions, references, suggestions or something else.
I know this post is quite general, but there are too much questions that i will describe later, on demand, in order to avoid to write a too long post.
Best regards,

Alberto

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

That's quite a difficult

That's quite a difficult problem you're aiming at! Based on my reading, what you're aiming for is something like a workflow typechecker that can handle protocol and state transitions based on predictable interactions between services. I've looked at this a few years back, but ended up deciding to put it on the backburner for my own distributed programming language project.

There are multiple service processes in a network, each with incoming and outgoing connections. Modulo caching, other such optimizations, and explicit forms of service interaction, you can assume that each connection gets its own 'copy' of a service. A connection with a service may therefore assume an initial state in a manner isolated from the other users of the system. This state will change in response to messages received, as shall the state of the connecting service (e.g. both the browser and the webserver will change). It is meaningless to deliver certain messages when the recipient is in certain states. Thus, you feel some need to identify messages and statically ensure that the mutual states of both sender and receiver will be compatible. (This becomes much more complicated when dealing with more than two services interacting.) In addition to plain messages, delivery and management of 'streams' of data is also a critical concern that should be handled properly by this model, especially if any sort of multimedia is to be handled.

It's a good thing you aren't concerning yourself with resource management, delay and disruption tolerance, survivability, graceful degradation, resilience and graceful recovery, vulnerabilities to DOS and other attacks, overlay network protocols, QoS, security, anonymity, etc. Those are what overloaded and fried my brain on this subject. Overlay network protocols, especially, are problematic given the need to verify against flooding and other problems. There are reasons that not many communications model checkers or automated systems have not been implemented yet.

What you need is something like a type-system, but modified to support state dependent types. There is no way graph grammars could really help you... at least not without the same sorts of enhancements that would turn them into full automated theorem proving systems like type-checkers. Graph grammars essentially solve the same problem as linear grammars, just in two or more dimensions. They can parse or generate graphs to obtain abstract structures capable of processing... but you still need to do the processing, the same sort of thing you're banging your head against.

Now I've been doing some work on integrating type checkers into the grammar (mostly because I believe type-dependent syntaxes will be a really cool feature for typeful programming) but even figuring out how to do that is enough to give me headaches. So I won't call it impossible to use graph grammars to solve the problem; it probably can be done. But it will not be the easiest or most straightforward of solutions.

For now, I suggest you continue on the path of logical analysis and keep in mind the rules of optimization:

  1. Don't Optimize
  2. (Experts Only:) Don't Optimize Yet

Go ahead and transform the graph model into a bunch of logical statements, include your own set of axioms, and check for inconsistencies. Perhaps you can make certain checks by walking a graph or whatever, but those can be done later when you know exactly what you're trying to do.

Metamodels and model-checking

Tools such as Vanderbilt's Generic Modelling Environment address your first problem by defining a visual language in terms of a metamodel that uses a combination of UML and OCL to provide a precise definition of allowable constructions. Users building a diagram in particular visual language are then constrained to only being able to build diagrams that meet the constraints. You could, for example, define several different subtypes of a generic state element, and define constraints that prevent transitions from being made between different types of states.

On the model-checking front, there have been a number of projects that have looked at translating UML state diagrams into various model-checkable formalisms (Promela and CSP are the ones I'm most familiar with). Papers such as Jussila et al. Model Checking Dynamic and Hierarchical UML State Machines and Ng and Butler, Towards Formalizing UML State Diagrams in CSP might give you a place to start from.

Having said all of that, I think you're right that these kinds of problems are somewhat undertheorized, and certainly the model-checking approach does run into problems with state explosion that often require human intervention to build more abstract system models. So there's probably plenty of scope for doing more work in this area.

More precisely...

Dear David, dear Allan,

thank you so much for your answers.
Although my description was too general, you have caught the core of the problem.
Now i will try to explain better the situation.

MDA standard, as specified by OMG, provides for a four layer abstraction where:

- m0 layer is the level of "reality"
- m1 layer is the level of the model of reality.
- m2 layer is the level of the model of the model of reality: in other words it's the model of the modeling language used to model the reality. This layer is called meta-model.
- m3 layer is the last layer of abstraction: following the rationale, it's called meta-meta-model or MOF. (It's self-defined)

Focusing on m2 layer, OMG prescribes to use UML metamodel and, if necessary, a couple of UML profiles (Domain Specific Languages) to reduce the semantic gap between the expressive power of UML metamodel (too general) and what a modeler wants to describe in a specific contexts.
The standard doesn't impose to use UML metamodel: one could define his own metamodel provided that it's MOF (m3 layer) compatible.

The project on which i'm working on is inspired by MDA, but not follows exactly what it prescribes, so probably we have to call it MDE, or, as Johan den Haan has suggested, MDx .

In my case at m2 layer we find a custom metamodel (not MOF compliant) expressed in XML (an XSD, XML Schema) which defines four Domain Specific Languages (we could call them "profiles", analogously with UML) containing the models that the users of the framework can handle graphically to define their (SOA three tiers) web applications.
These DSL's are:

- Front-end DSL
- Back-end DSL
- Queries DSL
- Datamodel DSL

Exluding Queries and Datamodel DSL's, where programmers work filling predefined forms, the problems arise in the context of front-end and back-end DSL's, where the concepts have to be expressed in terms of graphs.
Focusing on back-end, a programmer can define a Service as a set of Actions (methods), where every action is expressed in terms of a graph.
So the programmer can choose from a certain number of (type of) states, and connect them in order to express what he want to express.
The problem is there isn't any control, so one can define semantically wrong combinations of states.

From a linguistic point of view, we can describe this situation in this way: the XML Schema is the core of the metamodel and represents a grammar that generates a language.
Concrete phrases of this language are the models, which are the elements that constitute the four DSL's above-mentioned.
The fact is these phrases are not "simple phrases" of a language, but constituate themselves a sort of underconstrained grammar where:

-> the lexical level is well-defined ('cause it's XML-based)
-> the syntactical level is "under-defined", in the sense that a programmer can define more complex phrases (graphs) constrained by what the framework allows him to do from a graphical-interactional point of view. So the syntactical level is underdefined 'cause it precisely expresses that a state can have transitions, but it does not express that a certain type of state can't have a transition towards another certain type of state.
-> the semantical level lacks (the semantic is implicit in the head of the programmer, at the moment).

Starting from this analysis, there are at least three approaches to the problem:

1) MDA supply only a "Linguistic Metamodel", and literature suggest to define an Ontological Metamodel on which one could define a reasoner to reason about models defined by programmers. Still, how one could define this reasoner? I think there is a lot of work to do in this direction (see also)

2) In analogy with compiler theory, where the syntactical level is based on a abstract syntactical tree and a parser, one could think to define a sort of syntactical tree and a parser for the visual-graph-based languages under study. In this sense I have thought about graph-grammars.

3) In order to compensate the lack of the syntactical level, a valid solution could be to define a list of syntactical/semantical constraints that a model has to respect. In particoular model checkers seems to be a good solution: I could associate a logical formula to every type of state, and then express syntactical/semantical constraints in terms of (CTL) logical formula. The main problem is that the computational complexity of the labelling algorithm on which model checking is based on, is O(f * V * (V + E)), where f is the number of the logical connectives in the formula, V the number of states and E the number of the transitions.

Before your answers I was undecided between these tree ways of reasoning, although the last one seemed the be the more practicable.
Now I think the last one is the solution on which I will concentrate.
Thank you a lot,

Alberto