Call by Meaning

A new exciting paper in my Google Scholoar feed by Hesam Samimi et al. Abstract:

Software development involves stitching existing components together. These data/service components are usually not well-understood, as they are made by others and often obtained from somewhere on the Internet. This makes software development a daunting challenge, requiring programmers to manually discover the resources they need, understand their capabilities, adapt these resources to their needs, and update the system as external components change.

Software researchers have long realized the problem why automation seems impossible: the lack of semantic “understanding” on the part of the machine about those components. A multitude of solutions have been proposed under the umbrella term Semantic Web (SW), in which semantic markup of the components with concepts from semantic ontologies and the ability to invoke queries over those concepts enables a form of automated discovery and mediation among software services.

On the other hand, programming languages rarely provide mechanisms for anchoring objects/data to real-world concepts. Inspired by the aspirations of SW, in this paper we reformulate its visions from the perspective of a programming model, i.e., that components themselves should be able to interact using semantic ontologies, rather than having a separate markup language and composition platform. In the vision, a rich specification language and common sense knowledge base over real-world concepts serves as a lingua franca to describe software components. Components can query the system to automatically (1) discover other components that provide needed functionality/data (2) discover the appropriate API within that component in order to obtain what is intended, and even (3) implicitly interpret the provided data in the desired form independent of the form originally presented by the provider component.

By demonstrating a successful case of realization of this vision on a microexample, we hope to show how a programming languages (PL) approach to SW can be superior to existing engineered solutions, since the generality and expressiveness in the language can be harnessed, and encourage PL
researchers to jump on the SW bandwagon.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Amusing only to me?

On the other hand, programming languages rarely provide mechanisms for anchoring objects/data to real-world concepts.

Except all of the common languages that are primarily OOP...

I mean it's certainly a good point that people should be using interfaces rather than names for behavior discovery, but I don't see how anything here is discovering problems that haven't been known and practically solved for years now. Hell, there's been utils to try and auto-convert data for years even.

Prolog + Meaning Annotations?

This seems similar on the surface to what I am working on, an imperative language with a logic language for combining the components. However, I am interested in statically typing the components and using the logic language for processing axioms and concepts, this seems to be adding additional annotations to objects, and using a big database of existing facts to reason about them.

It seems to be Prolog with a large general knowledge base, and a very 1960s approach to language reasoning (Cyc is 30 years old). I suspect 30 years ago Cyc needed its customised reasoning approach to be performant on old hardware, but I suspect Prolog on modern hardware would be plenty fast enough.

This Prolog approach seems fine for restricted domains of knowledge (for example proving distributivity of a heyting algebra), but I don't think its the solution to natural language processing and interpreting human meaning. To interpret human meaning requires hard AI, it requires and understanding of the semantics, and not just symbolic processing. (Although maybe if Wittgenstein was right, combining enough simple symbolic language games together has the overall effect of understanding language).

Symbolic vs statistical AI

I don't believe much in symbolic games, but it does seem likely that using clever statistics on large amounts of text will have the overall effect of understanding language. The success of Google Translate is a good example.

Also I suspect that the need for "interpreting human meaning" is a bit overrated, because actual humans often seem to skip it! They rely on pattern-matching in conversations instead, as described in the post We Are Eliza by Phil Goetz.

Faster Processing + Probabilistic Approach

It would be somewhat of an interesting outcome if the original symbolic approach to AI only failed due to lack of resources (database size and inferences per second etc). Although I think the original true/false approach is insufficient, the statistical approach is the strongest and most likely place for success. I find the connectionist / brain modelling approach too inefficient and lacking synergy with the realities of computational hardware.

The term linguists use for

The term linguists use for pattern matching behavior is construction grammar.

Phil Goetz's idea of aggressive stupidity: "Whenever they are confronted with a user action which makes no sense, they should make assumptions until it makes sense."
actually sounds like a good thing for an intelligent agent to do.
It also matches well with predictive coding ideas for cortical processing. http://users.monash.edu/~naotsugt/Tsuchiya_Labs_Homepage/Bryan_Paton_files/Commentary%20.pdf

elephants don't play chess

Symbolic reasoning has been out of fashion since the last AI winter. Cyc and company (WordNet, VerbNet, FrameNet, etc...) are probably the last hand constructed KBs we'll see; machine learning is getting good enough to take over, including inferring automatically ontological IsA and HasA relationships (yep, machine learned models are often OO, along with biological life in general).

Machine learning has one big advantage: they bypass our bias of how we think the world should work and go straight to what is. That and they are able to generalize on TBs of data that normal humans would choke on.

We're still constructing knowledgebases by hand, but indirectly

the last hand constructed KBs we'll see

I think projects like DBpedia are a strong counterexample to the claim that we're not constructing KBs by hand. It's not necessarily purely by hand that big KBs are being constructed, but the data comes, eventually, from human authors filling in templates and infoboxes, and with a bit of normalization, this becomes some pretty regular, well-structured, data.