another new language, Chomsky hierarchy Type-0

I would like some of you to read my short article on "Ameba, a Calculator and Metalanguage" at the following URL:
http://computer.wikia.com/wiki/Ameba,_a_Calculator_and_Metalanguage

I have tried to find a language like the one I have unveiled by mind experiments. My research has not discovered a language like it, but I am an amateur, I know no one who can proof read my work, and since I am dyslexic this document quality is questionable. I will answer questions, correct and clarify as needed. I did spend at least a day per page proofreading and rewriting, but I just cannot see all the errors.

I am interested in comments about the technology, not particularly about proofreading errors. but the two probably cannot be separated.

I believe it to be a context-sensitive metalanguage.In general, argument syntax depends on runtime state.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Semantics?

I've read through your article but I'm afraid I haven't got much out of it. Use of context-sensitive grammars sounds interesting, but you don't explain the parsing algorithm and how you resolve ambiguities, which would be the tough part.

Also, there is no explanation of Ameba's semantics. Does the context-sensitivity refer only to the syntax of your language, or does it impact its semantics as well? If it's only the syntax, well most programming languages have a context-sensitive syntax. If Ameba's semantics relies on context-sensitive grammars, you should explain how. That shouldn't require too much typing, if that's your problem. Some formal reduction rules (i.e., small-step operational semantics) would suffice, if you can express the semantics that way. That's how grammars are already written, after all.

Thanks Mario

Sometimes simple things are conceptually difficult.

Parsing (lexical and syntactic) and semantics are recursive descent processes, with a twist. The main interpreter loop calls both built-in routines and Ameba operators to process lexemes, syntax and semantics. Although, the article doesn't say, an Ameba operator can also do lexical scanning.

My belief is that Ameba is exactly as context-sensitive as natural languages, but I do no know how to prove it.

Ameba lexical scanning, syntax analysis and semantic processing is completely contained within built-ins and Ameba operators. There is no external grammar and no compiler-compiler, just code.

Every operator is responsible for doing its own syntax analysis. Whoever codes an operator can use any syntax they desire for trailing operands. Preceding operands are already processed and on the accumulator stack.

Each operator needs to create a symbol wherever it is called. For example, the open-parenthesis operator will create a symbol for each unique parenthetical expression. Only succeeding operators will be included in these symbols. The symbol table should be saved from session to session to reduce processing time.

The operators that get arguments can create a string from the operator and arguments, and the "return" can store the expression. In the case of a parenthetical expression "return" occurs when the closing parenthesis is processed.

I don't have an answer for handling Ambiguities; in fact, I am still learning about it. Whoever codes in Ameba will ultimately be responsible for creating ambiguities and responsible for fixing them.

The math in http://en.wikipedia.org/wiki/Operational_semantics is beyond me. I am a self taught computer language amateur, and have not previously delved into operational semantics. Sorry.

It would be possible to write this article like a microprocessor manual, using only numeric numeric op-codes and explanations of what the op-codes do. Just as there are a variety of microprocessors with different op-codes, it is possible to make an Ameba interpreter with various built-ins. A precise definition of Ameba syntax and semantics is not actually important. I included pseudocode as a microprocessor manual provides op-code equivalents for numeric instructions, to improve readability.

My answers probably aren't what you expected, partly because Ameba is a bit outside the box, and because I lack a classical compiler education. My degree is Electronics Engineering. I've actually only taken three computer science classes, Fortran, Lisp and AI. When I attended the University of Texas at Austin, circa 1970, there was no undergraduate computer science department. Since then much technology has been discovered and developed.

P.S.

The operators that get arguments do have requirements that are fairly rigid, because they are key to context sensitive argument processing. In addition, I believe lambda expressions and context sensitive arguments are incompatible, because a lambda expression requires a prototype before run time.

Is it really context sensitive?

It looks to me as if the syntax expected symbols can be made conditional relative to any code, so that based on an arbitrary test a symbol might be either a class or a binary operator.

Example:

if some_test
  then !(f = !(o = 1) class)
  else !(f = +)
fi
f.o ...

Then if I understand correctly, whether f.o is accepted (syntactically correct), depends on which branch of the conditional was taken.

If this is right, then the language will fall outside the context sensitive languages.

Thanks Charles

The same type of if statement can be written in English. I thought English was context sensitive, if not then I suppose Ameba is not.

If English is context sensitive, isn't it possible to define a word that is context free, until overloaded with another definition. A more complex example is using English to define a context free grammar, which people imbed in English documents.

If your example means Ameba is not context sensitive, then I don't understand what context sensitive means, which is OK, I can stop saying it is context free.

Nonetheless, Ameba operators can be written to use both surrounding syntax and run time state to affect syntax and semantics. That does not mean that every operator will do so; in fact, there is no requirement any operator must do so.

A can of worms

You can create an analogous problem in English: In this picture we see the glorious monarch of Ruritiania, wearing her fabulous crown.

The grammatical gender of the term "glorious monarch of Ruritiania" here is put into agreement with the possessive pronoun "her", but the proper gender of the term is not available to the syntax, at "compile time", if you like. Instead, we grasp the gender outside of syntax in our how our conceptual system deals with noting that the picture shows either King Stephan or Queen Amelia, at "evaluate time".

As an aside, an expression like Some of the drummers from the women's team were unhappy, one of whom —call him drummer X— was later to take drastic action causes a problem for the claim that English is context free: the disagreement here appears to be syntactical (it is, provided the femininity of grammatical gender is a lexical property accessible to syntax), but look at the chain of inferences you have to make to reach that "one of whom" must be feminine. Have fun constructing a pushdown automata that accepts all and only such gender-agreeing sentences.

Maybe it's true that the language is context sensitive except for dynamic issues to do with typing. But I think it's worth putting some thought into exactly what you are claiming here. In particular, if type safety is not characterisable by a context-free grammar, what is?

Tex, FWIW, is a language that fails to be context sensitive in a very bad way. There was a discussion of this in an answer, Parsing TeX is Turing complete, to a question on the Tex Stackexchange.

Strengths of type safety

By a nice coincidence, a question, Context Sensitive Grammars and Types, has appeared on the TCS stachexchange.

Neel's answer is worth thinking about. C is expressive and I think has the good property (I think!) of the type safe programs being checkable with context-free plus lookup of decalred identifiers. This is possible because C has a very correctness unfriendly notion of type safety, where all casts are type safe, and type doesn't really mean anything beyond the compiler's intentions with respect to processing bits it might find at the other end of a pointer.

Ameba can't do this trick, because the syntax that is permissible after an identifier depends on the type of the identifier, and that is dynamic. But lots of respectable language are in the same boat, as Neel documents there.

Thanks for the reference to the TCS question.

I do not have a strong attachment to either strict type checking or otherwise; both are necessary. If you want stricter type checking in Ameba, then I believe it is possible to issue a message from your objects when someone tries to do something outside the rules of strict type checking.

If it helps, and if by

If it helps, and if by "context sensitive" one is first concerned with deciding whether or not a phrase belongs to the language generated by the grammar (before further semantic considerations),

(as in your "Then if I understand correctly, whether f.o is accepted (syntactically correct), depends on which branch of the conditional was taken")

then, yes, maybe it's worth to recall the unambiguous (yes, pun intended :) most general form of the productions for a context sensitive language's grammar (well, per Wikipedia at least, and that I reckon may or may not be your preferred source for this purpose) :

[...]Type-1 grammars (context-sensitive grammars) generate the context-sensitive languages. These grammars have rules of the form

α A β -> α γ β

with A a nonterminal and α, β and γ strings of terminals and nonterminals. The strings α and β may be empty, but γ must be nonempty. The rule S -> (empty) is allowed if S does not appear on the right side of any rule[...]

an important aspect of which, at parsing time, being:

how to have the parsing algorithm take into account (properly) these α, β chains on both sides, specifically as they surround the left-hand side non-terminal (A) which should be reduced, "at some point", with α...β seen as the syntactic context for the reduction of A, given α γ β from the input (in my understanding).

That Wikipedia reference has

That Wikipedia reference has not helped me enough to prove Ameba is context sensitive. The form α A β -> α γ β and the Wikipedia page for context (language use) seem incompatible, and makes me unsure how to use them.

Well, not only I can believe

Well, not only I can believe this

That Wikipedia reference has not helped me enough to prove Ameba is context sensitive

but I can also relate to it, actually.

FWIW, what I figured eventually is this general definition for a context sensitive language grammar's production (I copied/pasted above from WP) is likely not meant at all to give us clues about how a hypothetically context sensitive-compliant algorithm should be designed and/or concretely implemented.

In my understanding (again) it's instead, I believe, more like what should ideally be the computational outcome of such algorithm when parsing a chain that would be equivalent to the right-hand side of such productions, to reduce towards the A non-terminal in the left-hand side, within that alpha...beta syntactical surrounding (or, "context").

No matter what has been said already in this thread, I find your work/reflections about this problem space pretty interesting, in fact, since I can relate them to my own, as well, to some extent.

So, if you're interested/if it can help/inspire, I've also made an attempt at tackling with this type of thing in a more-or-less generic fashion, my way.

See this experiment of mine, for instance, dealing with the absence vs. presence/usage of certain "magic" tokens in the input, to alter the remaining of its (then ongoing) parse -- there, implemented in the perspective of a PEG-based parsing, and where the magic token is the adjective "so-called" to trigger a dynamic update of the expression grammar being used.

The raw idea is the following:

"so-called" is a terminal part of the initial PEG, and some productions are insensitive to its absence in chains of the form "...Adj Adj Adj (N times)... Noun", like in "...nice big black cat", but not to its presence, like in "...so-called sweet old connoisseur".

Thus, I've implemented a scheme, so that when such token is encountered, as in the latter (note "so-called" is just an example; but you can define whatever, say, "trigger pattern" you want for the same effect), I dynamically augment the pattern giving the list of acceptable nouns (itself defined elsewhere in the PEG), to include "connoisseur" as part of it (while it's not seen as a valid token, if "so-called" hasn't been encountered yet in the input, read from left to right).

Finally, note these "augmentations" from/to the PEG are of arbitrary complexity, at your discretion, and not just limited to add tokens to existing pattern definitions: you can also add brand new pattern predicates of kind "sequence...", or "some..." (one-or-more), or "any ..." (zero-or-more), or etc, etc, or even replace completely an existing one, possibly changing its kinds of predicates in use, while you're at it.

Thanks for your

Thanks for your interpretation of α A β -> α γ β, which seems a logical conclusion. My strategy was to look for a source document that resulted in α A β -> α γ β from the definition of context, and to read and try to understand it.

It appears to me that α A β -> α γ β only describes syntactic (i.e., verbal) context and ignores run time (i.e., social) context. If true proving Ameba seems trivial, but is only half of the required proof.

It seems a trivial proof by thinking of y being an Ameba operator with α and β being its arguments, which means A is an Ameba expression that defines y and contains operators that get the arguments α and β. Unfortunately, I am not experienced in proving theorems, and do not know if the preceding statement is a proof.

The operators that get prior arguments take them from the accumulator stack may not satisfy using the α context. The operator getNextCode uses the return address to get succeeding Codes from the call line. It is possible to make an operator that uses the return address to get previous Codes from the call line, The combination of getPriorValue and getPrevioiusCode should satisfy using the α context, as getNextValue and getNextCode should satisfy using the β context,

I might illustrate semantic context-sensitivity by writing an Ameba program that gets arguments based on real-time inputs and state values. Again, I do not know whether that is a proof.

Your experiment sounds interesting, but I am basically ignorant of natural language processing. It is a complex subject, and I wish you well.

Well, It seems a trivial

Well,

It seems a trivial proof by thinking of y being an Ameba operator with α and β being its arguments, which means A is an Ameba expression that defines y and contains operators that get the arguments α and β. [...] The combination of getPriorValue and getPrevioiusCode should satisfy using the α context, as getNextValue and getNextCode should satisfy using the β context,

I'm not super concerned with being able to give formal proofs of this type of things early in the reflection/experimentation process.

But as you know, most PLs as old as a guy like C, for example, are de facto context sensitive actually, if only because you generally have at some point semantic actions to decorate your AST with things you cannot represent directly and strictly with "just" a CFG (be it expressed in (E)BNF or whatever) -- the canonical example being the declaration and/or definition of an identifier, say, "myCoolFunction", before its use later on in the program source (at call sites).

But this "traditional" making of parsers for context sensitive languages (in the purely syntactical sense) by means of use of context free grammar definitions augmented with semantic information is thus kind of ad hoc since we have to rely on whatever code we need to figure out for this to happen during the parsing evaluation proper (which basically consists in reading the input and reducing non-terminals until we find the CFG's axiom is reached or not).

My feeling is that, pretty much like myself, you are trying instead to enable this context-sensitiveness upfront and somewhat more generically, trying to decouple it from what should be "strictly semantic" (e.g., type checking, constant folding in the AST maybe, etc) and thus have less "artificially semantic" tasks to perform (like the handling of declarations of identifier symbols before their use in other constructs' usage in the input, as I've recalled above).

Your experiment sounds interesting, but I am basically ignorant of natural language processing.

Sorry about the confusion, but NLP is not my interest, actually. This is indeed the design and implementation of DSLs for Turing Machines, instead. It's just I chose to take that NLP example as a borderline demo / on the edge of the subject, as people are aware and familiar upfront of the context-sensitiveness aspect for natural languages parsing, while they tend to assume a sort of implicit preference for context free grammar definitions (or assimilated) for computer languages.

and I wish you well

Same to you!

Domain Specific Language

Yes, a generically context sensitive language facilitates constructing much of a DSL using paraphrase forms and lessens the need to use orthophrase and metaphrase forms (Standish).

Metaphrase forms are used to overcome the rigid syntax of languages such as C++, Lisp and Python; whereas, Ameba does not have any immutable syntax. Lessening the need for metaphrase layering simplifies the process of making a DSL, but cannot make it simple.

Same to you!

Thanks, My ability to develop Ameba is limited by my health. I suffer chronic pain and take narcotics to mediate that pain.

My article on Ameba contains enough information for someone to develop it, but I am concerned it is not written well enough, because of Mario B's reply. I believe that papers can be written and published based on Ameba, but such things are beyond my capacity. I am willing to help others, if anyone wants to continue with Ameba. Otherwise....

Some of such "context

Some of such "context sensitive" grammars should be classicifid in type-0 rather type-1 grammars.

I feel type-1 grammars are not useful in practical. Type-2 (CFG) and type-3 (RE) are simple and work for many years. If I want more power, I would go for type-0 grammar, that gives me all the freedom. It is rare, if any, need a type-1 grammar.

Families of type-1

The point is that if the grammar appeals to, say, a "has this been defined already?" constraint as the only way that it is outside the realm of context free, than it violates context freeness is a very limited and practicable way.

Families of type-1

What are families of type-1?

http://en.wikipedia.org/wiki/Context-sensitive_grammar lists "indexed" and "moderately context-sensitive."

Prior art includes polymorphic as context-sensitive. However, I believe Ameba allows one to write generally context sensitive expressions.

I don't know if a similar system can be done recursive ascent.

Families of type-1

I feel that a warning is necessary before this discussion goes any further. Ameba programs can, in general, have either a context-sensitive, context-free, or regular syntax. Everyone here seems to agree that this is true. However, it is similarly true that Ameba programs may have an unrestricted, or type-0, syntax. I didn't mention this earlier because type-0 and type-1 languages have very similar properties. However, this fact now seems relevant to the discussion.

Not proof

My attempted proofs are incomplete or incorrect.

A can of worms :)

It doesn't take a context sensitive language to make a can of worms, as illustrated by "The International Obfuscated C Code Contest." And, billions of people communicate with natural languages, and most strive for clarity rather than obfuscation. A few have the gift of composing poetry and song, and sometimes use ambiguity to enhance meaning.

Many things both natural and man made can be used for either good or bad. Laws and custom prevent a few things from being used by people. However, some of the restrictions make life worse for people, for example the Prohibition Amendment to the USA Constitution that was repealed after 19 years.

Often these issues lead to flame wars. Is it impossible to devise a more scientific method to evaluate the qualities of languages and language features?

On a personal note, I have often been frustrated by a language that could not express a solution in a simple manner; thus, forcing me to write a more complex solution. I am not sympathetic with those who would censure language features. On the other hand, I am sympathetic with the motivation to improve the overall quality of code, but the solution is elusive if there is one.

Three takes on the censure of features

I am not sympathetic with those who would censure language features.

This sentiment I think is generally correlated with view that there is such a thing as optimal language design, which I rather doubt. Rather there are many local optima, but there are no optima that are best for everyone. This position we might call pluralist.

We pluralists like to censure language features, but only in the context of languages that have some particular purpose. We might contrast this position with the unitarians, who don't censure, but want a broad language that is right for everyone, and the cenobites, who seek the language that is right for the few, and censure the wishes of the unworthy.

I hope that everyone can find something in my pluralist/unitarian/cenobite trichotomy to object to...

Something everyone can object to

I don't think your hopes are misplaced. :-)

I do appreciate the pluralists view on constraining languages with a particular purpose (DSLs). But it's a little unclear what 'feature' means. Often, constraint is necessary for a feature, allowing improved reasoning about the program under stress or composition. If so, then what does it mean to censure a feature? ...

I think if you study any rich set of interesting language properties (real-time, orthogonal persistence, concurrency, distribution, live programming, generic programming, automatic memory management, pluggable modularity, scalability, etc.) you end up with a lattice or partial ordering of language designs. Some of these properties will be especially relevant to some domains or user-stories, and not so relevant to others. By sacrificing or adding properties, we can shape towards which domains user-stories our languages are most suitable. Figuring out which properties are important is one of the most difficult parts of language design - but one can look at design patterns, frameworks, boiler-plate, and a wealth of existing research to get some hints.

Additionally, even with a fixed set of interesting language properties, we'll have a lot of local maxima on the infinite plane of language design. Even moving horizontally across the lattice of language properties is rarely going to involve incremental changes to the existing language design - design considerations for interesting language properties are very often cross-cutting. Further, it is rarely clear whether improvement on any particular subset of properties is possible. On the other hand, moving down the lattice - making a language worse in every interesting property - isn't especially difficult (until you're near the bottom and need to get creative to compete with INTERCAL and its brethren). Since we can so easily move down the lattice, that does offer some hope that there might exist a path up, even if we haven't yet discovered it.

All this is certainly compatible with a pluralist's view. Where a unitarian (such as myself) and a pluralist might come into conflict is in deciding the set of interesting language properties. And perhaps we might also disagree upon the extent to which in-the-large (modular, scalable, cross-domain integration) programming should be sacrificed to provide a simpler language in-the-small ("have some particular purpose"). As a unitarian, I see a lot of pluralists building toy languages they'll never be able to effectively use because they don't integrate well, and I think: it would be better to embed a DSL inside a unitarian language. I suspect pluralists look in my direction and see a pipe dream.

I can't relate at all to the cenobites.

Well said

For a personal programming project, I would like a language that allows me to select from that infinite-plane.

Variety is a natural force that affects populations; my personal opinion does not. A variety of languages is inevitable and some can be used for the good. Some will be popular and some will not. What will be will be.

I can't relate to the cenobites either.

Political trichotomies

Well, if you follow Kent Pitman's analogy of programming languages as political parties, then it makes sense to say that every sufficiently lively programming language will have many factions, some of which likely will have tendencies point along each arm of my trichotomy.

So Tcl would be my paradigmatic example of a pluralism-leaning programming language, with Ousterhout's system/scripting language distinction. Python would be my example of a unitarian programming language, whose community leans towards the view that if your language has enough batteries programmed in, there is no need to look elsewhere.

I confess that in my tender youth I had a somewhat cenobite outlook, following my discovery of Miranda-like languages and their support for algebraic reasoning. How foolish, I thought, was the widespread obsession with execution efficiency, a veil for ignorance of whether code was correct or not. The enlightened sought adequate performance by taking care to understand what they really needed the machine to do. I was very naive about what kind of problems the OS that ran underneath my chosen programming language needed to solve.

it would be better to embed a DSL inside a unitarian language — An admirably unitarian perspective. The pluralist might prefer to embed scripting support, Lua-style, inside their domain-specific problem-solving engine.

But my trichotomy is advanced in a functional sense: there's no unitarian doctrine, say, instead something like a moral outlook that will be more compatible with some sets of doctrines than others.

re: ameba

Hello, Ed.

Are you aware that there is significant inherent complexity in the type of language which you have described? This complexity largely stems from two properties of Ameba. The first property is that there are no separate lexing or parsing stages. The second property is that there is no distinction between codes whose syntax is context-free and codes whose syntax is context-sensistive. The result of these two properties is that the meaning of almost any string of characters in a nontrivial Ameba program can be fundamentally altered from any other part of the program. This makes Ameba code extremely fragile, and also makes the task of writing an efficient Ameba interpreter prohibitively difficult.

complexity and efficiency

You haven't said specifically what you mean by "significant inherent complexity" and "makes the task of writing an efficient Ameba interpreter prohibitively difficult. Thus, I will convey my thoughts about the two.

Complexity

Ameba evaluates codes with built-in definitions as a microprocessor evaluates its instruction codes. Evaluating a Code with a built-in definition calls the built-in, as a microprocessor calls microcode, The built-in is coded as a recursive descent routine that returns void, and its side effects alter memory, primarily the accumulator.

Some built-ins will do other or additional things, for example the built-in for ( pushes the accumulator and possibly a pending operator, evaluates the subexpression until ), and evaluates the pending operator using the pushed accumulator and subexpression values. This process is recursive descent processing.

To evaluate a code with an Ameba Code string definition, basically requires expanding each Code in the string that does not have a built-in or constant-value definition. The expansion recurs until all Codes in the string have built-in or constant-value definitions. Then the expanded expression is evaluated by calling built-in that use constant-values.

However, the basic idea for expanding and evaluating will not work. It is necessary to expand and evaluate left-to-right as a calculator evaluates an expression. In other words, the first Code in a Code string is expanded, and if the first code of the expansion needs to be expanded, then it is expanded and so on recursively down until the first Code is defined as either a built-in or constant-value. The process continues left-to-right except for subexpressions.

The getNextCode operator considers Codes as constant values, and getNextValue evaluates Codes, which may be expanded (Code string), may be called (built-in), or may be used (constant-value).

It is a simple process; however, for me to realize the simplicity took many years of mind experiments. But, your understanding may be better than mine; please tell me if I have missed something.

Efficiency

I will grant you that evaluating Ameba from source will be less efficient than a typical macro language. But, some (perhaps most) inefficient processes can be mediated. For example, it is possible for a programmer to mark code as either "conntext-sensitive" or "context-free." Meta processes, such as optimizers and translators, can use such marking when processing Ameba Code. For example, context-free code can be compiled into a plug-in, and become very efficient.

The symbol table contains a phrase tree of the code and can reconstruct source by expanding Codes, which is time consuming to create. If this code is optimized, for example by partial evaluation, even more time is spent. Thus, Ameba should process source code only once, and work from Symbols thereafter, at least for context-free code. This process can save CPU time.

Programmers will be aware that context-sensitive code is requires more processor resources (CPU cycles and memory) than context-free code, which will discourage them to using context-sensitive code. But, context-sensitive code can be used when required.

Computers today are extremely powerful compared to ones I have used starting circa 1970; thus, context-sensitive code can be processed relatively fast. As computer power increases inefficient context-sensitive code will become less problematic. To process a natural language in real time is likely to require much more CPU power and advancements in natural language processing.

I don't have all the answers, and expect discovering inefficiencies and mediating them will be an on-going project, perhaps worthy of a paper.

Conclusion

I believe your dire predictions are too harsh. Recursive descent parsers do not have separate lexing and parsing stages; it is a method of implementing languages that is conceptually simple. Because Ameba allows one to write context-sensitive code does not force anyone to do so; thus Ameba code will not necessarily be fragile. And, IMO writing a reasonably efficient Ameba interpreter will not be prohibitively difficult.

complexity and efficiency

The complexity I was referring to does not lie in the concepts which form the Ameba programming language, but rather in how those concepts combine to create programs. It stems from the fact that the semantics of an arbitrary section of Ameba code is completely dependent upon the semantics of all previously executed code. Thus, while the language itself may be simple, the code written in the language can be complex.

Regarding efficiency, please note that, in general, the problem of optimizing away the inefficiencies introduced by Ameba's execution model is undecidable. However, upon further reflection, it appears that there are several effective workarounds. Thus, in this case I am willing to concede that my previous statement on the subject - "the task of writing an efficient Ameba interpreter [is] prohibitively difficult" - was unfounded. That being said, I am still of the opinion that the very existence of such optimization avenues is telling. This is because there exist languages of similar expressive power as Ameba which start out at a level of efficiency that Ameba must be optimized to obtain.

In summary, I believe that Ameba demonstrates a very intriguing approach to programming. However, for most situations I do not believe that the gains outweigh the losses.

complexity and efficiency

While "the semantics of an arbitrary section of Ameba code is completely dependent upon the semantics of all previously executed code" is a possibility, it is not an inevitability, and it is under programmer control. The built-ins described in my article are either context free or context sensitive only by polymorphism.

Your statement about efficiency seems to accurately describe efficiency of the Ameba interpreter. However, it does not necessarily describe the efficiency of an application written in Ameba, because many, perhaps most, applications can be compiled into machine language plug-ins that can have very fast run times.

Typically, a language designer argues that programs written in his/her language and compiled to machine language are about as efficient as any other compiled language. I believe experience with compiling various languages supports this contention. C compiles to superbly efficient machine code. However, many other languages that do not compile to superbly efficient machine code are used instead of C, because other language factors are more important than absolute efficiency.

At one time a similar argument was made about compiled code vs hand coded assembler, but no more. Today compiled code is likely to be better than hand coded assembler. In any case, the benefits of writing code in a high level language outweigh the gains of writing assembler code.

Your summary point, which is an overall cost-benefit assessment is premature, IMO. I have given much thought to cost-benefit, but cannot make a cost-benefit analysis. Instead, I will share some things that I believe are benefits of Ameba. Without an Ameba language to demonstrate viability, conjecture is necessary.

Consider an Ameba application as a version control system, somewhat like a CVS server. However, Ameba would be provided with with language front-ends to parse the languages and store phrase trees of programs in its dictionary. Versions can be maintained by mangling symbols with version data.

For simplicity, assume one server is sufficient to contain all systems for an enterprise, and that the server has enough processing power to satisfy service requests.

All the programs for all the systems are stored as Code strings that form phrase trees in the Ameba symbol table. Meta programs can process these systems, not only individually but as a whole. A program slice might trace the sources and uses of data to programs in several systems. A dataflow diagram might be generated for the entire enterprise. A corporate staff of meta programmers could work on the system of systems, instead of a specific system, and they could analyze, report, refactor, optimize, test, etc. I do not know all things possible, and can only guess what benefits might be derived.

I could go on, but hope these potential benefits are enough food for thought that you may revise your latest summary about Ameba.

complexity and efficiency

It appears that there is some confusion regarding my statement that "[the] semantics of an arbitrary section of Ameba code is completely dependent upon the semantics of all previously executed code." You seem to think that this statement is not universally true. Is this correct?

complexity and efficiency

Correct. For example, a binary operator such as + gets the value of the next succeeding Code (getNextValue, e.g., 1) and gets the value of the accumulator (getPriorValue, e.g., 2), adds them, and stores the result in the accumulator. The process is no different than that done by a standard calculator.

The + operator is not context sensitive, except for type polymorphism. To make a context sensitive operator requires using a state value, such as the value of another symbol or an input value from a file that affects getting arguments or affects the algorithm of the operator.

Unless someone writes a context sensitive operator, Ameba doesn't do anything context sensitive. The way getNextCode, getNextValue, getPriorCode and getPriorValue are used determines whether an operator is context sensitive or not. The example below illustrates a sequence that would make an operator context sensitive.

if X=0 then getNextCode else getNextValue fi

The mechanism to make a context sensitive language is simple, but realizing that fact was not easy for me. A better writer might create an explanation that people can quickly understand. Whereas, I need feedback to know whether my writing is understood or not.

Thanks for your help, and I hope this time we have a mutual understanding of the process. I fear that everyone reading about Ameba expects a complex solution to make a context sensitive language, and that they are having difficulty realizing it is a simple process.

complexity and efficiency

Your statement that "unless someone writes a context sensitive operator, Ameba doesn't do anything context sensitive" is incorrect. Consider the case in which a context-free operator is redefined, but remains context-free. At any given point in time, the syntax of the operator would be context-free. However, the question of which context-free syntax the operator has depends on runtime state. Thus, the syntax of the program as a whole would be context-sensitive, even if it is composed solely of context-free operators.

Redefinition

The slice of the program affected by the redefinition is context sensitive, not necessarily the whole program.

Redefinition

A context-free language concatenated with a non-context-free language cannot be context-free. If a "slice" of the program is not context-free, the program cannot be context-free.

Redefinition

Makes sense.

Hmm...

It appears to me that the misunderstanding is the distinction between "can depend" and "does depend." It seems that:

  1. an arbitrary section of code universally can depend on all previously executed code, but
  2. it is not universally true that an arbitrary section of code does depend on all previously executed code.

(1) is sufficient to make efficient implementation extremely difficult and to thwart any meaningful attempts at analysis or auditing, but probably (2) is sufficient to prevent carefully written programs from becoming unmanageably confusing.

But I haven't read the original material carefully, so may the misunderstanding is on my end.

Hmm

Carefully written code that is either context free or polymorphic written in Ameba will be as manageable as similar code in languages such as C++ and Java. The interpreter is recursive descent, no special context sensitive code is required to implement the interpreter.

I chose a calculator interpreter for several reasons, including the following:

1) A calculator is easier to implement recursive descent than a syntax such as C.

2) Many more people use a calculator than use any programming language.

3) For me, the LR(0) parsing order is easier to analyze for a context-sensitive language than others, and extending to LR(1) and LR(k) was also easy.

Other context sensitive syntax are possible, for example a C-like language that has getNextValue, and getNextPointer instead of prototypes. However, a C-like syntax is problematic. For example, C does not have prior arguments, not even for polymorphism. A context sensitive language requires the ability to process both preceding and succeeding syntax. Thus, the C-like language would have to be changed significantly, and IMO such a process is more complex and difficult than using a calculator syntax. But, it is possible.

Context sensitivity is accomplished by replacing the function prototype with functions to get arguments; for our C-like language they would be getNextValue, getPriorValue, getNextPointer and getPriorPointer. Otherwise, C-like would be the same as C, with the same manageability issues.

I do not concur with your conclusion (2), at least for Ameba.

If your conclusion (1) applies to natural languages, such as English, then I agree; otherwise, I do not. I find writing English difficult, but authors of literature and song apparently do not (e.g., Stephen King). Natural languages are context sensitive, and many (perhaps most) sentences are. To write with a context sensitive language, whether a programming language or natural language, will be similarly easy or difficult depending on the writer (or reader).

In summary, a context sensitive language is for poetry, and a context free language is for precision. I expect code will be mostly context free syntax except for specific context sensitive enhancements, such as polymorphism. I expect that to be the case even if a generally context sensitive language were to be used for writing code.

That expectation is based on man's use of both natural languages and math for thousands of years.

Hmm...

When I say "x depends on y," I mean that, to determine x, one must first determine y. To determine the semantics of a piece of Ameba code, one must first determine the semantics of all previously executed code. Thus, an arbitrary section of Ameba code depends upon all previously executed code.

Hmm

When x depends on y, it is necessary to evaluate the thread that determines the state of y, buy not necessarily all of the program.

Hmm

The semantics of a piece of Ameba code - in particular, its syntax - depends upon all previously executed Ameba code. This may or may not include all of the program.

Hmm

We are approaching a mathematically correct description, and all context sensitive languages share it, such as C++ for polymorphism.

What claims should be made about Ameba?

It seems to me likely that Ameba has desirable synactic properties, but that these are not exactly the ones now described on the wikia.com page.

I think part of this is a staging problem: it is very unfortunate for the claims being made that definitions are unscoped commands that can be freely mixed into code. Yes, this is the first of the worms in my above-mentioned can, and English has just the same problem, but linguists have moved beyond Chomsky (1956) Syntactic Structures, and you don't want to get into the kind of theories they use now.

It would be easier to analyse an Ameba-like language where all declarations were either top-level unconditional definitions, or were some sort of scoped declaration, perhaps with a with <declarations> do <code block> end construct. Doing so would not be incompatible with preferring the current Ameba, even if it comes to be known as the syntactically unsafe Ameba.

Claims: Yes, Things that are Easier and Simpler

Built-ins should be minimal, enough to do integer arithmetic, string manipulation, and load pre-compiled plug-ins.

Plug-ins should be like built-ins except loaded at run time. A plug-in library can provide orthophrase extensions to build operators that satisfy required claims. And, several libraries might be developed to provide different Ameba dialects, other languages, graphics, an operating system interface, etc.

Ameba objects augment plug-ins to make a dialect, language, or other application.

The loop syntax you suggest is a good one, simple, easy and effective.
My meta-programming knowledge is limited, including techniques for designing and implementing an object oriented language. There are many people better qualified than I to complete the definition and implementation of an Ameba dialect.

Ameba's future

I'm proposing with <declarations> do <code block> end as a syntax for lexically scoped local declarations, not as a looping construct.

There are many people better qualified than I to complete the definition and implementation of an Ameba dialect.

Nonetheless, neither definition nor implementation will ever be finished if you don't do them. The world has a terrible shortage of language design and implementation experts who want to work on other people's unimplemented programming languages.

Ameba's future

Oh sorry about the misunderstanding. Lexical scoping seems at odds with context sensitive. People using a natural language get by with semantic scoping, as, "the following EBNF is context-free: one = 1."

Natural languages, specifically English, have inspired me to describe Ameba. They are generally context sensitive and used by nearly everyone.

If I had discovered Ameba ten years earlier in my life, I might have tried to develop it into a successful programming language. But, I do not have the time and health necessary for that kind of project.

I can develop a minimal Ameba implementation. It would be a context sensitive language development system. But, I feel as if such an effort would be wasted, because there is no one to carry on. Why should I spend the last few years of my life developing code that will be thrown away?

On the other hand, if I can prove Ameba is as context sensitive as a natural language, there is a possibility of publishing a paper. That would assure the technology is not totally lost. However, i am unsure about my capability to prove and write an acceptable paper.

I intend to revise my article about Ameba with the benefit of discussion in this forum, and spend $35 to register a copyright in the US Library of Congress. That is an insurance our thoughts will not be totally lost.

Otherwise, I am undecided about my future, I will need something to get me away from TV.

Ameba's future

"Natural languages, specifically English, have inspired me to describe Ameba. They are generally context sensitive and used by nearly everyone."

Most natural languages have a context-free, or mostly context-free, syntax.

Can of worms, revisited

Chomskian structuralists do generally believe that English is context sensitive, but they can only do so by throwing out all kinds of apparently grammatical phenonmenon and saying they are not syntax. This is widely criticised.

I really don't see that appealing to claims in linguistics that are controversial is helpful in trying to show that a programming language is well-designed.

-

Did you mean to say "context free"?

No

X-bar grammars have attributes.

No

I see. You mentioned grammatical elements which cannot be represented with a context-sensitive grammar. Which ones would those be?

Chomsky hierarchy considered of marginal interest

I don't know of anyone who has seriously proposed a formal grammar beyond context sensitive as the grammar of natural language, and Chomskian structuralists generally want to constrain context sensitivity. But the first example I give earlier is one where the acceptability of the sentence comes down to information outside the sentence. My understanding is that structuralists say that acceptability here is not grammatical acceptability, but a semantic acceptability that takes place outside the grammatical system.

My general take is that there are so many things going on when we talk that it isn't very useful to talk about the English parsing algorithm that takes a sequence of tokens into a labelled tree that shows their roles in a sentence. The picture of cleanly separated language-digesting processes here has to be pretty far removed from reality. Outside of computational linguistics, which has its own set of special problems, I don't hear so many linguists talk about the Chomsky hierarchy.

revisited

Both context sensitive and context free syntax occurs in natural language. The controversy you mention about throwing out grammatical phenomenon is a bit of a surprise because I was unaware of it. On the other hand, I am aware that natural language processing is not perfected; thus, controversy is likely for more than one phenomenon.

I'm not sure what "well designed" means and don't recall posting such a remark.

I do argue that long expressions are more difficult to compose than shorter ones, whether context free or sensitive. Furthermore, an interpreter, whether human or electronic, will either understand or not understand an expression. Whether the expression is understood or not the interpreter has finished with it, and can set up to process another expression. The next expression may or may not be context sensitive.

E. E. Cummings wrote poetry without punctuation and capitalization, which demonstrates that expressions can and often are semantically scoped. In other words, the use of period-space-capitalization to mark the scope of context sensitive sentences is not always necessary.

If the interpreter doesn't

If the interpreter doesn't understand an expression, how can it know that it has finished?

if the interpreter doesn't

It issues an error message and resets to parse a top level expression. There may be remanents of the previous expression, which cause additional errors, and it is possible the interpreter cannot find the beginning of another expression. Natural language statements often have redundant information to indicate the extent of a statement.

Ultimately, an error needs to be fixed by someone, to allow the interpreter to successfully process code.

.

Ameba fission

My suggestion is that you have two versions of Ameba, a restricted version, call it Lexical Ameba, as I described, and the existing language, call it Dynamic Ameba. You can make sharply defined claims about Lexical Ameba, and more hand-wavy claims about Dynamic Ameba.

You need an implemenation in order for your claims to be taken seriously, but it need only be a toy implementation.

fission, thanks

I suppose it is time to make a minimal implementation, from it the Lexical Ameba can be built. .

Type-0

My motivation was not to make a type-x language. Rather, my motivation was to make a language easy to extend, and Ameba occurred to me from simple assumptions and logic. That makes it sound easier than it really was.

Computability theory is not one of my strengths, and cannot prove its type, Its syntax is like a type-1 language, because arguments can precede or succeed operators. And, I believe Ameba is as Turing complete.

Several people have argued that a context sensitive language is not practical. My belief is otherwise, and my reasoning is simple and falls into two arguments.

First, natural language can be context sensitive, and people manage quite well.

Second, writing a correct program is difficult, and writing a correct context-sensitive program is at least as difficult. When a program fails with an error, the language processor will reset and eventually recover.

Type-0

That is interesting and exciting news, thanks.