The Next Mainstream Programming Languages

I didn't see anyone post them yet, so here are the slides from Tim Sweeney's POPL talk entitled "The Next Mainstream Programming Languages: A Game Developer's Perspective". I know Tim and I aren't the only game developers who follow LtU, and I figure even non-game developers might find them quite interesting!

http://www.cs.princeton.edu/~dpw/popl/06/Tim-POPL.ppt

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

PDF Version of the slides?

Are these slides available as PDF-Version as well?

PDF version

I found a pdf version on the homepage of the "Advanced Functional Programming" seminar course at Saarland University

(I haven't been able to download the .ppt (server seems down), so I'm not sure it's exactly the same.)

PDF of Tim Sweeney's POPL talk

The PDF you found was created from the PowerPoint slides using Keynote -- so they are the same, except for animations.

Good points...

As an ex-game coder I agree with his empasis on concurrency, this is only going to be more and more important and not just in games. This is the major angle people interested in functional programming can leverage with people interested in performance.

Then again the only success I had in getting functional code into a game engine was a scene-graph scripting language which would have been hell to debug if it had side-effects.

Not sure about being able to eliminate all numerical/array index etc. exceptions though. A lot or most maybe but not all IMO.

Also not sure about lenient eval' either.

Very interesting and important stuff.

The conclusion is that we need better 'type' systems that can catch more requirements and programmer assumptions. This ties in beautifully with the other thread about static vs dynamic typing. Programming languages should generally go the static way, doing as much as they can in compile time, but also leave the door open for dynamic stuff, should it be necessary.

I agree with most of the points raised, but I have questions on the following:

1) I do not see how variable length integers are that important for games. Wouldn't 64-bit integers be enough? what game value would go over 64 bits?

2) bounds checking is not needed if the language makes indices be strong types. ADA has this feature very nicely done.

3) I did not understand how horizontal subclassing will offer more than vertical subclassing, as the new features introduced by horizontal subclassing will not be used by the rest of the game engine anyway without modifying the engine.

4) I do not agree that garbage collection is the end to memory management. I think there exists a 'calculus' for resource management (or effects management). After all, if we take a program and lay down its instructions one after the other, it is pretty obvious when an effect takes place.

I liked the part that shows how ineffective "modern" programming languages are. In just 3 lines of code, there are 7 possible sources of error or uneccessary overhead. If only programmers could realize this...

A few things

Storage size is important for performance - smaller storage requirements improves usage of cpu caches and can help prevent paging and keep minimum sys requirements down.

For some things just having a form of non-null reference would be handy. In fact wouldn't using C++ references prevent the problem that the parameters could be null here?

C++ refs: You'd think so, wouldn't you?

Consider the following code:

T* f() {...}
void g(T& y) {...}

void h()
{
  T* x=f();
  T& y=*x;
  g(y);
}

I've seen this happen, when f() and g() were written in different epochs of a team's development practices. Say, the original developers didn't like references, so they wrote f() to return a pointer; later hires managed to get the culture changed, so new code, such as g(), was written to take references. Then h() was written that had to talk to both.

C++

Null references are undefined behavior but they can still be created. An example

agreed !

It seems to be a way to get around non-extensible classes in a static class world.
--
Trojan Encyclopedia

Horizontal Subclassing

3) I did not understand how horizontal subclassing will offer more than vertical subclassing, as the new features introduced by horizontal subclassing will not be used by the rest of the game engine anyway without modifying the engine.

I think the idea of "horizontal subclassing" is to enable modification, extension, and redefinition of an already existing class. For example, altering an Actor to give it another field that *all* classes extending Actor will inherit and without editing the Actor source code.

It seems to be a way to get around non-extensible classes in a static class world.

Tim, am I close?

Open classes?

I think the idea of "horizontal subclassing" is to enable modification, extension, and redefinition of an already existing class.

Is not this usually called open classes?

How is it different from just adding new fields to class Actor?

How is that different from just adding new fields to class Actor? If all classes extending Actor will need those new fields, then it makes sense to add the field to class Actor. Already compiled code will have no use for those fields anyway. If, on the other hand, only certain classes need those fields, a simple refactoring (i.e. a new base class inheriting from Actor with those fields) solves the problem.

In either case, horizontal subclassing seems reduntant.

Modifying class Actor

Modifying class Actor directly would solve the immediate problem, but remember that for a game developer using "UnrealEngine3," class Actor belongs to the engine code you liscensed from Epic. If two months later Epic comes out with "UnrealEngine3.1" with cool new features you want to use, then you have try to merge your local changes into the new code.

I think Sweeny is arguing for "horizontal subclassing" so that liscensees of his engine can customize it to their liking while still being able to take advantage of future releases of the product.

Hi!

Hi!

would a good version control system not solve Sweenys immediate problems?
(of changing the basecode of the Unreal Engine and then merge the Unreal Engine changes back into the old codebase.)

Is there a version control system like Monticello (Smalltalk) for c++?

Merging: not a great solution

(wince) I don't know about Monticello, but in my random experience merging code is time-consuming and error-prone, even with good tools. (I like Araxis Merge, combined with any decent version control system.) Merging code is a common solution to a range of problems, but that doesn't mean it's pretty.

Choice of programming language definitely affects one's ability to cope with change. If programming language designers found better ways to support this (alternatives to merging), they'd be welcome.

My question is still not answered.

I still do not understand how horizontal subclassing will solve the problem. What is the difference between horizontal subclassing and vertical subclassing? Sweeny's problem could be solved by introducing a class between Actor and Actor-derived classes. Changing all references of 'new Actor' to 'new MyActor' is a matter of using the right tool...

Not That Easy

It's really not that easy to modify a large class hierarchy in C++ (or, presumably, UnrealScript) in such a way that you don't risk introducing bugs when the base class changes, which, again, Tim has said happens with their licensees every couple of weeks! To gain a better understanding as to why that might be, read Oleg Kiselyov's excellent OOP Critique or Peter Norvig's Java IAQ. Virtual classes are a great solution to the problem, but are somewhat difficult to keep type-safe. The paper "On the (Un)reality of Virtual Types," mentioned elsewhere in this thread, contains an excellent explanation of the challenges inherent in a type-safe virtual class system while at the same time offering multiple type-safe solutions that are extremely easy to express in O'Caml. The executive summary is that any system with parametric polymorphism, structural typing, correct binary method typing, and open self-typing allows for the easy implementation of virtual classes. It's worth noting that Scala also has no difficulty expressing virtual classes.

But why

I have to concur, I don't see the benefit of extending classes horizontally.

Considering that the base level engine will never make use of the extensions introduced in Actor, then the only users of those extensions are in the specific application that defines them. The underlying Unreal framework would not even know those extensions were present.

So, that being the case, do you not achieve the same results by creating a MyActor with the enhancements needed that subclasses Actor, then require that this be the Actor class that your application is allowed to use? Then if the Unreal engine changes, your modifications are isolated.

What does horizontal extension provide that normal vertical inheritance doesn't?

What if Actor isn't a leaf class?

Imagine there are other classes like Player that extend Actor in the engine framework. How are you going to create MyPlayer such that it extends both MyActor and Player? (If you said Multiple Inheritance, then now you have two problems)

Use delegation?

You're right, an open class model would simplify this situation, one I hadn't considered. But, assuming a small number of possible child classes off the non-leaf class you wish to modify, can you not also do it via a delegation model?

If you define a simple object, MyNewFeatures that handles your new features and an interface IMyNewFeatures for those methods, then define MyActor extending Actorand MyPlayer extending Player and havet both implement IMyNewFeatures. Then if you delegate their interface method calls to an instance of MyNewFeatures, do you not have the same result?

Again, neither MyActor nor MyPlayer are used by the base engine, only in your higher level code, so you can actually treat them in your code as IMyNewFeature objects. It won't matter to your code which one it receives, so long as it fullfills the required contract. You wouldn't be able to treat MyPlayer as an instance of MyActor, which depending on design, could be an issue, but you would be able to treat both as IMyNewFeatures.

Obviously if there are a large number of child types that you want to inherit the new features, the work effort here can increase, so I'm not arguing that open class models are unnecessary. In a newly designed langauge, that type of feature would be something to consider from the beginning. But, with existing lanaguages, there are alternatives that can get you a good portion of the way to what you need, albeit with more work.

Workarounds

I agree there are workarounds, but piling everything into one big IMyNewFeature interface looks unattractive to me.

In a real engine, the root of the inheritance hierarchy probably won't be Actor, but rather Thing or Object, and you can easily have a somewhat large number of classes (even non-leaf ones) in the hierarchy. Also, there will likely be a desire for more than two levels of framework - for example separating the First-person Shooter logic into its own layer inheriting from the generic game layer.

Also, your suggestion lets you cast down to the IMyNewFeature interface as needed, but you haven't explained how you're going to cast back to either Player or Actor when you need to interact with the engine layer. It sounds like that will either involve an unnecessary runtime check or a supression of the type system.

It looks to me that without a language improvement you end up either with a system that's pretty verbose or that lacks some nice property. With the right language improvement you can have straightfoward code with nice properties. And note: Named frameworks with inheritance are more flexible than just open classes in that you can have e.g. ShooterGame and PuzzleGame frameworks that extend GenericGame's classes in different ways.

Base classes can be extended with interfaces?

If we take what Sweeney wrote literally on slide 26, then I think it can already be done using interfaces? This avoids the multi-inheritance duplicates problem. For example in HaXe:

http://haxe.org/ref/packages
http://haxe.org/ref/oop

package Engine;
interface IActor {}
class Actor implements IActor {}
interface IPlayer implements IActor {}
class Player extends Actor implements IPlayer {}

package GearsOfWar;
interface IActor implements Engine.IActor {}
class Actor extends Engine.Actor implements GearsOfWar.IActor {}
class Player extends GearsOfWar.Actor implements Engine.IPlayer {}

Then throughout the licensed framework use only the abstract interface types IActor and IPlayer for references, arguments and return values.

But did Sweeney mean something else?

Implementation not inherited

class Player extends GearsOfWar.Actor implements Engine.IPlayer {}

That is obviously an error since it doesn't inherit the implementation of Engine.Player. I remember noticing that, and in my haste made a conflation of concepts in my mind.

If the Engine is supplied as a pre-compiled library, I can think of no way to insert the inheritance of GearsOfWar.Actor into Engine.Player because the vtable pointer and data schema of Engine.Actor in Engine.Player is hard-coded in each instance by the compiler. We need a need a new compiler that does not assume a fixed size for that data schema in the instance, then we can insert GearsOfWar.Actor into instances of Engine.Player that are inherited by GearsOfWar.Player. Sweeney did not suggest a syntax for this:

package GearsOfWar; class Actor extends Engine.Actor {} class Player extends Engine.Player extends GearsOfWar.Actor replaces Engine.Actor {}

If the Engine is supplied in source code, then one could take a different approach that can work with existing compilers, where the Engine classes would be empty by default and could be edited by licensee:

package Engine; class Actor extends EngineCore.Actor { /*empty default*/ } class Player extends EngineCore.Player { /*empty default*/ } package EngineCore; class Actor { /*default implementation*/ } class Player extends Engine.Actor { /*default implementation*/ }

OOP did not fail as he claimed

Paul Snively wrote:

...read Oleg Kiselyov's excellent OOP Critique...

foo() uses the clone of CSet (or what ever type is input to foo) and gets a different result from version of foo() that always uses a CBag. So what is the complaint? The interface is working exactly as it should, because CSet's virtual put() doesn't allow duplicates but CBag's does. That is the way virtual methods work, when you hold a reference to a CBag, it doesn't mean you will get the behavior of a CBag.

Because it violates the

Because it violates the Liskov Substitution Principle, and thus when clients depend on CBag semantics, they may very well break when provided a CSet. This is explained on the page as well, see "Subtyping vs. Subclassing" and "Subtyping and Immutability".

OOP never promised LSP...

OOP never promised LSP (no language can because LSP is generally "undecidable"), programmers shouldn't be relying on CBag semantics when they hold a reference to a CSet.

Isn't that SOBO (statement of the blatantly obvious). I don't mean to be rude, but I was disappointed I wasted my time trying to figure out what that webpage was criticizing. Disappointed the author didn't state it succinctly in first paragraph as I did, so save me time of reading the entire page and analyzing the code. Trying to save others the time.

In C++, T& does not mean you hold a reference to a type T. It means you hold a "weak" reference to an instance of a class that inherits from type T, possibly class of type T itself but not always. Maybe someone can argue that the syntax of C++ should be more explicit about that. Maybe we should have another syntax for expressing a "strong" reference to a type T only, i.e. T&&.

In short the author is aiming to solve something that is "undecidable", thus he solving a non-problem. The only problem there was human error, as we were tricked by our eyes "CBag&", because we forgot what the semantics of T& are.

OOP promised a lot

and didn't deliver much...

OOP didn't promise, humans idolize semantics

If we respect the natural law limitations of semantics, then we won't preach surety nor insurance.

It is really about the power of discernment. I see so many models thrown about on this site, and frankly I can't wrap my head around all of them in finite time. I find that when I follow the principles of the natural law, all the models end up falling into my lap naturally. That is how I was able to fake that I might know something about transactions in the link above. I was relying on the principles of the natural law holding out to be true by the end of the debate.

Granularity of inheritance breaks LSP

Type safety via the orthogonality of interfaces is only guaranteed if the granularity of the interfaces are sufficient:

http://lists.motion-twin.com/pipermail/neko/2010-January/002717.html

Thus C++ did not break or conflate-- it is always up to the programmer to do that. Granted that interfaces and mixins are superior to "all or nothing" sub-classing with virtual method dynamic typing, with respect to aiding in "un-conflating" (i.e. increasing the orthogonal granularity of) inheritance. I think we were talking past each other before and I hope the above helps aid our mutual understanding and agreement. I am confident that we are in agreement.

OOP never promised LSP,

OOP never promised LSP, programmers shouldn't be relying on CBag semantics when they hold a reference to a CSet. Isn't that SOBO (statement of the blatantly obvious).

If you can't reason about code written in a given language, then that language is frankly useless. LSP is a principle that makes reasoning about compositions of abstractions tractable. If C++ or OOP in general does not respect LSP, then reasoning about code is very hard, as it is not possible to understand what a piece of code does locally without global knowledge of the entire program. I think that's a very important point to make, and the conflation of subtyping and subclassing is exactly the problem.

Reason must know itself or it is not reason

We can reason, as long as we respect what reason is and is not. Reason with respect to semantics can only render to us what is semantics domain (which is always local and subject to random interference from the infinite external state diagram, see below).

The semantics of the language never conflated subclassing and subtyping, the human error did.

We can change the semantics of the language, but as I understand the problem is that LSP insurance is unachievable from the semantics of a Turing complete language. Correct?

We can try to use hacks (heuristics as the author suggested) to weed out specific instances of error, but this just causes the aliasing error to get pushed into other spaces. It is the exact same problem of "aliasing errors that I am mentioning are random and are a result of the inability to sample for infinite time all the code paths of the algorithm". Fundamentally semantics can not make any guarantees because of the Halting problem, which is another way of stating the infinite sampling problem. Look also at Gödel's Theorem, as it seems logically analogous to the work I did recently to reveal that sampling theory tells us we never know the true signal, thus science is faith (evidence of a shared reality only), that seems to piss a lot of "well reasoned" people off (many invent all sorts of heuristics to try to push away the implications relative to faith in general, but four of the world's smartest men Carl Sagan understood "Albert Einstein—considered God to be essentially the sum total of the [natural law]...An atheist has to know a lot more than I know."). I deduce also Isaac Asimov and Marvin Minksy, "If you wire it randomly, it will still have preconceptions of how to play. But you just won't know what those preconceptions are." (reality is true only to extent it is shared). An example of false security of erroneous overextended reason (idols), Edge Conditions of local state diagram provides the illusion that the randomness of natural law is contained or bounded, but that ignores the infinite external state diagram. Plato demonstrated that scientific models are only approximations of real world.

It is common human trait to want something impossible, and then blame something else when we don't get it.

Coming to accept the natural law is maturity, contrasted against what comes to my mind are the kids who lay down in aisles of Toys R Us and scream until they get what they want. Unfortunately most people never seem to accept it, even in their 60s or 80s. It is as if reason is some magic wand that makes someone mature and wipes away the randomness of nature. I wish otherwise intelligient people wouldn't fall for the-little-red-guy-with-horns deception ("I will give you reason, so you can feel superior and you won't be subject to the natural law of ancient relic uncivilized beasts").

This is why all of humanity has moved to fiat financial system for the first time in history of world (the Romans shaved coins so it was similar, but orders-of-magnitude increase in statism fraud with electronic monetary dilution tax), and when financial system implodes within next 5 - 10 years (or less!) the entire world will blame some thing other than their insatiable desire to idolize surety (in ignorance/defiance of natural law). The fractal chaos will strike globally again...

Drifting

This discussion seems to be drifting off-topic, particularly given that this is an old thread (2006). We leave old threads open on LtU to allow for continued discussion relevant to the thread in question. But if every old thread is resurrected for ongoing philosophical discussion, the useful content on the site will become increasingly difficult to extract.

Ehud's Blog not forum! post addresses some of these issues.

Extremely relevant

The subject of this forum topic is Tim Sweeney's presentation, which at it's core philosophy pondered about some perfection in semantics that would wipe away the programming aliasing errors in various spaces (typing, initialization, concurrency, composability, ...).

I am showing you the theory that goes back to Turing, as to why the random error of the natural law can not be wiped away with any new semantics. Instead I am pointing out that we have to design tools and methods which aid in increasing the human comprehension of the external state diagram (natural law).

You may not like the implication of the Halting problem or the Gödel's Theorem, but that hardly constitutes drift. Oh I do realize it is the interleaving of Biblical concepts that constitutes drift in your mind, but the reality is that what people believe is what drives their comprehension and lack thereof to repeat the same incorrect focus of effort over and over again. That is why I linked to Minsky's new Emotional Machine (because our faith drives our reason). Yet the point is well received, that will never be resolved here. But hardly could we say that Minsky is off topic from CompSci, nor can we say that my points were off topic of Sweeney's proposals.

And, I am running out of time to post :D I came here to resolve Tim Sweeney's proposals in my mind. Done.

IMHO, Ehud's post is unrealistic (violates 1856 law of thermo, that highest quality/order will always be dispersed into lower quality/order). We have Wikipedia for that conical snapshot of cite-able political correctness. If he wants to make the site more readable, he could do some programming, so we can collapse leaves of the discussion. And so it can show just the first few sentences of each post by default.

One of the main points I made since I joined 2 days ago, was that algorithms trump implementation solutions. Implementation can never remove the errors made in algorithms. IMHO, Ehud should look for an algorithmic solution to his desire, not a force of law or statism or political action. I think it was Turing who said we can always change the machine/algorithm if we do not like it. Programmers should never need political coercion, unless they are being lazy! If it can't be done with a program, then it can't be done! Eliminating all errors (via semantics or any other method) is one of those things that can NEVER be done. Actually it is a good thing...but that is another philosophical discussion...

The semantics of the

The semantics of the language never conflated subclassing and subtyping, the human error did.

No, the point is that sometimes you want subclassing only, and sometimes you want subtyping only, but with languages which conflate the two, you're screwed because you get both. Whether LSP is enforceable is not as important as whether the programmer can even express his intent to subtype or inherit, which C++ does not permit. That's a real problem with real consequences, which Oleg identified quite well in that article. Even if only the intent to follow LSP is captured syntactically, that is a marked improvement over this conflation.

I also second Anton's point that this thread is meandering, so let's try to keep the discussion focused. I don't see how general comments on science, faith, finances or psychology are even remotely relevant to LSP, subtyping or OOP, except as some sort of personal metaphysics, in which case this is not the appropriate venue for publishing such thoughts, even if you do connect them to computation in some way.

C++ did not conflate subtyping and subclassing!

First, the correct orthogonal OOP definitions:

1) Classtype is a collection of functional interfaces (i.e. functions or methods). Wikipedia's definition is hopelessly conflated.

2) Datatype is a collection of data (i.e. state). I am being more orthogonal than Wikipedia's definition, as I am not including (conflating) operators, which I assert are part of the classtype interfaces.

3) Subtyping means that we can apply the classtype for a datatype to an inherited datatype, i.e. subtyping keeps the functional interfaces constant for datatypes that add data members. Subtyping is an abbreviation for subdatatyping.

4) Subclassing means that we can apply an inherited classtype to the datatype for the original classtype, i.e. subclassing keeps the datatypes constant for classtypes that have different (not just added) functional interfaces.

==============
UPDATE 10/26/2009:
Subsequent discussion revealed:
1) The above definitions are not in tune with mainstream.
2) The definitions used by the mainstream do nothing to solve LSP risk of conflation.
3) No definitions for OOP concepts can solve LSP risk of conflation.
4) The mainstream concept of subclassing (new class) orthogonal to subtyping (new interface) is useful for re-implementing interfaces with different classes. However it does nothing to prevent semantics conflation of virtual inheritance, which is one of the sources of LSP risk of conflation. In short, the makes type declaration orthogonal to type semantics (implementation), which actually does nothing to solve conflation in the LSP. This furthers my point that C++ is not lacking any feature that is causing LSP conflation. Besides C++ has abstract types and private inheritance, which makes subtyping and subclassing orthogonal (I never claimed C++ was the most elegant).
==============

...whether the programmer can even express his intent to subtype or inherit, which C++ does not permit...

Incorrect, we can express this in C++, using the classtype :: operator:

bool foo(const CBag& a, const CBag& b, const CBag& c)
{
  CBag & ab = *(a.clone());
  ab.CBag::operator += (ab, b);
  bool result = ab.CBag::operator <= (ab, c);
  delete &ab;
  return result;
}

Perhaps you are complaining that C++ is too granular in it's subtyping and subclassing orthogonality?

Realize that technically you have two competing forces here in your desires. If you decrease the granularity of the expression of the orthogonality (i.e. interfaces in newer languages, actually C++ has templates), then you actually increase conflation in other scenarios.

And this gets directly back to my important philosophical (actually technically programming patterns paradigms) point, which is that people want something impossible, then fail to appreciate why they fail. Please understand that there is a trade-off between the programming pattern (more technically the category) that you choose and the way that aliasing errors due to Halting problem or Gödel's Theorem will manifest. NEVER will any chosen category for programming paradigm erase aliasing errors of natural law. Even pure functional programming will not.

I don't see how general comments on science, faith, finances or psychology are even remotely relevant to LSP, subtyping or OOP, except as some sort of personal metaphysics...

I hope you see now. The natural law is immutable and has been since it was written down at least about 2000 years ago. Halting problem and Gödel's Theorem are just manifestations of it. It is far from a private or personal philosophy.

Perhaps you can claim that it is off-topic from PL (programming logic), when I extend the above understanding to the fact that the superset of the Halting problem actually applies to all signals in the universe (i.e. sampling theory says we do not know if we have sampled the true signal until we've sampled infinite time), and thus that all science is faith. That disturbs people and turns them off, and they want to label it a "personal metaphysics" to try to get it out of their mind. But the fact remains true and I think coming to grips with it emotionally is necessary to fully come to accept the implications Halting problem for our profession in PL. Then we do not make the emotional mistake of the easy accusation that C++ was the problem of our human conflation of subtyping and subclassing. It is an emotional maturity to come to expect random error and to understand that science in all forms is and will always be a faith. Then we strive to apply reason correctly and not emotionally leap to incorrect reason, because we first search for the sources of random error, not lash out at the wrong victim. For example, right now society is preparing to blame the "lack of regulation" (actually a programming logic paradigm) for the financial crisis. But the fact is that the root cause of the problem is the insatiable human desire for debt and the creation-of-money/debt-out-of-thin-air (i.e. fiat) system which enables it. No amount of regulation, not even switching to a gold standard, will eliminate that aliasing error because of the natural laws. For me the superset of all natural law is the 1856 law of thermo which concludes that the universe is trending to maximum disorder (randomness). If you accept my proof of that (essentially the universe can not be infinite otherwise), then you come to expect disorder and see it as natural and thus desireable. Also we come to appreciate that exponential increases in order (local order) are mechanisms to break down order in larger systems, so as the overall trend is intact. This is related to PL in that we can see how our big wins where we create new methods to exponentially increase the amount of local complexity we can handle, actually create new opportunities (and demands) for complexity externally.

Whether LSP is enforceable is not as important...

Disagree. Per what I wrote above, it is the key thing we must accept. We must always be analyzing where the aliasing error in our paradigm is going to manifest (instead of trying to blame the manifestations on some fault of the language or paradigm...LSP says there will never be a way to avoid the risk of conflation in some code path!). This is what I was doing past few days in discussions about STM, transactions atomicity error propagation, and concurrency.

Apologies if I wasted any one's time with a long post. That is not my intent nor motivation to force my personal conviction on any one else. I think I am making a globally important point, but readers have to decide for themselves (that makes it a free market). I will try to shut up now, or be terse with small snippets of technical discussion.

science is repeatable observations

all science is faith

science is not blind faith nor is it an absolute certainty. It is repeatable observations verifiable to a discernible degree. And predictions thereof.

The overstretching of reason is precisely a departure from scientific method which (method) prescribes degree of certainty to every knowledge; a measure of fitness for each pattern match. Human intelligence is largely about pattern matching (automatic mode) and pattern creation (learning mode). Mis-application of previously learned patterns often leads to overstretching, but that's the trade-off that was apparently advantageous to the previous 100,000 years of human evolution or so as a sociable animal. IOW it is much easier for us to believe than truly understand, maintaining the proof base for each and every one of our assumptions.

Which might yet prove fatal at some point in the future.

Obviously some trade-offs had to be made, for efficiency's sake. Just like we do today with programming.

True intelligence/reason is like a self-aware Bayesian network, constantly re-checking assumptions at base of its knowledge. It's when we stop this re-checking that we overstretch our patterns, or as we call them, "understanding". Assumptions should not be confused with certainty.

So this mis-understanding is not necessarily erratic.

The world is very complex so over-reliance on assumptions might lead to wildly distorted visions of reality, that nevertheless maintain good degree of local match to the data selected (unconsciously) by the individual(s) interested in maintaining said vision (i.e. denial of fact). IOW it will be successful until a sudden break-off point where the force-fit mismatch finally ruptures, leading to a seemingly erratic break-down event. Wrong vision of reality will lead to wrong policies, which will fail, eventually.

As for the second law of thermodynamics, it only speaks of closed systems. Ours has an energy influx which pumps it up, so that natural selection provides for ever growing complexity. How that is applicable to programming, I do not know.

Repeatability is an illusion of shared reality (White Swans)

This isn't proper forum to get in a long debate about "science is faith". Can anyone suggest the best forum (that won't call one side or the other a "kook" or "quacker")? I suggest any reply should be a link to another forum where you have posted your reply.

Bottom line is you go back to Shannon-Nyquist theorem or sampling theory in general, and you see that the signal being sampled can not be known until it has been sampled for infinite time. The difference is called aliasing error. That is true whether you use wavelets or fourier (or any other) model for reconstruction. Analogously in programming logic, due to Halting problem and Gödel's Theorem, it doesn't matter what model you use for your construction (e.g. functional programming, imperative), you still can not avoid the fact that you can not sample all code paths in finite time, therefor you can not say that something that was repeatable N times, will certainly be repeatable on the N+1th test. Even proofs of simple FP programs are prone to error in the proofs themselves.

If you doubt that we can't have another event where a major theory is proven wrong (e.g. universe doesn't revolve around earth, earth wasn't flat, space & time are not orthogonal, space-time doesn't apply to quantum mechanics), then ponder the fact that since the time the baby boomers were born, the computing power on the planet has increased a million-trillion fold, yet this isn't even yet Avogrado's number (molecules in 1 gram of oxygen). Universe scale is hard for people to appreciate.

I already had this debate Eric Raymond's blog:

http://esr.ibiblio.org/?p=1271#comment-240650 (read the entire page, including sublinked discussion at Wikipedia)
http://esr.ibiblio.org/?p=1271#comment-240672 (same page)
http://esr.ibiblio.org/?p=1271#comment-240687 (same page)

But "I" (impossible to ban a person on internet, as new identities are cheap, only you can manually censor their ideas with infinite work to monitor all content) am now banned (at my request, because Eric Raymond doesn't yet understand you can't ban ideas) so won't be able to debate you there:

http://esr.ibiblio.org/?p=1332#comment-241469

...overstretching of reason is precisely a departure from scientific method which prescribes degree of certainty to every knowledge; a measure of fitness for each pattern match...

Sadly I suspect you missed my entire key point, even though your tone and intent is so upbeat and appreciated. There is no foundation point (think of a dog chasing his tail). You can't be sure that which you are basing your measurement of reason on is true. This is why Carl Sagan said he is not smart enough to be an atheist (to prove there is no God, only he could be agnostic, meaning he doesn't know).

But there is one immutable truth. It is the natural law. If you understand that it says randomness is natural, then you get out of the trap of thinking your reason is ever sure. You stop trying to be sure, and try to live for each day and to break down order as often as you can. This is why open source and Unix programming paradigm wins, because it is a dynamic system that is constantly changing to it's wild (randomly changing) environment. "Given enough eyeballs, all bugs are shallow" really means that the errors are always changing as the wild environment does. Order is a cluster of statism. More smaller modules interact with EXPONENTIALLY more permutations than less larger modules. Go look at the equation for entropy (in all it's disciplinary forms).

The 1856 law of thermo most definitely does not apply to closed systems. It says the entire universe is trending to maximum disorder (Wikipedia it!). The universe by definition is a closed system, else it isn't the entire universe.

The whole notion of falsifiability is nonsense because nothing can ever be proven true nor false with 100% certainty, and it follows from what I wrote above. Even a math proof relies on some arbitrary foundations (i.e. what is a number, etc). Again a debate for another forum.

Please take it upon yourself to move the debate to another forum if you don't want it to proliferate here. I will follow you to the other forum. All the best.

C++ did not conflate

C++ did not conflate subtyping and subclassing!

Ok, then show me how to reuse a class implementation without implementing a subtype or duplicating that class's code.

The only way to reuse code is via inheritance, and inheritance also implements a subtyping relationship, which you may NOT want. That's the conflation. If you could publicly hide an inheritance relationship, such that an inherited type is no longer seen as a subtype of the base class, then there would be no conflation. This is not expressible in C++ or most OO languages.

Private or protected

Private or protected inheritance plus "using..." may be what you are seeking.

C++ subclassing does subclasstyping, subdatatyping, or both

C++ gives you the power to subclasstype, subdatatype, or to conflate the two.

...show me how to reuse a class implementation...

I parse that to be, "show me how to reuse a classtype" per my prior definitions of the terms.

...without implementing a subtype or duplicating that class's code.

I parse that to be, "without implementing a subdatatype or duplicating the code for the classtype", again per my prior definitions of the terms.

In C++ you use class inheritance. It does not duplicate the code for the functionals (methods) from the superclass, nor does it force the creation of a subdatatype. The subclass has a new name (identifier). You then override or add new functionals (methods) in the subclass to make it subclasstype.

But I understand what is bothering you is that C++ also gives you the power to create a subdatatype simultaneous with a subclasstype. All we have to do is add data members when we do the above inheritance. But this conflation is not forced on you. You can just as easily inherit and not override nor add new functionals (methods) in the subclass, but add data members, so then you have subdatatyping without conflation.

I understand that it is bothers people that this is not more explicit, so it could be checked by the compiler somehow. I have not pondered ramifications and how that could or could not be improved without ending up with breakage or forced conflation.

Only?

>> C++ did not conflate subtyping and subclassing!
> Ok, then show me how to reuse a class implementation without implementing a subtype or duplicating that class's code.

I see two way:
1) private inheritance: C++ provide it.

2) A kind of wrapper class which 'reuse' code of a class H from another class A by implementing a 'hidden object' h of class H when an object of class A is created and calling the method of h when a method of A is called.
IMHO this is also a form a code reuse..

1) private inheritance: C++

1) private inheritance: C++ provide it.

This makes all inherited members private. Not what I want. I want to inherit the implementation, perhaps overriding some behaviour, but without creating a subtyping relationship. IIRC, this is what Oleg was after in his article: he wants to reuse the CBag implementation and enforce the stricter invariants of a set, but avoid a subtyping relationship because the semantics of sets differ from the semantics of bags.

2) A kind of wrapper class which 'reuse' code of a class H from another class A by implementing a 'hidden object' h of class H when an object of class A is created and calling the method of h when a method of A is called.

Code reuse by delegation, which forces you to duplicate all the methods and delegate to the wrapped class. Unfortunately, this is not very convenient in any class-based OO language as you must duplicate at least a portion of the code from the base class.

What is your request??

I want to inherit the implementation, perhaps overriding some behaviour, but without creating a subtyping relationship.

Please describe in what way CSet did not do that?

IIRC, this is what Oleg was after in his article: he wants to reuse the CBag implementation and enforce the stricter invariants of a set, but avoid a subtyping relationship because the semantics of sets differ from the semantics of bags.

Could you restate this so it is not ambiguous? I guess you have something in your mind that is important to you, but I can not figure out what you are requesting, that C++ can not do.

* CSet did reuse the "implementation" (classtype) of CBag.
* It did enforce the stricter conditions of the set.
* It did avoid a subtype relationship per my prior posts on that.

Please describe in what way

Please describe in what way CSet did not do that?

CSet is a subtype of CBag, so you can pass a CSet anywhere a CBag is expected. Knowing that a CSet can violate CBag semantics, what if I want to reuse CBag's implementation overriding some of its behaviour, but without establishing a subtyping relationship between them, so clients cannot erroneously handle CSet as if it were a CBag.

Others here have suggested that private inheritance solves this problem, though unless I'm misunderstanding something, this might require quite a bit of boilerplate which makes it less useful than it might at first appear.

C++ does what you requested

...what if I want to reuse CBag's implementation overriding some of its behaviour, but without establishing a subtyping relationship between them, so clients cannot erroneously handle CSet as if it were a CBag...

That is exactly what CSet does, including "so clients cannot erroneously handle CSet as if it were a CBag".

When you get a CBag reference to a CSet, you still get CSet's virtual interface semantics, i.e. "cannot erroneously handle CSet as if it were a CBag".

I suspect what you are trying to say is you don't want the user to be able to create a CBag reference that points to a CSet?

I suspect what you are

I suspect what you are trying to say is you don't want the user to be able to create a CBag reference that points to a CSet?

Yes, implicit coercion of a base class reference to a super class reference, known everywhere as a subtyping relationship.

C+ can express non-inheritable methods

Simply do not make the methods virtual in CBag. That will force the CBag reference to always access the methods of CBag.

As I said, no conflation inherent in C++. You have flexibility to choose what you want to express.

Now that is just as bad

I create a set, put in 1, pass it to somebody else's function F as a bag. F happens to add 1 to the bag. Now there are two 1s in it, violating the internal invariant of a set object. When I later remove 1 from the set I may or may not remove both, depending on the implementation details.

Perfect recipe for subtle bugs.

What is problem?

He asked to force CBag to always have CBag semantics, so he gets what he asked for.

Except that...

...you allow bag semantics being applied to sets now, breaking sets. What he really asked for is to be able to prevent sets from being used as bags altogether.

C++ can express private (local/hidden) inheritance

What he really asked for is to be able to prevent sets from being used as bags altogether.

Then make CBag a private data member of CSet :).

There is no conflation of contracts either.

Bingo!

Now I think you are starting to understand. This is the only possible solution in C++. What one would like, however, would be to inherit the implementation of CBag automatically, (i.e., without having to manually delegate method calls to a private data member, and having the option of overriding some behavior, as well as access to protected state/methods of CBag). But one would like to do this without creating a subtyping relationship between CSet and CBag.

AFAIK, this is possible in Eiffel. It seems that you have accepted C++'s limitations as necessary and part of the structure of natural law (!), and it seems that this has hindered your ability to understand those same limitations.

Zigzag is not a valid win in Bingo :) (Illogic 102)

What one would like, however, would be to inherit the implementation of CBag automatically, (i.e., without having to manually delegate method calls to a private data member...

That is not "conflation" which was the original complaint that I disagreed with.

I am not disagreeing with feature requests for C++.

...having the option of overriding some behavior, as well as access to protected state/methods of CBag). But one would like to do this without creating a subtyping relationship between CSet and CBag.

Yes you can do that in C++ with the method I described (or inheriting CBag private and delegating) in my prior post that you replied to.

It seems that you have accepted C++'s limitations as necessary and part of the structure of natural law (!)...

You stated no valid limitation.

You are idolizing logic so much (making it your God), striving so hard to make it all consistent every where without exception (which of course the natural law says is impossible), that you've tied yourself in a illogic pickle spaghetti and the natural law is blowing illogic aliasing error out your typing fingers (an assumption granted but probably correct).

Don't Feed The Troll is why I won't reply to that other post of yours. Nor the graduate of Illogic 101 troll. Make factual logical posts please.

===============
UPDATE: here is rebuttal to the reply that follows this message by naasking:

Subject: Conflations of topic proliferations

...a non-trivial amount of boilerplate for explicit delegation is required for subclassing-without-subtyping...

But that is NOT CONFLATION of subclassing and subtyping, which was the only technical claim I was disagreeing with from the beginning to the end of this topic.

...and as Andreas pointed out, subtyping-without-subclassing is entirely impossible.

Disagree. Do inheritance to an abstract class (virtual = 0) if you need a re-implementable type. It may not be perfect, but it isn't impossible.

But the salient point is that if foo() takes an interface IBag (in your perfect language of choice), this does not prevent someone from
implementing an IBag that has CSet semantics. Remember the interface of CBag and CSet are the same, it is the semantics which differ. So
interface conflation is not the issue that Oleg was faced with. He was faced with the fact that foo() accepts an interface that has virtual inheritance.

I keep trying to point out that you are presenting an academic argument that 1) I never disagreed with, C++ can do what interfaces do and it can do private inheritance, 2) that has no bearing on the problem that Oleg raised.

How many times am I going to have to say that LSP is causing the problem that Oleg raised (as he admitted) and all the orthogonality you want to invent in your perfect language will never prevent it if the user is determined to conflate. The LSP can not be fixed with a language
construct. Period. That is an academic conclusion.

Now please stop badgering me. Am I not clear already that no perfect language can solve LSP? And C++ is not unable to conflate subclass and subtype in both directions?

I added an update to this definitions post from early in this discussion:

C++ did not conflate subtyping and subclassing!

Yes you can do that in C++

Yes you can do that in C++ with the method I described (or inheriting CBag private and delegating) in my prior post that you replied to.

Inheritance is intended to promote code reuse. Requiring so much boilerplate to express a solution because declaring subtyping and subclassing relationships are a combined operation instead of separated violates that goal; as you admitted, a non-trivial amount of boilerplate for explicit delegation is required for subclassing-without-subtyping, and as Andreas pointed out, subtyping-without-subclassing is entirely impossible.

As far as I can see, nothing significant is gained by keeping them combined, but much is gained by separating them. I'm not sure why you would be so resistant to separating two concepts that are clearly orthogonal, since separating them strictly increases the number of safely expressible compositions of code.

There is significant literature covering these concepts, and you are arguing with people that are familiar with this literature. Your use of non-standard terminology and the fact that you repeatedly ask for examples which have been covered extensively elsewhere, and even linked from the article you first read, implies that you aren't familiar with it. It's hard to have a productive discussion as a result.

If you are still interested in this topic, I would recommend that you read the literature for yourself to see what all the fuss is about. If you still think you have trivial solutions to these problems, you can easily publish a paper to much acclaim, or write up a post on your blog or website and post a link here. Many here would be very interested in reading about any such solutions.

[admin]

LtU's probation system can be circumvented by editing old comments, but that rather misses the point, and is more likely to lead to banning.

In any case, it seems unlikely that the spirit of LtU will be served by continuing this discussion here. If anyone wants to continue it, please consider finding a different venue and posting a link here.

Corrupt record?

Is that why Shelby's profile says that he has been a member for 39 years and 44 weeks?

Glitch in the matrix

Shelby requested that his account be blocked, and we've done so. Essentially, the account no longer exists but the comments and author's name persist. That profile page behavior occurs with any nonexistent user id, and is a bug in the version of Drupal we're using.

Ask the other way round

You asked the question the wrong way round. You actually can have inheritance without subtyping in C++, via private inheritance. However, like in most mainstream class-based OOPLs, you cannot have subtyping without subclassing.

C++ does subtyping w/o subclassing

...like in most mainstream class-based OOPLs, you cannot have subtyping without subclassing.

You just have to get your definitions of terms correctly orthogonal:

C++ subclassing does subclasstyping, subdatatyping, or both

I think your confusion is because the same syntax is used in C++ for both subclassing the type and the data. But the choice is in what you put between the {} of the subclass.

That is why I do not like the terms "subclass" and "subtype". Interfaces (classtypes) are types too. I prefer "subclasstype" and "subdatatype". And the action of inheriting is "subclassing", which can be inheritance of "classtype" or inheritance of "datatype" or both.

Strictly speaking, my conflation of "interface" with "classtype" is not correct. The interface could remain constant, while the behavior of the interface could change. So my definitions for that still need to be fixed.

Call it what you want

The fact remains that you cannot treat objects of type T as also having type S ("subtyping" in standard terminology) unless you have defined both types as classes/interfaces and proactively derived T from S ("subclassing").

You are conflating, not C++

class sub : super {} is agnostic as to whether it is subclassing the interface+semantics or data schema or both.

And all this is subclassing

You seem to be redefining the meaning of "subtyping" in a non-standard manner. That's not very helpful for understanding what people are talking about.

Show me how I misdefined sub(data)typing?

?

The misunderstanding is that you have arbitrarily put it in your mind that class sub : super {} means a new interface and/or virtual behavior (overridden methods), when in fact C++ does not force that.

C++ is entirely flexible and agnostic about it. When you subclass, you can subclass the data schema (subtyping aka subdatatyping) or you can subclass the interface and/or virtual methods, or you can do both.

Why is that so hard to wrap your mind around?

Let me give code examples of each:

class TopLeft { int x; int y; virtual void Clear() { x = 0; y = 0; } }; class BottomRight : TopLeft { void Clear() { x = 640; y = 480; } } class Rectangle : TopLeft { BottomRight br; void Clear() { br.Clear(); this->TopLeft::Clear(); } } class Rhombus : Rectange { int skew; }

So BottomRight is a subclass of virtual method semantics (i.e. subclasstype) and Rectangle is a subclass of data schema (i.e. subtype aka subdatatype) and virtual method. Rhombus is the pure subtype (aka subdatatype) example, although Rectangle could be argued to be also since Clear() semantics remained consistent.

I haven't written C++ code in several years (literally this was first time), so excuse any syntax errors.

The definition I gave for classtype a few posts back, is problematic. Also when subclassing of datatype, you really need to modify methods to support the new data members, but that is not a conflation with subclassing of classtype. I need to improve my definitions.

CSet subclassed the semantics but did not subclass the datatype of CBag. What is the conflation that was forced upon you by C++?

Be nicer.

Why is that so hard to wrap your mind around?

It would serve you well to be more civil. You should bear in mind the possibility that it may be you who is wrong. (You should especially bear it in mind in this case, because in this case, it is true.)

Finally, I ask you to pause and notice that in the past week you have literally dominated all discussion on this site in terms of the number of words posted. This state of affairs is far from ideal. I don't plan to argue with you about it, but I wish you would consider whether the length and topic of your posts is appropriate, and consider who is the intended audience. You asked earlier where to find an appropriate forum for this type of philosophical discussion. I don't have an answer to that question, but just because a better forum does not exist does not make it appropriate to conduct the discussion on LtU.

For what it's worth, I don't have any problem with the topicality of the current post, as long as you're willing to learn and not just to promulgate your own eccentric views.

Facts?

Please tell me what is wrong specifically? Give me a clear statement of something C++ can not do that is conflation. I really want to know. Until you do, I can not be shown to be wrong.

(apologies for the wrap your mind around comment, but it is frustrating to repeat the same answer several times and still not get a clear statement of something C++ can not do with regard to this conflation accusation)

as long as you're willing to learn

Okay teach me? I am waiting........

Who was wrong?

You should bear in mind the possibility that it may be you who is wrong. (You should especially bear it in mind in this case, because in this case, it is true.)

C+ can express non-inheritable methods

Done.

It would be nicer if you do not put it in your mind that I am evil person out to do harm to someone. I also wish I didn't have to make so many posts to get to the resolution. Regards. Hope this topic can die, I want to go do other things away from this site.

Not done.

This is exactly what I'm talking about. You thought you were right before, and you still think you're right now, but so what? It is still possible that you are wrong now, just as it was possible then. Keeping this in mind would make you much easier to talk to. If you would like the topic to die, you can simply stop posting invective. (If you spent more time reading, I suspect the entire thread could have been avoided.)

Briefly

I (kind of) understand what you are trying to say. However, that has little to do with what other people mean when they talk about subtyping vs subclassing - or types vs classes, for that matter (*). In standard terminology, all your latter classes are subclasses of TopLeft, and in languages like C++ or Java they all introduce types that will be treated as subtypes of TopLeft by the type system. This coupling is what folks have been criticising as problematic (since more than 20 years, in fact).

If you really want to understand this better then why not follow some of the pointers given on, say, Oleg's pages?

(*) Briefly, a type is something that describes objects in a type system, and thereby defines their legal uses (in OO e.g. in terms of the messages the object understands). A class OTOH is a language mechanism for creating objects. Notions like "abstract classes" or Java's "interfaces" have been invented to (barely) work around the conflation of the two.

Real problems have solutions, imaginary problems may not

I appreciate you kinda trying to understand my point.

I am trying to read your synopsis of a problem but (not to criticize you, just to be frank) it sounds like "purist" theory mumbo-jumbo non-problem to me-- an invented problem in imaginary world that doesn't exist in real world. I have written huge C++ projects years ago and I do not ever recall getting stuck in some widespread conflation problem of this nature. My main complaint with C++ was I guess off top of my head, the perceived lower productivity than with dynamically typed languages.

We can go round and round in circles forever on theory, but the point is that no one yet has given me an simple example of code that couldn't easily express what they requested to express in the C++ syntax relative to subclass of data or semantics.

I read that long website from Oleg trying to figure out what is such a problem, and I only had to change two lines in his foo() to fix the problem that he probably spent days or weeks researching and writing about.

How can I respect the complaints of people if after several posts they still can't show me a real problem in code that I can not easily fix? Why should I go off reading Oleg's references when he wasted so much time developing heuristics to fix something that I fixed by changing 2 lines of code? I mean I hardly respect his power of discernment now. No set of heuristics can insure LSP in all scenarios.

If the problem was so pervasive, why can't any one show me an example?

Oleg's example

Oleg's example is representative of something very real. If you tried to "fix" it by changing foo then you have missed the whole point. Which is that bags come with an implicit contract that is broken by sets, while no client of Set is prevented from passing sets as bags, thereby potentially allowing them to (accidentally?) break arbitrary other clients of Bag who rely on that contract (here represented by foo). More seriously, OOPLs actually tend to encourage such dubious uses.

C++ subclass contracts do not force conflation

Excellent statement of the complaint. Thank you.

Which is that bags come with an implicit contract that is broken by sets, while no client of Set is prevented from passing sets as bags, thereby potentially allowing them to (accidentally?) break arbitrary other clients of Bag who rely on that contract (here represented by foo).

No contract is broken in Oleg's first foo(). If a client feeds a CSet or CBag to foo() he gets CSet or CBag semantics respectively.

No contract is broken for the programmer of foo() when foo() expects CBag, because the programmer can see that the methods of CBag are virtual, and thus can not expect he will always be operating with CBag semantics. If the programmer of foo() wants to force CBag semantics, then he can, in which case he should document that or throw exception when the input is not a CBag. If the implementer of CBag wants to force always CBag semantics he can.

Sorry I do not see any hidden breaking of contracts nor any forced conflation of subclassing. Could you give me a real example of errors in code that proliferates in a project and can't be done correctly with forethought?

The designer of CBag and CSet should implement the overrides such that semantics is preserved. The programmer who uses overridden methods should depend on those documented semantics which should hopefully be consistent (e.g. an "Add" is still an "Add" even if it only adds unique elements) even though they will be different. That is not conflation, that is the way inheritance is supposed to work.

Can it cause collisions in logic and composability? Yes of course, but that depends on the designer (human error). That was the whole point of my recent diatribes about aliasing error and keeping modules small and orthogonal. But that is not forced conflation, that is just the nature of complexity (the natural law, the universe is always trending to maximum disorder, if you stand in it's way by adding too much order, you get trampled after not going too far).

I do not know how you could expect inheritance to work any other way? Are there are alternative models to inheritance? Okay I know about interfaces and I think that is great, but I do not see how that is any different that forcing foo() to use CBag semantics, which I showed how to do.

Non-solutions

You suggest two choices:

(1) Use virtual inheritance. Allows clients of Set to Break clients of Bag.
(2) Use non-virtual inheritance. Allows clients of Bag to break Set.

That is, both are non-solutions. Both allow some clients to break either the contract of Bag or the one of Set. Both fail to deliver on OO promises about encapsulation and compositionality. The only sane solution is to disable subtyping. Two possibilities:

(3) Don't use inheritance. Basically abandons OO (although YMMV).
(4) Use inheritance, but not subtyping. That is, if the language properly separates them.

C++ private inheritance gives something like 3.5, btw. As naasking rightly pointed out, it doesn't really give you proper inheritance.

(I won't go into the possibility of inheritance with dynamically checked contracts, as that has its own set of problems and limitations.)

How do we explain orthogonality to someone who can't see it?

How can there be solutions to a real problem you've never articulated. You keep talking about theoretical breakage that doesn't exist unless the designer wants breakage. I have given solutions to every request for expression and explained why there is no breakage nor contracts (unless the designer intends them...they are not forced by C++).

I am afraid that your way of thinking about the issues is so conflated, that I may not be able to explain to you why there is no conflation forced by C++. I will give it a try, maybe there is hope for our mutual understanding (would that be love?). Of course someone can always conflate anything in any language (LSP insures it!!! or did you miss that key point), but it doesn't mean we were forced to by the language.

(1) Use virtual inheritance. Allows clients of Set to Break clients of Bag.

C++ allows the designer to create conflation, but C++ doesn't force conflation or breakage. It is up to the designer of the semantics of overloaded method. And it is not the clients of CSet that would be causing the breakage, but it would be the designer of the overloaded method.

I think you need to define breakage in order to understand this. If the Add() method for CBag and CSet are both consistent with adding, there is not breakage. The clients of CBag and CSet both get Add() the way they each expected, when Add() is called virtually within foo(). Your accusation of a victim of breakage would only be valid for the expectation of the programmer of foo() that when he uses a CBag& reference that he gets CBag semantics. But the programmer of foo() can see that Add() is virtual and he did get virtual Add() semantics, which is what the designer of the CBag and CSet Add() semantics intended. I do not see any breakage at all.

(2) Use non-virtual inheritance. Allows clients of Bag to break Set.

Read above that there is no breakage in that foo() example that should be causing us to choose non-virtual inheritance. If you do choose non-virtual inheritance, then you do it for some design reason. The feature of non-virtual inheritance is that Add() for CBag is always used when CBag& reference is used, and vice versa for CSet. That is a feature, not conflation. You choose the feature if it helps your situation. So the concept of breakage does not have any meaning in this context of choosing a feature to match a design goal.

Until I see a real world example of a problem that can not be unconflated, then it is just whistling in the wind for us to try to debate this. If you give me real problems, I will show you how C++ can do them without conflation. If you give me imaginary problems with words ("breakage") which have no meaning in the context of a specific design decision that has not yet been articulated, then we have no foundation with which to discuss. In short, you are way too high level to get any meaning from this. We will see only aliasing error in this discussion. It is analogous to thinking you know how to "purely" attack a problem, then digging in and realize you need a totally different approach. How many times that I thought purity was going to be important (farting in the wind, wasting our time like we are doing right now), and it was just a waste of my brain cells. Real problems (experience) make us better programmers.

Speaking of wasting my time, I am not going to bother to reply further to this. If you can't see it by now, it is your problem not mine. I have more important work to do than fart around here trying to teach novices. Go get a real job with C++ and learn how to unconflate.

I just punted the ball into the canal. Have fun getting yourself muddy from here. I won't be around to clean it off.

Final replies have been inserted in prior posts:

No conflation in C++, LSP is never solved by your perfect ideals

C++ did not conflate subtyping and subclassing!

[admin] Probation

Due to continued violations of site policies, Shelby Moore III's account been placed on probation, which means that further comments from him will be reviewed for compliance with the policies before being published.

Since Andreas didn't mention it...

I appreciate you kinda trying to understand my point.

He said he (kind of) understood what you'd written, not that he kinda tried.

I haven't read everything you've written on LtU (I doubt anyone has), but the two "solutions" I've seen from you (don't use virtual functions and use an explicit scope specifier) are total hacks. So it's not much of a surprise that you're not excited by the "purist theory mumbo jumbo" that's being discussed or that you've never had a problem with C++. I get worse is better, but some of us like better better.

Illogic 101

I appreciate you kinda trying to understand my point.

He said he (kind of) understood what you'd written, not that he kinda tried.

It wasn't a conscious grammatical juxtaposition. Consider it a bug. Thus, your insinuation is you never make small grammar bugs :).

...two "solutions" I've seen from you (don't use virtual functions and use an explicit scope specifier) are total hacks.

They are not hacks. It depends what the designer wants to choose (more than 2 paradigms). If not using virtual keyword was a hack, then the keyword doesn't need to exist, it would just be hard-coded into every method :).

Just a little bit of logic for ya :) bud.

I would hope not

Thus, your insinuation is you never make small grammar bugs :).

Poor grammar isn't tolerated in the fortune cookie business.

I can see how that would

I can see how that would also be necessary too, but I had thought private inheritance made all inherited members private, which doesn't seem like what I was after.

I'm thinking of going into

I'm thinking of going into the fortune cookie business. What's the copyright status of this post?

Programming is to some degree random like a fortune cookie

After reading this, do you still think randomness of nature law does not apply to PL and your application of it in a very real way?

I will try to shut up now, or be terse with small snippets of technical discussion.

Whoops this post was a transaction atomicity error :).

After reading this

I think that what he must think is 'Ce que l'on conçoit bien s'énonce clairement, et les mots pour le dire viennent aisément'.

Or more simply, I would advise you to try to be terser&clearer in your posts otherwise I doubt that you'll find many interesting discussions..

Regards.

LSP exists at the design level

LSP can't be encoded in the type system. LSP is a design principle for working with OO polymorphism. That a language can let you write code which is poorly designed is neither news to anyone, nor is it possible to avoid.

Sorry. Can you explain, what

Sorry. Can you explain, what does "horizontal subclassing" and "vertical subclassing" means?

These aren't the normal

These aren't exactly standard terms used, but...

Vertical subclassing is normal subclassing.

Horizontal subclassing is a method of altering and extending an already existing class without altering its source.

The units of extension are

The units of extension are different.

In vertical subclassing you have a class extending the class.

In horizontal subclassing you have a class library extending other class library. Or module/package extending another one.

I think UML2 package extension mechanism specifies this model of work. They have quite a funny scenarios defined. ML modules might also come close, but my knowledge of ML is quite limited.

The idea is actually might be useful in other contexts, so I suggest looking into it. I borrowed the idea as it comes in UML2 for ETL to implement grammar extension mechanism.

How does Ada do it?

2) bounds checking is not needed if the language makes indices be strong types. ADA has this feature very nicely done.

I looked around for information on Ada's solution, but couldn't find what I needed. How does Ada do it?

Dependently typed array bounds would guarantee no out-of-bounds accesses without having to perform any dynamic checks.

Run time...

Not all bounds checking is static. The Ada model supports runtime checks for various error conditions, and array bounds are a part of this model.

"Lenient" evaluation

-- versus eager or lazy.

Can anyone point to more information about this?

I see that there was a discussion of this about 1.5 years ago (here, which also makes mention of "the next mainstream programming language(s)"), but haven't seen much else.

"How Much Non-strictness do Lenient Programs Require?"

How Much Non-strictness do Lenient Programs Require?

While not dedicated solely to description of what is lenient evaluation strategy, this paper is the only (free) online resource I could find on the topic. I didn't look hard, so there may be others as well...

Lenient

Lenient evaluation is just non-stict, non-lazy implementation, aka:

1. Inputs to a function need not be defined/valid -- not strict

2. Unnecessary work may be performed -- not lazy

My understanding is that most non-strict languages are "lenient" ones.

ID, on the mosoon

You can draw a 2x2 grid:

 strict   non-strict    |
-------------------------
 C, ML     Id (lenient) | eager evaluation
           Haskell      | lazy evaluation

(credit for the diagram goes to... I don't recall. SimonPJ perhaps.)

Of course, once you've drawn that, you wonder what goes in the last corner, and what a lazy strict language would look like.

Bounds Checking and Array Comprehensions

Check out SAC for some good work on dimension intedependent array comprehensions and a type system for array indexes.

The slides were interesting,

The slides were interesting, but without hearing the lecture it is hard to understand the deatils. Nonetheless, I feel he is right about the importance on concurrency.

A couple of quotes I found interesting

There is not a simple set of “hotspots” to optimize!

A lot of people falsely assume that in any project there's always a small core that needs to run fast, which can be coded in C, etc., but the rest can be written in your favorite but slow language.

90% of integer variables in Unreal exist to index into arrays

Wow !

Garbage collection should be the only option

This is very interesting coming from a game dev.

Garbage collection could be controlled by game state.

Garbage collection could be controlled by game state.

Imagine if you will, you burst into a room. Monsters are everywhere and you have only a couple seconds to respond before you're toast. Do you want the GC to run now? Hell no!

What if you could defere the GC run until the room was cleared. As soon as there are no monsters in sight, go ahead and process that long overdue memory sweep. Or run it whenever the player opens a menu or otherwise pauses the game.

By defering the cost of memory deallocation until appropriate times, it seems possible to more than make up for the cost for GC. Obviously there will be times when a memory sweep cannot wait, but careful tunning should at least reduce those occurances.

Real-time GC

Presumably what you really need is some form of real-time garbage collection, to ensure that your frame-rates never drop too low. There's plenty of research in this area: you could try here, here, or here as starting points, or just hit Google.

I like that idea. I wonder

I like that idea. I wonder how users would react to a tip that they could increase performance of their game by occasionally pausing it. Most people expect a pause while they quicksave their FPS game -perhaps a good time to GC as well.

GC at the end of a level

Some games with GC-ed scripting languages do a garbage collection between levels. Obviously this means there needs to be an upper bound on how much garbage can be created, but I know of several shipped games that used this technique. (The trend at the moment is to not have levels, though, and have a big seamless world.)

GC would be fine here

You're taking the performance of superscalar computers that can perform billions of operations per second and putting it into a very loose human frame of reference. I'd say a long GC, provided you don't have an enormous heap or you're doing generational collections, would be 2 milliseconds. That's less than 1/8th of a total frame at 60fps. This would be completely fine in an action game like Quake.

Especially with C++

Yes, GC would probably be excellent - in a language like C++ where stack allocation is extensive you don't get the same problems as in Java/Python where nearly everything is an object and tons of temporary objects are created behind the scenes so putting excess load on the GC. A well managed game engine that kept allocation to a minimum combined with coarse-grained objects and stack-allocated temporary objects would probably require very few GCs, and they'd be very fast.

Note that there is research

Note that there is research on Java compiler's optimisation to allocate objects on the stack when it can see that the object is not referenced at the end of the function.
This would reduce the pressure on the GC.

Escape analysis

Right, escape analysis can be used to prove that an object is stack allocatable, and this is the "clean" way to do it as it means the developer does not have to think about these things themselves. And actually the bulk of the proof is showing that it cannot be accessed by other threads.

But, stack allocation is still not with us in production/shipping JVMs, and 10 years ago C++ developers were still benefiting from this. So you could argue that waiting around for the clean solution (in this and other ways) did Java a great deal of damage in the public eye, and unfortunately this damage is often genearlised out to all garbage collection.

Holy cow, this almost

Holy cow, this almost doesn't look like SPAM! Are the bots really getting that smart?

And most other languages too

Again, you're jumping to the "GC is slow in a human frame of reference" conclusion. Tons of temporary objects is no big deal--provided you're using a compacting collector that only cares about live objects.

"magic stack GC" versus live-object GC

I'm not so sure:
Allocating objects on the stack allows them to all be deallocated once you leave the stack frame ("free" GC) - as long as you KNOW they're not needed past that stack-frame (and thus don't need to scan them). I find this very useful in C++, like alloca() in C. (of course, it would be nice to have some guarantee about the "KNOW"ing :)

..whereas allocating "Tons of temporary objects" is still taking up memory until your GC runs.

If nothing else, I would think the latter would give more page-faults & cache-misses, and not be favored by memory-bound apps.

I suppose using the stack for objects with known lifetimes is something like "Linear Lisp", which I wish there were more of..

Region inference

Allocating objects on the stack allows them to all be deallocated once you leave the stack frame ("free" GC) - as long as you KNOW they're not needed past that stack-frame (and thus don't need to scan them).

Region inference can do this kind of thing for you automatically in many cases and also generalizes the idea dramatically. It's been discussed on LtU several times in the past. A good primer on this material is Mads Tofte's introductory slides.

Programming with Regions in the MLKit

I've been reading it during the last couple of days (on the train to work and back). While regions (in MLKit) have some nice properties, there also seems to be some gotchas that the programmer needs to be aware of. I'm not sure I would want to give up the flexibility/freedom of expression that garbage collection gives (luckily MLKit offers both regions and gc). However, regions (without gc) might be an excellent approach in some special domains (real-time / embedded), where resources (time / space) are limited.

Hot-spots

A lot of people falsely assume that in any project there's always a small core that needs to run fast, which can be coded in C, etc., but the rest can be written in your favorite but slow language.

Yes, this is an all too common fallacy. I used to work for Tim on Unreal Engine 2 and 3 and other projects, and one of the programmers who often worked on optimizations, Dan Vogel, was fond of pointing out that UE2 had a basically flat profile. It didn't start out that way, but when you've been optimizing your code for a while it inevitably converges towards this performance profile. The reason many non-game apps don't experience this is that they have a very different idea of what "sufficient performance" means.

Sufficient performance

I'm curious - by the time a game gets to the flat profile, does it usually have sufficient performance? I can understand how repeated profiling would lead software to converge to a flat profile: if you always optimize the hotspots, then eventually there will be no hotspots, because they'll already be coded in C and tightly optimized. But this is only a problem if you need to do further optimization - otherwise, you're done, and profiling has done its job.

Sufficient performance

Well, it better, or you're screwed! Like Tim mentioned, once all feasible code optimizations have been attempted/implemented, you generally start making compromises on the art/level content to increase performance.

Another thing: for a console game the minimum bar is probably a rock solid frame-rate at around 25 Hz. But higher frame-rates generally lead to smoother, better looking experiences, so game programmers are often chasing them.

By the way, there are forces at work that help to flatten the profile which I didn't mention. Especially when optimizing graphics code you are dealing with a pipelined architecture and thus a bottleneck in one of the pipeline stages means that you can shove more workload into the other non-bottlenecked stages without noticably decreasing performance. So, for instance, if you are fill-bound you can push more polygons for "free". Or if your GPU performance is memory bandwidth bound from fetching texture data, you can use more complicated pixel shaders (as long as the extra complexity is arithmetical and doesn't involve extra pixel fetches). Taking advantages of these kinds of oppurtunities also serves to flatten the performance profile.

SPAM

Ah, I see. They are mining previous comments to create new comments with embedded SPAM. Really sneaky!

It's worse than that...

The newest spam-bots hack into accounts of regular contributors and accuse random posts from years ago of being spam.

Deep philosophy

If spam is deleted in the forest, and Matt M doesn't read it, was it ever there? ;-)

Yes... We need to fix the

Yes... We need to fix the spam deletion procedure so that replies get deleted too).

I figured...

...it was something like that. But this approach seemed more fun than just asking what I was missing :).

Games can make compromises that others cannot though

Remember we are talking about a game. Games are not important. If my game spends 90% of its time in arcTangent() [which of course would use Newton's method or some variation to find results to many decimal places] I can write a new arcTangent() that isn't as accurate and call it good enough. (most likely a lookup table)

If my bridge simulation turns out to spend 90% of its time in arcTangent(), there may not be much I can do about it without risking a bridge collapsing.

In the real world most programs are not bottlenecked by the CPU these days. CPUs are fast, programmer time is expensive. Nobody cares. So long as are are not stupid in choosing algorithms things will be fast enough.

That's just wrong...

... any game engine developer would tell you otherwise.

Tripple-A games always push the envelope and will always try to maximize the performance (especially in the game console world, where most of the money is). Therefore, CPUs are *always* slow and yes, we do very much care about performance. That won't change.

What *has* changed is that we've realized that most of the game *logic* processing can be executed using slower methods. This is 50-80% of the code, and *productivity* is by far the most important factor. Therefore, higher-level languages are quite suited for these kinds of tasks.

Not necessarily slower methods...

(well, since apparently spam bots resurfaced this thread...)

Not necessarily slower methods...

Some game companies are moving towards the Mono VM for game logic processing. Interpreted languages (like Lua) carry really high performance consequences, but a VM can JIT the instructions. Moreover, a VM lets you choose from a wider variety of High-Level Languages. With an interpreted language, you pretty much lock yourself into that one langauge. What happens if you want to create an External DSL? Then you are writing another interpreter. Presumably, you might do what Quake 3 did and effectively LLVM the game logic DSL at load-time - this assumes you can stabilize the game logic around basic patterns that are easy to compile down to efficient machine instructions.

The more important void here

[eh didn't feel like continuing this conversation, sorry -- wish i could delete a post, not just edit it =/ ]

GC for memory, ok but for resources?

While 'GC only' for memory seems possible, there is also the problem when an object is used to control the access of some OS (scarse) resources such like files, in which case RAII seems (to me) the only sensible approach..

So I'd say GC only seems to be too restrictive..

PS:
Never heard before about 'lenient evaluation' interesting read even though I'm just starting.

RAII

While 'GC only' for memory seems possible, there is also the problem when an object is used to control the access of some OS (scarse) resources such like files, in which case RAII seems (to me) the only sensible approach..

RAII mixes together initialization, memory management and resource acquisition, which is an undeniably useful idiom in C++. In garbage collected languages you can still get much of the same effect. For instance, in Lisp-like languages it's common to use WITH-* macros. In C#/.NET you can implement the IDisposable interface for your class X and then do

using (X x = f()) { ... }

which will automatically call the Dispose() method on x when the section is exited. So there are plenty of ways to deal with resources that don't confound resource management with memory allocation.

On RAII in higher-order languages

Using fxCop in addition to the C# compiler

Sweeney's presentation lists a number of errors that could be detected at build time but currently aren't:

  • Accessing arrays out-of-bounds
  • Dereferencing null pointers
  • Integer overflow
  • Accessing uninitialized variables

I wonder if he tried tools such as fxCop which can check for a number of these and can be extended to verify more.

FxCop can't be fully complete and sound.

FxCop does what it can with what it's given, but you need more typing information to fully guarantee safety.

You can never fully

You can never fully gaurantee safety.

Garbage collection

Garbage collection performance in games:

Though an Unreal Engine 3 game can easily consume 512MB memory (on console) or up to 2GB (on PC, in the editing tools), less than 10% of that memory consists of data that contains pointers. The rest is bulk binary resources - like texture maps, sounds, animations, which don't require scanning for references.

50MB of data is well within the range of a mark-and-sweep garbage collector taking a few milliseconds within a game running at 30 frames per second. This is what we do. Realtime garbage collection is another possibility, but we didn't want to try to implement that within a primarily C++ engine.

Garbage collecting memory:

I do believe this is completely practical as the sole memory management solution, even in a realtime application like a game.

Garbage collecting resources (file handles, etc):

This is a very different question, because resource freeing has observable consequences beyond performance and memory consumption -- unlike garbage collection, which is justified by the realization that if you don't have any pointers to a value in memory, then it can simply dissapear (or not) without observable consequences.

For example, if you have a file handle open for writing, then other applications can't open that file. You want such resource usage to be clear and deterministic, so that files don't just remain open for a random duration depending on the garbage collector's internals.

In general, I would not advocate garbage collection of OS resources or any other thing requiring explicit cleanup. For example, I think that Java/C# finalizers are a misguided idea, because they have observable, nondeterministic consequences. That is exactly the sort of feature a high-level, secure language should avoid. The bizarre finalization state diagrams for those languages should be enough to indicate that something is wrong here!

For resource freeing, constructs which guarantee that every resource is freed look promising (think of wrapping a file handle in an abstraction like a Haskell State monad). Or just plain old handles with explicit closing (and thus the risk of not closing a handle, closing it when it's not open, etc).

FxCop:

These sorts of tools are useful for analyzing code written in existing/legacy languages. Microsoft has some scarily powerful analyzers along these lines that they use internally, as well. But they're all built on the idea that you have code written in a language without strong safety guarantees, and you want to statically analyze them to figure out some of the things that might possibly go wrong at runtime. That is useful, but it's not comparable to the guarantees one would have in a language with strong safety guarantees.

I remember writing game code in assembly language a long time ago. Then macro assemblers came out and made assembly code easier to write correctly. But the eventual solution was C, which led to ~5X higher programming productivity and a comparable reduction in bugs. Ultimately tools like this are just workarounds for legacy problems to which genuine solutions are known ... and on the verge of coming into commercial practice.

Garbage collection without type information?

C++ garbage collection would greatly benefit if it was assisted by the compiler. During a scan, only pointers to objects need to be scanned. Unfortunately, it is not possible without assistance from the compiler. That is the reason third-party solutions are not good enough for performance-intensive applications.

Memory management and resource management is not the same. Resources other than memory are usually few in a program and not interconnected. A program may have files, sockets, windows and other types of handles, but usually these handles are 'dead ends', i.e. they do not contain references to other handles. Memory, on the other hand, is a totally different beast: a block of memory usually contains references to other blocks of memory. So the idea of finalizers is a bad one, since resources other than memory should be freed by RAII techniques (for deterministic resource management), but memory should be handled by garbage collector.

I will say it again, because no one seemed to notice it: could it be that there exists a "calculus" for resource management?

Resource management calculus

In general this is not possible in a Turing-complete language, because of the halting problem. Presumably the goal of your resource management calculus is to determine if there is some statement within the program that can still access the resource, and free that resource iff there is not. Now replace that statement with "halt()", and your resource management calculus can determine if any arbitrary Turing machine can halt. But we know this is impossible, so such a resource management calculus cannot exist.

Garbage collection, reference counting, RAII, and WITH-blocks are all conservative approximations. They tell you if it's safe to delete a given resource, but they do not guarantee they'll always find every safe situation. That's why it's still possible to leak memory in a GC'd app: just hold onto references longer than they're needed. RAII and WITH-blocks can also fail, but they usually do this because the program went into an infinite loop within the stack frame in question, so you have bigger and more obvious problems.

There may be better resource-management solutions (and personally I think Eros OS has an interesting approach: the disk is just treated as one more level in the memory hierarchy, and everything is automatically persisted. Wonder how they deal with garbage), but "perfect" resource management is impossible.

How about a 'limited' resource management calculus?

Of course, it can not be determined if an algorithm halts or not, but I think it is doable to infer from the source code when some resource is no longer needed.

First of all, humans do it, by running an approximation of the program in their brain.

Secondly, I think it is possible to improve resource management by annotating (either manually or automatically by inference) if an object is stored in a temporary or another long-lived object.

Here is an example of a double-linked List class that shows the above concept with annotations(in Java style). The annotation stored means that an object is stored in another object; The annotation temp means an object is not stored:

private class Node {
    private Node m_prev;
    private Node m_next;
    Object m_object;
}

private class ListIterator implements Iterator {
    public ListIterator(Node node : stored) {
        m_node = node;
    }

    public boolean hasNext() {
        return m_node.m_next != null;
    }

    public Object getNext() {
        Object result = m_node.m_object;
        m_node = m_node.m_next;
        return result;
    }

    Node m_node;
}

public class List {
    public Iterator find(Object object : temp) : temp {
        for(Node node = m_first; node != null; node = node.m_next) {
            if (((Comparable)node.m_object).compare(object) == true) return new ListIterator(node);
        }
        return null;
    }

    public void add(Object object: stored) {
        Node node = new Node;
        node.m_prev = m_last;
        node.m_object = object;
        if (m_count) m_last.m_next = node; else m_first = node;
        m_last = node;
        m_count++;
    }

    public void remove(Iterator t: temp) {
        ListIterator it = (ListIterator)t;
        if (it.m_node.m_prev) it.m_node.m_prev.m_next = it.m_node.m_next; 
        else m_first = it.m_node.m_next;
        if (it.m_node.m_next) it.m_node.m_next.m_prev = it.m_node.m_prev; 
        else m_last = it.m_node.m_prev;
        m_count--;
    }

    public void remove(Object object: temp) {
        Iterator it = find(object);
        if (it != null) remove(it);
    }

    private Node m_first;
    private Node m_last;
    int m_count;
}

Let's see using the above:

class Main {
    public static main(String[] args) {
        List list = new List;
        list.add(new Integer(1));
        Iterator it = list.find(new Integer(1));
        if (it != null) list.remove(it);
    }
}

By analysing the method main using the annotations, we can infer the following:

public static main(String[] args) {
    List list = new List;                      //list is temp
    list.add(new Integer(1));                  //new Integer(1) is stored
    Iterator iter = list.find(new Integer(1)); //new Integer(1) is temp
    if (iter != null) list.remove(it);         //iter is temp
}

We can see that objects list, new Integer(1)(the one used in find) and iter can be safely deleted after execution of main.

I think with a little more work on this, garbage collection could be made much more effective (and in many programs, eliminated completely).

Refcounting

That's basically reference counting, with additional programmer annotations that tell the compiler "Don't refcount this" (the temp annotation). At least, I hope it's refcounting, because otherwise you can run into big problems. What happens if you put an object into two different containers?

I also don't understand some of the defaults you're using. When you create an object within a function as "new Node", is it assumed to be stack allocated? What happens with the assignment to m_last then? Increment the reference count? What happens if you have cycles? Who owns objects that are returned from functions? (Obviously not the callee, but the caller might not be collection, and may do arbitrary things to the reference like passing it to multiple functions.)

Anyway, I don't really see the point, because this system is:

  • Unsafe. If a programmer forgets a temp or stored annotation, they end up with a dangling pointer
  • Inflexible. If you look at large Java projects (say, Netbeans or Eclipse), most objects are nothing more than a collection of references to other objects that manage their behavior. Yet they're not containers of those other objects, more like clients of their services. For example, a ClassPath in Netbeans is used in all sorts of places, from resolving imports to suggesting autocompletions to flagging type errors to compiling the project. This is very different from how C++ programs work, and I doubt that IntelliJ/Netbeans/Eclipse could have been written in C++. Look how Microsoft had to rewrite portions of Visual Studio in C# to get some of the features that were cropping up in Java IDEs.
  • Slow. This'll probably surprise you, but Java allocation is faster than C++.

I'd encourage you to try and pursue this - who knows, maybe you'll find something - but IMHO garbage collection's the best approach out there for memory management, and garbage-collection + orthogonal persistence + formalized concurrency calculi like CSP are the best approach for general resource management.

It is not reference counting.

What happens with this system it is that the compiler does the checks at compile time, instead of run time.

If you put an object in two different containers, the object's lifetime is equal to the longest lifetime between the two containers.

"new Node" is presumably allocated on the heap, but the compiler finds out that it does not "escape" beyond a certain block of code, so it can be safely deleted after this block of code is executed.

There is no problem with cycles, since cyclic objects also have a lifetime.

No one owns an object that is returned from a function. What I presented is a solution to improve garbage collection by doing optimizations based on code flow.

If a reference is passed to arbitrary functions, then the parameters of those functions are also annotated, thus making the compiler aware of how they use their data.

The system is not unsafe, because the annotations were only for comprehension. The compiler will insert those annotations.

I know that Java allocation is faster than in C++, since Java allocation is linear. But C++ wins due to many objects allocated on the stack.

If you put an object in two

If you put an object in two different containers, the object's lifetime is equal to the longest lifetime between the two containers.

How do you determine a container's lifetime? Does making the container (or its reference) a global result in its lifetime being that of the process?

Will the programmer be able to implement their own containers? As I understand your proposal, certain common collection operations (e.g: remove an object, empty a collection, etc.) will require the collection to somehow delete objects (e.g: those that are no longer in the collection). Without access to manual memory management (to delete when appropriate, etc.), I don't see how you'll be able to support cyclic objects as you'll need to treat reference as containment and thus might not be able to infer lifetimes (a < b < a). If collections cannot delete things when necessary, I don't see how you can avoid a space leaks in common patterns such as using a queue to store units of work for threads in a pool.

It's not possible to

It's not possible to determine all things statically, but a great deal of things can be done about it that are not touched now.

How do programmers manage manual memory management in their mind? they abstract the program as a sequence of operations without loops, and thus they are able to understand when something should be deleted.

The same thing could be done by a compiler, if the compiler was fed with the appropriate information.

My opinion is that whatever the human brain can detect in a program, a compiler can also detect. The problem lies that no one has sat down yet and defined the set of information a compiler needs so as that it can do the same work as the human brain.

Before you say that the human brain is inherently more powerful than a CPU, I will say that when I am trying to understand when I should delete something in a C++ program, I do not use all the information at hand, but only certain things in a program that make sense to the problem I have to deal with. In other words, I use the proper abstractions. If those abstractions are fed to a compiler, the compiler could do the same job I do, but without the distractions (i.e. with 100% success instead of 90%+).

But what if the abstractions

But what if the abstractions that you typically use in your reasoning typically ignore certain details which in boundary cases become very important? Then the compiler too will only cover that same 90%. How would we represent these abstractions in the compiler and how do we convince ourselves that the abstractions are sound? We can't just use a fixed logic, because then there would always be useful statements that can't be encoded in that logic. The alternative to a sound logic doesn't seem like a good idea to me, because a compiler that makes tranformations based upon assumptions that we can't show to be true is a compiler that'd I want to be very careful when using.

Not all cases can be

Not all cases can be covered, of course. In very complex algorithms, the confusion of the programmer will be transferred to the compiler. The difference with using an automated tool to do the check is that if there is at least one contradiction, the tool will find it, no matter how complex the algorithm is...whereas we humans we may not be so lucky.

The reason we humans cover the 90% and not the 100% is that we get distracted from external things, we forget, our emotions get in the way etc. A compiler would not have such problems. For example, I can easily assume that a variable X never gets value 0 in a program, but two weeks later I can come back and write 'X := 0'...or another programmer may come in and write that for me.

I'm not quite following

I'm not quite following where the statement regarding emotions comes into play....

Think about what can distract you when you are "programming".

There are many things that can distract a person who's designing/programming/debugging. It's easy for humans to lose focus; machines have the advantage in this domain.

It's not possible to

It's not possible to determine all things statically, but a great deal of things can be done about it that are not touched now.

I agree. There are so many avenues that haven't be explored in programming languages, and programming environments for that matter.

How do programmers manage manual memory management in their mind? they abstract the program as a sequence of operations without loops, and thus they are able to understand when something should be deleted.

I suspect it's a bit more complicated than that. It's my belief that the haulting problem afflicts humans as much as computers and that the same way we work around it can be applied to program analysis. Obviously the times where even we can't figure it out statically will likely be impossible for the computer, as well, but getting a compiler to utilize a fraction of our ability could yield a large boost in lowering memory usage for a GCed language.

profiling memory use

Curtis W: ...but getting a compiler to utilize a fraction of our ability could yield a large boost in lowering memory usage for a GCed language.

To lower high water memory usage for a language with gc, I think one focus should be standardized profiling in a language, showing both source of use and patterns of growth.

This leverages a human mind's ability to process patterns by simply making them visible. Then a coder can reason about causes and experiment scientifically by making changes and observing results.[cf]

Visibility is key to getting the mind into play. Highly expressive languages make it easy to request unintended effects like excessive memory use. Good profiling and debugging tools should make it easy to observe effects, so a coder can compare actual to expected results.

Abstractions...

One of the things that help me decide where I can safely delete an object is the "linearity" abstraction: I draw a line where I put all significant milestones in the lifetime of an object, including its birth (start of line) and its death (end of line). Then I try to fold this line in one of the program's loops. I guess something similar can work for a compiler.

Has there ever been a study about what abstractions programmers use in their minds?

linear types

One of the things that help me decide where I can safely delete an object is the "linearity" abstraction: I draw a line where I put all significant milestones in the lifetime of an object, including its birth (start of line) and its death (end of line). Then I try to fold this line in one of the program's loops. I guess something similar can work for a compiler.

Sounds like linear types to me, which are definitely useful for automatic memory management and regions, etc.

Boehm GC

The Boehm GC can be made less conservative by giving it type information in C++ apps, typically with the help of the compiler. I think there are smart pointers you can use that let the collector know to only scan those values, but I don't recall the details.

Anyway, it's certainly possible, and these days nothing stops you using G++ or LLVM and creating your own C++ compiler modified however you wish :)

C++ Runtime Data

Something to look at here might be item #9 of Paul Wilson's OOPS Group Publications at the University of Texas: "Portable Runtime Type Description for Conventional Compilers." An implementation is apparently available via FTP, but it's not obvious from where.

Note that item #9 is related to item #14, "Real-Time Non-Copying Garbage Collection." This is definitely real; it's used in the RScheme system.

some random comments

some random comments

can you get aspects for c++? i would have thought so. they seem like a possible solution to the problem of extending base classes, based on my limited experience with java (especially if the extensions are orthogonal to the original classes, so really it's more like multiple inheritance from a bunch of base classes).

lazy evaluation seems a rather extreme solution to uninitialized variables - couldn't the compiler just warn instead?!

lenient evaluation *is* eager evaluation, isn't it?

arrays are preferable to lists? doesn't that mean imperative counters? have you considered lists as an alternative to checking array bounds via dependent types? maybe there's some kind of hybrid implementation that gives close-to-array efficiency for simple cases? [later - list comprehensions only give you map, not fold, afaik]

type inference doesn't scale for modules - have you looked at ml?

Type inference and scaling.

type inference doesn't scale for modules - have you looked at ml?

Since Tim is obviously reading this forum it would be interesting to hear a clarification of that statement, "Type inference doesn't scale". The slide in question seems to mix that statement along with some "scary" error messages - which is of course a valid observation, but in my humble opinion not the same problem as "type inference doesn't scale".

two more things.

two more things.

first, i read elsewhere a talk by another games programmer claiming that the majority of time was spent in searching (A* tree search). anyone know where/how that fits with the numbers here?

second, a day later, i don't remember any reference to queues. it seems like they might help with the "interacting objects" problem. both in the future (erlang, lightweight thread per object?) and now (allowing you to "straighten out" the inter-dependencies so that single threading doesn't go round in loops between closely coupled objects).

i would certainly look at erlang and ml, rather than haskell, if thinking about high performance multi-threaded applications. seems to me that you want a strongly typed erlang, basically (how close is alice to that?)

A* search, message-passing

first, i read elsewhere a talk by another games programmer claiming that the majority of time was spent in searching (A* tree search). anyone know where/how that fits with the numbers here?

I think there's basically no way that claim could be true for a modern game if he was referring to A* search specifically, even for something like an RTS that does a crap load of path-finding.

If you take an extremely broad definition of 'search' that encompasses everything from frustum culling to collision detection to A* search and other AI-related activities, then it would have at least a chance of being true.

second, a day later, i don't remember any reference to queues. it seems like they might help with the "interacting objects" problem. both in the future (erlang, lightweight thread per object?) and now (allowing you to "straighten out" the inter-dependencies so that single threading doesn't go round in loops between closely coupled objects).

He mentions message-passing in there. Tim's definitely familiar with Erlang-style concurrency.

I would think that the games

I would think that the games programer talking about A* search meant games as in chess.

We were using A* for path

We were using A* for path finding in RTS (Real-Time Strategy) games 8+ years ago. For the unintiated RTS games involve building bases, creating units and getting them to fight each other. Click on unit or units click on the other end of the map and they figure out how to get there.

Oh, thats intressting, but

Oh, thats intressting, but even if you did use A* i doubt that your games spent more than half there time doing it, as was claimed above. By the way can you tell which games?

I can easily see any game

I can easily see any game targetting hardware from, say, 1996-2000 in which hundreds of units are finding paths across large maps on a regular basis spending half its time doing pathfinding. The RTS genre's traditionally gone easy on the graphics for much this reason, where possible getting improvements in look from better artwork.

Yep

During that time period at least (I don't know about now) tweaking A* to get acceptable performance was a fairly common topic in game development resources that I looked at. Generally using A* to obtain an optimal path simply took too much time for real-time games where you don't want the computer (or the user's unit) to grind to a halt for a second or two, so they'd reduce the quality of the path to get acceptable response times. You didn't even need more than one unit back then--computing a single path over a complicated map could take more than the few hundred milliseconds that you had available, and that would be enough to make the game unplayable.

Pretty much any real-time game that involved moving stuff around a map had to deal with these problems, for example games like Baldur's Gate and Fallout (real time RPG's), Warcraft and Command & Conquer (RTS's). All had generally abysmal path finding quality.

20-30% in our case if memory

20-30% in our case if memory serves me correctly. We did have to precompile maps (thereby precluding dynamic maps), and make the algorithm incremental to achieve that though. The was also a fudge to get units moving before the search was completed.

Haven't worked on an RTS in years, last time I heard Epsilon-Admissible Iterative-Deepening A* was the next big thing but this could be out of date especially now that dynamic maps are being tried.

lenient = non-strict + eager

After reading the mentioned paper I think that lenient is not an evaluation strategy, but a pair of (non-strict) semantics and (eager) evaluation. Similar to lazy evaluation (which implies, AFAIK, non-strict semantics). Other non-lazy evaluation strategies which satisfy non-strict semantics are: speculative and optimistic.

Anyone here knows about a good paper talking about the relationship between all these evaluation strategies?

No, but CPS is a good bet

Unless you are asking just about some fact-sheet (this strategy scores 5 for expressivity, etc.).
My worst concern is that most of these terms are abused and never standardized in the first place - lazy of one researcher may well mean non-strict of another.

As a side note, it would be interesting to see how various evaluation strategies interact with various delimited continuation operators...

Full comparison

No, I really want to understand what are the real differences behind those, because we have much research in optimistic evaluation and I have no idea if the results are valid to lenient evaluation.

Just curious

because we have much research in optimistic evaluation

Do you mean the CS community as a whole, or some proprietary research?

The reason I ask is I am very interested in (business) process semantics, and in some use cases they require something that feels like optimistic evaluation.

We, the people...

The CS community has lots of it, I have zero. Actually I have a bit more than zero, but in PLT and type systems not implementation strategies.

Oops, CPS is a bad bet

It just occurred to me that while CPS-transforming to lambda terms is appropriate to define call-by-value or call-by-name, but if one wants to express lenient evaluation strategy faithfully, then the target language must have some notion of concurrency. Which means, translating to pi calculus (or join calculus, m-calculus, etc.) is a better way to compare various evaluation strategies.

On the other hand, "CPS" in some papers does not assume the target is lambda calculus, so I might have pretend I didn't mean that as well :-)

Futures, maybe

Although I cannot point you to any paper that mentions this explicitly, all three evaluation strategies can trivially be expressed with futures. For example, in Alice ML, the three forms of reduction for application can be employed as follows:

strict: f(e)
lazy: f(lazy e)
lenient: f(spawn e)

That is, the different forms of future directly correspond to the different evaluation strategies.

Futures are great...

but what about speculative and optimistic? IIUC you need compiler suport (or some sort of staged introspection) to decide if you should evaluate some parameters. Also what happens in the lenient case if e diverges but f neither use it nor diverge? Do we have to explicitly kill the divergent thread in the implementation of f?

Good questions

Re speculative and optimistic: I'd say these are different because they rather are interesting on the level of implementation than semantics. The distinction is not a clear one, though, because it depends on the language. Lazy and lenient can both be considered implementation strategies for non-strictness, but they are semantically distinguishable in a language with side effects. In contrast, for speculative and optimistic evaluation observability usually is not desired, as far as I know.

The case of f(spawn e) where e diverges but is unused will have the correct behaviour semantically, but it would indeed leave a junk thread running forever, unless you kill it explicitly. This of course might be undesirable operationally. But I consider it a rather pathological situation and would be surprised if it posed much of a problem in practice.

Garbage-collecting threads?

The case of f(spawn e) where e diverges but is unused will have the correct behaviour semantically, but it would indeed leave a junk thread running forever, unless you kill it explicitly.

Isn't it possible to collect junk threads in PLs without (undeclared) side-effects? By junk thread I would define a thread of the future that is garbage itself (unreachable from root set) - in absence of side-effect this thread can do nothing but consume resources and return the result that will be ignored, so it is semantically just to kill it.

But I consider it a rather pathological situation and would be surprised if it posed much of a problem in practice.

I suspect there are a lot of use cases - parallelOr being the simplest (granted, in this simplest case GC is not the best option performance-wise).

[on edit: ha, so it was not a deja vu: I already asked about something like that in the past... I have to return and rethink that...]

Isn't it possible to

Isn't it possible to collect junk threads in PLs without (undeclared) side-effects? By junk thread I would define a thread of the future that is garbage itself (unreachable from root set) - in absence of side-effect this thread can do nothing but consume resources and return the result that will be ignored, so it is semantically just to kill it.

Maybe not the exact same scenario, but interesting nonetheless: The Incremental Garbage Collection of Processes...

We discuss an approach to a problem that arises in this context: futures which were thought to be relevant when they were created become irrelevant through being ignored in the body of the expression where they were bound. The problem of irrelevant processes also appears in multiprocessing problem-solving systems which start several processors working on the same problem but with different methods, and return with the solution which finishes first. This parallel method strategy has the drawback that the processes which are investigating the losing methods must be identified, stopped, and reassigned to more useful tasks.

The solution we propose is that of garbage collection. We propose that the goal structure of the solution plan be explicitly represented in memory as part of the graph memory (like Lisp's heap) so that a garbage collection algorithm can discover which processes are performing useful work, and which can be recycled for a new task.

An incremental algorithm for the unified garbage collection of storage and processes is described.

In principle, yes


Isn't it possible to collect junk threads in PLs without (undeclared) side-effects?

In principle, yes. Although I would expect most threads to not be free of side-effects, even in pure languages. In any case, you'd need an effect system or monadic type system to find out.

OTOH, Alice can, in some situations, collect threads that are blocked for eternity - e.g. when they wait for a future that is not reachable by any runnable thread (normal garbage collection actually does that almost for free).


I suspect there are a lot of use cases - parallelOr being the simplest (granted, in this simplest case GC is not the best option performance-wise).

In general, sure. I was referring to the particular situation of using threads to emulate lenient function calls - I don't think you normally would want to apply a function that does not use its argument to an argument whose evaluation diverges (unless you plan to do real lazy programming).

The difference lies in application

Interpreting Functions as Pi-Calculus Processes: A Tutorial

The paper gives encodings of CBV, CBN, and parallel CBV into pi calculus. See section 7 for a variant of encoding that differentiates evaluation strategies only by the translation of application (though for some reason this section ignores parallel CBV :-( ).

Why virtual classes?

How is that different from just adding new fields to class Actor? If all classes extending Actor will need those new fields, then it makes sense to add the field to class Actor.

In a one-developer world, you would just add the new functionality to your base class. But in the case of Unreal Engine 3, we maintain and distribute a large software framework used by a hundred or so other developers.

We're continually adding new features to the framework, so if an external developer adds a new member function to our base class's source in their local codebase, then they'll need to merge their changes into all subsequent versions of our framework, and we release frequently - every week or two.

In this case, a virtual class solution that enables a team to extend an entire class hierarchy in a framework, without modifying the framework's source, would be very valuable.

Obviously there are lots of workarounds available (see the slides) - adding extra data components, dynamically casting them to the expected type, etc. But they're quite error-prone.

by the time a game gets to the flat profile, does it usually have sufficient performance?

When all commercially-practical code optimizations have been exausted, further optimizations have to come from the game's content: reducing the number of visible objects, enemies, etc. At that point, it's just a performance vs quality tradeoff, as opposed to performance vs cost.

Arrays are preferable to lists? doesn't that mean imperative counters?

Comprehensions and folds are fine substitutes for imperative iteration.

Type inference doesn't scale

My 1-line example produced an incomprehensible error message in HUGS, but it was easy enough to see what was wrong with that code. Given a multi-hundred like Haskell program with fairly complex typing, it can be combinatorially more difficult to figure out what's wrong, given such an error message. This property doesn't seem at all tenable in a mainstream programming language.

Note that I'm only arguing against type inference: inferring x::Numeric in "f x = x+1". Type propagation, deriving the typing x::Numeric from "x = 7", is not problematic.

On Virtual Types

Note that notions of virtual types/classes are addressed very well in Scalable Component Abstractions and On the (un)reality of virtual types. You'll have to scroll down a bit on the latter to find the reference.

Update: I should have pointed out that one upshot of the above is that Scala and O'Caml lend themselves particularly nicely to maintaining groups of classes that are extended independently of each other.

Nested Inheritance

I think Nystrom, Chong and Myers's OOPSLA 2004 paper Scalable Extensibility via Nested Inheritance is also worth mentioning here (it's been mentioned before, but didn't get the attention I thought it deserved).

It allows you to subclass (implying subtyping) a group of classes, so that, just like when you override virtual methods, references to the overridden classes in the original group of classes now refer to the overridden versions.

A key difference with virtual classes and the like, is that nested inheritance works at the class-level, not the object-level: a class is an attribute of the enclosing class, not of an instance of this class. This made it easier to prove the soundness of their approach. (And arguably, it also might make their language extension easier to use than full-blown virtual classes.. that's just wild speculation, though..)

However, when the nested inheritance paper was written, the soundness of gbeta's and Caesar's virtual classes-approach hadn't been proven. This has changed recently (at POPL 2006, to be exact :-)): A Calculus of Virtual Classes by Ernst, Ostermann and Cook looks very interesting, but I haven't had a chance to study it yet. (I skimmed it briefly and they discuss the nested inheritance paper in their related work. At first glance, they seem to have some valid points about how nested inheritance actually isn't easier to use than their approach...)

As far as I can tell from your slides, both approaches (especially nested inheritance) seem to be pretty close to what you're looking for.

responses

My 1-line example produced an incomprehensible error message in HUGS, but it was easy enough to see what was wrong with that code. Given a multi-hundred like Haskell program with fairly complex typing, it can be combinatorially more difficult to figure out what's wrong, given such an error message. This property doesn't seem at all tenable in a mainstream programming language.

Incomprehensible? Such an error is easy to read for any programmer who has been using Haskell for a little while. It's telling you that there's no instance of Num for strings, so f can't be applied to them. When you look at the definition of f, that makes perfect sense, since it adds the components of the pair that you pass it. As an alternative to looking at the source for f, you can simply ask hugs or ghci to give you the inferred type. I don't see how the error is any more or less confusing than the one you're getting in C. Here's what GHC gives:

test.hs:2:7:
No instance for (Num [Char])
arising from the literal `3' at test.hs:2:7
Probable fix: add an instance declaration for (Num [Char])
In the first argument of `f', namely `(3, "4")'
In the definition of `a': a = f (3, "4")

Lazy evaluation isn't really all that costly -- it's more the case that most programmers are used to thinking in strict languages, and expect the lazy language to behave similarly with regard to performance. I think it's commonly forgotten that there are two ways to optimize a Haskell program: make it stricter, or make it lazier. Lots of people forget the second one. It's actually not so hard to get decent performance from lazy code if you're used to thinking in terms of demand. Sure, the compiler could also be a lot better in terms of the code that it produces, but compilers for strict languages have been around a lot longer than for lazy languages, and it's an easier problem there in the first place. I don't think it's a fundamental problem with non-strict semantics, and the benefits of non-strict semantics in terms of compositionality of programs are so incredibly huge, that it's worth a closer look. :) Laziness only when explicitly specified will defeat the whole usefulness of it in the first place, because strict functions will be all over the place and will prevent the gain in compositionality that laziness gives you.

There is a right place for strictness, but it's actually rather small compared to the cases when laziness wins out. You want strictness when reducing large data structures into small ones in a way which uses all of the large structure (i.e. not a search). I'd claim that in all other cases, laziness is actually preferable or equivalent to strictness in terms of performance. Strict higher order functions can be provided to deal with cases where you really want the strictness: see foldl' (note the prime) in Data.List as a good example.

If you want to know why lists are so popular in Haskell, ask why loops are so popular in C. :) The reason is mostly the same. Due to laziness, lists, along with the usual higher order functions are essentially reified loops which can be transformed all at once. In the absence of laziness, yes, arrays may be preferable. It's laziness that allows you to think of data structures like control structures.

Here's a nice (perhaps too small) example of where laziness improves compositionality. I could give much larger ones, and Why Functional Programming Matters gives lots -- from numerical computation to game AI.

Consider writing a function which checks if a particular string x occurs as a substring of another string y. A natural way to break this problem up is to check if x is a prefix of any of the tails of y. In an imperative language, you end up writing a couple of nested loops, with early breaks to proceed to the next tail as soon as it becomes obvious that x doesn't occur there. In Haskell, we can write:

isSubstringOf x y = any (isPrefixOf x) (tails y)

If Haskell was strict, this would be a very stupid way to write this function. Computing all the tails of y, then checking every single one of them to see if x is a prefix, then summarising that to see if any of the tests returned true is asymptotically much worse, and usually quite impractical. With lazy evaluation, the above code is essentially the same as the nested C loops with the early break out tests. We can successfully reuse existing tools like 'any', 'isPrefixOf' and 'tails', whereas if the language was strict, we'd be back to square one.

Another odd thing is that you complain about the syntax. The syntax of Haskell is actually one of the things which I like most about it. I suppose this is a matter of taste, but could you be more specific here? I think the syntax is really clean and easy on the eyes.

I'm still not really sure what you mean about type inference not scaling. Perhaps it doesn't scale to certain extensions of the type system (higher rank types / GADTs / existential types), but when combined with a relatively small number of programmer annotations, it's incredibly nice. Most Haskell programmers write type signatures on important top-level bindings anyway, as a form of documentation, to aid in thought about how to write the function, and to improve error messages. I still wouldn't want to lose type inference altogether though, because it would be a horrible pain to get anything done. You really don't want to explicitly type everything, and it's very often nice to be able to omit type signatures when they're unwieldy, or you're just writing something quick.

- Cale

Incomprihensible error messages

Incomprehensible? Such an error is easy to read for any programmer who has been using Haskell for a little while.

I am going to side with Tim on this one. It is incomprihensible, and more research into type error reporting is needed. Bastian Heeren's thesis has a lot of interesting information on the subject, if you were to skip all chapters but one I would recommend atleast skimming Chapter 12 of it.

Lazy evaluation vs. Streams

If Haskell was strict, this would be a very stupid way to write this function.

Isn't it a property of lists in Haskell having two uses: (algebraic) lists and (co-algebraic) streams? In a strict language with streams you can write similar code and have similar results.

I'm not sold on non-strictness as a default for improving compositionality, IMHO all you need is a way to declare codata in you language.

The C++ traits pattern

Re: virtual classes, have you looked at the C++ template traits pattern?

I know templates get a lot of flack for their syntactic ugliness, but this is one problem they can really solve quite well.

Basically you make everything a template, and inherit from a traits class passed in as one of the template parameters. I can then instantiate an Actor<MyCustomData>, which can inherit custom methods, fields, etc. There is a lot of knowledge in the C++ community about how to deal with whole familes of relations between such constructs, for instance how to implement things like:

	Collision<CustomCollisionData> Actor<CustomActorData>::DetectCollision();

and have the code be able to query the CustomActorData traits to automatically determine the appropriate type for CustomCollisionData.

Modern C++ Design, by Andrei Alexandrescu, is pretty much the definitive text on this stuff: check out http://erdani.org/

What about the verbosity problem?

Paul Graham's article about what makes hackers prefer some programming languages over others makes the interesting point that programmers are lazy -- concise syntax is highly desireable. I think this is part of why I enjoy coding in Python but hate coding in Java. Tim's idea of adding more annotations seems like it would make this problem even worse. I've been thinking over compromises but I'm curious if anyone else has thought about this.

The verbosity "problem"...

One way to address the verbosity problem is simply to change your point of view. In a language with a sufficiently expressive type system, it no longer makes sense to think of these annotations as being redundant verbosity. Instead people tend to think of them as part of the program itself. This is one of the things that people mean by "typeful programming".

I disagree: verbosity is a

I disagree: verbosity is a problem, it reduce readability.
But with local type inferencing, you can have static typing and nearly the same readability as with dynamic typing.
Sure, the 'type inference' remove a little bit of the safety of static typing, but not so much IMHO.

Tim's idea of adding more

Tim's idea of adding more annotations seems like it would make this problem even worse.

I'm a huge fan of concise syntax.

I'm not such a fan of syntax that's so concise that it hides potential problems. For example, "a[i]" in Java looks like you're just reading an element from an array, but actually translates into something like: "if a is NULL, throw an exception; if i<0 or i>a.length then throw an exception, otherwise read the element from the array".

So the goal of further type annotations would be to introduce more explicitness in type declarations in exchange for making other parts of the program clearer by virtual of eliminating all of the hidden things that can go wrong with them.

Obviously, this isn't what you want for, e.g. a Python-like scripting language, where coders want to write simple code with minimum effort. But for the sort of code we write -- hundreds of thousands of lines of game and engine code -- the explicitness and extra compile-time checking would be a net productivity improvement in the lifecycle of writing, extending, debugging, testing, and shipping complex software.

which constraints must the new language fullfill?

The last days i thought about your talk. you want a new language that's better suited for your task. Which constraints do you have?
(a good bridge to c++ ?)

what do you think of:

ocaml / F#

Alice

Mlton with CML

and last:

Titanium

Titanium is an explicitly parallel dialect of Java developed at UC Berkeley to support high-performance scientific computing on large-scale multiprocessors, including massively parallel supercomputers and distributed-memory clusters with one or more processors per node. Other language goals include safety, portability, and support for building complex data structures.

The main additions to Java are:

* Explicitly parallel SPMD control model
* Flexible and efficient multi-dimensional arrays
* Built-in types for representing multi-dimensional points, rectangles and general domains
* Unordered loop iteration to allow aggressive optimization
* User-defined immutable classes (often called "lightweight" or "value" classes)
* Zone-based memory management (in addition to standard Java garbage collection)
* Augmented type system for expressing or inferring locality & sharing properties of distributed data structures
* Compile-time prevention of deadlocks on barrier synchronization
* Library of useful parallel synchronization operations and collectives
* Operator-overloading
* Templates (parameterized classes)
* Cross-language application support

What we don't need/want is another Java

What we don't need/want is another Java or any bastardization of Java, what we need is a new language(s) with a less ad-hoc, formalized & expressive type system, a language which is fundamentally purely functional as it's basis/core not the other way round. As Tim mentions most numerical computations in game engines are purely functional anyways and there may only be a few places where state-full computation maybe necessary.

We do not need/want any kind of C++ templates, we can have parametric and ad-hoc polymorphism on one side and a proper macro system for metaprogramming on the other side (if necessary).

I think what we really need to be doing is pay attention to languages like Epigram because of it's dependent type system and languages like Alice ML for all the useful/necessary stuff it brings to large-scale massively concurrent/parallel applications. I think some kind of new dialect and/or hybrid of Epigram, Alice ML, and abit of haskell would be the way to go.

Not everybody agrees

It is interesting to me that folks deep in the bowels of game development are happy doing what appears to me to be banging their heads against rocks. (I'm not meaning to pick on anybody, this is just email I stumbled across.)

That pesky real world keeps on creeping in

From your link:

OK, looonggg, e-mail, but it covers a lot of ground in terms of why
I believe the "Ditching C++" thread is only somewhat viable. I'll ditch
C/C++ in a heartbeat given a better "proven" system though.

It's doubtful you're in a position to "pick on" someone like Tim Sweeney or John Carmack. Like most of us, game developers have to co-exist in the real world where you don't have the luxury to bank a $10 million project on unknowns, especially when you have to interface with 3rd party libraries.

It's doubtful you're in a

It's doubtful you're in a position to "pick on" someone like Tim Sweeney or John Carmack. Like most of us, game developers have to co-exist in the real world where you don't have the luxury to bank a $10 million project on unknowns, especially when you have to interface with 3rd party libraries.

Felix seems like a viable alternative in that case.

The average project

The average project manager's response is going to be "WTF is that?!", followed by an immediate no. It's still essentially an unknown.

$ doesn't mean Right

i'm guessing the same general attitude was around about whatever people used before they used Erlang or O'Caml or even, shudder, Java at some point (and probably still is, in some cases). part of the problem with there not being a tipping point vs. there being one is that people try to only play it safe.

(i don't particularly disagree with what you wrote, i just think it is all pretty subjective in the end. in other words, there are always costs and pains, and you have to choose which fight you feel the most interested in fighting. personally i don't like the C/++ fights, and would rather fight e.g. porting the JVM to a new host so I can write Clojure or Scala, etc. but as you say i am not J.C. etc.)

So what are we waiting for?

I'v seen things like this before.
My question is : If this things are so cool, what are you waiting for? Go for it!
(Personally I think there should be a Microsoft or Sun or other buzz-generator guys to push something.
There are many things in computer science that can be practically useful. But no one is aware of them.)
After all nothing is going to happen.

Social engineering

Personally I think there should be a Microsoft or Sun or other buzz-generator guys to push something.

You've got it. Everyone can write a programming language. The hard part is to social engineer the mainstream.

Google "site:lambda-the-ultimate.org tim sweeney"...

...and you'll see that Tim isn't waiting for anything. Keep an eye on Unreal technology 4, and encourage him to spin his language research off from it.

2009 stopping in to say hi!

Welp, 2009 is here, and he says there's still no 20+ CPUs to deal with and that C++ still does everything fantastically.

At work I use 16 and 32 core

At work I use 16 and 32 core CPUs. More interesting, on my desk, is a quadcore CPU targeted for phones. There's an increasingly chicken-egg scenario between these devices and software for them.

C++ hasn't done "everything

C++ hasn't done "everything fantastically" at any point in history.

ADA

Ada was built from the ground up for reliability, so much so that it specializes in applications that people's lives depend on. It compiles to native code, supports useful features like OOP and operator overloading, is one of the easier languages to use for interfacing with C and C++ code, has built-in support for concurrency, and is very portable to different platforms ranging from desktops to embedded systems. If Ada can fly an airplane I don't see why it can't run a flight sim as well.

The main drawback I can see? Irritatingly verbose syntax, probably in part due to Pascal's influence and in part due to the desire to avoid characters (like {}) not present on a European 7-bit terminal's keyboard (whence C's oft-lamented trigraphs).

I'm sure there are other drawbacks that I'm not aware of (I admit I haven't worked much in the language myself) but it does seem to me that Ada is an oft-overlooked alternative to C++ that supports its most important features while avoiding most of its misfeatures. (If you can tell me differently, I'm all ears.)

re: ada syntax

i (still) want a Lisp Flavo{u}red Ada front-end (with repl, of course).

Some languages solve all problems

but not all problems are best solved in those languages.

But, to stay in the mood of the current swing. I hereby vote for assembly. ;)

Don't Scoff

We - that is game developers - use assembly quite frequently. No reason for any sarcasm, assembly is sometimes the right tool for a job.

Wasn't

It was a joke/response to the C++ post. As I said, the problem defines the (solution) language, not the reverse. I agree with you.

We - that is game

We - that is game developers - use assembly quite frequently. No reason for any sarcasm, assembly is sometimes the right tool for a job.

It might be because you don't have the proper generator tools to write the assembly for you. I often wind up writing generators to generate GPU code, but its much better than the alternative (writing the GPU code raw, even GLSL/HLSL is often too low level for comfort).

Actually, I would be interested to know the extent of generative programming in game development: is it common? Is it found to not be useful (easier to write the assembly rather than the generator for the assembly)? Or maybe they often just don't have the PL people on staff to write the generators?

We use a shader builder for

We use a shader builder for microcode generation written in C++. The main problem with it is that if you want to change a shader (add some feature, optimize etc.) you first have to change the C++ code that generates the shader, compile (takes forever, being C++) and then run the shader generator on the input (a Maya shader node), and then load it in the game, and only after this ~5 minute process is complete can you see the result.

The upside is that generating snippets of highly optimized microcode that can be pasted together (with lots of tricky glue code) is usually a hell of a lot more efficient than HLSL (factors of 2x are not uncommon, but on average it's probably more in the range of 10-15%), but it's a bit of a nightmare to iterate with it.

So we're moving away from this. From now on we'll concentrate on making it easier to write shaders manually (in HLSL or microcode) and iterate rapidly. It's possible that we'll end up with fewer, but more general, shaders where we rely on static branches (essentially boolean flags) to turn features on and off, and in theory this will lead to crappier shaders (especially register usage), but the smaller number of shaders (as they're not generated), OTOH, will hopefully lead to larger batch sizes and fewer state changes which is probably a much bigger problem in the grand scheme of things.

I'm toying with the idea of writing my own HLSL compiler since the stock one is so rubbish. That's probably a fruitful place to spend effort.