Concerning introspection and compilation.

My knowledge in the area of languages is limited, that is why I am always visiting this site. Some of the papers, and terminology are very helpful and give me many 'over-my-head' interesting reads, so to the editors and commentors you have my thanks for the education.

I don't have a detailed grasp on introspection but I feel I have enough knowledge to appreciate its practicality. My main question arises from a seperation I see, in dealing with introspection, between so-called compiled languages like C, and C++, and dynamic languages like Python and Ruby. Does a language make sacrifices to gain the ability to introspect? And if so, what kind of sacrifices does it have to make? Can a languages like C or C++ gain introspection?

Though I guess the ultimate question would be, would languages like C and C++, and in turn their programmers, gain anything substantial from introspection?

I am sorry if my question is not completely clear, so please prod me with questions if the questions need to be refined for better answers.

Regards,

MJ Stahl

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Looking for some reflection examples.

I broached the subject of reflection in Alice ML recently and got a couple of responses (here, here, here, and here). If anyone's got some good examples of where reflection and introspection come in handy, I'd be interested.

Java's reflection is necessary in a lot of cases because you can't assign an interface to an object unless the object explicitly lists it as implemented. In Alice ML, the type information in the package allows you to assign a Signature that was not known by the module when it was packed. The unpack routine does a runtime type check to verify that the Signature can match the module before being loaded.

Bytecode manipulation & intercession, metadata & introspection

Java's reflection is necessary in a lot of cases because you can't assign an interface to an object unless the object explicitly lists it as implemented.
One way around this is to subclass the class in the question and to list the needed interface (sometimes the resulted class is called proxy). Can be done at runtime, of course (just play with classloaders and bytecode, or use some of the existing libraries for that). Depending on your requirements this may be better or worse than classical reflection (e.g., if there is some code explicitly referring to the original class, you probably should not proxy it. If all the access is via factories, these factories are the ideal place to create proxies).

The main use of reflection that comes to mind other than proxying is, um, introspection :-) I mean finding information about a runtime object without having this information statically known to the introspector (e.g., UI builder). The boundary between introspection and just metadata may be pretty fuzzy sometimes. In JavaBeans, JDO, and some implementations of EJB the same information can be obtained by either introspection or reading of explicit "descriptors".

Motivation and background

You might want to follow references from this thread to gain more background (not sure which of B* languages is the best to start with, though).

Reflection, Java and C++

Reflection is more needed in Java than in C++, because the target audience is different: Java is a more dynamic environment...C++ is a 'low level' programming language with 'high level' constructs.

The cool thing about Java reflection is that the reflection bytecodes are recognized by the JVM and replaced with the actual code, and therefore optimized as if the reflected method call was written in a direct way. This is not possible with C++ because all compilation is static.

Sometimes reflection is needed even in C++ though. Here is an example: in one of the apps we have at work, the user is able to built scenarios out of predefined components. These components have properties. A previous version of the app that worked without reflection spanned many thousands of lines, because the gui for every component property had to manually be coded. The latest version uses reflection to find out the properties of each component and builds the gui on run-time automatically: a change in the property is directly visible in the gui. The whole thing cut the code size drastically (thank's to Qt's reflection).

Since Qt uses reflection so easily, I suspect that reflection/introspection can live in languages like C++ without many sacrifices; especially in the case of objects that already contain a pointer to class information, i.e. the vtable.

Reflection bytecodes

The cool thing about Java reflection is that the reflection bytecodes are recognized by the JVM and replaced with the actual code, and therefore optimized as if the reflected method call was written in a direct way.
Which of JVM instructions are specific to reflection?

Or did you mean that the implementation is free to replace calls to specific methods (like Method.invoke()) by a direct invokation of the reified method? Where in the spec is it given such an option?

Or did you mean that the implementation is free to replace interpretation by compilation to native code (which is of course true, but is not specific to reflection, though interplaying with it)?

I meant that a JVM can replac

I meant that a JVM can replace 'Method.invoke(object, foo)' with '((SomeClass)object).foo()' (SomeClass is the class/interface that declares the 'foo' method).

I doubt that

...because to be able to do that the JVM must be aware of Method.invoke() semantics (which is not part of JVM spec, but a part of Java the language spec). What if I come up with my own language that runs on JVM, and happens to have a class Method with method invoke, that has semantics completely different from that defined by Java?

Speeding up reflection

happens to have a class Method with method invoke

I don't know about that, but you can rely on the fact that the JVM (well - Sun's implementation in particular) does indeed replace certain Java library methods with inline processor code. There is, for example, an entire set of methods in java.lang.Math which are "intrinsified" by the JVM. You can hit up Azeem Jiva - one of Sun's Hotspot compiler engineers - for a complete list.

I am fairly certain that the optimizations to reflection in 1.4 were of a similar kind - i.e. specializing the VM's treatment of particular library calls. In general, I think you'd be surprised at the number of Java-language specific optimizations made in JVMs.

-- Java.Next

More on reflection

Or did you mean that the implementation is free to replace calls to specific methods (like Method.invoke()) by a direct invokation of the reified method?

I believe that from a bird's eye view, this is what happens, but the actual technique is more sophisticated and slightly more expensive.

Where in the spec is it given such an option?

Any implementation techniques described in the JVM specification would be non-normative. You're free to obtain the correct results, in whichever way you choose to implement your JVM. That said, let me belabor the point that the dirty details of performing reflection optimization involves more than a simple replacement of Method.invoke( foo, bar ) with foo.bar().

-- Java.Next

Space

A large part of introspection is just having type metadata. "Language-level introspection support" is what you call it when the compiler creates all that data for you.

As far as I can tell, a general-purpose programming language doesn't have to make any fundamental "sacrifices" to accomodate introspection. At the level of C++'s RTTI, the only additional cost is the space that the extra metadata takes up.

At the level of Java's reflection, I think you have to start making tradeoffs: are you willing to reduce the efficiency of "normal" code to speed up the performance of the introspection functionality?

Once you get into Python-style reflection where you can modify type at runtime, the tradeoffs become more drastic (though I don't know if this is still considered "reflection"). Such invasive runtime capabilities preclude many optimizations.

Sacrifices of reflection

Your code can be reverse engineered more easily if you support reflection (some would see that as a benefit!). Obfuscation can mitigate this, but in practice symbols are often tied into the implementation in such a way that the code cannot be obfuscated (e.g. using reflection to link database field names of a record with attribute names of an object).