Hello,
New here, so if I missed the point of this place, apologies. I'm a hobbyist that's interested in languages in general.
I've been researching the concepts behind parsers and compilers for a few years and I'm even working on a few projects
associated to such research: one related to Parser generation, very incomplete, LL(*) attempt will likely use
pre-calculated look-ahead table, I think it'd be similar to PEG logic, but I'd have to actually thoroughly research
that to answer it, and another project related to a generalized high-level abstract syntax tree which provides an
abstract typing model with a Common Language Infrastructure (.NET) implementation and an intermediate code
implementation.
The focus of this topic is the latter project, the compiler framework. The end goal of this project is to provide
a near C#-level representation of a code document object model (not to be confused with Microsoft's sad CodeDOM
framework) for static languages (aptly named, the Static Language Framework, SLF for short.)
*end introduction*
I've been searching for a while for a place that can give me some realistic feedback about a project I've been
working on called Abstraction: a Static Language Framework, I hope to enable everything you'd find in a high-level
static language like C#, classes, structs, et al, lambdas, LINQ, Extension methods, iterators, and so on. Into
this, I want to add a few features to the CLI's definition of Generic type parameters: duck typing.
As of now, I've mostly constructed the high-level intermediate code representation, a unified typing model, a
majority of the expression/statement representations, right now I'm laying the ground work for the linker, which will
be irksome because the available namespaces in scope can change method to method, by design, to accommodate for a
situation that's likely been encountered by a few C# users: extension method overlap. Below is a list of what I hope
to be able to pull off with this project, mind you it's focused on a Document Object Model to describe code and then
compile it, so there's no language-specific logic tying it to VB or C# exclusively (a reason for that below.)
- Multiple File Assemblies
- Duck Typing (Through type-parameters)
- Parametered Constructor Constraints
- Property Constraints
- Method Constraints
- Indexer Constraints
- Event Constraints
This functionality will work by creating a public, but non editor browsable interface
on which 'bridges' can be constructed. Call sites which use the duck-typed generics will
generate a bridge for the types accordingly. Prior to use of the generic the compiler
will check the bridge, for a given closure of the generic-parameters in use, and inject
the compiler-generated implementation of the interface, which will be a basic
member-mapping.
The information associated to the constraints will be embedded on a per-type-parameter private
empty struct which will be linked to the type-parameter through an attribute that points to that
private struct, I know of no other better way since constraint signatures which refer to other
type-parameters can't be embedded in attributes alone: attributes can't typeof(TParam) legally.
- Lambda Expressions
- From my understanding of lambda expressions, closures are created out of the given scope of the
lambda, and based upon scope nesting, the locals of the method are hoised into a generated
class, the lambda expression being a method within that class, the internal links to the locals
within the method are redirected to a singleton instance (relative to the method call) of the
locals associated to the scope.
- Language Integrated Query
- LINQ is syntactical sugar which is a series of extension methods which resolve based upon
the active extensions in scope. Once rewritten the identity binding is based off of the
methods that match the given sequence of method calls. IQueryable<T> resolves in light
of being IEnumerable<T> because of it being higher in the interface inheritance chain.
- Iterators and Asynchronous methods
Iterators, and Asynchronous Methods respectively, work in an abstract sense, by taking
a given method and chopping it up into sections based off of a predefined state change signaling
symbol of some kind. In iterators it's yield *, in asynchronous methods its await.
In the case of asynchronous methods, it gets a bit trickier, primarily because await is allowed
within expressions, whereas yield * is only allowed at the start of a statement. My guess is this will
require a bit more in-depth analysis on the expression level, adding private locals to the state-machine
that's generated to hoist the previously calculated elements of the, potentially pending, method call.
Example: F(G(), await N()), a local would need to be created to hoist the result of the call to G
to ensure that the execution order is preserved. Exceptions can be made to fields since there is no chance of
state change occuring, as there is no way to capture gets from fields.
The last four main items above are the major points I wanted to list because they're the main items I might have
misconceptions about, and they're the toughest of the bunch, from what I can tell.
The reason there's no language-specific semantics implied within this framework is a secondary goal: code-generation.
To ensure that when you write code for a specific language you get the proper result when encoding expressions,
language-specific binary expressions are available (to enforce C# operator precedence, you'd use C# expressions
of such.)
I've already done some previous testing with things like constant folding, short circuit logic and the like.
I'm more hopeful that the folks here can provide some, much needed, insight, perhaps language suggestions (being most
people here are interested in language design or languages in general, it should keep the suggestions realistic.)
Recent comments
22 weeks 6 days ago
22 weeks 6 days ago
22 weeks 6 days ago
45 weeks 14 hours ago
49 weeks 2 days ago
50 weeks 6 days ago
50 weeks 6 days ago
1 year 1 week ago
1 year 6 weeks ago
1 year 6 weeks ago