Friend of mine used to write machine code for Crays, which had (as he told it, as I remember) vector operations built in, along with parallel execution pipelining of some sort - so you would execute one instruction to load up a vector, do some other stuff while it was happening, then come back and collect the results. I would imagine that writing a compiler for that architecture would present some interesting challenges. But also, I don't see why a reasonably smart compiler shouldn't be able to recognise for-loops as above and optimize accordingly - I know C/++ ostensibly exposes "real" pointers to "real" memory addresses for manipulation, but that's no reason why a compiler shouldn't treat what looks like pointer-hopping traversal over a linked list as an abstraction rather than implementation gospel. In other words, forget that C is "portable assembler" and pretend it's a somewhat higher-level language than it really is. As is no doubt evident, I have no idea of how difficult this would actually be to do...
|