User loginNavigation |
automatic program parallelization for multicore cpus as a software problemI just came across an article regarding Intel having massive multicore CPUs ready in the near future. This kind of CPUs will require a fundamental change of how we write programs, or so they say (there are plenty of discussions around about this). But do we really have to change the way we write software? I wonder if there can be a part on the chip that automatically makes different threads of execution out of a single thread, by correlating data dependencies at run-time. Just like branch prediction, a similar piece of logic could be used to 'predict' data correlations, and thus separating the instruction stream into different threads, as if the code was multithreaded. A special lookaside buffer could be used to prevent simultaneous access to memory locations: since data are always fetched from the cache, the cache itself could contain flags that can be used to monitor simultaneous access. The CPU would catch the simultaneous access event and modify its threads and prediction statistics accordingly, so as that next time the simultaneous access is avoided. I am asking LtU for this because I think the problem is essentially a software problem. In my mind, an instruction always targets a memory location (either directly or indirectly through registers), so the CPU could monitor dependent instructions and separate them into different threads of execution. The mechanism could be extended to registers, since registers are essentially memory locations inside the CPU. Assuming that dependency tracking could take place in run-time, what programming language semantics are required in order to help the hardware run the software more efficiently? Is the C semantics enough? For example, Fortran loops can be automatically vectorized because they do not contain pointer aliases. Finally, are purely functional programming languages better, on the semantics level, for automatic parallelization? on the surface, it certainly seems so, but what about languages that are translated to C (or equivalent) code? aren't those programs under the same constraints as C? would we have to eliminate C as the middleman and encode parallelization tips (coming straight from the FP semantics) directly in the instruction stream? By Achilleas Margaritis at 2007-06-15 14:21 | LtU Forum | previous forum topic | next forum topic | other blogs | 8297 reads
|
Browse archives
Active forum topics |
Recent comments
1 week 1 day ago
41 weeks 3 days ago
41 weeks 3 days ago
41 weeks 3 days ago
1 year 11 weeks ago
1 year 15 weeks ago
1 year 17 weeks ago
1 year 17 weeks ago
1 year 20 weeks ago
1 year 24 weeks ago