High level languages with optimal code generation

One of the issues that arises during the implementation of programs that need to have high performance, is that a large percentage of time is spent doing and debugging optimizations.

These range from reorganizing data, reorganizing control flow, and sometimes even going as far as using inline assembly.

This is work seems to be very error prone and quite difficult to do, with a lot of specialized knowledge that is necessary to achieve good performance.

A good example of the difficulties of this process can be seen in the series handmade hero.
From about episode 590 to about episode 618. Much of the work done in the episodes is about doing optimizations.

How can we make it easier for compilers to help generate faster code automatically?

I found out about superoptimization, but this seem to be very limited in capability. Specifically, it is very slow, can't deal with control flow and can't do anything about data structures.

Regarding this last point, most languages force us to decide on them, and so a compiler can't even do much about it. A different strategy is used in Compilation of Bottom-Up Evaluation for a Pure Logic Programming Language, where a Turing complete Datalog inspired language is proposed. With this approach, the data structures are selected by the compiler.

What other research looks at this problem, and what other ideas do you have?

Is relational algebra + mapreduce a good way to approach this problem?

How can we have program descriptions that compile into optimal or close to optimal code? Or at least how can we decouple optimizations from the problem description?

P.S. This is a bit of a sequel to the Why is there no widely accepted progress for 50 years?. The answers I got in that thread were very thought provoking and helped to clarify much of my thinking. Thank you very much to all that participated in the discussion!