Punctuated equilibrium in the large scale evolution of programming languages

Sergi Valverde and Ricard Solé, "Punctuated equilibrium in the large scale evolution of programming languages", SFI working paper 2014-09-030

Here we study the large scale historical development of programming languages, which have deeply marked social and technological advances in the last half century. We analyse their historical connections using network theory and reconstructed phylogenetic networks. Using both data analysis and network modelling, it is shown that their evolution is highly uneven, marked by innovation events where new languages are created out of improved combinations of different structural components belonging to previous languages. These radiation events occur in a bursty pattern and are tied to novel technological and social niches. The method can be extrapolated to other systems and consistently captures the major classes of languages and the widespread horizontal design exchanges, revealing a punctuated evolutionary path.

The results developed here are perhaps not that surprising to people familiar with the history of programming languages. But it's interesting to see it all formalized and analyzed.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Raw data

Wow, more like this please :)

I find figure 1 to be quite confusing as it is meant to illustrate the directed relationship between languages. My confusion is that it is clearly a spring-model layout of an undirected graph, so it does not seem to convey much information about the raw data that the paper is based on. Skimming wikipedia I can see that they have scraped the "Influenced-by" meta-data for each of the programming language pages, this seems to be quite an informative dataset. An alternative view of the same data can be found here, which perhaps gives some intuition about their dataset.


I have to admit I'm a little skeptical about the source (Wikipedia) of their dataset. OTOH, I'm not sure where else you'd get a similar kind of dataset that is (a) as comprehensive, and (b) as widely reviewed by so many people.

evolution and natural selection of languages?

"Darwin’s theory of natural selection has been often used
as a basic blueprint for understanding the tempo and
mode of cultural change, ..."

And where did Darwin get this idea of 'descent with modification'? Er, actually from philologists who studied the descent of ancient languages.

With genomic change, at least influence is wholly attributable to two parents. With natural language change, maybe a handful. One thing that messes up those nice language 'inheritance' trees is linguistic 'sideways' borrowing of vocabulary -- for example all those Latin and French words into English, whose grammar and sound-pattern is more of Germanic/Teutonic descent.

The thing is: all programming languages have come (and gone) within living memory (very nearly).

Darwin's evolutionary approach (wherever it came from) just doesn't apply.

Evolution doesn't require

Evolution doesn't require bisexual reproduction.