Agglutination and ethnocide

My editorial

Yesterday, in my update in French, I started discussing some literature, which I came by recently, devoted to the issue of quantitative research in long-term social changes (see “La guerre, l’espace, et l’évolution des sociétés” ). As we are talking long-term, this stream of research comes mostly from history. I am currently reviewing one of those papers, entitled ‘War, space, and the evolution of Old World complex societies’ (Turchin et al. 2013[1]). To me, science is, at the end of the day, a method of discovering things. When I see a piece of research done by other scientists, I most of all look for methods. In this precise case, the method is quite illuminating for my own purposes in research. At the baseline of their methodology, Turchin et al. divide big populations in big territories into basic, local cells, equivalent to local communities, and assess three essential probabilities, namely that of coordination occurring between two or more cells, as opposed to the probability of disintegration in such coordinated structures, as well as the probability of at least one cell being destroyed by others. Two other probabilities come as instrumental in calculating the fundamental three: the probabilities of social mutation. Turchin et al. construe the concept of social mutation around that of valuation. At any given moment, there is a set of traits, in a society, which make this society optimally competitive, accounting for requirements stemming from the environment. Any given society develops its own traits through valuating them in its own culture, or, conversely, disintegrates some traits by culturally denying their value. As I understand this methodology by Turchin et al., the concept of valuing some societal traits or disvaluing them is a compound, covering both the strictly spoken ethical valuation, and the actions informative about it (investment, creation or disintegration of specific social structures etc.).

In short, social mutation is supposed to be something akin genetic mutation. There is a set of traits, in a society, and each of those traits can be switched on, or switched off. This is the social code. I am trying to represent it below, in a semi-graphical example, where ‘1’ stands for the given trait being switched on, and ‘0’ to its deactivation.

Trait A     >> 1

Trait B     >> 0

Trait C     >> 0

Trait D     >> 1 etc.

Each society has such a social code, and, in the background, there is some kind of implicitly optimal code, the one that makes the top dog in the pack of societies. The local, social code, observable in any given society displays some Euclidean distance from this optimal code. Putting it simply, in this world, when you are a society, you have all the interest in having the right traits switched on at ‘1’, with the not-quite-favourable ones switched off, i.e. at ‘0’. What Turchin et al. assess, for any given local society studied empirically, is the probability of favourable traits passing from 0 to 1 (µ01: functional mutation), or, conversely, being deactivated from 1 to 0 (µ10: dysfunctional mutation). This specific methodology allows setting baseline probabilities as well. If the general assumption is that societies have a tendency to f**k things up, rather than figuring them out correctly (this is, by the way, what Turchin et al. assume), then µ10 > µ10. If, on the other hand, we have some optimism as for collective intelligence, we can settle for µ10 < µ10. Of course, µ10 = µ10 is a compromise at the weakest possible level of assuming anything. Anyway, the proportions between those probabilities, namely µ01 and µ10, make the overall likelihood for the emergence of large political structures, the ‘ultrasocial’ ones, as Turchin et al. call them (you know: army, taxes, government etc.) in a given set of local communities. Those chances are calculated as: u = ((µ01/(µ01 + µ10)). The ‘u’ symbol comes from that ‘ultrasocial’ adjective. The baseline probabilities in the model, as they come from empirical tests, are: µ01 = 0,0001 and µ10 = 0,002. That makes the likelihood u = 0,05. In other words, in a given set of local communities, a priori not connected by ultrasocial institutions, which, in turn, could stimulate the emergence of political systems, the likelihood that such institutions are triggered on is like 5%.

On the grounds of these findings by Turchin et al., I start my own reasoning. Just hold on to something, ‘cause my reasoning, it can really get some swing, on the account of me having that curious ape inside of me. Anyway, I am translating that tiny u = 0,05 likelihood into the possible behaviour of large human populations living in a territory. Some 5% of those humans, whoever they are, is likely to develop social traits, which can turn them into ultrasocial political systems. This, in turn, means that in every large collection of local communities a relatively small, ultrasocial core is likely to emerge, and this core is going to agglutinate around itself consecutive local communities, to make something really political. Still, the actual empirical results obtained by Turchin et al. are way above those baseline probabilities. The likelihood of turning on the right ultrasocial genes in a given society turns out to be like 0,47 =< µ01 =< 0,51, and the probability µ10 of switching them off ranges from 0,49 to 0,52. That makes the likelihood u = ((µ01/(µ01 + µ10)) floating consistently close to 50%. In other words, if you take 1 million primitive, proto-political people (voters), the baseline likelihood of some among them turning into serious political players, i.e. of turning on the right ultrasocial traits is like µ01 = 0,0001, whilst the probability of them consistently not giving a s***t about going political is µ10 = 0,002, which, in turn, makes that likelihood u = 0,05 of anything seriously political going on in those 1 million people. Now, my internal curious ape spots a detail in the article: those baseline probabilities correspond to something that Turchin et al. call ‘equilibrium’. As an economist, I have a very ground-to-ground approach to equilibriums: it would be nice if they existed in reality, but most of the times they don’t, and we have just a neighbourhood of equilibrium, and still, it is if we are lucky.

I put, now, those two sets of numbers back to back, i.e. the parameters of equilibrium against those empirically inferable from actual historical data. One conclusion jumps to the eye: in real life, we, humans, tend to be some 10 times more prone to do politics in large structures, than we are technically expected to be in the state of equilibrium (whatever is being balanced in that equilibrium). By the way, and to be quite honest in relation to that article by Turchin et al., agglutination around the political core is not the only option actually available. Ethnocide is another one, and, sadly enough, quite recurrent in the historical perspective. Recurrence means, in the results obtained by Turchin et al., a likelihood of ethnocide varying between emax = 0,41 and emax = 0,56 in communities, which had not triggered on their ultrasocial traits at the right moment. This is sad, but seems to be rock solid in that empirical research.

[1] Turchin P., Currie, T.E.,  Turner, E. A. L., Gavrilets, S., 2013, War, space, and the evolution of Old World complex societies, Proceedings of The National Academy of Science, vol. 110, no. 41, pp. 16384 – 16389

I cannot prove we’re smart

My editorial

I am preparing an article, which presents, in a more elegant and disciplined form, that evolutionary model of technological change. I am going once again through all the observation, guessing and econometric testing. My current purpose is to find simple, intelligible premises that all my thinking started from. ‘Simple and intelligible’ means sort of hard, irrefutable facts, or, foggy, unresolved questions in the available literature. This is the point, in scientific research, when I am coining up statements like: ‘I took on that issue in my research, because facts A,B, C suggest something interesting, and the available literature remains silent or undecided about it’. So now, I am trying to reconstruct my own thinking and explain, to whomever would read my article, why the hell did I adopt that evolutionary perspective. This is the point when doing science as pure research is being transformed into scientific writing and communication.

Thus, facts should come first. The Schumpeterian process of technological progress can be decomposed into three parts: the exogenous scientific input of invention, the resulting replacement of established technologies, and the ultimate growth in productivity. Empirical data provides a puzzling image of those three sub-processes in the modern economy. Data published by the World Bank regarding science, research and development allow noticing, for example, a consistently growing number of patent applications per one million people in the global economy (see ). On the other hand, Penn Tables 9.0 (Feenstra et al. 2015[1]) make it possible to compute a steadily growing amount of aggregate amortization per capita, just as a growing share of aggregate amortization in the global GDP (see Table 1 in the Appendix). Still, the same Penn Tables 9.0, indicate unequivocally that the mean value of Total Factor Productivity across the global economy has been consistently decreasing since 1979 until 2014.

Of course, there are alternative views of measuring efficiency in economic activity. It is possible, for example, to consider energy efficiency as informative about technological progress, and the World Bank publishes the relevant statistics, such as energy use per capita, in kilograms of oil equivalent (see ). Here too, the last decades do not seem to have brought any significant slowdown in the growth of energy consumption. The overall energy-efficiency of the global economy, measured with this metric, is decreasing, and there is no technological progress to observe at this level. A still different approach is possible, namely that of measuring technological progress at the very basic level of economic activity, in farming and food supply. The statistics reported by the World Bank as, respectively, the cereal yield per hectare ( see ), and the depth of food deficit per capita (see ), allow noticing a progressive improvement, at the scale of global economy, in those most fundamental metrics of technological performance.

Thus, the very clearly growing effort in research and development, paired with a seemingly accelerating pace of moral ageing in established technologies, occurs together with a decreasing Total Factor Productivity, decreasing energy efficiency, and just very slowly increasing efficiency in farming and food supply chains. Now, in science, there are basically three ways of apprehending facts: the why, the what, and the how. Yes, I know, there is a fourth way, the ‘nonsense!’ one, currently in fashion as ‘this is fake news! we ignore it’. Still, this fourth way is not really science. This is idiocy dressed fancily for an electoral meeting. So, we have three: the why, the what, and the how.

The why, or ‘Why are things happening the way they are?’, is probably the oldest way of starting science. ‘Why?’ is the intuitive way we have of apprehending things we don’t quite understand, like ‘Why is this piece of iron bending after I have left it close to a furnace?’. Probably, that intuitive tendency to ask for reasons reflects the way our brain works. Something happens, and some neurons fire in response. Now, they have to get social and to inform other neurons about that something having happened. Only in the world of neurons, i.e. in our nervous system, the category ‘other neurons to inform’ is quite broad. There are millions of them, in there. Besides, they need synapses to communicate, and synapses are an investment. Sort of a fixed asset. So, neurons have to invest in creating synapses, and they have a wide choice as for where exactly they should branch. As a result, neurons like fixed patterns of communication. Once they make a synaptic connection, they just use it. The ‘why?’ reflects this predilection, as in response we expect ‘Because things happen this way’, i.e. in response to this stimulus we fire that synaptic network, period.

The problem with the ‘why?’ is that it is essentially deterministic. We ask ‘why?’ and we expect ‘Because…’ in return. The ‘Because…’ is supposed to be reassuringly repetitive. Still, it usually is not. We build a ‘Because…’ in response to a ‘why?’, and suddenly something new pops up. Something, which makes the established ‘Because…’ look a little out of place. Something that requires a new ‘Because…’ in response to essentially the same ‘why?’. We end up with many becauses being attached to one why. Picking up the right because for the situation at hand becomes a real issue. Which because is the right because can be logically derived from observation, or illogically derived from our emotional stress due to cognitive dissonance. Did you know that the experience of cognitive dissonance can trigger, in a human being, stronger a stress reaction than the actual danger of death? This is probably why we do science. Anyway, choosing the right because on the illogical grounds of personal emotions leads to metaphysics, whilst an attempt to pick up the right because for the occasion by logical inference from observation leads to the next question: the ‘what?’. What exactly is happening? If we have many becauses to choose between, choosing the right one means adapting our reaction to what is actually taking place.

The ‘what?’ is slightly more modern than the ‘why?’. Probably, mathematics were historically the first attempt to harness the subtleties of the ‘what?’, so we are talking about settled populations, with a division of labour allowing some people to think about things kind of professionally. Anyway, the ‘what?’ amounts to describing reality so as the causal sequence of ‘because…’ is being decomposed as a sequence. Instead of saying ‘C happens because of B, and B happens because of A’, we state a sequence: A comes first, then comes B, and finally comes C. If we really mean business, we observe probabilities of occurrence and we can make those sequences more complex and more flexible. A happens with a probability of 20%, and then B can happen with a probability of 30%, or B’ can happen at 50% odds, and finally we have 20% of chances that B’’ happens instead. If it is B’’ than happens, it can branch into C, C’ or C’’ with the respective probabilities of X, Y, Z etc.

Statistics are basically a baby of the ‘what?’. As the ‘why?’ is stressful and embarrassingly deterministic, we dodge and duck and dive into the reassuringly cool waters of the ‘what?’. Still, I am not the only one to have a curious ape inside of me. Everyone has, and the curiosity of the curious ape is neurologically wired around the ‘why?’ pattern. So, just to make the ape calm and logical, whilst satisfying its ‘why’-based curiosity, we use the ‘how?’ question. Instead of asking ‘why are things happening the way they are?’, so instead of looking for fixed patterns, we ask ‘how are things happening?’. We are still on the hunt for links between phenomena, but instead of trying to shoot the solid, heavy becauses, we satisfy our ambition with the faster and more flexible hows. The how is the way things happen in a given context. We have all the liberty to compare the hows from different contexts and to look for their mutual similarities and differences. With enough empirical material we can even make a set of connected hows into a family, under a common ‘why?’. Still, even with such generalisations, the how is always different an answer from ‘because…’. The how is always context-specific and always allows other hows to take place in different contexts. The ‘because…’ is much more prone to elbow its way to the front of the crowd and to push the others out of the way.

Returning to my observations about technological change, I can choose, now, between the ‘why?’, the ‘what?’, and the “how?’. I can ask ‘Why is this apparent contradiction taking place between the way technological change takes place, and its outcomes in terms of productivity?’. Answering this question directly with a ‘Because…’ means building a full-fledged theory. I do not feel ready for that, yet. All these ideas in my head need more ripening, I can feel it. I have to settle for a ‘what?’, hopefully combined into context-specific hows. Hows run fast, and they change their shape, according to the situation. If you are not quick enough to run after a how, you have to satisfy yourself with the slow, respectable because. Being quick, in science, means having access to empirical data and be able to test quickly your hypotheses. I mean, you can be quick without access to empirical data, but then you just run very quickly after your own shadow. Interesting, but moderately productive.

So I am running after my hows. I have that empirical landscape, where a continuously intensifying experimentation with new technologies leads, apparently, to decreasing a productivity. There is a how, camouflaging itself in that landscape. This how assumes that we, as a civilisation, randomly experiment with new technologies, kind of which idea comes first, and then we watch the outcomes in terms of productivity. The outcomes are not really good – Total Factor Productivity keeps falling in the global economy – and we still keep experimenting at an accelerating pace. Are we stupid? That would be a tempting because, only I can invert my how. We are experimenting with new technologies at an increasing pace as we face disappointing outcomes in terms of productivity. If technology A brings, on the long run, decreasing productivity, we quickly experiment with A’, A’’, A’’’ etc. Something that we do brings unsatisfactory results. We have two options then. Firstly, we can stop doing what we do, or, in other words, in the presence of decreasing productivity we could stop experimenting with new technologies. Secondly, we can intensify experimentation in order to find efficient ways to do what we do. Facing trouble, we can be passive or try to be clever. Which option is cleverer, at the end of the day? I cast my personal vote for trying to be clever.

Thus, it would turn out that the global innovative effort is an intelligent, collective response to the unsatisfactory outcomes of previous innovative effort. Someone could say that this is irrational to go deeper and deeper into something that does not bring results. That is a rightful objection. I can formulate two answers. First of all, any results come with a delay. If something is not bringing results we want, we can assume it is not bringing them yet. Science, which allows invention, is in itself quite a recent invention. The scientific paradigm we know today has taken definitive shape in the 19th century. Earlier, we basically have been using philosophy in order to invent science. It makes some 150 years that we can use real science to invent new technologies. Maybe it has not been enough to learn how to use science properly. Secondly, there is still the question of what we want. The Schumpeterian paradigm assumes we want increased productivity but do we really? I can assume, very biologically, what I already signalled in my previous posts: any living species tends to maximize its hold on the environment by absorbing as much energy as possible. Maybe we are not that far from amoeba, after all, and, as a species, we collectively tend towards maximizing our absorption of energy from our environment. From this point of view, technological change that leads to increasing our energy use per capita and to engaging an ever growing amount of capital and labour into the process could be a perfectly rational behaviour.

All that requires assuming collective intelligence in the mankind. Proving the existence of intelligence is both hard and easy. On the one hand, culture is proof of intelligence: this is one of the foundational principles in anthropology. From that point of view, we can perfectly assume that the whole human species has collective intelligence. Still, an economist has a problem with this view. In economics, we assume individual choice. Can individual choice be congruent with collective intelligence, i.e. can individual, conscious behaviour change in step with collective decisions? Well, we did Renaissance, didn’t we? We did electricity, we did vaccines, we did religions, didn’t we? I use the expression ‘we did’ and not ‘we made’, because it wasn’t that one day in the 15th century we collectively decided that from now on, we sculpt people with no clothes on and we build cathedrals on pseudo-ancient columns. Many people made individual choices, and those individual choices turned out to be mutually congruent, and produced a coherent social change, and so we have Tesla and Barbie dolls today.

Now, this is the easy part. The difficult one consists in passing from those general intuitions, which, in the scientific world, are hypotheses, to empirical verification. Honestly, I cannot directly, empirically prove we are collectively intelligent. I reviewed thoroughly the empirical data I have access to and I found nothing that could serve as direct proof of collective intelligence in the mankind. Maybe this is because I don’t know how exactly could I formulate the null hypothesis, here. Would it be that we are collectively dumb? Milton Friedman would say that in such a case, I have to options: forget it or do just as if. In other words, I can drop entirely the hypothesis of collective intelligence, with all its ramifications, or construe a model implying its veracity, so treating this hypothesis as an actual assumption, and see how this model fits, in confrontation with facts. In economics, we have that assumption of efficient markets. Right, I agree, they are not necessarily perfectly efficient, those markets, but they arrange prices and quantities in a predictable way. We have the assumption of rational institutions. In general, we assume that a collection of individual acts can produce coherent social action. Thus, we always somehow imply the existence of collective intelligence in our doings. Dropping entirely this hypothesis would be excessive. So I stay with doing just as if.

[1] Feenstra, Robert C., Robert Inklaar and Marcel P. Timmer (2015), “The Next Generation of the Penn World Table” American Economic Review, 105(10), 3150-3182, available for download at

Money essentially doesn’t give a s***

My editorial

I am interpreting my empirical findings about that evolutionary model of technological change. So far, it seems to make a logical structure, all those econometric tests. Yesterday, as I was presenting a research update in French ( see  “L’invention mâle de modèles”), I wanted to test the hypothesis that different social structures yield different selection functions, with different equilibriums between the number of patent applications and the capital invested in fixed assets. I added to my initial model two more variables, which I consider as informative about social structure: the density of population, and the depth of food deficit. In turned out quite interesting, although with no surprises. Higher density of population favours greater a number of patent applications, whilst the food deficit works in the opposite way. In the first case, the corresponding correlation seems to be rock-solid, regarding the significance of the null hypothesis ( p < 0,001). The second structural variable, the depth of food deficit, seems a bit wobbly in its correlation, though. With a significance level p = 0.124, the null hypothesis is dangerously close.

You probably already know that I have three inside of me: the curious ape, the austere monk, and the happy bulldog. Those last days, the bulldog could really have had some fun, with all that quantitative data to rummage through and test. The ape and the monk are sitting now, observing the bulldog running after sparse pieces of data, and they are having a conversation. ‘You know what, ape?’ the monk opens up, ‘I am thinking about how far is the obvious from the truth. Catching my drift, somehow, are you?’. ‘Ooookh’, answers the ape. ‘Sure, you are absolutely right’, continues the monk, ‘When I have cut off the bullshit, with that Ockham’s razor, there is still plenty of knowable things left. Take this case: the model looks nice, on the whole, and still I have some doubts. Aren’t we leaving some truth behind?’. ‘Ooookh, oookh!’, the ape is definitely developing a theoretical stance here, which inspires the monk. ‘Right you are, once again, ape. That significant role of labour compensation, in our evolutionary model, suggests that we could consider labour, not capital, as the set of female organisms, which recombine the genetic code of technologies, transmitted in male patent applications. Good! So we take away capital, put the supply of labour instead, in the model, and we see what happens’.

The monk gets on his feet, eager to start. The ape smiles, taps him gently on the shoulder, and forces him to fold his razor (the Ockham’s razor) back into his pocket. Safety first. The ape points at the bulldog. The monk nods. This is going to be bulldog’s task, too. A bit of play with the data, again. So I start, me with those three in me. I reformulate my basic hypothesis: evolutionary selection of new technologies works as an interaction between a set of female organizations of labour force, and a set of male organisms generating intelligible blueprints of new technologies. First, just as I studied the velocity of capital across patentable inventions, I study now the velocity of labour. I take my database made of Penn Tables 9.0 (Feenstra et al. 2015[1]) and additional data from the World Bank. In the database, I select two variables: ‘emp’ and ‘avh’.  The first one stands for the number of jobs in the economy, the second for the number of hours worked, on average, by one employee in one year. As I multiply those two, so as I compute ‘emp’*’avh’, I get the total supply of labour in one year.

Now, I make a ratio: supply of labour per one resident patent application. This is my velocity of labour across the units of patentable invention. I compute the mean and the variance of this variable, for each year separately. I put it back to back with the mean and the variance of capital ‘ck’ per one patent application. Right here you can download the corresponding Excel spreadsheet from my Google Disc. Interesting things appear. The mean values of those two ratios are significantly correlated: their mutual coefficient of Pearson correlation in moments is r = 0,557779846. Their respective variabilities, or the thing that happens when I divide the square root of variance by the mean, are even more significantly correlated: r = 0,748282791. Those two ratios seem to represent two, inter-correlated equilibriums, which share a common structure in space (variability for each year is variability between countries).  Thus, it would be interesting to follow the logic of the production function, in that evolutionary modelling of mine.

The next thing I do is to play the surgeon. I remove carefully the natural logarithm of physical capital, or ln(ck), from my model, and I put the natural logarithm of labour supply, or ln(emp*avh) instead. I do ceteris paribus, meaning that I leave everything else intact. Like a transplantation. Well, almost ceteris paribus. I have to remove one redundancy, too. The share of labour compensation, in the model I had so far, is clearly correlated with the supply of labour. Stands to reason: the compensation of labour is made of total hours worked multiplied by the average wage per hour. Cool! Organ transplanted, redundancy removed, and the patient seems to be still alive, which is a good thing in the surgery profession. It kicks nicely, with n = 317 valid observations (food deficit sifts away a lot of observations from my database; this is quite a sparse variable), and it yields a nice determination, with R2 = 0,806. Well, well, well, with that new variable instead of the old one, the patient seems even more alive than before. Somehow leaner, though. It happens. Let’s have a look at the parameters:

variable coefficient std. error t-statistic p-value
ln(delta) -1,398 0,433 -3,229 0,001
ln(Energy use (kg of oil equivalent per capita)) 2,133 0,108 19,79 0,000
ln(Density of population (people per sq km)) 0,481 0,077 6,243 0,000
ln(Depth of the food deficit (kilocalories per person per day)) -0,42 0,055 -7,646 0,000
ln(emp × avh) 1,01 0,069 14,649 0,000
constant -24,416 2,091 -11,677 0,000

Interestingly, the supply of labour seems to have jumped in the seat previously occupied by physical capital with almost the same coefficient of regression. Structural variables (i.e. those describing the social structure) keep their bearings, and even seem to gain some gravitas. The depth of food deficit has lost its previous wobbliness in correlation, and displays a proud p < 0,001 in terms of significance. Oh, I forgot to remove this one: energy intensity. It had its place in the previous model, with capital as peg variable, because at one moment in time, the residual constant from an early version of the model displayed a significant correlation with energy intensity. I just sort of left it, as it did not seem to be aggressive towards the new peg value, labour. Still, it was basically an omission, from my part, not to have removed it. Still, mistakes bring interesting results. When left in the model, energy intensity keeps its importance and its sign, whatever the peg variable, capital or labour.

Is it possible that we, humans, have a general tendency to favour technologies with high energy intensity? From the engineering point of view, it sounds stupid. Any decent engineer would look for minimizing energy intensity. Still, a species, understood as a biological mass with no official graduation in engineering, could be looking for appropriating as much energy from its environment as possible. So could we. That could mean, in turn, that all the policies aiming at minimizing the consumption of energy go essentially against the basic selection functions, which, in turn, animate our technological change. Systematically trying to minimize energy consumption means carving a completely new selection function.

The coefficient attached to the natural logarithm of ‘delta’, or the rate of depreciation in fixed assets, has undergone an interesting personal transformation in that new model. It has changed its sign: in the model with capital as peg variable, its sign was positive, now it is negative. When I assumed that capital chooses inventions, the selection function seemed to favour technologies with shorter a life (higher depreciation). Now, as I assume that labour chooses technologies, the selection process favours technologies with longer a life, or lower rate of depreciation. We have two opposing forces, though: the push to rotate technologies faster, in the selection function based on capital, and the strive to keep those technologies alive as long as possible, in the selection process based on labour. The respective velocities of the two production factors across the units of patentable invention are closely correlated, so I have some kind of economic equilibrium, here. That would be the equilibrium between investors wanting new technologies to pop all the time, and organizations (groups of workers) desiring technological standstill.

Ok, it had to happen. This is what happens when you let the bulldog play with data, unattended. It sniffed the money. I mean, the supply of money. I explain. In my evolutionary models, capital and labour, so the production factors, play the role of some primal substance of life, which gets shaped by technological innovation. Logically, a sexual model of reproduction needs a device for transmitting DNA from male organisms to the female ones, for further treatment. In biological reality, that device consists in semen for animals, pollen and seeds for plants. I could not figure out exactly, how to represent a spermatozoid in economic terms, and so I assumed that in economic evolution, money is the transmitting device. Each dollar is a marker, attached to a small piece of valuable resources. What if the transmitting mechanism had brains of its own? What if it was a smart transmitting mechanism? Well, for what I know about our transmitting mechanism, it is not very smart. I mean, it takes one billion spermatozoids to make one zygote. It is a bit as if it took one billion humans to make one new technology. We would be still struggling with the wheel. Yet, plants seem to be smarter in that respect. The vegetal pollen, and especially vegetal seeds, display amazing intelligence: they choose other organisms as conveyors, they choose the right place to fall off the conveyor etc.

So, what if money was a smart mechanism of transmission in my evolutionary model? What if there was a selection function from the part of money? The problem with ‘whatifs’ is that there is an indefinite multitude of them for each actual state of nature. Still, nobody forbids me to check at least one, right? So I take that model ln(Patent Applications) = a1*ln(Supply of broad money) + a2*ln(delta) + a3*ln(Energy use) + a4*ln(Density of population) + a5*ln(Depth of food deficit) + residual ln, and I test. Sample size: n = 494 observations. Could have been worse. Explanatory power: R2 = 0,615.  Nothing to inform the government about, but respectable. Parameters: in the table below.

variable coefficient std. error t-statistic p-value
ln(delta) 0,24 0,421 0,57 0,569
ln(Energy use (kg of oil equivalent per capita)) 0,834 0,134 6,222 0,000
ln(Density of population (people per sq km)) 0,406 0,1 4,041 0,000
ln(Depth of the food deficit (kilocalories per person per day)) -0,218 0,063 -3,469 0,001
ln(Supply of broad money, % of GDP / 100 × rgdpo) 0,791 0,06 13,102 0,000
constant -9,529 1,875 -5,081 0,000

It is getting really interesting. The supply of broad money takes the place of capital, or that of labour, as smoothly as if it had been practicing for years. Same sign, very similar magnitude in the coefficient, rock-solid correlation. Other variables basically stand still, even the structural ones. One thing changes: ‘delta’, or the rate of depreciation, seems to have lost the north. A significance at p = 0,569 is no significance at all. It means that with other variables constant, money could choose any life expectancy in technologies available for development, with a probability of such random choice reaching 56,9%. So, wrapping it up: There are three selection functions (probably, there is very nearly an infinity of them, but those three just look cool, to an economist), which share a common core – dependence on social structure, preference for energy maximization – and differ as for their preference for the duration of life-cycle in technologies. Capital likes short-lived, quick technologies. Labour goes for those more perennial and long-lasting. Money essentially doesn’t give a s*** (pronounce: s-asterisk-asterisk-asterisk). Whatever aggregate I take as the primal living substance of my model, the remaining part of the selection function remains more or less the same.

[1] Feenstra, Robert C., Robert Inklaar and Marcel P. Timmer (2015), “The Next Generation of the Penn World Table” American Economic Review, 105(10), 3150-3182, available for download at

Primitive, male satisfaction with bigger a size

Short introduction via You Tube

I have a nice structure for that book about innovation and technological change, viewed mostly in evolutionary terms. For the moment, I want to focus on two metrics, which progressively came out of the research and writing that I did over the last two days. In my update entitled ‘Evolutionary games’ , I identified the ratio of capital per one patent application as some sort of velocity of capital across units of scientific invention. On the other hand, yesterday, in my update in French, namely ‘Des trucs marrants qui nous passent à côté du nez, ou quelques réflexions évolutionnistes’ , I started to nail down another one, the ratio of money supplied per unit of fixed capital. All that in the framework of a model, where investors are female organisms, able to create substance and recombine genetic information, whilst research and development is made of male organisms, unable to reproduce or mix genes, but able to communicate their own genetic code in the form of patentable inventions. As I am a bit obsessed about monetary systems, those last months, I added money supplied from the financial sector as the conveyor of genetic information in my model. Each unit of money is informative about the temporary, local, market value of something, and in that sense it can be considered analogously to a biological marker in an evolutionary framework.

Now, I am returning to one of the cornerstones of a decent evolutionary model, namely to the selection function. Female investors select male inventions for reproduction and recombination of genetic code. Characteristics of the most frequently chosen inventions are being remembered and used as guidelines for creating future inventions: this is the component of adaptation in my model. In entitled ‘Evolutionary games’, I started to nail down my selection function, from two angles. I studied the distribution of a coefficient, namely the ratio of physical capital per one resident patent application, in my database made of Penn Tables 9.0 (Feenstra et al. 2015[1]) and additional data from the World Bank. The first results that I got suggest a strong geographical disparity, and a progressive change over time in a complex set of local averages. In general, combined with the disparity of that ration across different classes of food deficit in local populations, my working assumption is that the ratio of physical capital per one patent application, should it have any relevance, characterises the way that the selection function works locally.

The second angle of approach is linear regression. I tested econometrically the hypothesis that the number of patent applications depends on the amount of physical capital available locally. In evolutionary terms, it means that the number of male inventions depends on the amount of substance available in the female set of capital holders. I started with nailing down a logarithmic equation in my dataset, namely: ln(Patent Applications) = 0,825*ln(ck) + residual ln -4,204, in a sample of 2 623 valid observations in my database, where ‘ck’ stands for the amount of physical capital available (that’s the original acronym from Penn Tables 9.0). That equation yields a coefficient of determination R2 = 0,478, regarding the variance of empirical distribution in the number of patent applications.

This time, today, I want to meddle a little bit more with that linear regression. First of all, a quick update and interpretation of what I have. The full regression, described kind of by the book, looks like that:

variable coefficient std. error t-statistic p-value
ln(ck) 0,825 0,019 43,19 0,000
constant -4,204 0,259 -16,239 0,000

The low values of p – significance mean that the probability of the null hypothesis is below 0,001. In other words, it is very low a probability that for a given value of capital I can have any observable number of patent applications. Analogously, the probability that for the average value of physical capital the residual, unexplained number of patent applications is different from e-4,204 = 0,014935714 is also below 0,001.

The amount of physical capital available locally explains some 47% of the overall variance, observable in the distribution of resident patent applications. This is quite substantial an explanatory power, and it confirms the basic intuition of my whole evolutionary reasoning that the amount of genetic information communicated in the system (number of patent applications) is significantly proportional to the amount of organic substance (physical capital) available for recombination with the help of said genetic information. Still, I have that more than 52% of variance, left unexplained.

In econometrics, as in many other instances of existence, size matters. The size of a model is measured with its explanatory power, or its coefficient of determination R2. My equation, as I have it now, is medium in size. If I want it to be bigger in explanatory power, I can add variables on the right side. In my database, I have that variable called ‘delta’ in the original notation of Penn Tables 9.0, and it stands for the rate of depreciation in fixed assets. The greater that rate, the shorter the lifecycle of my physical assets. A few words of explanation for the mildly initiated. If my rate of depreciation is delta = 20%, it means that one fifth of book value in my assets goes out of the window every year, due to both physical wear and tear (physical depreciation), and to obsolescence in comparison to more modern assets (moral depreciation). If my delta = 20%, it basically means that I should replace the corresponding assets with new ones every five years. If my delta = 15%, that lifecycle climbs to 1/15% = 6,66 years, and with delta = 40%, it accelerates to 1/40% = 2,5 years.

In my evolutionary framework, ‘delta’ is the opposite of average life expectancy, observable in those technologies, which female capital is supposed to breed when fecundated by male inventions. I am positing a working hypothesis, that the amount of male inventions, serving to fecundate female capital, is inversely proportional to the life expectancy of my average technology. The longer one average technology lives, the less fun is required between male inventions and female capital, and vice versa: the shorter that lifecycle, the more conception has to go on between the two sides of my equation (capital and patent applications). In other words, I am hypothesising that the number of patent applications is straight proportional to the rate of depreciation ‘delta’. Let’s check. I am dropping the natural logarithm of ‘delta’, or ln(Depreciation), into my model, and I am running that linear regression ln(Patent Applications) = a1*ln(ck) + a2*ln(delta) + residual ln by Ordinary Least Squares. In return, I have R2 = 0,492, and the coefficients, together with their descriptive statistics (standard error and significance test) are as shown in the table below:

variable coefficient std. error t-statistic p-value
ln(ck) 0,843 0,019 43,587 0,000
ln(delta) 1,371 0,172 7,986 0,000
constant -0,203 0,561 -0,362 0,718

My hypothesis has been confirmed: there is a significant, positive correlation between the rate of depreciation in technologies, and the amount of patent applications. In other words, the shorter the lifecycle of technologies in a given country and year, the greater the number of those male inventions ready to conceive new baby technologies. Interestingly, my residual constant in the model has gone feral and uncorrelated with explanatory variables. For a given amount of physical capital, and a given rate of depreciation, the probability that I have a completely random residual number of patent applications is p = 71,8%.

At this point, I can try a different technique of empirical research. I compute that residual component for each of the 2 623 observations separately, and thus I get a statistical distribution of residuals. Then, I look for variables in my database, which are significantly correlated with those residuals from the model. In other words, I am looking for pegs, which I can possibly attach that rebel, residual tail to. In general, that logarithmic tail is truly feral: there is very little correlation with any other variable, excepted with the left side of the equation (number of patent applications). Still, two, moderately strong correlations come forth. The natural logarithm of energy use per capita, in kilograms of oil equivalent, comes as correlated with my logarithmic residual at r = 0,509, where ‘r’ stands for the Pearson coefficient of correlation in moments. The second correlation is that with the share of labour compensation in the GDP, or ‘labsh’ in the original notation of Penn Tables 9.0.  Here, the coefficient of correlation is r = 0,491.

You don’t argue with significant correlations, if you want to stay serious in econometric research, and so I drop those two additional, natural logarithms into my equation. I am testing now the validity of the proposition that ln(Patent Applications) = a1*ln(ck) + a2*ln(delta) + a3*ln(labsh) + a4*ln(Energy use) + residual ln. I get n = 2 338 valid observations, and my explanatory power, i.e. the size of my explanation, grows bigger, up to R2 = 0,701. I will be honest with you: I feel a primitive, male satisfaction with that bigger size in my explanatory power. Back to the nice and polite framework of empirical investigation, I have that table of coefficients, below:

variable coefficient std. error t-statistic p-value
ln(ck) 0,847 0,017 49,012 0,000
ln(delta) 2,256 0,16 14,089 0,000
ln(labsh) 2,782 0,157 17,693 0,000
ln(Energy use (kg of oil equivalent per capita)) 0,643 0,036 17,901 0,000
constant -0,854 0,561 -1,522 0,128

I can observe, first of all, that adding those two variables to the game pumped some size in the coefficient ascribed to depreciation, and left the coefficient attached to the amount of physical capital almost unchanged. It could suggest that the way those two additional variables work is somehow correlated with the lifecycle of technologies. Secondly, I have no clear and unequivocal clue, for the moment at least, how to interpret the significant presence of those two additional variables in the model. Maybe the selection function, in my evolutionary model, favours inventions with greater share of labour compensation in their production functions, as well as with more energy intensity? Maybe… It is something to dismantle into small pieces carefully. Anyway, it looks interesting.

The third thing is that new residual in the new, enriched model. It still has pretty low a significance (the null hypothesis as for this residual is significant at p = 12,8%), and so I repeated the same procedure: I computed local residuals from this model, and then I checked the correlation of thus obtained distribution of residuals, with other variables in my database. Nope. Nothing. Rien. Nada. This constant residual is really lonely and sociopathic. Better leave it alone.

[1] Feenstra, Robert C., Robert Inklaar and Marcel P. Timmer (2015), “The Next Generation of the Penn World Table” American Economic Review, 105(10), 3150-3182, available for download at

Evolutionary games

My editorial for today

My mind is wandering a bit, this morning. I am experiencing that peculiar state of creative lack in my focus, as if my brain was hesitating which compound parcel of information to treat. I am having a quick glance at ‘Théorie de la spéculation’ by Louis Bachelier (Bachelier 1900[1]), as I am almost always interested in practical applications of uncertainty. That, in turn, suggests me to add some data about stock markets to my database anchored in Penn Tables 9.0 (Feenstra et al. 2015[2]). On the other hand, I would like to continue on the path that I have been developing for a few days, about innovation and technological change understood as a game of adaptation in a complex social structure. As I am trying to connect those two dots, I end up reading papers like that by Frederic Scherer (Scherer 1996[3]) on the probability of big, fat profits from an invention . Finally, I have that idea of reconciling the game-theoretic approach with the evolutionary one, in modelling processes of adaptation under uncertainty.

Maybe if I start from the end and advance towards the beginning, as I love doing in my research, I will come to some interesting results? Good, let’s waltz. So I am studying evolution. Evolution of anything is about breeding, i.e. about having fun, first, and being a responsible parent in subsequent periods. At a given time ‘t’ , in a place (market) MR, we have a set N of n inventions, N = {N1, N2, …, Nn}, and each of those inventions can give rise to some fun on the spot, and to responsible parenthood next, or, in other words, it can be implemented as a technology, and maintained in exploitation for a given period of time. Right, so I need a set of established technologies, as well. I call this set TC = {TC1, TC2, …, TCm}, and this long name with curly brackets in it means that my set of established technologies contains ‘m’ of them.

Let’s suppose that inventions are male and established technologies are children. We need females, if we want to be serious about evolution. A female organism is what gets fecundated by some seed coming from the male. Logically, in this specific process, investment capital is female. Thus, I imagine a set CH = {CH1, CH2, …, CHk} of ‘k’ capital holders, who are logically female but who, in fact, can be male by physicality, or even be kind of transgender if they are corporations or governments. Anyway, the set N of inventions is having fun with the set CH of capital holders, which subsequently gives rise to responsible parenthood regarding the set TC. Some inventions stay out of the evolutionary game fault of proper mating, just as some capital holders remain infecund fault of catching the right opportunity. I know, it sounds cruel, but this is not the first time I learn how far from conventional decency are the actual ways of the world.

What is interesting, is the process of selection. As we are in the world of technological change, lipstick, short skirts, six-pack abs and Ferraris are being replaced by rules of selection, which give systematic preference to some matches between i-th invention and j-th capital holder, to the detriment of others. As I read some recent evolutionary literature (see for example Young 2001[4]), there is some inclination, in the theory, to considering the female mechanisms of selection, i.e. those applied by females regarding males, as dominant in importance. In the logic I have just developed, it generally holds: I can safely assume that capital holders select inventions to finance, rather than inventions picking up their investors. Yes, it is a simplification, and so is a cafe Americano, but it works (when you order an espresso, a good barista should serve it with a glass of cold, still water in accompaniment; still, when the barista can suspect you are not really a gourmet in coffee, they basically heat up that water, mix it with a slightly thinned espresso, and you get cafe Americano).

Anyway, we have j-th capital holder CHj picking up the i-th invention Ni, to give birth to o-th established technology TCo. Evolutionary theory assumes, in general, that this process is far from being random. It has an inherent function of selection, and this function is basically the stuff that makes hierarchy in the social structure of the corresponding species. The function of selection defines the requirements that a successful match should have. In this case, it means that a hierarchy of inventions is being formed, with the top inventions being the most likely to be selected, and subsequent levels of hierarchy being populated with inventions displaying decreasing f****ability. As usually in such cases, spell: ‘f-asterisk-asterisk-asterisk-asterisk-ability’.

As I am a scientist, I am keen on understanding the mating mechanism, and so I take my compound database made of Penn Tables 9.0, glued together with data from the World Bank about patent applications. I hypothesise something very basic, namely that the number of resident patent applications per year, in a given country, significantly depends on the amount of capital being invested in fixed assets. Patent applications stand for inventions, and capital stands for itself. Now, I can do two types of tests for this hypothesis. I can go linear, or I can follow Milton Friedman. Being linear means that I posit my selection function as something in the lines of:

n = a*k + residual

Of course, we keep in mind that ‘n’ is the number of inventions (patent applications in this case), and ‘k’ stands for the number of capital holders, which I approximate with the aggregate amount of capital (variable ‘ck’ in Penn Tables 9.0). Now, I squeeze it down to natural logarithms, in order to provide for non-stationarity in empirical data, and I test: ln(Patent Applications) = a1*ln(ck) + residual ln.  For the mildly initiated, non-stationarity means that data usually jumps up and down, sometimes even follows local trends. It generates some noise, and natural logarithms help to quiet it all down. Anyway, as I test this logarithmic equation in my dataset, I get a sample of 2 623 valid observations, and they yield a coefficient of determination R2 = 0,478. It means, once again for the mildly initiated, that my equation explains some 47,8% of variance observable in ln(Patent Applications), and more specifically, the whole things spells: ln(Patent Applications) = 0,825*ln(ck) + residual ln -4,204.

I am going to return to those linear results a bit later. Now, just to grasp that nice contrast between methodologies, I start following the intuitions of Milton Friedman. I mean the intuitions expressed in his equation of quantitative monetary equilibrium. Why do I follow this path? Well, I try to advance at the frontier of economics and evolutionary theory. In the latter, I have that concept of selection function. In the former, one of the central theoretical concepts is that of equilibrium. I mention Milton Friedman, because he used to maintain quite firmly that equilibriums in economic systems, if they are true equilibriums, are expressed by some kind of predictable proportion. In his equation of monetary equilibrium, it was expressed by the velocity of money, or the speed of circulation of money between different units of real output.

Here, I translate it as the velocity of capital regarding patent applications, or the speed of circulation in capital between patentable inventions. In short, it is the ratio of fixed capital ‘ck’ divided by the number of resident patent applications. Yes, I know that I labelled capital as female, and inventions as male, and I propose to drop the metaphor at this very point. Velocity of circulation in females between males is not something that decent scientists should discuss. Well, some scientists could, like biologists. Anyway, I hypothesise that if my selection function is really robust, it should be constant over space and time. In probabilistic terms, it means that the mean ratio of fixed capital per one patent application, in millions of 2011 US$ at current Purchasing Power Parities, should be fairly constant between countries and over time, as well as it should display relatively low and recurrent variability (standard deviation divided by mean) inside places and years. The more disparity will I be able to notice, like the more arithmetical distance between local means or the more local variability around them, the more confidently can I assume there are many selection functions in that evolutionary game.

So I computed the mean value of this ratio and its variance across countries (get it from this link), as well as over years (this is downloadable, too). On the whole, the thing is quite unstable. There is a lot of disparity between mean values, as well as around them. Still, geographical disparity seems to be stronger, in terms of magnitude, than changes observable over time. It allows me to formulate two, quite tentative, and still empirically verifiable hypotheses. Firstly, at a given time, there is a lot of different, local selection functions between capital holders and patentable inventions. Secondly, the geographical disparity in these selection functions is relatively recurrent over time. We have an internally differentiated structure, made of local evolutionary mechanisms, and this structure seems to reproduce itself in time. Returning to my linear model, I could go into nailing down linear functions of ln(Patent Applications) = a1*ln(ck) + residual ln for each country separately. Still, I have one more interesting path: I can connect that evolutionary game to the issue of food deficit, which I explored a bit in my previous posts, namely in ‘Cases of moderate deprivation’  and in ‘‘Un modèle mal nourri’ . I made a pivot out of my database and I calculated mean capital per patent application, as well as its variance across different depths of food deficit . Interesting results turn out, as usually with this cruel and unpleasant variable. Between different classes of food deficit, regarding the ratio of capital per one patent application, disparities are significant, and, what seems even more interesting, disparity around the local means, inside particular classes of food deficit, is strongly idiosyncratic. Different depths of food deficit are accompanied by completely different selection functions in my evolutionary game.

So far, I have come to more interesting results by applying the logic of economic equilibrium in my evolutionary game, than by using plain linear regression. This is the thing about correlations and regressions: if you want them to be really meaningful, you need to narrow down your hypotheses really tight, just as Milton Friedman used to say. By the way, you can see here, at work, the basic methodology of statistical analysis, as presented, for example, by a classic like Blalock[5]: first, you do your basic maths with descriptive statistics, and only in your next step you jump to correlations and whatnot.

[1] Bachelier, Louis. Théorie de la spéculation. Gauthier-Villars, 1900.

[2] Feenstra, Robert C., Robert Inklaar and Marcel P. Timmer (2015), “The Next Generation of the Penn World Table” American Economic Review, 105(10), 3150-3182, available for download at

[3] Scherer, Frederic M. “The size distribution of profits from innovation.” The Economics and Econometrics of Innovation. Springer US, 2000. 473-494.

[4] Young, H. Peyton. Individual strategy and social structure: An evolutionary theory of institutions. Princeton University Press, 2001.

[5] Blalock, Hubert Morse. Social Statistics: 2d Ed. McGraw-Hill, 1972.