I have a nice structure for that book about innovation and technological change, viewed mostly in evolutionary terms. For the moment, I want to focus on two metrics, which progressively came out of the research and writing that I did over the last two days. In my update entitled ‘Evolutionary games’ , I identified the ratio of capital per one patent application as some sort of velocity of capital across units of scientific invention. On the other hand, yesterday, in my update in French, namely ‘Des trucs marrants qui nous passent à côté du nez, ou quelques réflexions évolutionnistes’ , I started to nail down another one, the ratio of money supplied per unit of fixed capital. All that in the framework of a model, where investors are female organisms, able to create substance and recombine genetic information, whilst research and development is made of male organisms, unable to reproduce or mix genes, but able to communicate their own genetic code in the form of patentable inventions. As I am a bit obsessed about monetary systems, those last months, I added money supplied from the financial sector as the conveyor of genetic information in my model. Each unit of money is informative about the temporary, local, market value of something, and in that sense it can be considered analogously to a biological marker in an evolutionary framework.
Now, I am returning to one of the cornerstones of a decent evolutionary model, namely to the selection function. Female investors select male inventions for reproduction and recombination of genetic code. Characteristics of the most frequently chosen inventions are being remembered and used as guidelines for creating future inventions: this is the component of adaptation in my model. In entitled ‘Evolutionary games’, I started to nail down my selection function, from two angles. I studied the distribution of a coefficient, namely the ratio of physical capital per one resident patent application, in my database made of Penn Tables 9.0 (Feenstra et al. 2015) and additional data from the World Bank. The first results that I got suggest a strong geographical disparity, and a progressive change over time in a complex set of local averages. In general, combined with the disparity of that ration across different classes of food deficit in local populations, my working assumption is that the ratio of physical capital per one patent application, should it have any relevance, characterises the way that the selection function works locally.
The second angle of approach is linear regression. I tested econometrically the hypothesis that the number of patent applications depends on the amount of physical capital available locally. In evolutionary terms, it means that the number of male inventions depends on the amount of substance available in the female set of capital holders. I started with nailing down a logarithmic equation in my dataset, namely: ln(Patent Applications) = 0,825*ln(ck) + residual ln -4,204, in a sample of 2 623 valid observations in my database, where ‘ck’ stands for the amount of physical capital available (that’s the original acronym from Penn Tables 9.0). That equation yields a coefficient of determination R2 = 0,478, regarding the variance of empirical distribution in the number of patent applications.
This time, today, I want to meddle a little bit more with that linear regression. First of all, a quick update and interpretation of what I have. The full regression, described kind of by the book, looks like that:
The low values of p – significance mean that the probability of the null hypothesis is below 0,001. In other words, it is very low a probability that for a given value of capital I can have any observable number of patent applications. Analogously, the probability that for the average value of physical capital the residual, unexplained number of patent applications is different from e-4,204 = 0,014935714 is also below 0,001.
The amount of physical capital available locally explains some 47% of the overall variance, observable in the distribution of resident patent applications. This is quite substantial an explanatory power, and it confirms the basic intuition of my whole evolutionary reasoning that the amount of genetic information communicated in the system (number of patent applications) is significantly proportional to the amount of organic substance (physical capital) available for recombination with the help of said genetic information. Still, I have that more than 52% of variance, left unexplained.
In econometrics, as in many other instances of existence, size matters. The size of a model is measured with its explanatory power, or its coefficient of determination R2. My equation, as I have it now, is medium in size. If I want it to be bigger in explanatory power, I can add variables on the right side. In my database, I have that variable called ‘delta’ in the original notation of Penn Tables 9.0, and it stands for the rate of depreciation in fixed assets. The greater that rate, the shorter the lifecycle of my physical assets. A few words of explanation for the mildly initiated. If my rate of depreciation is delta = 20%, it means that one fifth of book value in my assets goes out of the window every year, due to both physical wear and tear (physical depreciation), and to obsolescence in comparison to more modern assets (moral depreciation). If my delta = 20%, it basically means that I should replace the corresponding assets with new ones every five years. If my delta = 15%, that lifecycle climbs to 1/15% = 6,66 years, and with delta = 40%, it accelerates to 1/40% = 2,5 years.
In my evolutionary framework, ‘delta’ is the opposite of average life expectancy, observable in those technologies, which female capital is supposed to breed when fecundated by male inventions. I am positing a working hypothesis, that the amount of male inventions, serving to fecundate female capital, is inversely proportional to the life expectancy of my average technology. The longer one average technology lives, the less fun is required between male inventions and female capital, and vice versa: the shorter that lifecycle, the more conception has to go on between the two sides of my equation (capital and patent applications). In other words, I am hypothesising that the number of patent applications is straight proportional to the rate of depreciation ‘delta’. Let’s check. I am dropping the natural logarithm of ‘delta’, or ln(Depreciation), into my model, and I am running that linear regression ln(Patent Applications) = a1*ln(ck) + a2*ln(delta) + residual ln by Ordinary Least Squares. In return, I have R2 = 0,492, and the coefficients, together with their descriptive statistics (standard error and significance test) are as shown in the table below:
My hypothesis has been confirmed: there is a significant, positive correlation between the rate of depreciation in technologies, and the amount of patent applications. In other words, the shorter the lifecycle of technologies in a given country and year, the greater the number of those male inventions ready to conceive new baby technologies. Interestingly, my residual constant in the model has gone feral and uncorrelated with explanatory variables. For a given amount of physical capital, and a given rate of depreciation, the probability that I have a completely random residual number of patent applications is p = 71,8%.
At this point, I can try a different technique of empirical research. I compute that residual component for each of the 2 623 observations separately, and thus I get a statistical distribution of residuals. Then, I look for variables in my database, which are significantly correlated with those residuals from the model. In other words, I am looking for pegs, which I can possibly attach that rebel, residual tail to. In general, that logarithmic tail is truly feral: there is very little correlation with any other variable, excepted with the left side of the equation (number of patent applications). Still, two, moderately strong correlations come forth. The natural logarithm of energy use per capita, in kilograms of oil equivalent, comes as correlated with my logarithmic residual at r = 0,509, where ‘r’ stands for the Pearson coefficient of correlation in moments. The second correlation is that with the share of labour compensation in the GDP, or ‘labsh’ in the original notation of Penn Tables 9.0. Here, the coefficient of correlation is r = 0,491.
You don’t argue with significant correlations, if you want to stay serious in econometric research, and so I drop those two additional, natural logarithms into my equation. I am testing now the validity of the proposition that ln(Patent Applications) = a1*ln(ck) + a2*ln(delta) + a3*ln(labsh) + a4*ln(Energy use) + residual ln. I get n = 2 338 valid observations, and my explanatory power, i.e. the size of my explanation, grows bigger, up to R2 = 0,701. I will be honest with you: I feel a primitive, male satisfaction with that bigger size in my explanatory power. Back to the nice and polite framework of empirical investigation, I have that table of coefficients, below:
|ln(Energy use (kg of oil equivalent per capita))||0,643||0,036||17,901||0,000|
I can observe, first of all, that adding those two variables to the game pumped some size in the coefficient ascribed to depreciation, and left the coefficient attached to the amount of physical capital almost unchanged. It could suggest that the way those two additional variables work is somehow correlated with the lifecycle of technologies. Secondly, I have no clear and unequivocal clue, for the moment at least, how to interpret the significant presence of those two additional variables in the model. Maybe the selection function, in my evolutionary model, favours inventions with greater share of labour compensation in their production functions, as well as with more energy intensity? Maybe… It is something to dismantle into small pieces carefully. Anyway, it looks interesting.
The third thing is that new residual in the new, enriched model. It still has pretty low a significance (the null hypothesis as for this residual is significant at p = 12,8%), and so I repeated the same procedure: I computed local residuals from this model, and then I checked the correlation of thus obtained distribution of residuals, with other variables in my database. Nope. Nothing. Rien. Nada. This constant residual is really lonely and sociopathic. Better leave it alone.
 Feenstra, Robert C., Robert Inklaar and Marcel P. Timmer (2015), “The Next Generation of the Penn World Table” American Economic Review, 105(10), 3150-3182, available for download at http://www.ggdc.net/pwt