As it is ripe, I can harvest

I keep revising my manuscript titled ‘Climbing the right hill – an evolutionary approach to the European market of electricity’, in order to resubmit it to the journal Applied Energy. In my last update, titled ‘Still some juice in facts’, I used the technique of reverted reading to break the manuscript down into a chain of ideas. Now, I start reviewing the most recent literature associated with those ideas. I start with Rosales-Asensio et al. (2020)[1], i.e. with ‘Decision-making tools for sustainable planning and conceptual framework for the energy–water–food nexus’. The paper comes within a broader stream of literature, which I already mentioned in the first version of my manuscript, namely within the so-called MUSIASEM framework, where energy management in national economies is viewed as metabolic function, and socio-economic systems in general are equated to metabolic structures. Energy, water, food, and land are considered in this paper as sectors in the economic system, i.e. as chains of markets with economic goods being exchanged. We know that energy, water and food are interconnected, and all the three are connected to the way that our human social structures work. Yet, in the study of those connections we have been going into growing complexity of theoretical models, hardly workable at all when applied to actual policies. Rosales-Asensio et al. propose a method to simplify theoretical models in order to make them functional in decision-making. Water, land, and food can be included into economic planning as soon as we explicitly treat them as valuable assets. Here, the approach by Rosales-Asensio et al. goes interestingly against the current of something that can be labelled as ‘popular environmentalism’. Whilst the latter treats those natural (or semi-natural, in the case of food base) resources as invaluable and therefore impossible to put a price tag on, Rosales-Asensio et al. argue that it is much more workable, policy-wise to do exactly the opposite, i.e. to give explicit prices and book values to those resources. The connection between energy, water, food, and the economy is being done as transformation of matrices, thus as something akin a Markov chain of states. 

The next article I pass in review is that by Al-Tamimi and Al-Ghamdi (2020), titled ‘Multiscale integrated analysis of societal and ecosystem metabolism of Qatar’ (Energy Reports, 6, 521-527, https://doi.org/10.1016/j.egyr.2019.09.019 ). This paper presents interesting findings, namely that energy consumption in Quatar, between 2006 and 2015, grew at a faster rate than GDP within the same period, and energy consumption per capita and energy intensity grew approximately at the same rate. That could suggest some kind of trade-off between productivity and energy intensity of an economy. Interestingly, the fall of productivity was accompanied by increased economic activity of the Quatar’s population, i.e. the growth of professionally active population, and thence of the labour market, was faster than the overall demographic growth.

In still another paper, titled ‘The energy metabolism of countries: Energy efficiency and use in the period that followed the global financial crisis’. Energy Policy, 139, 111304. https://doi.org/10.1016/j.enpol.2020.111304 (2020),  professor Valeria Andreoni develops a line of research, where rapid economic change, even when it is a crisis-like change, contributes to reducing energy intensity of national economies. Still, some kind of blueprint for energy-efficient technological change needs to be in place, at the level of national policies. Energy-efficient technological change might be easier than we think, and yet, apparently, it needs some sort of accompanying economic change as its trigger. Energy efficiency seems to be correlated with competitive technological development in national economies. Financial constraints can hamper those positive changes. Cross-sectional (i.e. inter-country) gaps in energy efficiency are essentially bad for sustainable development. Public policies should aim at equalizing those gaps, by integrating the market of energy within EU. 

Velasco-Fernández, R., Pérez-Sánchez, L., Chen, L., & Giampietro, M. (2020), in the article titled ‘A becoming China and the assisted maturity of the EU: Assessing the factors determining their energy metabolic patterns’. Energy Strategy Reviews, 32, 100562.  https://doi.org/10.1016/j.esr.2020.100562 , bring empirical results somehow similar to mine, although with a different method. The number of hours worked per person per year is mentioned in this paper as an important variable of the MuSIASEM framework for China. There is, for example, a comparison of energy metabolized in the sector of paid work, as compared to the household sector. It is found that the aggregate amount of human work used in a given sector of the economy is closely correlated with the aggregate energy metabolized by that sector. The economic development of China, and its pattern of societal metabolism in using energy, displays increase in the level of capitalization of all sectors, while reducing the human activity (paid work) in all of them except in the services. In the same time, the amount of human work per unit of real output seems to be negatively correlated with the capital-intensity (or capital-endowment) of particular sectors in the economy. Energy efficiency seems to be driven by decreasing work-intensity and increasing capital-intensity.

I found another similarity to my own research, although under a different angle, in the article by Koponen, K., & Le Net, E. (2021): Towards robust renewable energy investment decisions at the territorial level. Applied Energy, 287, 116552.  https://doi.org/10.1016/j.apenergy.2021.116552 . The authors build a simulative model in Excel, where they create m = 5000 alternative futures for a networked energy system aiming at optimizing 5 performance metrics, namely: the LCOE cost of electricity, the GHG metric (greenhouse gases emission) for the climate, the density of PM2.5 and PM10 particles in the ambient air as a metric of health, capacity of power generation as a technological benchmark, and the number of jobs as social outcome. That complex vector of outcomes has been simulated as dependent on a vector of uncertainty as regards costs, and more specifically: cost of CO2, cost of electricity, cost of natural gas, and the cost of biomass. The model was based on actual empirical data as for those variables, and the ‘alternative futures’ are, in other words, 5000 alternative states of the same system. Outcomes are gauged with the so-called regret analysis, where the relative performance in a specific outcome is measured as the residual difference between its local value, and, respectively, its general minimum or maximum, depending on whether the given metric is something we strive to maximize (e.g. capacity), or to minimize (e.g. GHG). The regret analysis is very similar to the estimation of residual local error.

That short review of literature has the merit of showing me that I am not completely off the picture with the method and he findings which I initially presented to the editor of Applied Energy in that manuscript: ‘Climbing the right hill – an evolutionary approach to the European market of electricity’. The idea of understanding the mechanism of change in social structures, including the market of energy, by studying many alternative versions of said structure, seems to be catching in literature. I am progressively wrapping my mind around the fact that in my manuscript, the method is more important than the findings. The real value for money of my article seems to reside in the extent to which I can demonstrate the reproducibility and robustness of that method.

Thus, probably for the umpteenth time, I am rephrasing the fundamentals of my approach, and I am trying to fit it into the structure which Applied Energy recommends for articles submitted to their attention. I should open up with an ‘Introduction’, where I sketch the purpose of the paper, as well as the main points of the theoretical background which my paper stems from, although without entering into detailed study thereof. Then, I should develop on ‘Material and Methods’, with the main focus on making my method as reproducible as possible, and now comes the time to develop on, respectively, ‘Theory’ and ‘Calculation’, thus elaborating on the theoretical foundations of my research as pitched against literature, and on the detailed computational procedures I used. I guess that I need to distinguish, at this specific point, between the literature pertinent to the substance of my research (Theory), and that oriented on the method of working with empirical data (Calculation).

Those four initial sections – Introduction, Material and Methods, Theory, Calculation – open the topic up and then comes the time to give it a closure, with, respectively: ‘Results’, ‘Discussion’, and, optionally, a separate ‘Conclusion’. Over the top of that logical flow, I need to decorate with sections pertinent to ‘Data availability’, ‘Glossary’, and ‘Appendices’. As I get further back from the core and substance of my manuscript, and deeper into peripheral information, I need to address three succinct ways of presenting my research: Highlights, Graphical Abstract, and a structured cover letter. Highlights are 5 – 6 bullet points, 1 – 2 lines each, sort of abstract translated into a corporate presentation on slides. The Graphical Abstract is a challenge – as I need to present complex ideas in a pictographic form – and it is an interesting challenge. The structured cover letter should address the following points:

>> what is the novelty of this work?

>> is the paper appealing to a popular or scientific audience?

>> why the author thinks the paper is important and why the journal should publish it?

>> has the article been checked by an expert native speaker?

>> is the author available as reviewer?

Now, I ask myself fundamental questions. Why should anyone bother about the substance and the method of the research I present in my article. I noticed, both in public policies and in business strategies, a tendency to formulate completely unrealistic plans, and then to complain about other people not being smart enough to carry those plans out and up to happy ending. It is very visible in everything related to environmental policies and environmentally friendly strategies in business. Environmental activism consumes itself, very largely, in bashing everyone around for not being diligent enough in saving the planet.

To me, it looks very similarly to what I did many times as a person: unrealistic plans, obvious failure which anyone sensible could have predicted, frustration, resentment, practical inefficiency. I did it many times, and, obviously, whole societies are perfectly able to do it collectively. Action is key to success. A good plan is the plan which utilizes and reinforces the skills and capacities I already have, makes those skills into recurrent patterns of action, something like one good thing done per day, whilst clearly defining the skills I need to learn in order to be even more complete and more efficient in what I do. A good public policy, just as a good business strategy, should work in the same way.

When we talk about energy efficiency, or about the transition towards renewable energies, what is our action? Like really, what is the most fundamental thing we do together? Do we purposefully increase energy efficiency, in the first place? Do we deliberately transition to renewables? Yes, and no. Yes, at the end of the day we get those outcomes, and no, what we do on a daily basis is something else. We work. We do business. We study in order to get a job, or to start a business. We live our lives, from day to day, and small outcomes of that daily activity pile up, producing big cumulative change.   

Instead of discussing what we do completely wrong, and thus need to change, it is a good direction to discover what we do well, consistently and with visible learning. That line of action can be reinforced and amplified, with good results. The so-far review of literature suggests that research concerning energy and energy transition is progressively changing direction, from the tendency to growing complexity and depth in study, dominant until recently, towards a translation of those complex, in-depth findings into relatively simple decision-making tools for policies and business strategies.

Here comes my method. I think it is important to create an analytical background for policies and business strategies, where we take commonly available empirical data at the macro scale, and use this data to discover the essential, recurrently pursued collective outcomes of a society, in the context of specific social goals. My point and purpose is to nail down a reproducible, relatively simple method of discovering what whole societies are really after. Once again, I think about something simple, which anyone can perform on their computer, with access to Internet. Nothing of that fancy stuff of social engineering, with personal data collected from unaware folks on Facebook. I want the equivalent of a screwdriver in positive, acceptably fair social engineering.

How do I think I can make a social screwdriver? I start with defining a collective goal we think we should pursue. In the specific case of my research on energy it is the transition to renewable sources. I nail down my observation of achievement, regarding that goal, with a simple metric, such as e.g. the percentage of renewables in total energy consumed (https://data.worldbank.org/indicator/EG.FEC.RNEW.ZS ) or in total electricity produced (https://data.worldbank.org/indicator/EG.ELC.RNEW.ZS ). I place that metric in the context of other socio-economic variables, such as GDP per capita, average hours worked per person per year etc. At this point, I make an important assumption as regards the meaning of all the variables I use. I assume that if a lot of humans go to great lengths in measuring something and reporting those measurements, it must be important stuff. I know, sounds simplistic, yet it is fundamental. I assume that quantitative variables used in social sciences represent important aspects of social life, which we do our best to observe and understand. Importance translates as significant connection to the outcomes of our actions.

Quantitative variables which we use in social sciences represent collectively acknowledged outcomes of our collective action. They inform about something we consistently care about, as a society, and, at the same time, something we recurrently produce, as a society. An array of quantitative socio-economic variables represents an imperfect, and yet consistently construed representation of complex social reality.

We essentially care about change. Both individual human nervous systems, and whole cultures, are incredibly accommodative. When we stay in a really strange state long enough to develop adaptive habits, that strange state becomes normal. We pay attention to things that change, whence a further hypothesis of mine that quantitative socio-economic variables, even if arithmetically they are local stationary states, serve us to apprehend gradients of change, at the level of collective, communicable cognition.

If many different variables I study serve to represent, imperfectly but consistently, the process of change in social reality, they might zoom on the right thing with various degrees of accuracy. Some of them reflect better the kind of change that is really important for us, collectively, whilst some others are just sort of accurate in representing those collectively pursed outcomes. An important assumption pops its head from between the lines of my writing: the bridging between pursued outcomes and important change. We pay attention to change, and some types of change are more important to us than others. Those particularly important changes are, I think, the outcomes we are after. We pay the most attention, both individually and collectively, to phenomena which bring us payoffs, or, conversely, which seriously hamper such payoffs. This is, once again on my path of research, a salute to the Interface Theory of Perception (Hoffman et al. 2015[2]; Fields et al. 2018[3]).

Now, the question is: how to extract orientations, i.e. objectively pursued collective outcomes, from that array of apparently important, structured observations of what is happening to our society? One possible method consists in observing trends and variance over time, and this is what I had very largely done, up to a moment, and what I always do now, with a fresh dataset, as a way of data mining. In this approach, I generally assume that a combination of relatively strong variance with strong correlation to the remaining metrics, makes a particular variable likely to be the driving undertow of the whole social reality represented by the dataset at hand.

Still, there is another method, which I focus on in my research, and which consists in treating the empirical dataset as a complex and imperfect representation of the way that collectively intelligent social structures learn by experimenting with many alternative versions of themselves. That general hypothesis leads to building supervised, purposefully biased experiments with that data. Each experiment consists in running the dataset through a specifically skewed neural network – a perceptron – where one variable from the dataset is the output which the perceptron strives to optimize, and the remaining variables make the complex input instrumental to that end. Therefore, each such experiment simulates an artificial situation when one variable is the desired and collectively pursued outcome, with other variables representing gradients of change subservient to that chief value.

When I run such experiments with any dataset, I create as many transformed datasets as there are variables in the game. Both for the original dataset, and for the transformed ones, I can calculate the mean value of each variable, thus construing a vector of mean expected values, and, according to classical statistics, such a vector is representative for the expected state of the dataset in question. I end up with both the original dataset and the transformed ones being tied to the corresponding vectors of mean expected values. It is easy to estimate the Euclidean distance between those vectors, and thus to assess the relative mathematical resemblance between the underlying datasets. Here comes something I discovered more than assumed: those Euclidean distances are very disparate, and some of them are one or two orders of magnitude smaller than all the rest. In other words, some among all the supervised experiments done yield a simulated state of social reality much more similar to the original, empirical one than all the other experiments. This is the methodological discovery which underpins my whole research in this article, and which emerged as pure coincidence, when I was working on a revised version of another paper, titled ‘Energy efficiency as manifestation of collective intelligence in human societies’, which I published with the journal ‘Energy’(https://doi.org/10.1016/j.energy.2019.116500 ).

My guess from there was – and still is – that those supervised experiments have disparate capacity to represent the social reality I study with the given dataset. Experiments which yield mathematical transformations relatively the most similar to the original set of empirical numbers are probably the most representative. Once again, the mathematical structure of the perceptron used in all those experiments is rigorously the same, and what makes the difference is the focus on one particular variable as the output to optimize. In other words, some among the variables studied represent much more plausible collective outputs than others.

I feel a bit lost in my own thinking. Good. It means I have generated enough loose thoughts to put some order in them. It would be much worse if I didn’t have thoughts to put order in. Productive chaos is better than sterile emptiness. Anyway, the reproducible method I want to present and validate in my article ‘Climbing the right hill – an evolutionary approach to the European market of electricity’ aims at discovering the collectively pursued social outcomes, which, in turn, are assumed to be the key drivers of social change, and the path to that discovery leads through the hypothesis that such outcomes are equivalent to specific a gradient of change, which we collectively pay particular attention to in the complex social reality, imperfectly represented with an array of quantitative socio-economic variables. The methodological discovery which I bring forth in that reproducible method is that when any dataset of quantitative socio-economic variables is being transformed, with a perceptron, into as many single-variable-optimizing transformations as there are variables in the set, 1 ÷ 3 among those transformations are mathematically much more similar to the original set of observations that all the other thus transformed sets. Consequently, in this method, it is expected to find 1 ÷ 3 variables which represent – much more plausibly than others – the possible orientations, i.e. the collectively pursued outcomes of the society I study with the given empirical dataset.

Ouff! I have finally spat it out. It took some time. The idea needed to ripe, intellectually. As it is ripe, I can harvest.


[1] Rosales-Asensio, E., de la Puente-Gil, Á., García-Moya, F. J., Blanes-Peiró, J., & de Simón-Martín, M. (2020). Decision-making tools for sustainable planning and conceptual framework for the energy–water–food nexus. Energy Reports, 6, 4-15. https://doi.org/10.1016/j.egyr.2020.08.020

[2] Hoffman, D. D., Singh, M., & Prakash, C. (2015). The interface theory of perception. Psychonomic bulletin & review, 22(6), 1480-1506.

[3] Fields, C., Hoffman, D. D., Prakash, C., & Singh, M. (2018). Conscious agent networks: Formal analysis and application to cognition. Cognitive Systems Research, 47, 186-213. https://doi.org/10.1016/j.cogsys.2017.10.003

Still some juice in facts

I am working on improving my manuscript titled ‘Climbing the right hill – an evolutionary approach to the European market of electricity’, after it received an amicable rejection from the journal Applied Energy, and, in the same time, I am working on other stuff. As usually. Some of that other staff is a completely new method of teaching in the summer semester, sort of a gentle revolution, with glorious prospects ahead, and without guillotines (well, not always).

As for the manuscript, I intend to work in three phases. I restate and reformulate the main lines of the article, and this is phase one. I pass in review the freshest literature in energy economics, as well as in the applications of artificial neural networks therein, and this is phase two. Finally, in phase three, I plan to position my method and my findings vis a vis that latest research.

I start phase one. When I want to understand what I wrote about 1 year ago, it is very nearly as if I was trying to understand what someone else wrote. Yes, I function like that. I have pretty good long-term memory, and it is because I learnt to detach emotions from old stuff. I sort of archive my old thoughts in order to make room for the always slightly disquieting waterfall of new thoughts. I need to dig and unearth my past meaning. I use the technique of reverse reading to achieve that. I read written content from its end back upstream to its beginning, and I go back upstream at two levels of structure: the whole piece of text, and individual sentences. In practical terms, when I work with that manuscript of mine, I take the last paragraph of the conclusion, and I actively write it backwards word-wise (I keep proper names unchanged). See by yourself.

This is the original last paragraph: ‘What if economic systems, inclusive of their technological change, optimized themselves so as to satisfy a certain workstyle? The thought seems incongruous, and yet Adam Smith noticed that division of labour, hence the way we work, shapes the way we structure our society. Can we hypothesise that technological change we are witnessing is, most of all, a collectively intelligent adaptation in the view of making a growing mass of humans work in ways they collectively like working? That would revert the Marxist logic, still, the report by World Bank, cited in the beginning of the article, allows such an intellectual adventure. On the path to clarify the concept, it is useful to define the meaning of collective intelligence’.

Now, I write it backwards: ‘Intelligence collective of meaning the define to useful is it concept the clarify to path the on adventure intellectual an such allows article the of beginning the in cited World Bank by report the still logic Marxist the revert that would that. Working like collectively they ways in work humans of mass growing a making view of the in adaptation intelligent collectively a all of most is witnessing are we change technological that hypothesise we can? Society our structure we way the shapes work we way the hence labour of division that noticed Adam Smith yet and incongruous seems thought the workstyle certain a satisfy to as so themselves optimized change technological their of inclusive systems economic if what?

Strange? Certainly, it is strange, as it is information with its pants on its head, and this is precisely why it is informative. The paper is about the market of energy, and my last paragraph of conclusions is about the market of labour, and its connection to the market of energy.

I go further upstream in my writing. The before-last paragraph of conclusions goes like: ‘Since David Ricardo, all the way through the works of Karl Marks, John Maynard Keynes, and those of Kuznets, economic sciences seem to be treating the labour market as easily transformable in response to an otherwise exogenous technological change. It is the assumption that technological change brings greater a productivity, and technology has the capacity to bend social structures. In this view, work means executing instructions coming from the management of business structures. In other words, human labour is supposed to be subservient and executive in relation to technological change. Still, the interaction between technology and society seems to be mutual, rather than unidirectional (Mumford 1964, McKenzie 1984, Kline and Pinch 1996; David 1990, Vincenti 1994). The relation between technological change and the labour market can be restated in the opposite direction. There is a body of literature, which perceives society as an organism, and social change is seen as complex metabolic adaptation of that organism. This channel of research is applied, for example, in order to apprehend energy efficiency of national economies. The so-called MuSIASEM model is an example of that approach, claiming that complex economic and technological change, including transformations in the labour market, can be seen as a collectively intelligent change towards optimal use of energy (see for example: Andreoni 2017 op. cit.; Velasco-Fernández et al 2018 op. cit.). Work can be seen as fundamental human activity, crucial for the management of energy in human societies. The amount of work we perform creates the need for a certain caloric intake, in the form of food, which, in turn, shapes the economic system around, so as to produce that food. This is a looped adaptation, as, on the long run, the system supposed to feed humans at work relies on this very work’.

Here is what comes from reverted writing of mine: ‘Work very this on relies work at humans feed to supposed system the run long the on as adaptation looped a is this food that produce to around system economic the shapes turn in which food of form the in intake caloric certain a for need the creates perform we work of amount the societies human in energy of management the for crucial activity human fundamental as seen be can work. Energy of use optimal towards change intelligent collectively a as seen be can market labour the in transformations including change technological and economic complex that claiming approach that of example an is model MuSIASEM called so the economies national of efficiency energy apprehend to order in example for applied is research of channel this. Organism that of adaptation metabolic complex as seen is change social and organism an as society perceives which literature of body a is there. Direction opposite the in restated be can market labour the and change technological between relation the. Unidirectional than rather mutual be to seems society and technology between interaction the still. Change technological to relation in executive and subservient be to supposed is labour human words other in. Structures social bend to capacity the has technology and productivity a greater brings change technological that assumption the is it. Change technological exogenous otherwise an to response in transformable easily as market labour the treating ne to seem sciences economic Kuznets of those and Keynes […], Marks […] of works the through way the all Ricardo […]’.

Good. I speed up. I am going back upstream through consecutive paragraphs of my manuscript. The chain of 35 ideas which I write here below corresponds to the reverted logical structure (i.e. from the end backstream to the beginning) of my manuscript. Here I go. Ideas listed below have numbers corresponding to their place in the manuscript. The higher the number, the later in the text the given idea is phrased out for the first time.

>> Idea 35: The market of labour, i.e. the way we organize for working, determines the way we use energy.

>> Idea 34: The way we work shapes technological change more than vice versa. Technologies and workstyles interact

>> Idea 33: The labour market offsets the loss of jobs in some sectors by the creation of jobs in other sectors, and thus the labour market accommodates the emergent technological change.

>> Idea 32: The basket of technologies we use determines the ways we use energy. work in itself is human effort, and that effort is functionally connected to the energy base of our society

>> Idea 31: Digital technologies seem to have a special function in mediating the connection between technological change and the labour market

>> Idea 30: the number of hours worked per person per year (AVH), the share of labour in the GNI (LABSH), and the indicator of human capital (HC) seem to make an axis of social change, both as input and as output of the collectively intelligent structure.

>> Idea 29: The price index in exports (PL_X) comes as the chief collective goal pursued, and the share of public expenditures in the Gross National Income (CSH_G) appears as the main epistatic driver in that pursuit.

>> Idea 28: The methodological novelty of the article consists in using the capacity of a neural network to produce many variations of itself, and thus to perform evolutionary adaptive walk in rugged landscape.

>> Idea 27: The here-presented methodology assumes: a) tacit coordination b) evolutionary adaptive walk in rugged landscape c) collective intelligence d) observable socio-economic variables are manifestations of the past, coordinated decisions.

>> Idea 26: Variance observable in the average Euclidean distances that each variable has with the remaining 48 ones reflects the capacity of each variable to enter into epistatic interactions with other variables, as the social system studied climbs different hills, i.e. pursues different outcomes to optimize.

>> Idea 25: Coherence: across 48 sets Si out of the 49 generated with the neural network, variances in Euclidean distances between variables are quite even. Only one set Si yields different variances, namely the one pegged on the coefficient of patent applications per 1 million people.

>> Idea 24: the order of phenomenal occurrences in the set X does not have a significant influence on the outcomes of learning.

>> Idea 23: results of multiple linear regression of natural logarithms in the variables observed is compared to the application of an artificial neural network with the same dataset – to pass in review and to rework – lots of meaning there.

>> Idea 22: the phenomena assumed to be a disturbance, i.e. the discrepancy in retail prices of electricity, as well as the resulting aggregate cash flow, are strongly correlated with many other variables in the dataset. Perhaps the most puzzling is their significant correlation with the absolute number of resident patent applications, and with its coefficient denominated per million of inhabitants. Apparently, the more patent applications in the system, the deeper is that market imperfection.

>> Idea 21: Another puzzling correlation of these variables is the negative one with the variable AVH, or the number of hours worked per person per year. The more an average person works per year, in the given country and year, the less likely this local market is to display harmful differences in the retail prices of electricity for households.

>> Idea 20: On the other hand, variables which we wish to see as systemic – the share of electricity in energy consumption and the share of renewables in the output of electricity – have surprisingly few significant correlations in the dataset studied, just as if they were exogenous stressors with little foothold in the market as for yet. 

>> Idea 19: None of the four key variables regarding the European market of energy: a) the price fork in the retail market of electricity (€) b) the capital value of cash flow resulting from that price fork (€ mln) c) the share of electricity in energy consumption (%) and d) the share of renewables in electricity output (%)seems having been generated by a ‘regular’ Gaussian process: they all produce definitely too much outliers for a Gaussian process to be the case.

>> Idea 18: other variables in the dataset, the ‘regulars’ such as GDP or price levels, seem to be distributed quite close to normal, and Gaussian processes can be assumed to work in the background. This is a typical context for evolutionary adaptive walk in rugged landscape. An otherwise stable socio-economic environment gets disturbed by changes in the energy base of the society living in the whereabouts. As new stressors (e.g. the need to switch to electricity, from the direct combustion of fossil fuels) come into the game, some ‘mutant’ social entities stick out of the lot and stimulate an adaptive walk uphill.

>> Idea 17: The formal test of Euclidean distances, according to equation (1), yields a hierarchy of alternative sets Si, as for their similarity to the source empirical set X of m= 300 observations. This hierarchy represents the relative importance of variables, which each corresponding set Si is pegged on.

>> Idea 16: The comparative set XR has been created as a sequence of 10 stacked, pseudo-random permutations of the original set X has been created as one database. Each permutation consists in sorting the records of the original set X according to a pseudo-random index variable. The resulting set covers m = 3000 phenomenal occurrences.

>> Idea 15: The underlying assumption as regards the collective intelligence of that set is that each country learns separately over the time frame of observation (2008 – 2017), and once one country develops some learning, that experience is being taken and reframed by the next country etc. 

>> Idea 14: we have a market of energy with goals to meet, regarding the local energy mix, and with a significant disturbance in the form of market imperfections

>> Idea 13: special focus on two variables, which the author perceives as crucial for tackling climate change: a) the share of renewable energy in the total output of electricity, and b) the share of electricity in the total consumption of energy.

>> Idea 12: A est for robustness, possible to apply together with this method, is based on a category of algorithms called ‘random forest’

>> Idea 11: The vector of variances in the xi-specific fitness function V[xi(pj)] across the n sets Si has another methodological role to play: it can serve to assess the interpretative robustness of the whole complex model. If, across neural networks oriented on different outcome variables, the given input variable xi displays a pretty uniform variance in its fitness function V[xi(pj)], the collective intelligence represented in equations (2) – (5) performs its adaptive walk in rugged landscape coherently across all the different hills considered to walk up. Conversely, should all or most variables xi, across different sets Si, display noticeably disparate variances in V[xi(pj)], the network represents a collective intelligence which adapts in a clearly different manner to each specific outcome (i.e. output variable).

>> Idea 10: the mathematical model for this research is composed of 5 main equations, which, in the same time, make the logical structure of the artificial neural network used for treating empirical data. That structure entails: a) a measure of mathematical similarity between numerical representations of collectively intelligent structure b) the expected state of intelligent structure reverse engineered from the behaviour of the neural network c) neural activation and the error of observation, the latter being material for learning by measurable failure, for the collectively intelligent structure d) transformation of multi-variate empirical data into one number fed into the neural activation function e) a measure of internal coherence in the collectively intelligent structure

>> Idea 9: the more complexity, the more is the hyperbolic tangent, based on the expression e2h, driven away from its constant root e2. Complexity in variables induces greater swings in the hyperbolic tangent, i.e. greater magnitudes of error, and, consequently, longer strides in the process of learning.

>> Idea 8: Each congruent set Si is produced with the same logical structure of the neural network, i.e. with the same procedure of estimating the value of output variable, valuing the error of estimation, and feeding the error forward into consecutive experimental rounds. This, in turn, represents a hypothetical state of nature, where the social system represented with the set X is oriented on optimizing the given variable xi, which the corresponding set Si is pegged on as its output.

>> Idea 7: complex entities can internalize an external stressor as they perform their adaptive walk. Therefore, observable variance in each variable xi in the set X can be considered as manifestation of such internalization. In other words, observable change in each separate variable can result from the adaptation of social entities observed to some kind of ‘survival imperative’.

>> Idea 6: hypothesis that collectively intelligent adaptation in human societies, regarding the ways of generating and using energy, is instrumental to the optimization of other social traits.    

>> Idea 5: Adaptive walks in rugged landscape consist in overcoming environmental challenges in a process comparable to climbing a hill: it is both an effort and a learning, where each step sets a finite range of possibilities for the next step.

>> Idea 4: the MuSIASEM methodological framework – aggregate use of energy in an economy can be studied as a metabolic process

>> Idea 3: human societies are collectively intelligent about the ways of generating and using energy: each social entity (country, city, region etc.) displays a set of characteristics in that respect

>> Idea 2: adaptive walk of a collective intelligence happens in a very rugged landscape, and the ruggedness of that landscape comes from the complexity of human societies

>> Idea 1: Collective intelligence occurs even in animals as simple neurologically as bees, or even as the Toxo parasite. Collective intelligence means shifting between different levels of coordination.

As I look at that thing, namely at what I wrote something like one year ago, I have a doubly recomforting feeling. The article seems to make sense from the end to the beginning, and from the beginning to the end. Both logical streams seem coherent and interesting, whilst being slightly different in their intellectual melody. This is the first comfortable feeling. The second is that I have still some meaning, and, therefore, some possible truth, to unearth out of my empirical findings, and this is always a good thing. In science, the view of empirical findings squeezed out of the last bit of meaning and yet still standing as something potentially significant is one of the saddest perspectives one can have. Here, there is still some juice in facts. Good.  

Second-hand observations

MY EDITORIAL ON YOU TUBE

I keep reviewing, upon the request of the International Journal of Energy Sector Management (ISSN1750-6220), a manuscript entitled ‘Evolutionary Analysis of a Four-dimensional Energy- Economy- Environment Dynamic System’. I have already formulated my first observations on that paper in the last update: I followed my suspects home, where I mostly focused on studying the theoretical premises of the model used in the paper under review, or rather of a model used in another article, which the paper under review heavily refers to.

As I go through that initial attempt to review this manuscript, I see I was bitching a lot, and this is not nice. I deeply believe in the power of eristic dialogue, and I think that being artful in verbal dispute is different from being destructive. I want to step into the shoes of those authors, technically anonymous to me (although I can guess who they are by their bibliographical references), who wrote ‘Evolutionary Analysis of a Four-dimensional Energy- Economy- Environment Dynamic System’. When I write a scientific paper, my conclusion is essentially what I wanted to say from the very beginning, I just didn’t know how to phrase that s**t out. All the rest, i.e. introduction, mathematical modelling, empirical research – it all serves as a set of strings (duct tape?), which help me attach my thinking to other people’s thinking.

I assume that people who wrote ‘Evolutionary Analysis of a Four-dimensional Energy- Economy- Environment Dynamic System’ are like me. Risky but sensible, as assumptions come. I start from the conclusion of their paper, and I am going to follow upstream. When I follow upstream, I mean business. It is not just about going upstream the chain of paragraphs: it is going upstream the flow of language itself. I am going to present you a technique I use frequently when I really want to extract meaning and inspiration from a piece of writing. I split that writing into short paragraphs, like no more than 10 lines each. I rewrite each such paragraph in inverted syntax, i.e. I rewrite from the last word back to the first one. It gives something like Master Yoda speaking: bullshit avoid shall you. I discovered by myself, and this is backed by the science of generative grammar, that good writing, when read backwards, uncovers something like a second layer of meaning, and that second layer is just as important as the superficial one.

I remember having applied this method to a piece of writing by Deepak Chopra. It was almost wonderful how completely meaningless that text was when read backwards. There was no second layer. Amazing.

Anyway, now I am going to demonstrate the application of that method to the conclusion of ‘Evolutionary Analysis of a Four-dimensional Energy- Economy- Environment Dynamic System’. I present paragraphs of the original text in Times New Roman Italic. I rewrite the same paragraphs in inverted syntax with Times New Roman Bold. Under each such pair ‘original paragraph + inverted syntax’ I write my second-hand observations inspired by those two layers of meaning, and those observations of mine come in plain Times New Roman.

Let’s dance.

Original text: Although increasing the investment in energy reduction can effectively improve the environmental quality; in a relatively short period of time, the improvement of environmental quality is also very obvious; but in the long run, the the influence of the system (1). In this study, the energy, economic and environmental (3E) four-dimensional system model of energy conservation constraints was first established. The Bayesian estimation method is used to correct the environmental quality variables to obtain the environmental quality data needed for the research.

Inverted syntax: Research the for needed data quality environmental the obtain to variables quality environmental the correct to used is method estimation Bayesian the established first was constraints conservation energy of model system dimensional four 3E environmental and economic energy the study this in system the of influence run long the in but obvious very also is quality environmental of improvement the time of period short relatively a in quality environmental the improve effectively can reduction energy in investment the increasing although.

Second-hand observations: The essential logic of using Bayesian methodology is to reduce uncertainty in an otherwise weakly studied field, and to set essential points for orientation. A Bayesian function essentially divides reality into parts, which correspond to, respectively, success and failure.

It is interesting that traits of reality which we see as important – energy, economy and environmental quality – can be interpreted as dimensions of said reality. It corresponds to the Interface Theory of Perception (ITP): it pays off to build a representation of reality based on what we want rather than on what we perceive as absolute truth.    

Original text: In addition, based on the Chinese statistical yearbook data, the Levenberg-Marquardt BP neural network method optimized by genetic algorithm is used to energy, economy and environment under energy conservation constraints. The parameters in the four-dimensional system model are effectively identified. Finally, the system science analysis theory and environment is still deteriorating with the decline in the discount rate of energy reduction inputs.

Inverted syntax: Inputs reduction energy of rate discount the in decline the with deteriorating still is environment and theory analysis science system the finally identified effectively are model system dimensional four the in parameters the constraints conservation energy under environment and economy energy to used is algorithm genetic by optimized method network neural Levenberg-Marquardt Backpropagation the data yearbook statistical Chinese the on based addition in.

Second-hand observations: The strictly empirical part of the article is relatively the least meaningful. The Levenberg-Marquardt BP neural network is made for quick optimization. It is essentially the method of Ordinary Least Squares transformed into a heuristic algorithm, and it can optimize very nearly anything. When using the Levenberg-Marquardt BP neural network we risk overfitting (i.e. hasty conclusions based on a sequence of successes) rather than the inability to optimize. It is almost obvious that – when trained and optimized with a genetic algorithm – the network can set such values in the model which allow stability. It simply means that the model has a set of values that virtually eliminate the need for adjustment between particular arguments, i.e. that the model is mathematically sound. On the other hand, it would be intellectually risky to attach too much importance to the specific values generated by the network. Remark: under the concept of ‘argument’ in the model I mean mathematical expressions of the type: [coefficient]*[parameter]*[local value in variable].

The article conveys an important thesis, namely that the rate of return on investment in environmental improvement is important for sustaining long-term commitment to such improvement.  

Original text: It can be better explained that effective control of the peak arrival time of pollution emissions can be used as an important decision for pollution emission control and energy intensity reduction; Therefore, how to effectively and reasonably control the peak of pollution emissions is of great significance for controlling the stability of Energy, Economy and Environment system under the constraint of energy reduction, regulating energy intensity, improving environmental quality and sustainable development.

Inverted syntax: Development sustainable and quality environmental improving intensity energy regulating reduction energy of constraint the under system environment and economy energy of stability the controlling for significance great of is emissions pollution of peak the control reasonably and effectively to how therefore reduction intensity energy and control emission pollution for decision important an as used be can emissions pollution of time arrival peak the of control effective that explained better be can.

Second-hand observations: This is an interesting logic: we can control the stability of a system by controlling the occurrence of peak states. Incidentally, it is the same logic as that used during the COVID-19 pandemic. If we can control the way that s**t unfolds up to its climax, and if we can make that climax somewhat less steep, we have an overall better control over the whole system.

Original text: As the environmental capacity decreases, over time, the evolution of environmental quality shows an upward trend of fluctuations and fluctuations around a certain central value; when the capacity of the ecosystem falls to the limit, the system collapses. In order to improve the quality of the ecological environment and promote the rapid development of the economy, we need more measures to use more means and technologies to promote stable economic growth and management of the ecological environment.

Inverted syntax: Environment ecological the of management and growth economic stable promote to technologies and means more use to measures more need we economy the of development rapid the promote and environment ecological the of quality the improve to order in collapse system the limit the to falls ecosystem the of capacity the when value central a around fluctuations and fluctuations of trend upward an shows quality environmental of evolution the time over decreases capacity environmental the as.    

Second-hand observations: We can see more of the same logic: controlling a system means avoiding extreme states and staying in a zone of proximal development. As the system reaches the frontier of its capacity, fluctuations amplify and start drawing an upward drift. We don’t want such a drift. The system is the most steerable when it stays in a somewhat mean-reverted state.  

I am summing up that little exercise of style. The authors of ‘Evolutionary Analysis of a Four-dimensional Energy-Economy-Environment Dynamic System’ claim that relations between economy, energy and environment are a complex, self-regulating system, yet the capacity of that system to self-regulate is relatively the most pronounced in some sort of central expected states thereof, and fades as the system moves towards peak states. According to this logic, our relations with ecosystems are always somewhere between homeostasis and critical disaster, and those relations are the most manageable when closer to homeostasis. A predictable, and economically attractive rate of return in investments that contribute to energy savings seems to be an important component of that homeostasis.

The claim in itself is interesting and very largely common-sense, although it goes against some views, that e.g. in order to take care of the ecosystem we should forego economics. Rephrased in business terms, the main thesis of ‘Evolutionary Analysis of a Four-dimensional Energy-Economy-Environment Dynamic System’ is that we can manage that dynamic system as long as it involves project management much more than crisis-management. When the latter prevails, things get out of hand. The real intellectual rabbit hole starts when one considers the method of proving the veracity of that general thesis.  The authors build a model of non-linear connections between volume of pollution, real economic output, environmental quality, and constraint on energy reduction. Non-linear connections mean that output variables of the model –  on the left side of each equation – are rates of change over time in each of the four variables. Output variables in the model are strongly linked, via attractor-like mathematical arguments on the input side, i.e. arguments which combine coefficients strictly speaking with standardized distance from pre-defined peak values in pollution, real economic output, environmental quality, and constraint on energy reduction. In simpler words, the theoretical model built by the authors of ‘Evolutionary Analysis of a Four-dimensional Energy-Economy-Environment Dynamic System’ resembles a spiderweb. It has distant points of attachment, i.e. the peak values, and oscillates between them.

It is interesting how this model demonstrates the cognitive limitations of mathematics. If we are interested in controlling relations between energy, economy, and environment, our first, intuitive approach is to consider these desired outcomes as dimensions of our reality. Yet, those dimensions could be different. If I want to become a top-level basketball player, I does not necessarily mean that social reality is truly pegged on a vector of becoming-a-top-level-basketball-player. Social mechanics might be structured around completely different variables. Still, with strong commitment, this particular strategy might work. Truth is not the same as payoffs from our actions. A model of relations energy-economy-environment pegged on desired outcomes in these fields might be essentially false in ontological terms, yet workable as a tool for policy-making. This approach is validated by the Interface Theory of Perception (see, e.g. Hoffman et al. 2015[1] and Fields et al. 2018[2]).

From the formal-mathematical point of view, the model is construed as a square matrix of complex arguments, i.e. the number of arguments on the left, input side of each equation is the same as the number of equations, whence the possibility to make a Jacobian matrix thereof, and to calculate its eigenvalues. The authors preset the coefficients of the model, and the peak-values so as to test for stability. Testing the model with those preset values demonstrates an essential lack of stability in the such-represented system. Stability is further tested by studying the evolution trajectory of the system. The method of evolution trajectory, in this context, seems referring to life sciences and the concept of phenotypic trajectory (see e.g. Michael & Dean 2013[3]), and shows that the system, such as modelled, is unstable. Its evolution trajectory can change in an irregular and chaotic way.

In a next step, the authors test their model with empirical data regarding China between 2000 and 2017. They use a Levenberg–Marquardt Backpropagation Network in order to find the parameters of the system. With thus-computed parameters, and the initial state of the system set on data from 1980, evolution trajectory of the system proves stable, in a multi cycle mode.

Now, as I have passed in review the logic of ‘Evolutionary Analysis of a Four-dimensional Energy-Economy-Environment Dynamic System’, I start bitching again, i.e. I point at what I perceive as, respectively, strengths and weaknesses of the manuscript. After reading and rereading the paper, I come to the conclusion that the most valuable part thereof is precisely the use of evolution trajectory as theoretical vessel. The value added I can see here consists in representing something complex that we want – we want our ecosystem not to kill us (immediately) and we want our smartphones and our power plants working as well – in a mathematical form, which can be further processed along the lines of evolution trajectory.

That inventive, intellectually far-reaching approach is accompanied, however, by several weaknesses. Firstly, it is an elaborate path leading to common-sense conclusions, namely that managing our relations with the ecosystem is functional as long as it takes the form of economically sound project management, rather than crisis management. The manuscript seems to be more of a spectacular demonstration of method rather than discovery in substance.

Secondly, the model, such as is presented in the manuscript, is practically impossible to decipher without referring to the article Zhao, L., & Otoo, C. O. A. (2019). Stability and Complexity of a Novel Three-Dimensional Environmental Quality Dynamic Evolution System. Complexity, 2019, https://doi.org/10.1155/2019/3941920 . When I say ‘impossible’, it means that the four equations of the model under review are completely cryptic, as the authors do not explain their mathematical notation at all, and one has to look into this Zhao, L., & Otoo, C. O. A. (2019) paper in order to make any sense of it.

After cross-referencing those two papers and the two models, I obtain quite a blurry picture. In this Zhao, L., & Otoo, C. O. A. (2019) we have  a complex, self-regulating system made of 3 variables: volume of pollution x(t), real economic output y(t), and environmental quality z(t). The system goes through an economic cycle of k periods, and inside the cycle those three variables reach their respective local maxima and combine into complex apex states. These states are: F = maxk[x(t)], E = maxk[y(t)], H = maxk(∆z/∆y) – or the peak value of the impact of economic growth 𝑦(𝑡) on environmental quality 𝑧(𝑡) –  and P stands for absolute maximum of pollution possible to absorb by the ecosystem, thus something like P = max(F). With those assumptions in mind, the original model by Zhao, L., & Otoo, C. O. A. (2019), which, for the sake of presentational convenience I will further designate as Model #1, goes like:  

d(x)/d(t) = a1*x*[1 – (x/F)] + a2*y*[1 – (y/E)] – a3*z

d(y)/d(t) = -b1*x – b2*y – b3*z

d(z)/d(t) = -c1*x + c2*y*[1 – (y/H)] + c3*z*[(x/P) – 1]

The authors of ‘Evolutionary Analysis of a Four-dimensional Energy- Economy- Environment Dynamic System’ present a different model, which they introduce as an extension of that by Zhao, L., & Otoo, C. O. A. (2019). They introduce a 4th variable, namely energy reduction constraints designated as w(t). There is no single word as for what does it exactly mean. The first moments over time of, respectively, x(t), y(t), z(t), and w(t) play out as in Model #2:

d(x)/d(t)= a1*x*[(y/M) – 1] – a2*y + a3*z + a4w

d(y)/d(t) = -b1*x + b2*y*[1 – (y/F)] + b3*z*[1 – (z/E)] – b4*w

d(z)/d(t) = c1*x*[(x/N) – 1] – c2*y – c3*z – c4*w

d(w)/d(t) = d1*x – d2*y + d3*z*[1 – (z/H)] + d4*w*[(y/P) – 1]

No, I have a problem. When the authors of ‘Evolutionary Analysis of a Four-dimensional Energy- Economy- Environment Dynamic System’ present their Model #2 as a simple extension of Model #1, this is simply not true. First of all, Model #2 contains two additional parameters, namely M and N, which are not explained at all in the paper. They are supposed to be normal numbers and that’s it. If I follow bona fide the logic of Model #1, M and N should be some kind of maxima, simple or compound. It is getting interesting. I have a model with two unknown maxima and 4 known ones, and I am supposed to understand the theory behind. Cool. I like puzzles.

The parameter N is used in the expression [(x/N) – 1]. We divide the local volume of pollution x by N, thus we do x/N, and this is supposed to mean something. If we keep any logic at all, N should be a maximum of pollution, yet we already have two maxima of pollution, namely F and P. As for M, it is used in the expression [(y/M) – 1] and therefore I assume M is a maximum state of real economic output ‘y’. Once again, we already have one maximum in that category, namely ‘E’. Apparently, the authors of Model #2 assume that the volume of pollution x(t) can have three different, meaningful maxima, whilst real economic output y(t) has two of them. I will go back to those maxima further below, when I discuss the so-called ‘attractor expressions’ contained in Model #2.

Second of all, Model #2 presents a very different logic than Model #1. Arbitrary signs of coefficients ai, bi, ci and di are different, i.e. minuses replace pluses and vice versa. Attractor expressions of the type [(a/b) – 1] or [1 – (a/b)] are different, too. I am going to stop by these ones a bit longer, as it is important regarding the methodology of science in general. When I dress a hypothesis like y = a*x1 + b*x2, coefficients ‘a’ and ‘b’ are neutral in the sense that if x1 > 0, then a*x1 > 0 as well etc. In other words, positive coefficients ‘a’ and ‘b’ do not imply anything about the relation between y, x1, and x2.

On the other hand, when I say y = -a*x1 + b*x2, it is different. Instead of having a coefficient ‘a’, I have a coefficient ‘-a’, thus opposite to ‘y’. If x1 > 0, then a*x1 < 0 and vice versa. By assigning a negative coefficient to phenomenon x, I assume it works as a contrary force to phenomenon y. A negative coefficient is already a strong assumption. As I go through all the arbitrarily negative coefficients in Model #2, I can see the following strong claims:

>>> Assumption 1: the rate of change in the volume of pollution d(x)/d(t) is inversely proportional to the real economic output y.

>>> Assumption 2: the rate of change in real economic output d(y)/d(t) is inversely proportional to the volume of pollution x

>>> Assumption 3: the rate of change in real economic output d(y)/d(t) is inversely proportional to energy reduction constraints w.

>>> Assumption 4: the rate of change in environmental quality d(z)/d(t) is inversely proportional to environmental quality z.

>>> Assumption 5: the rate of change in environmental quality d(z)/d(t) is inversely proportional to real economic output y.

>>> Assumption 6: the rate of change in environmental quality d(z)/d(t) is inversely proportional to the volume of pollution x.

>>> Assumption 7: the rate of change in energy reduction constraints d(w)/d(t) is inversely proportional to real economic output y.

These assumptions would greatly benefit from theoretical discussion, as some of them, e.g. Assumption 1 and 2, seem counterintuitive.

Empirical data presented in ‘Evolutionary Analysis of a Four-dimensional Energy- Economy- Environment Dynamic System’ is probably the true soft belly of the whole logical structure unfolding in the manuscript. Authors present that data in the standardized form, as constant-base indexes, where values from 1999 are equal to 1. In Table 1 below, I present those standardized values:

Table 1

YearX – volume of pollutionY – real economic outputZ – environmental qualityW – energy reduction constraints
20001,96261,04551,10850,9837
20013,67861,10661,22280,9595
20022,27911,20641,34820,9347
20031,26991,4021,52830,8747
20042,30331,63821,80620,7741
20051,93521,85942,08130,7242
20062,07782,0382,45090,6403
20073,94372,21563,03070,5455
20086,55832,28083,59760,4752
20093,2832,39123,89970,4061
20103,33072,56564,6020,3693
20113,68712,75345,42430,3565
20124,0132,86086,03260,321
20133,92742,96596,60680,2817
20144,21673,02927,21510,2575
20154,28933,05837,68130,2322
20164,59253,10048,28720,2159
20174,61213,19429,22970,2147

I found several problems with that data, and they sum up to one conclusion: it is practically impossible to check its veracity. The time series of real economic output seem to correspond to some kind of constant-price measurement of aggregate GDP of China, yet it does not fit the corresponding time series published by the World Bank (https://data.worldbank.org/indicator/NY.GDP.MKTP.KD ). Metrics such as ‘environmental quality’ (x) or energy reduction constraints (w) are completely cryptic. Probably, they are some sort of compound indices, and their structure in itself requires explanation.

There seems to be a logical loop between the theoretical model presented in the beginning of the manuscript, and the way that data is presented. The model presents an important weakness as regards functional relations inside arguments based on peak values, such as ‘y/M’ or ‘y/P’. The authors very freely put metric tons of pollution in fractional relation with units of real output etc. This is theoretical freestyle, which might be justified, yet requires thorough explanation and references to literature. Given the form that data is presented under, a suspicion arises, namely that standardization, i.e. having driven all data to the same denomination, opened the door to those strange, cross-functional arguments. It is to remember that even standardized through common denomination, distinct phenomena remain distinct. A mathematical trick is not the same as ontological identity.

Validation of the model with a Levenberg–Marquardt Backpropagation Network raises doubts, as well. This specific network, prone to overfitting, is essentially a tool for quick optimization in a system which we otherwise thoroughly understand. This is the good old method of Ordinary Least Squares translated into a sequence of heuristic steps. The LM-BN network does not discover anything about the system at hand, it just optimizes it as quickly as possible.

In a larger perspective, using a neural network to validate a model implies an important assumption, namely that consecutive states of the system form a Markov chain, i.e. each consecutive state is exclusively the outcome of the preceding state. It is to remember that neural networks in general are artificial intelligence, and intelligence, in short, means that we figure out what to do when we have no clue as for what to do, and we do it without any providential, external guidelines. The model presented by the authors clearly pegs the system on hypothetical peak values. These are exogenous to all but one possible state of the system, whence a logical contradiction between the model and the method of its validation.

Good. After some praising and some bitching, I can assess the manuscript by answering standard questions asked by the editor of the International Journal of Energy Sector Management (ISSN1750-6220).

  1. Originality: Does the paper contain new and significant information adequate to justify publication?

The paper presents a methodological novelty, i.e. the use of evolution trajectory as a method to study complex social-environmental systems, and this novelty deserves being put in the spotlight even more than it is in the present version of the paper. Still, substantive conclusions of the paper do not seem original at all.  

  • Relationship to Literature: Does the paper demonstrate an adequate understanding of the relevant literature in the field and cite an appropriate range of literature sources? Is any significant work ignored?

The paper presents important weaknesses as for bibliographical referencing. First of all, there are clear theoretical gaps as regards the macroeconomic aspect of the model presented, and as regards the nature and proper interpretation of the empirical data used for validating the model. More abundant references in these two fields would be welcome, if not necessary.

Second of all, the model presented by the authors is practically impossible to understand formally without reading another paper, referenced is a case of exaggerate referencing. The paper should present its theory in a complete way.   

  • Methodology: Is the paper’s argument built on an appropriate base of theory, concepts, or other ideas? Has the research or equivalent intellectual work on which the paper is based been well designed? Are the methods employed appropriate?

The paper combines a very interesting methodological approach, i.e. the formulation of complex systems in a way that makes them treatable with the method of evolution trajectory, with clear methodological weaknesses. As for the latter, three main questions emerge. Firstly, it seems to be methodologically incorrect to construe the cross-functional attractor arguments, where distinct phenomena are denominated one over the other. Secondly, the use of LM-BN network as a tool for validating the model is highly dubious. This specific network is made for quick optimization of something we understand and not for discovery inside something we barely understand.

Thirdly, the use of a neural network of any kind implies assuming that consecutive states of the system form a Markov chain, which is logically impossible with exogenous peak-values preset in the model.

  • Results: Are results presented clearly and analysed appropriately? Do the conclusions adequately tie together the other elements of the paper?

The results are clear, yet their meaning seems not to be fully understood. Coefficients calculated via a neural network represent a plausibly possible state of the system. When the authors conclude that the results so-obtained, combined with the state of the system from the year 1980, it seems really stretched in terms of empirical inference.

  • Implications for research, practice and/or society: Does the paper identify clearly any implications for research, practice and/or society? Does the paper bridge the gap between theory and practice? How can the research be used in practice (economic and commercial impact), in teaching, to influence public policy, in research (contributing to the body of knowledge)? What is the impact upon society (influencing public attitudes, affecting quality of life)? Are these implications consistent with the findings and conclusions of the paper?

The paper presents more implications for research than for society. As stated before, substantive conclusions of the paper boil down to common-sense claims, i.e. that it is better to keep the system stable rather than unstable. On the other hand, some aspects of the method used, i.e. the application of evolutionary trajectory, seem being very promising for the future. The paper seems to create abundant, interesting openings for future research rather than practical applications for now.

  • Quality of Communication: Does the paper clearly express its case, measured against the technical language of the field and the expected knowledge of the journal’s readership? Has attention been paid to the clarity of expression and readability, such as sentence structure, jargon use, acronyms, etc.

The quality of communication is a serious weakness in this case. The above-mentioned exaggerate reference to Zhao, L., & Otoo, C. O. A. (2019). Stability and Complexity of a Novel Three-Dimensional Environmental Quality Dynamic Evolution System. Complexity, 2019, https://doi.org/10.1155/2019/3941920 is one point. The flow of logic is another. When, for example, the authors suddenly claim (page 8, top): ‘In this study we set …’, there is no explanation why and on what theoretical grounds they do so.

Clarity and correct phrasing clearly lack as regards all the macroeconomic aspect of the paper. It is truly hard to understand what the authors mean by ‘economic growth’.

Finally, some sentences are clearly ungrammatical, e.g. (page 6, bottom): ‘By the system (1) can be launched energy intensity […].   

Good. Now, you can see what a scientific review looks like. I hope it was useful. Discover Social Sciences is a scientific blog, which I, Krzysztof Wasniewski, individually write and manage. If you enjoy the content I create, you can choose to support my work, with a symbolic $1, or whatever other amount you please, via MY PAYPAL ACCOUNT.  What you will contribute to will be almost exactly what you can read now. I have been blogging since 2017, and I think I have a pretty clearly rounded style.

In the bottom on the sidebar of the main page, you can access the archives of that blog, all the way back to August 2017. You can make yourself an idea how I work, what do I work on and how has my writing evolved. If you like social sciences served in this specific sauce, I will be grateful for your support to my research and writing.

‘Discover Social Sciences’ is a continuous endeavour and is mostly made of my personal energy and work. There are minor expenses, to cover the current costs of maintaining the website, or to collect data, yet I want to be honest: by supporting ‘Discover Social Sciences’, you will be mostly supporting my continuous stream of writing and online publishing. As you read through the stream of my updates on https://discoversocialsciences.com , you can see that I usually write 1 – 3 updates a week, and this is the pace of writing that you can expect from me.

Besides the continuous stream of writing which I provide to my readers, there are some more durable takeaways. One of them is an e-book which I published in 2017, ‘Capitalism And Political Power’. Normally, it is available with the publisher, the Scholar publishing house (https://scholar.com.pl/en/economics/1703-capitalism-and-political-power.html?search_query=Wasniewski&results=2 ). Via https://discoversocialsciences.com , you can download that e-book for free.

Another takeaway you can be interested in is ‘The Business Planning Calculator’, an Excel-based, simple tool for financial calculations needed when building a business plan.

Both the e-book and the calculator are available via links in the top right corner of the main page on https://discoversocialsciences.com .

You might be interested Virtual Summer Camps, as well. These are free, half-day summer camps will be a week-long, with enrichment-based classes in subjects like foreign languages, chess, theater, coding, Minecraft, how to be a detective, photography and more. These live, interactive classes will be taught by expert instructors vetted through Varsity Tutors’ platform. We already have 200 camps scheduled for the summer.   https://www.varsitytutors.com/virtual-summer-camps .


[1] Hoffman, D. D., Singh, M., & Prakash, C. (2015). The interface theory of perception. Psychonomic bulletin & review, 22(6), 1480-1506.

[2] Fields, C., Hoffman, D. D., Prakash, C., & Singh, M. (2018). Conscious agent networks: Formal analysis and application to cognition. Cognitive Systems Research, 47, 186-213. https://doi.org/10.1016/j.cogsys.2017.10.003

[3] Michael, L. C., & Dean, C. A. (2013). Phenotypic trajectory analysis: comparison of shape change patterns in evolution and ecology. https://doi.org/10.4404/hystrix-24.1-6298

Social roles and pathogens: our average civilisation

MY EDITORIAL ON YOU TUBE

I am starting this update with a bit of a winddown on my previous excitement, expressed in Demographic anomalies – the puzzle of urban density. I was excited about the apparently mind-blowing, negative correlation of ranks between the relative density of urban population, on the one hand, and the consumption of energy per capita, on the other hand. Apparently, the lower the rank of the {[DU/DG] [Density of urban population / General density of population]} coefficient, the greater the consumption of energy per capita. All in all, it is not as mysterious as I thought. It is visible, that the average value of the [DU/DG] coefficient decreases with the level of socio-economic development. In higher-middle income countries, and in the high-income ones, [DU/DG] stays consistently below 10, whilst in poor countries it can even flirt with values above 100. In other words, relatively greater a national wealth is associated with relatively smaller a social difference between cities and the countryside. Still, that shrinking difference seems to have a ceiling around [DU/DG] = 2,00. In the realm of [DU/DG] < 2,00, we do not really encounter wealthy countries. In this category we have tropical island states, or entities such as West Bank and Gaza, which are demographic anomalies even against the background of cities in general being demographic anomalies. Among really wealthy countries, the lowest values in the [DU/DG] coefficient are to find with Belgium (2,39) and Netherlands (2,30).

I am taking it from the beginning, ‘it’ being the issue of cities and urbanisation. The beginning was my bewilderment when the COVID-19-related lockdowns started in my country, i.e. in Poland. I remember cycling through the post-apocalyptically empty streets of my hometown, Krakow, Poland, I was turning in my mind the news, regarding the adverse economic outcomes of the lockdown, and strange questions were popping up in my consciousness. How many human footsteps per day does a city need to thrive? How many face-to-face interactions between people do we need, to keep that city working?

I had that sudden realization that city life is all about intensity of human interaction.  I reminded another realization, which I experienced in November 2017. I was on a plane that had just taken off from the giant Frankfurt airport. It was a short flight, to Lyon, France – almost like a ballistic curve – and this is probably why the plane was gathering altitude very gently. I could see the land beneath, and I marvelled at the slightly pulsating, intricate streaks of light, down there, on the ground. It took me a few minutes to realize that the lights I was admiring were those of vehicles trapped in the gargantuan traffic jams, typical for the whole region of Frankfurt. Massively recurrent, utterly unpleasant, individual happening – being stuck in a traffic jam – was producing outstanding beauty, when contemplated from far above. 

As I rummaged a bit through literature, cities seem to have been invented, back in the day, as social contrivances allowing, on the one hand, relatively peaceful coexistence of many different ethnic groups in fertile lowlands, and, on the other hand, a clear focusing of demographic growth in limited areas, whilst leaving the majority of arable land to the production of food. With time, the unusually high density of population in cities started generating secondary and tertiary effects. Greater a density of population favours accelerated emergence of new social roles, which, in turn, stimulates technological change and the development of markets. Thus, initially, cities tend to differentiate sharply from the surrounding countryside. By doing so, they create a powerful creative power regarding aggregate income of the social group. When this income-generating force concurs, hopefully, with acceptably favourable natural conditions and with political stability, the whole place (i.e. country or region) starts getting posh, and, as it does so, the relative disparity between cities and the countryside starts to diminish down to some kind of no-go-further threshold, where urban populations are a little bit twice as dense as the general average of the country. In other words, cities are a demographic anomaly which alleviates social tensions, and allows social change through personal individuation and technological change, and this anomaly starts dissolving itself as soon as those secondary and tertiary outcomes really kick in.

In the presence of that multi-layer cognitive dissonance, I am doing what I frequently do, i.e. in a squid-like manner I produce a cloud of ink. Well, metaphorically: it is more of a digital ink. As I start making myself comfortable inside that cloud, axes of coordinates emerged. One of them is human coordination in cities, and a relatively young, interesting avenue of research labelled ‘social neuroscience’. As digital imaging of neural processes has been making itself some space, as empirical method of investigation, interesting openings emerge. I am undertaking a short review of literature in the field of social neuroscience, in order to understand better the link between us, humans, being socially dense, and us doing other interesting things, e.g. inventing quantum physics or publishing the ‘Vogue’ magazine.

I am comparing literature from 2010 with the most recent one, like 2018 and 2019. I snatched almost the entire volume 65 of the ‘Neuron’ journal from March 25, 2010, and I passed in review articles pertinent to social neuroscience. Pascal Belin and Marie-Helene Grosbras (2010[1]) discuss the implications of research on voice cognition in infants. Neurologically, the capacity to recognize voice, i.e. to identify people by their voices, emerges long before the capacity to process verbal communication. Apparently, the period stretching from the 3rd month of life through the 7th month is critical for the development of voice cognition in infants. During that time, babies learn to be sharper observers of voices than other ambient sounds. Cerebral processing of voice seems to be largely subcortical and connected to our perception of time. In other words, when we use to say, jokingly, that city people cannot distinguish the voices of birds but can overhear gossip in a social situation, it is fundamentally true. From the standpoint of my research it means that dense social interaction in cities has a deep neurological impact on people already in their infancy. I assume that the denser a population is, the more different human voices a baby is likely to hear, and learn to discriminate, during that 3rd ÷ 7th month phase of learning voice cognition. The greater the density of population, the greater the data input for the development of this specific function in our brain. The greater the difference between the city and the countryside, social-density-wise, the greater the developmental difference between infant brains as regards voice cognition.

From specific I pass to the general, and to a review article by Ralph Adolphs (2010[2]). One of the most interesting takeaways from this article is a strongly corroborated thesis that social neurophysiology (i.e. the way that our brain works in different social contexts) goes two ways: our neuro-wiring predisposes us to some specific patterns of social behaviour, and yet specific social contexts can make us switch between neurophysiological patterns. That could mean that every mentally healthy human is neurologically wired for being both a city slicker and a rural being. Depending on the context we are in, the corresponding neurophysiological protocol kicks in. Long-lasting urbanization privileges social learning around ‘urban’ neurophysiological patterns, and therefore cities can have triggered a specific evolutionary direction in our species.

I found an interesting, slightly older paper on risk-taking behaviour in adolescents (Steinberg 2008[3]). It is interesting because it shows connections between developmental changes in the brain, and the appetite for risk. Risk-taking behaviour is like a fast lane of learning. We take risks when and to the extent that we can tolerate both high uncertainty and high emotional tension in a specific context. Adolescents take risks in order to boost their position in social hierarchy and that seems to be a truly adolescent behaviour from the neurophysiological point of view. Neurophysiological adults, thus, roughly speaking, people over the age of 25, seem to develop increasing preference for strategies of social advancement based on long-term, planned action with clearly delayed rewards. Apparently, there are two distinct, neurophysiological protocols – the adolescent one and the adult one – as regards the quest for individual social role, and the learning which that role requires.

Cities allow more interactions between adolescents than countryside does. More interactions between adolescents stronger a reinforcement for social-role-building strategies based on short-term reward acquired at the price of high risk. That might be the reason why in the modern society, which, fault of a better term, we call ‘consumer society’, there is such a push towards quick professional careers. The fascinating part is that in a social environment rich in adolescent social interaction, the adolescent pattern of social learning, based on risk taking for quick reward, finds itself prolongated deep into people’s 40ies or even 50ies.

We probably all know those situations, when we look for something valuable in a place where we can reasonably expect to find valuable things, yet the search is not really successful. Then, all of a sudden, just next door to that well-reputed location, we find true jewels of value. I experienced it with books, and with people as well. So is the case here, with social neuroscience. As long as I was typing ‘social neuroscience’ in the search interfaces of scientific repositories, more or less the same essential content kept coming to the surface. As my internal curious ape was getting bored, it started dropping side-keywords into the search, like ‘serotonin’ and ‘oxytocin’, thus the names of hormonal neurotransmitters in us, humans, which are reputed to be abundantly entangled with our social life. The keyword ‘Serotonin’ led me to a series of articles on the possibilities of treating and curing neurodevelopmental deficits in adults. Not obviously linked to cities and urban life? Look again, carefully. Cities allow the making of science. Science allows treating neurodevelopmental deficits in adults. Logically, developing the type of social structure called ‘cities’ allows our species to regulate our own neurophysiological development beyond the blueprint of our DNA, and the early engram of infant development (see, for example: Ehninger et al. 2008[4]; Bavelier at al. 2010[5]).

When I searched under ‘oxytocin’, I found a few papers focused on the fascinating subject of epigenetics. This is a novel trend in biology in general, based on the discovery that our DNA has many alternative ways of expressing itself, depending on environmental stimulation. In other words, the same genotype can produce many alternative phenotypes, through different expressions of coding genes, and the phenotype produced depends on environmental factors (see, e.g. Day & Sweatt 2011[6]; Sweatt 2013[7]). It is a fascinating question: to what extent urban environment can trigger a specific phenotypical expression of our human genotype?

A tentative synthesis regarding the social neuroscience of urban life leads me to develop on the following thread: we, humans, have a repertoire of alternative behavioural algorithms pre-programmed in our central nervous system, and, apparently, at some biologically very primal level, a repertoire of different phenotypical expressions to our genotype. Urban environments are likely to trigger some of those alternative patterns. Appetite for risk, combined with quick learning of social competences, in an adolescent-like mode, seems to be one of such orientations, socially reinforced in cities.   

All that neuroscience thing leads me to taking once again a behavioural an angle of approach to my hypothesis on the connection between the development of cities, and technological change, all that dipped in the sauce of ‘What is going to happen due to COVID-19?’. Reminder for those readers, who just start to follow this thread: I hypothesise that, as COVID-19 hits mostly in densely populated urban areas, we will probably change our way of life in cities. I want to understand how exactly it can possibly happen. When the pandemic became sort of official, I had a crazy idea: what if I represented all social change as a case of interacting epidemics? I noticed that SARS-Cov-2 gives a real boost to some technologies and behaviours, whilst others are being pushed aside. Certain types of medical equipment, ethylic alcohol (as disinfectant!), online communication, express delivery services – all that stuff just boomed. There were even local speculative bubbles in the stock market, around the stock of medical companies. In my own investment portfolio, I earnt 190% in two weeks, on the stock of a few Polish biotechs, and it could have been 400%, had I played it better.

Another pattern of collective behaviour that SARS-Cov-2 has clearly developed is acceptance of authoritarian governance. Well, yes, folks. Those special ‘epidemic’ regimes most of us live under, right now, are totalitarian governance by instalments, in the presence of a pathogen, which, statistically, is less dangerous than driving one’s own car. There is quite convincing scientific evidence that prevalence of pathogens makes people much more favourable to authoritarian policies in their governments (see for example: Cashdan & Steele 2013[8]; Murray, Schaller & Suedfeld 2013[9]).    

On the other hand, there are social activities and technologies, which SARS-Cov-2 is adverse to: restaurants, hotels, air travel, everything connected to mass events and live performative arts. The retail industry is largely taken down by this pandemic, too: see the reports by IDC, PwC, and Deloitte. As for behavioural patterns, the adolescent-like pattern of quick social learning with a lot of risk taking, which I described a few paragraphs earlier, is likely to be severely limited in a pandemic-threatened environment.

Anyway, I am taking that crazy intellectual stance where everything that makes our civilisation is the outcome of epidemic spread in technologies and behavioural patterns, which can be disrupted by the epidemic spread of some real s**t, such as a virus. I had a look at what people smarter than me have written on the topic (Méndez, Campos & Horsthemke 2012[10]; Otunuga 2019[11]), and a mathematical model starts emerging.

I define a set SR = {sr1, sr2, …, srm} of ‘m’ social roles, defined as combinations of technologies and behavioural patterns. On the other hand, there is a set of ‘k’ pathogens PT = {pt1, pt2, …, ptk}. Social roles are essentially idiosyncratic and individual, yet they are prone to imperfect imitation from person to person, consistently with what I wrote in ‘City slickers, or the illusion of standardized social roles’. Types of social roles spread epidemically through civilization just as a pathogen would. Now, an important methodological note is due: epidemic spread means diffusion by contact. Anything spreads epidemically when some form of contact from human to human is necessary for that thing to jump. We are talking about a broad spectrum of interactions. We can pass a virus by touching each other or by using the same enclosed space. We can contaminate another person with a social role by hanging out with them or by sharing the same online platform.

Any epidemic spread – would it be a social role sri in the set SR or a pathogen ptj – happens in a population composed of three subsets of individuals: subset I of infected people, the S subset of people susceptible to infection, and subset R of the immune ones. In the initial phase of epidemic spread, at the moment t0, everyone is potentially susceptible to catch whatever there is to catch, i.e. subset S is equal to the overall headcount of population N, whilst I and R are completely or virtually non-existent. I write it mathematically as I(t0) = 0, R(t0) = 0, S(t0) = N(t0).

The processes of infection, recovery, and acquisition of immune resistance are characterized by 5 essential parameters: a) the rate β of transmission from person to person b) the recruitment rate Λ from general population N to the susceptible subset S c) the rate μ of natural death, d) the rate γ of temporary recovery, and e) ψ the rate of manifestation in immune resistance. The rates γ and ψ can be correlated, although they don’t have to. Immune resistance can be the outcome of recovery or can be attributable to exogenous factors.

Over a timeline made of z temporal checkpoints (periods), some people get infected, i.e. they contract the new virus in fashion, or they buy into being an online influencer. This is the flow from S to I. Some people manifest immunity to infection: they pass from S to R. Both immune resistance and infection can have various outcomes. Infected people can heal and develop immunity, they can die, or they can return to being susceptible. Changes in S, I, and R over time – thus, respectively, dS/dt, dI/dt, and dR/dt, can be described with the following equations:  

Equation [I] [Development of susceptibility]dS/dt = Λ βSI – μS + γI

Equation [II] [Infection]dI/dt = βSI – (μ + γ)I

Equation [III] [Development of immune resistance] dR/dt = ψS(t0) = ψN

We remember that equations [I], [II], and [III] can apply both to pathogens and new social roles. Therefore, we can have a social role sri spreading at dS(sri)/dt, dI(sri)/dt, and dR(sri)/dt, whilst some micro-beast ptj is minding its own business at dS(ptj)/dt, dI(ptj)/dt, and dR(ptj)/dt.

Any given civilization – ours, for example – experiments with the prevalence of different social roles sri in the presence of known pathogens ptj. Experimentation occurs in the form of producing many alternative, local instances of civilization, each based on a general structure. The general structure assumes that a given pace of infection with social roles dI(sri)/dt coexists with a given pace of infection with pathogens dI(ptj)/dt.

I further assume that ε stands for the relative prevalence of anything (i.e. the empirically observed frequency of happening), social role or pathogen. A desired outcome O is being collectively pursued, and e represents the gap between that desired outcome and reality. Our average civilization can be represented as:

Equation [IV] [things that happen]h = {dI(sr1)/dt}* ε(sr1) + {dI(sr2)/dt}* ε(sr2) + … + {dI(srn)/dt}* ε(srn) + {dI(ptj)/dt}* ε(ptj)

Equation [V] [evaluation of the things that happen] e = O – [(e2h – 1)/(e2h + 1)]*{1 – [(e2h – 1)/(e2h + 1)]}2

In equation [V] I used a neural activation function, the hyperbolic tangent, which you can find discussed more in depth, in the context of collective intelligence, in my article on energy efficiency. Essentially, the more social roles are there in the game, in equation [IV], the broader will the amplitude of error in equation [V], when error is produced with hyperbolic tangent. In other words, the more complex is our civilization, the more it can freak out in the presence of a new risk factor, such as a pathogen. It is possible, at least in theory, to reach a level of complexity where the introduction of a new pathogen, such as SARS-Covid-19, makes the error explode into such high a register that social learning either takes a crash trajectory and aims at revolution, or slows down dramatically.

The basic idea of our civilization experimenting with itself is that each actual state of things according to equation [IV] produces some error in equation [V], and we can produce social change by utilizing this error and learning how to minimize it.

Discover Social Sciences is a scientific blog, which I, Krzysztof Wasniewski, individually write and manage. If you enjoy the content I create, you can choose to support my work, with a symbolic $1, or whatever other amount you please, via MY PAYPAL ACCOUNT.  What you will contribute to will be almost exactly what you can read now. I have been blogging since 2017, and I think I have a pretty clearly rounded style.

In the bottom on the sidebar of the main page, you can access the archives of that blog, all the way back to August 2017. You can make yourself an idea how I work, what do I work on and how has my writing evolved. If you like social sciences served in this specific sauce, I will be grateful for your support to my research and writing.

‘Discover Social Sciences’ is a continuous endeavour and is mostly made of my personal energy and work. There are minor expenses, to cover the current costs of maintaining the website, or to collect data, yet I want to be honest: by supporting ‘Discover Social Sciences’, you will be mostly supporting my continuous stream of writing and online publishing. As you read through the stream of my updates on https://discoversocialsciences.com , you can see that I usually write 1 – 3 updates a week, and this is the pace of writing that you can expect from me.

Besides the continuous stream of writing which I provide to my readers, there are some more durable takeaways. One of them is an e-book which I published in 2017, ‘Capitalism And Political Power’. Normally, it is available with the publisher, the Scholar publishing house (https://scholar.com.pl/en/economics/1703-capitalism-and-political-power.html?search_query=Wasniewski&results=2 ). Via https://discoversocialsciences.com , you can download that e-book for free.

Another takeaway you can be interested in is ‘The Business Planning Calculator’, an Excel-based, simple tool for financial calculations needed when building a business plan.

Both the e-book and the calculator are available via links in the top right corner of the main page on https://discoversocialsciences.com .


[1] Belin, P., & Grosbras, M. H. (2010). Before speech: cerebral voice processing in infants. Neuron, 65(6), 733-735. https://doi.org/10.1016/j.neuron.2010.03.018

[2] Adolphs, R. (2010). Conceptual challenges and directions for social neuroscience. Neuron, 65(6), 752-767. https://doi.org/10.1016/j.neuron.2010.03.006

[3] Steinberg, L. (2008). A social neuroscience perspective on adolescent risk-taking. Developmental review, 28(1), 78-106. https://dx.doi.org/10.1016%2Fj.dr.2007.08.002

[4] Ehninger, D., Li, W., Fox, K., Stryker, M. P., & Silva, A. J. (2008). Reversing neurodevelopmental disorders in adults. Neuron, 60(6), 950-960. https://doi.org/10.1016/j.neuron.2008.12.007

[5] Bavelier, D., Levi, D. M., Li, R. W., Dan, Y., & Hensch, T. K. (2010). Removing brakes on adult brain plasticity: from molecular to behavioral interventions. Journal of Neuroscience, 30(45), 14964-14971. https://www.jneurosci.org/content/jneuro/30/45/14964.full.pdf

[6] Day, J. J., & Sweatt, J. D. (2011). Epigenetic mechanisms in cognition. Neuron, 70(5), 813-829. https://doi.org/10.1016/j.neuron.2011.05.019

[7] Sweatt, J. D. (2013). The emerging field of neuroepigenetics. Neuron, 80(3), 624-632. https://doi.org/10.1016/j.neuron.2013.10.023

[8] Cashdan, E., & Steele, M. (2013). Pathogen prevalence, group bias, and collectivism in the standard cross-cultural sample. Human Nature, 24(1), 59-75. https://doi.org/10.1007/s12110-012-9159-3

[9] Murray DR, Schaller M, Suedfeld P (2013) Pathogens and Politics: Further Evidence That Parasite Prevalence Predicts Authoritarianism. PLoS ONE 8(5): e62275. https://doi.org/10.1371/journal.pone.0062275

[10] Méndez, V., Campos, D., & Horsthemke, W. (2012). Stochastic fluctuations of the transmission rate in the susceptible-infected-susceptible epidemic model. Physical Review E, 86(1), 011919. http://dx.doi.org/10.1103/PhysRevE.86.011919

[11] Otunuga, O. M. (2019). Closed-form probability distribution of number of infections at a given time in a stochastic SIS epidemic model. Heliyon, 5(9), e02499. https://doi.org/10.1016/j.heliyon.2019.e02499

Demographic anomalies – the puzzle of urban density

MY EDITORIAL ON YOU TUBE

I am returning to one particular topic connected my hypothesis, stating that technological change that has been going on in our civilisation at least since 1960 is oriented on increasing urbanization of humanity, and more specifically on effective, rigid partition between urban areas and rural ones. I am returning to a specific metric, namely to the DENSITY OF URBAN POPULATION, which I calculated my myself on the basis of three component datasets from the World Bank, namely: i) percentage of general population living in cities AKA coefficient of urbanization, ii) general headcount of population, and iii) total urban land area. I multiply the coefficient of urbanization by the general headcount of population, and thus I get the total number of people living in cities. In the next step, I divide that headcount of urban population by the total urban land area, and I get the density of urban population, measured as people per 1 km2 of urban land. 

That whole calculation is a bit of a mindfuck, and here is why. According to the World Bank, the total area of urban land, i.e. the two-dimensional total size of cities in the world has remained constant since 1990. Counter-intuitive? Hell, yes, especially that the same numerical standstill is officially recorded not only at the planetary level but also at the level of particular countries. It seems so impossible that my calculations regarding the density of urban population should be meaningless. Yet, the most interesting is to come. That DIY coefficient of mine, the density of urban population is significantly, positively correlated, at least at the level of the whole world, with another one: the coefficient of patent applications per 1 million people, which represents the intensity of occurrence in marketable scientific inventions. The corresponding Pearson correlation is r = 0,93 for resident patent applications (i.e. filed in the same country where the corresponding inventions have been made), and r = 0,97 for non-resident patent applications (i.e. foreign science searching legal protection in a country). You can read the details of those calculations in ‘Correlated coupling between living in cities and developing science’. 

That strong Pearson correlations are almost uncanny. Should it hold to deeper scrutiny, it would be one of the strongest correlations I have ever seen in social sciences. Something that is suspected not to make sense (the assumption of constant urban surface on the planet since 1990) produces a coefficient correlated almost at the 1:1 basis with something that is commonly recognized to make sense. F**k! I love science!

I want to sniff around that one a bit. My first step is to split global data into individual countries. In my native Poland, the coefficient of density in urban population, such as I calculate it on the basis of World Bank data, was 759,48 people per 1 km2, against 124,21 people per 1 km2 of general population density. I confront that metric with official data, published by the Main Statistical Office of Poland (www.stat.gov.pl ), regarding three cities, in 2012: my native and beloved Krakow with 5 481 people per 1 km2 of urban land, and not native at all but just as sentimentally attached to my past Gdansk, yielding 4 761 people per 1 km2. Right, maybe I should try something smaller: Myslenice, near Krakow, sort of a satellite town. It is 3 756 people per 1 km2. If smaller does not quite work, it is worth trying bigger. Poland as a whole, according to the same source, has 2 424 people in its average square kilometre of urban space. All these numbers are one order of magnitude higher than my own calculation.

Now, I take a look at my own country from a different angle. The same site, www.stat.gov.pl says that the percentage of urban land in the total surface of the country has been gently growing, from 4,6% in 2003 to 5,4% in 2017. The total surface of Poland is 312 679 km2, and 5,4% makes 16 884,67 km2, against  30 501,34 km2 reported by the World Bank for 2010. All in all, data from the World Bank looks like an overly inflated projection of what urban land in Poland could possibly grow to in the distant future.

I try another European country: France. According to the French website Actu-Environnement: urban areas in France made 119 000 km2 in 2011, and it had apparently grown from the straight 100 000 km2 in 1999. The World Bank reports 86 463,06 km2, thus much less in this case. Similar check for United Kingdom: according to https://www.ons.gov.uk , urban land makes 1,77 million hectares, thus 17 700 km2, against  58 698,75 km2 reported by the World Bank. Once again, a puzzle: where that discrepancy comes from?

The data reported on https://data.worldbank.org/ , as regards the extent of urban land apparently comes from one place: the Center for International Earth Science Information Network (CIESIN), at the Columbia University, and CIESIN declares to base their estimation on satellite photos. The French statistical institution, INSEE, reports a similar methodological take in their studies, in a  paper available at: https://www.insee.fr/fr/statistiques/fichier/2571258/imet129-b-chapitre1.pdf . Apparently, urban land seen from the orbit of Earth is not exactly the same as urban land seen from the window of an office. The latter is strictly delineated by administrative borders of towns and cities, whilst the former has shades and tones, e.g. are 50 warehouse, office and hotel buildings, standing next to each other in an otherwise rural place, an urban space? That’s a tricky question. We return here to the deep thought by Fernand Braudel, in his ‘Civilisation and Capitalism’, Volume 1, Section 8:‘Towns and Cities’: The town, an unusual concentration of people, of houses close together, often joined wall to all, is a demographic anomaly.  

Yes, that seems definitely the path to follow in order to understand those strange, counterintuitive results which I got, regarding the size and human density of urban spaces across the planet: the town is a demographic anomaly. The methodology used by CIESIN, and reproduced by the World Bank, looks for demographic anomalies of urban persuasion, observable on satellite photos. The total surface of those anomalies can be very different from officially reported surface of administrative urban entities within particular countries and seems remaining constant for the moment.

Good. I can return to my hypothesis: technological change that has been going on in our civilisation at least since 1960 is oriented on increasing urbanization of humanity, and more specifically on effective, rigid partition between urban areas and rural ones. The discourse about defining what urban space actually is, and the assumption that it is a demographic anomaly, leads me into investigating how much of an anomaly is it across the planet. In other words: are urban structures anomalous in the same way everywhere, across all the countries on the planet? In order to discuss this specific question, I will be referring to a small database I made, out of data downloaded from the World Bank, and which you can view or download, in Excel format, from this link: Urban Density Database. In my calculations, I assumed that demographic anomaly in urban settlements is observable quantitatively, among others, as abnormal density of population. Official demographic databases yield average, national densities of population, whilst I calculate densities of urban populations, and I can denominate the latter in units of the former. For each country separately, I calculate the following coefficient: [Density of urban population] / [Density of general population]. Both densities are given in the same units, i.e. in people per 1 km2. With the same unit in both the nominator and the denominator of my coefficient, I can ask politely that unit to go and have a break, so as to leave me with what I like: bare numbers.

Those bare numbers, estimated for 2010, tell me a few interesting stories. First of all, there is a bunch of small states where my coefficient is below 1, i.e. the apparent density of urban populations in those places is lower than their general density. They are: San Marino (0,99), Guam (0,98), Puerto Rico (0,98), Tonga (0,93), Grenada (0,72), Mauritius (0,66), Micronesia Fed. Sts. (0,64), Aruba (0,45), Antigua and Barbuda (0,43), St. Kitts and Nevis (0,35), Barbados (0,33), St. Lucia (0,32). These places look like the dream of social distancing: in cities, the density of population is lower than what is observable in the countryside. Numbers in parentheses are precisely the fractions [Density of urban population / Density of general population]. If I keep assuming that urban settlements are a demographic anomaly, those cases yield an anomalous form of an anomaly. These are mostly small island states. The paradox in their case is that officially, their populations mostly urban: more than 90% of their respective populations are technically city dwellers.

I am going once again through the methodology, in order to understand the logic of those local anomalies in the distribution of a general anomaly. Administrative data yields the number of people living in cities. Satellite-based data from the Center for International Earth Science Information Network (CIESIN), at the Columbia University, yields the total surface of settlements qualifiable as urban. The exact method used for that qualification is described as follows: ‘The Global Rural-Urban Mapping Project, Version 1 (GRUMPv1) urban extent grid distinguishes urban and rural areas based on a combination of population counts (persons), settlement points, and the presence of Night-time Lights. Areas are defined as urban where contiguous lighted cells from the Night-time Lights or approximated urban extents based on buffered settlement points for which the total population is greater than 5,000 persons’.  

Night-time lights manifest a fairly high use of electricity, and this is supposed to combine with the presence of specific settlement points. I assume (it is not straightforwardly phrased out in the official methodology) that settlement points mean residential buildings. I guess that a given intensity of agglomeration in such structures allows guessing a potentially urban area. A working hypothesis is being phrased out: ‘This place is a city’. The next step consists in measuring the occurrence of Night-time Lights, and those researchers from CIESIN probably have some scale of that occurrence, with a threshold on it. When the given place, running up for being a city, passes that threshold, then it is definitely deemed a city.

Now, I am returning to those strange outliers with urban populations being apparently less dense than general populations. In my mind, I can see three maps of the same territory. The first map is that of actual human settlements, i.e. the map of humans staying in one place, over the whole territory of the country. The second map is that of official, administratively defined urban entities: towns and cities. Officially, those first two maps overlap in more than 90%: more than 90% of the population lives in places officially deemed as urban settlements. A third map comes to the fore, that of urban settlements defined according to the concentration of residential structures and Night-Time Lights. Apparently, that third map diverges a lot from the second one (administratively defined cities), and a large part of the population lives in places which administratively are urban, but, according to the CIESIN methodology, they are rural, not urban. 

Generally, the distribution of coefficient [Density of urban population] / [Density of general population], which, for the sake of convenience, I will further designate as [DU/DG], is far from the normal bell curve. I have just discussed outliers to be found at the bottom of the scale, and yet there are outliers on its opposite end as well. The most striking is Greenland, with [DU/DG] = 10 385.81, which is not that weird if one thinks about their physical geography. Mauritania and Somalia come with [DU/DG] equal to, respectively, 622.32 and 618.50. Adverse natural conditions apparently make towns and cities a true demographic anomaly, with their populations being several hundred times denser than the general population of their host countries.

The more I study the geographical distribution of the [DU/DG] coefficient, the more I agree with the claim that towns are a demographic anomaly. The coefficient [DU/DG] looks like a measure of civilizational difference between the city and the countryside. Table 1, below, introduces the average values of that coefficient across categories of countries, defined according to income per capita. An interesting pattern emerges. The wealthier a given country is, the smaller the difference between the city and the countryside, in terms of population density. Most of the global population seems to be living in neighbourhoods where that difference is around 20, i.e. where city slickers live in a twentyish times more dense populations than the national average.

I have been focusing a lot on cities as cradles to hone new social roles for new people coming to active social life, and as solutions for peacefully managing the possible conflicts of interests, between social groups, as regards the exploitation of fertile lowland areas on the planet. The abnormally high density of urban population is both an opportunity for creating new social roles, and a possible threshold of marginal gains. The more people there are per 1 km2, the more social interactions between those people, and the greater the likelihood for some of those interactions turning into recurrent patterns, i.e. into social roles. On the other hand, abundant, richly textured social structure, with a big capacity to engender new social roles – in other words, the social structure of wealthy countries – seems to be developing on the back of an undertow of diminishing difference between the city and the countryside.          

Table 1 – Average values of coefficient [Density of urban population] / [Density of general population] across categories of countries regarding wealth and economic development

Category of countriesDensity of urban population denominated over general density of population, 2010Population, 2010
Fragile and conflict affected situations91,98 618 029 522
Heavily indebted poor countries (HIPC)84,96 624 219 326
Low income74,24577 274 011
Upper middle income26,422 499 410 493
Low & middle income22,885 765 121 055
Middle income20,875 187 847 044
Lower middle income15,392 688 436 551
High Income15,811 157 826 206
Central Europe and the Baltics9,63104 421 447
United States9,21309 321 666
European Union5,65441 532 412
Euro area5,16336 151 479

Table 2 represents a different take on the implications of density in urban population. Something old and something new: the already known coefficient of patent applications per 1 million people, and a new one, of fundamental importance, namely the mean consumption of energy per capita, in kilograms of oil equivalent. One kilogram of oil equivalent stands for approximately 11,63 kilowatt hours.  Those two variables are averaged across sextiles (i.e. sets representing 1/6th of the total sample n = 221 countries), in 2010. Consumption of energy presents maybe the clearest pattern: its mean value decreases consistently across sextiles 1 ÷ 5, just to grow slightly in the sixth one. That sixth sextile groups countries with exceptionally tough natural conditions for human settlement, whence an understandable need for extra energy to consume. Save for those outliers, one of the most puzzling connections I have ever seen in social sciences emerges: the less difference between the city and the countryside, in terms of population density, the more energy is being consumed per capita. In other words: the less of a demographic anomaly cities are, in a given country (i.e. the less they diverge from rural neighbourhoods), the more energy people consume. I am trying to wrap my mind around it, just as I try to convey this partial observation graphically, in Graph 2, further below Table 2.

Table 2 – Mean coefficients of energy use per capita, and patent applications per 1 mln people, across sextiles of density in urban population, data for 2010        

Sextiles (Density of urban population denominated over general density of population)Mean [Energy use (kg of oil equivalent per capita)], 2010Mean [Patent applications total per 1 million people], 2010
50,94 ≤ [DU/DG] ≤ 10 385,812 070,5468,35
23,50 ≤ [DU/DG] < 50,941 611,73596,464
12,84 ≤ [DU/DG] < 23,502 184,039218,857
6,00 ≤ [DU/DG] < 12,842 780,263100,097
2,02 ≤ [DU/DG]  < 6,003 288,468284,685
0,00 ≤ [DU/DG] < 2,024 581,108126,734

Final consumption of energy is usually represented as a triad of main uses: production of goods and services, transportation, and the strictly spoken household use (heating, cooking, cooling, electronics etc.). Still, another possible representation comes to my mind: the more different technologies we have stacked up in our civilization, the more energy they all need. I explain. Technological change is frequently modelled as a process when newer generations of technologies supplant the older ones. However, what happens when those generations overlap? If they do, quick technological change makes us stack up technologies, older ones together with the newer ones, and the faster new technologies get invented, the richer the basket of technologies we hold. We could, possibly, strip our state of possessions down, just to one generation of technologies – implicitly it would be the most recent one – and yet we don’t. We keep them all. I look around my house, and around my close neighbourhood. Houses like mine, built in 2001, with the construction technologies of the time, are neighbouring houses built just recently, much more energy-efficient when it comes to heating and thermal insulation. In a theoretically perfect world, when new generation of technologies supplants the older one, my house should be demolished and replaced by a state-of-the-art structure. Yet, I don’t do it. I stick to the old house.

The same applies to cars. My car is an old Honda Civic from 2004. As compared to the really recent cars some of my neighbours bought, my Honda gently drifts in the direction of the nearest museum. Still, I keep it. Across the entire neighbourhood of some 500 households, we have cars stretching from the 1990s up to now. Many generations of technologies coexist. Once again, we technically could shave off the old stuff and stick just to the most recent, yet we don’t. All those technologies need to be powered with at least some energy. The more technologies we have stacked up, the more energy we need.  

I think about that technological explanation because of the second numerical column in Table 2, namely that informative about patent applications per 1 million people. Patentable invention coincides with the creation of new social roles for new people coming with demographic growth. Data in table 2 suggests that some classes of the coefficient [Density of urban population] / [Density of general population] are more prone to such creation than others, i.e. in those specific classes of [DU/DG] the creation of new social roles is particularly intense.

Good. Now comes the strangest mathematical proportion I found in that data about density of urban population and energy. For the interval 1972 ÷ 2014, I could calculate a double-stack coefficient: {[Energy per capita] / [DU/DG]}. The logic behind this fraction is to smooth out the connection between energy per capita and the relative density of urban population, as observed in Table 2 on a discrete scale. As I denominate the density of urban population in units of density in the general population, I want to know how much energy per capita is consumed per each such unit. As it is, that fraction {[Energy per capita] / [DU/DG] is a compound arithmetical construct covering six levels of simple numerical values. After simplification, {[Energy per capita] / [DU/DG] = [Total energy consumed / Population in cities] * [Surface of urban land / Total surface of land]. Simplification goes further. As I look at the numerical values of that complex fraction, computed for the whole world since 1972 through 2014, it keeps oscillating very tightly around 100. More specifically, its average value for that period is AVG{[Energy per capita] / [DU/DG]} = 102.9, with a standard deviation of 3.5, which makes that standard deviation quasi inexistent. As I go across many socio-economic indicators available with the World Bank, none of them follows so flat a trajectory over that many decades. It looks as if there was a global equilibrium between total energy consumed and density of population in cities. What adds pepper to that burrito is the fact that cross-sectionally, i.e. computed for the same year across many countries, the same coefficient {[Energy per capita] / [DU/DG] swings wildly. There are no local patterns, but there is a very strong planetary pattern. WTF?

Discover Social Sciences is a scientific blog, which I, Krzysztof Wasniewski, individually write and manage. If you enjoy the content I create, you can choose to support my work, with a symbolic $1, or whatever other amount you please, via MY PAYPAL ACCOUNT.  What you will contribute to will be almost exactly what you can read now. I have been blogging since 2017, and I think I have a pretty clearly rounded style.

In the bottom on the sidebar of the main page, you can access the archives of that blog, all the way back to August 2017. You can make yourself an idea how I work, what do I work on and how has my writing evolved. If you like social sciences served in this specific sauce, I will be grateful for your support to my research and writing.

‘Discover Social Sciences’ is a continuous endeavour and is mostly made of my personal energy and work. There are minor expenses, to cover the current costs of maintaining the website, or to collect data, yet I want to be honest: by supporting ‘Discover Social Sciences’, you will be mostly supporting my continuous stream of writing and online publishing. As you read through the stream of my updates on https://discoversocialsciences.com , you can see that I usually write 1 – 3 updates a week, and this is the pace of writing that you can expect from me.

Besides the continuous stream of writing which I provide to my readers, there are some more durable takeaways. One of them is an e-book which I published in 2017, ‘Capitalism And Political Power’. Normally, it is available with the publisher, the Scholar publishing house (https://scholar.com.pl/en/economics/1703-capitalism-and-political-power.html?search_query=Wasniewski&results=2 ). Via https://discoversocialsciences.com , you can download that e-book for free.

Another takeaway you can be interested in is ‘The Business Planning Calculator’, an Excel-based, simple tool for financial calculations needed when building a business plan.

Both the e-book and the calculator are available via links in the top right corner of the main page on https://discoversocialsciences.com .

The collective archetype of striking good deals in exports

My editorial on You Tube

I keep philosophizing about the current situation, and I try to coin up a story in my mind, a story meaningful enough to carry me through the weeks and months to come. I try to figure out a strategy for future investment, and, in order to do that, I am doing that thing called ‘strategic assessment of the market’.

Now, seriously, I am profiting from that moment of forced reclusion (in Poland we have just had compulsory sheltering at home introduced, as law) to work a bit on my science, more specifically on the application of artificial neural networks to simulate collective intelligence in human societies. As I have been sending around draft papers on the topic, to various scientific journals (here you have a sample of what I wrote on the topic << click this link to retrieve a draft paper of mine), I have encountered something like a pretty uniform logic of constructive criticism. One of the main lines of reasoning in that logic goes like: ‘Man, it is interesting what you write. Yet, it would be equally interesting to explain what you mean exactly by collective intelligence. How does it or doesn’t it rhyme with individual intelligence? How does it connect with culture?’.

Good question, truly a good one. It is the question that I have been asking myself for months, since I discovered my fascination with the way that simple neural networks work. At the time, I observed intelligent behaviour in a set of four equations, put back to back in a looping sequence, and it was a ground-breaking experience for me. As I am trying to answer this question, my intuitive path is that of distinction between collective intelligence and the individual one. Once again (see The games we play with what has no brains at all ), I go back to William James’s ‘Essays in Radical Empiricism’, and to his take on the relation between reality and our mind. In Essay I, entitled ‘Does Consciousness Exist?’, he goes: “My thesis is that if we start with the supposition that there is only one primal stuff or material in the world, a stuff of which everything is composed, and if we call that stuff ‘pure experience,’ then knowing can easily be explained as a particular sort of relation towards one another into which portions of pure experience may enter. The relation itself is a part of pure experience; one of its ‘terms’ becomes the subject or bearer of the knowledge, the knower, the other becomes the object known. […] Just so, I maintain, does a given undivided portion of experience, taken in one context of associates, play the part of a knower, of a state of mind, of ‘consciousness’; while in a different context the same undivided bit of experience plays the part of a thing known, of an objective ‘content.’ In a word, in one group it figures as a thought, in another group as a thing. And, since it can figure in both groups simultaneously, we have every right to speak of it as subjective and objective both at once.”

Here it is, my distinction. Right, it is partly William James’s distinction. Anyway, individual intelligence is almost entirely mediated by conscious experience of reality, which is representation thereof, not reality as such. Individual intelligence is based on individual representation of reality. By opposition, my take on collective intelligence is based on the theory of adaptive walk in rugged landscape, a theory used both in evolutionary biology and in the programming of artificial intelligence. I define collective intelligence as the capacity to run constant experimentation across many social entities (persons, groups, cultures, technologies etc.), as regards the capacity of those entities to achieve a vector of desired social outcomes.

The expression ‘vector of desired social outcomes’ sounds as something invented by a philosopher and mathematician, together, after a strong intake of strong spirits. I am supposed to be simple in getting my ideas across, and thus I am translating that expression into something simpler. As individuals, we are after something. We have values that we pursue, and that pursuit helps us making it through each consecutive day. Now, there is a question: do we have collective values that we pursue as a society? Interesting question. Bernard Bosanquet, the British philosopher who wrote ‘The Philosophical Theory of The State[1], claimed very sharply that individual desires and values hardly translate into collective, state-wide values and goals to pursue. He claimed that entire societies are fundamentally unable to want anything, they can just be objectively after something. The collective being after something is essentially non-emotional and non-intentional. It is something like a collective archetype, occurring at the individual level somewhere below the level of consciousness, in the collective unconscious, which mediates between conscious individual intelligence and the external stuff of reality, to use William James’ expression.

How to figure out what outcomes are we after, as a society? This is precisely, for the time being, the central axis of my research involving neural networks. I take a set of empirical observations about a society, e.g. a set of country-year observation of 30 countries across 40 quantitative variables. Those empirical observations are the closest I can get to the stuff of reality. I make a simple neural network supposed to simulate the way a society works. The simpler this network is, the better. Each additional component of complexity requires making ever strengthening assumptions about the way societies works. I use that network as a simple robot. I tell the robot: ‘Take one variable from among those 40 in the source dataset. Make it your output variable, i.e. the desired outcome of collective existence. Treat the remaining 39 variables as input, instrumental to achieving that outcome’.  I make 40 such robots, and each of them produces a set of numbers, which is like a mutation of the original empirical dataset, and I can assess the similarity between each such mutation and the source empirical stuff. I do it by calculating the Euclidean distance between vectors of mean values, respectively in each such clone and the original data. Other methods can be used, e.g. kernel functions.

I worked that method through with various empirical datasets, and my preferred one, for now, is Penn Tables 9.1. (Feenstra et al. 2015[2]), which is a pretty comprehensive overview of macroeconomic variables across the planetary board. The detailed results of my research vary, depending on the exact set of variables I take into account, and on the set of observations I select, still there is a tentative conclusion that emerges: as a set of national societies, living in separate countries on that crazy piece of rock, speeding through cosmic space with no roof whatsoever, just with air condition on, we are mostly after terms of trade, and about the way we work, we prepare for work, and the way we remunerate work. Numerical robots which I program to optimize variables such as average price in exports, the share of labour compensation in Gross National Income, the average number of hours worked per year per person, or the number of years spent in education before starting professional activity: all these tend to win the race for similarity to the source empirical data. These seem to be the desired outcomes that our human collective intelligence seems to be after.

Is it of any help regarding the present tough s**t we are waist deep in? If my intuitions are true, whatever we will do regarding the COVID-19 pandemic, will be based on an evolutionary, adaptive choice. Path #1 consists in collectively optimizing those outcomes, whilst trying to deal with the pandemic, and dealing with the pandemic will be instrumental to, for example, the deals we strike in international trade, and to the average number of hours worked per person per year. An alternative Path #2 means to reshuffle our priorities completely and reorganize so as to pursue completely different goals. Which one are we going to take? Good question, very much about guessing rather than forecasting. Historical facts indicate that so far, as a civilization, we have been rather slow out of the gate. Change in collectively pursued values had occurred slowly, progressively, at the pace of generations rather than press conferences.  

In parallel to doing research on collective intelligence, I am working on a business plan for the project I named ‘Energy Ponds’ (see, for example: Bloody hard to make a strategy). I have done some market research down this specific avenue of my intellectual walk, and here below I am giving a raw account of progress therein.

The study of market environment for the Energy Ponds project is pegged on one central characteristic of the technology, which will be eventually developed: the amount of electricity possible to produce in the structure based on ram pumps and relatively small hydroelectric turbines. Will this amount be sufficient just to supply energy to a small neighbouring community or will it be enough to be sold in wholesale amounts via auctions and deals with grid operators. In other words, is Energy Ponds a viable concept just for the off-grid installations or is it scalable up to facility size?

There are examples of small hydropower installations, which connect to big power grids in order to exploit incidental price opportunities (Kusakana 2019[3]).

That basic question kept in mind, it is worth studying both the off-grid market for hydroelectricity, as well as the wholesale, on-grid market. Market research for Energy Ponds starts, in the first subsection below, with a general, global take on the geographical distribution of the main factors, both environmental and socio-economic. The next sections study characteristic types of markets

Overview of environmental and socio-economic factors 

Quantitative investigation starts with the identification of countries, where hydrological conditions are favourable to implementation of Energy Ponds, namely where significant water stress is accompanied by relatively abundant precipitations. More specifically, this stage of analysis comprises two steps. In the first place, countries with significant water stress are identified[4], and then each of them is checked as for the amount of precipitations[5], hence the amount of rainwater possible to collect.

Two remarks are worth formulating at this point. Firstly, in the case of big countries, such as China or United States, covering both swamps and deserts, the target locations for Energy Ponds would be rather regions than countries as a whole. Secondly, and maybe a bit counterintuitively, water stress is not a strict function of precipitations. When studied in 2014, with the above-referenced data from the World Bank, water stress is Pearson-correlated with precipitations just at r = -0,257817141.

Water stress and precipitations have very different distributions across the set of countries reported in the World Bank’s database. Water stress strongly varies across space, and displays a variability (i.e. quotient of its standard deviation divided by its mean value) of v = 3,36. Precipitations are distributed much more evenly, with a variability of v = 0,68. With that in mind, further categorization of countries as potential markets for the implementation of Energy Ponds has been conducted with the assumption that significant water stress is above the median value observed, thus above 14,306296%. As for precipitations, a cautious assumption, prone to subsequent revision, is that sufficient rainfall for sustaining a structure such as Energy Ponds is above the residual difference between mean rainfall observed and its standard deviation, thus above 366,38 mm per year.      

That first selection led to focusing further analysis on 40 countries, namely: Kenya, Haiti, Maldives, Mauritania, Portugal, Thailand, Greece, Denmark, Netherlands, Puerto Rico, Estonia, United States, France, Czech Republic, Mexico, Zimbabwe, Philippines, Mauritius, Turkey, Japan, China, Singapore, Lebanon, Sri Lanka, Cyprus, Poland, Bulgaria, Germany, South Africa, Dominican Republic, Kyrgyz Republic, Malta, India, Italy, Spain, Azerbaijan, Belgium, Korea, Rep., Armenia, Tajikistan.

Further investigation focused on describing those 40 countries from the standpoint of the essential benefits inherent to the concept of Energy Ponds: prevention of droughts and floods on the one hand, with the production of electricity being the other positive outcome. The variable published by the World Bank under the heading of ‘Droughts, floods, extreme temperatures (% of population, average 1990-2009)[6] has been taken individually, and interpolated with the headcount of population. In the first case, the relative importance of extreme weather phenomena for local populations is measured. When recalculated into the national headcount of people touched by extreme weather, this metric highlights the geographical distribution of the aggregate benefits, possibly derived from adaptive resilience vis a vis such events.

Below, both metrics, i.e. the percentage and the headcount of population, are shown as maps. The percentage of population touched by extreme weather conditions is much more evenly distributed than its absolute headcount. In general, Asian countries seem to absorb most of the adverse outcomes resulting from climate change. Outside Asia, and, of course, within the initially selected set of 40 countries, Kenya seems to be the most exposed.    


Another possible take on the socio-economic environment for developing Energy Ponds is the strictly business one. Prices of electricity, together with the sheer quantity of electricity consumed are the chief coordinates in this approach. Prices of electricity have been reported as retail prices for households, as Energy Ponds are very likely to be an off-grid local supplier. Sources of information used in this case are varied: EUROSTAT data has been used as regards prices in European countries[1] and they are generally relevant for 2019. For other countries sites such as STATISTA or www.globalpetrolprices.com have been used, and most of them are relevant for 2018. These prices are national averages across different types of contracts.

The size of electricity markets has been measured in two steps, starting with consumption of electricity per capita, as published by the World Bank[2], which has been multiplied by the headcount of population. Figures below give a graphical idea of the results. In general, there seems to be a trade-off between price and quantity, almost as in the classical demand function. The biggest markets of electricity, such as China or the United States, display relatively low prices. Markets with high prices are comparatively much smaller in terms of quantity. An interesting insight has been found, when prices of electricity have been compared with the percentage of population with access to electricity, as published by the World Bank[3]. Such a comparison, shown in Further below, we can see interesting outliers: Haiti, Kenya, India, and Zimbabwe. These are countries burdened with significant limitations as regards access to electricity. In these locations, projects such as Energy Ponds can possibly produce entirely new energy sources for local populations. 

The possible implementation of Energy Ponds can take place in very different socio-economic environments. It is worth studying those environments as idiosyncratic types. Further below, the following types and cases are studied more in detail:

  1. Type ‘Large cheap market with a lot of environmental outcomes’: China, India >> low price of electricity, locally access to electricity, prevention of droughts and floods,
  • Type ‘Small or medium-sized, developed European economy with high prices of electricity and relatively small a market’
  • Special case: United States ‘Large, moderately priced market, with moderate environmental outcomes’: United States >> moderate price of electricity, possibility to go off grid with Energy Ponds, prevention of droughts and floods 
  • Special case: Kenya > quite low access to electricity (63%) and moderately high retail price of electricity (0,22/ kWh), big population affected by droughts and floods, Energy Ponds can increase access to electricity

Table 1, further below, exemplifies the basic metrics of a hypothetical installation of Energy Ponds, in specific locations representative for the above-mentioned types and special cases. These metrics are:

  1. Discharge (of water) in m3 per second, in selected riverain locations. Each type among those above is illustrated with a few specific, actual geographical spots. The central assumption at this stage is that a local installation of Energy Ponds abstracts 20% of the flow per second in the river. Of course, should a given location be selected for more in-depth a study, specific hydrological conditions have to be taken into account, and the 20%-assumption might be verified upwards or downwards.
  • Electric power to expect with the given abstraction of water. That power has been calculated with the assumption that an average ram pump can create elevation, thus hydraulic head, of about 20 metres. There are more powerful ram pumps (see for example: https://www.allspeeds.co.uk/hydraulic-ram-pump/ ), yet 20 metres is a safely achievable head to assume without precise knowledge of environmental conditions in the given location. Given that 20-meter head, the basic equation to calculate electric power in watts is:
  • [Flow per second, in m3, calculated as 20% of abstraction from the local river]

x

20 [head in meters, by ram pumping]

x

9,81 [Newtonian acceleration]

x

75% [average efficiency of hydroelectric turbines]

  • Financial results to expect from the sales of electricity. Those results are calculated on the basis of two empirical variables: the retail price of electricity, referenced as mentioned earlier in this chapter, and the LCOE (Levelized Cost Of Energy). The latter is sourced from a report by the International Renewable Energy Agency (IRENA 2019[1]), and provisionally pegged at $0,05 per kWh. This is a global average and in this context it plays the role of simplifying assumption, which, in turn, allows direct comparison of various socio-economic contexts. Of course, each specific location for Energy Ponds bears a specific LCOE, in the phase of implementation. With those two source variables, two financial metrics are calculated:
    • Revenues from the sales of electricity, as: [Electric power in kilowatts] x [8760 hours in a year] x [Local retail price for households per 1 kWh]
    • Margin generated over the LCOE, equal to: [Electric power in kilowatts] x [8760 hours in a year] x {[Retail price for households per 1 kWh] – $0,05}

Table 1

Country Location (Flow per second, with 20% abstraction from the river)   Electric power generated with 20% of abstraction from the river (Energy for sale) Annual revenue (Annual margin over LCOE)  
China Near Xiamen,  Jiulong River (26 636,23 m3 /s)   783,9 kW (6 867 006,38 kWh a year)   $549 360,51 ($206 010,19)
China   Near Changde, Yangtze River (2400 m3/s)     353,16 kW (3 093 681,60 kWh a year)     $247 494,53 ($92 810,45
India   North of Rajahmundry, Godavari River (701 m3/s)   103,15 kW (903 612,83 kWh a year) $54 216,77 ($9 036,13) 
India   Ganges River near Patna (2400 m3/s)   353,16 kW (3 093 681,60 kWh a year) $185 620,90  ($30 936,82)
Portugal Near Lisbon, Tagus river (100 m3/s)   14,72 kW (128 903,40 kWh a year)   € 27 765,79 (€22 029,59)
Germany   Elbe river between Magdeburg and Dresden (174 m3/s)   25,6 kW (224 291,92 kWh a year) €68 252,03 (€58 271,04)
  Poland   Vistula between Krakow and Sandomierz (89,8 m3/s)     13,21 kW (115 755,25 kWh a year)   € 18 234,93 (€13 083,82)
France   Rhone river, south of Lyon (3400 m3/s)   500,31 kW   (4 382 715,60 kWh a year)  € 773 549,30  (€ 582 901,17)
United States, California   San Joaquin River (28,8 m3/s)   4,238 kW (37 124,18 kWh a year) $ 7 387,71 ($5 531,50)
United States, Texas   Colorado River, near Barton Creek (100 m3/s)   14,72 kW (128 903,40 kWh a year) $14 643,43 ($8 198,26)
United States, South Carolina   Tennessee River, near Florence (399 m3/s)   58,8 kW   (515 097,99 kWh a year)    $66 499,15  ($40 744,25)
Kenya   Nile River, by the Lake Victoria (400 m3/s)   58,86 kW (515 613,6 kWh a year)  $113 435  ($87 654,31)
Kenya Tana River, near Kibusu (81 m3/s)   11,92 kW (104 411,75 kWh a year)   $22 970,59 ($17 750)

China and India are grouped in the same category for two reasons. Firstly, because of the proportion between the size of markets for electricity, and the pricing thereof. These are huge markets in terms of quantity, yet very frugal in terms of price per 1 kWh. Secondly, these two countries seem to be representing the bulk of populations, globally observed as touched damage from droughts and floods. Should the implementation of Energy Ponds be successful in these countries, i.e. should water management significantly improve as a result, environmental benefits would play a significant socio-economic role.

With those similarities to keep in mind, China and India display significant differences as for both the environmental conditions, and the economic context. China hosts powerful rivers, with very high flow per second. This creates an opportunity, and a challenge. The amount of water possible to abstract from those rivers through ram pumping, and the corresponding electric power possible to generate are the opportunity. Yet, ram pumps, as they are manufactured now, are mostly small-scale equipment. Creating ram-pumping systems able to abstract significant amounts of water from Chinese rivers, in the Energy Ponds scheme, is a technological challenge in itself, which would require specific R&D work.

That said, China is already implementing a nation-wide programme of water management, called ‘Sponge Cities’, which shows some affinity to the Energy Ponds concept. Water management in relatively small, network-like structures, seems to have a favourable economic and political climate in China, and that climate translates into billions of dollars in investment capital.

India is different in these respects. Indian rivers, at least in floodplains, where Energy Ponds can be located, are relatively slow, in terms of flow per second, as compared to China. Whilst Energy Ponds are easier to implement technologically in such conditions, the corresponding amount of electricity is modest. India seems to be driven towards financing projects of water management as big dams, or as local preservation of wetlands. Nothing like the Chinese ‘Sponge Cities’ programme seems to be emerging, to the author’s best knowledge.

European countries form quite a homogenous class of possible locations for Energy Ponds. Retail prices of electricity for households are generally high, whilst the river system is dense and quite disparate in terms of flow per second. In the case of most European rivers, flow per second is low or moderate, still the biggest rivers, such as Rhine or Rhone, offer technological challenges similar to those in China, in terms of required volume in ram pumping.

As regards the Energy Ponds business concept, the United States seem to be a market on their own right. Local populations are exposed to moderate (although growing) an impact of droughts and floods, whilst they consume big amounts of electricity, both in aggregate, and per capita. Retail prices of electricity for households are noticeable disparate from state to state, although generally lower than those practiced in Europe[2]. Prices range from less than $0,1 per 1 kWh in Louisiana, Arkansas or Washington, up to $0,21 in Connecticut. It is to note that with respect to prices of electricity, the state of Hawaii stands out, with more than $0,3 per 1 kWh.

The United States offer quite a favourable environment for private investment in renewable sources of energy, still largely devoid of systematic public incentives. It is a market of multiple, different ecosystems, and all ranges of flow in local rivers.    


[1] IRENA (2019), Renewable Power Generation Costs in 2018, International Renewable Energy Agency, Abu Dhabi. ISBN 978-92-9260-126-3

[2] https://www.electricchoice.com/electricity-prices-by-state/ last access March 6th, 2020


[1] https://ec.europa.eu/eurostat/

[2] https://data.worldbank.org/indicator/EG.USE.ELEC.KH.PC

[3] https://data.worldbank.org/indicator/EG.ELC.ACCS.ZS


[1] Bosanquet, B. (1920). The philosophical theory of the state (Vol. 5). Macmillan and Company, limited.

[2] Feenstra, Robert C., Robert Inklaar and Marcel P. Timmer (2015), “The Next Generation of the Penn World Table” American Economic Review, 105(10), 3150-3182, available for download at http://www.ggdc.net/pwt

[3] Kusakana, K. (2019). Optimal electricity cost minimization of a grid-interactive Pumped Hydro Storage using ground water in a dynamic electricity pricing environment. Energy Reports, 5, 159-169.

[4] Level of water stress: freshwater withdrawal as a proportion of available freshwater resources >> https://data.worldbank.org/indicator/ER.H2O.FWST.ZS

[5] Average precipitation in depth (mm per year) >> https://data.worldbank.org/indicator/AG.LND.PRCP.MM

[6] https://data.worldbank.org/indicator/EN.CLC.MDAT.ZS

Bloody hard to make a strategy

My editorial on You Tube

It is weekend, and it is time to sum up my investment decisions. It is time to set a strategy for investing the next rent collected. Besides being a wannabe financial investor, I am a teacher and a scientist, and thus I want to learn by schooling myself. As with any type of behavioural analysis, I start by asking “What the hell am I doing?”. Here comes a bit of financial theory. When I use money to buy corporate stock, I exchange one type of financial instrument (currency) against another type of financial instrument, i.e. equity-based securities. Why? What for? If I trade one thing against another one, there must be a difference that justifies the trade-off. The difference is certainly in the market pricing. Securities are much more volatile in their prices than money. Thus, when I invest money in securities, I go for higher a risk, and higher possible gains. I want to play a game.

Here comes another thing. When I say I want to play a game, the ‘want’ part is complex. I am determined to learn investment in the most practical terms, i.e. as my own investment. Still, something has changed in my emotions over the last month. I feel apprehensive after having taken my first losses into account. Whilst in the beginning, one month ago, I approached investment as a kid would approach picking a toy in a store, now I am much more cautious. Instead of being in a rush to invest in anything, I am even pushing off a bit the moment of investment decision. It is like sport training. Sooner or later, after the first outburst of enthusiasm, there comes the moment when it hurts. Not much, just a bit, but enough to make me feel uncomfortable. That’s the moment when I need to reassess my goals, and just push myself through that window of doubt. As I follow that ‘sport training’ logic, what works for me when I am down on optimism is consistency. I do measured pieces of work, which I can reliably link to predictable outcomes.

Interesting. Two sessions of investment decisions, some 4 weeks apart from each other, and I experience completely different emotions. This is a sure sign that I am really learning something new. I invest 2500 PLN, and in my investments, I mostly swing between positions denominated in PLN, those in EUR, and those in USD. At current exchange rates 2500 PLN = €582,75 = $629,72. Please, notice that when I consider investing Polish zlotys, the PLNs, into securities denominated in PLN, EUR or USD, I consider two, overlapping financial decisions: that of exchanging money (pretty fixed nominal value) against securities, and that of exchanging zlotys against other currencies.

Let’s focus for a moment, on the strictly speaking money game. If I swing between three currencies, it is a good move to choose one as reference. Here comes a practical rule, which I teach to my students: your reference currency is the one you earn the major part of your income in. My income comes from my salary, and from the rent, both in Polish zlotys, and thus the PLN is my denominator. A quick glance at the play between PLN, USD, and EUR brings the following results:

>> PLN to EUR: February 1st 2020, €1 = 4,3034 PLN  ; February 23rd, 2020 €1 = 4,2831 PLN ; net change: (4,2831 – 4,3034) / 4,034 =  -0,50% 

>> PLN to USD: February 1st 2020, $1 = 3,8864 PLN ; February 23rd, 2020 $1 = 3,9623 PLN; net change: (3,9623 – 3,8864) / 3,8864 =  1,95%

For the moment, it seems that the euro is depreciating as compared to the US dollar, and I think it would be better to invest in dollars. Since my last update on this blog, I did something just opposite: I sold in USD, and bought in euro. That would be it as for consistency. February 21st – decided to sell Frequency Therapeutics, as I was losing money on it. I consistently apply the principle of cutting losses short. I had a look at short-term trend in the price of Frequency Therapeutics, and there is no indication of bouncing back up. Question: what to invest that money in? Canadian Solar? No, they are falling. SMA Solar Technology AG? Good fundamentals, rising price trend, equity €411,4 mln, market cap €1 251 mln, clearly overvalued, but maybe for a reason. Bought SMA Solar Technology, and it seems to have been a bad move. I have a slight loss on them, just as I have one on First Solar. I consider selling them both, still they both have interestingly strong fundamentals, yet both are experiencing a downwards trend in stock price. Hard to say why. Hence, what I have just done is to place continuous ‘sell’ orders with a price limit that covers my loss and gives me a profit. We will see how it works. For First Solar, I placed a ‘sell’ order at minimum 54$, and regarding SMA Solar Technology I did the same with the bottom limit at €37.

I found another interesting investment in the industry of renewable energies: SolarWinds Corporation. Good fundamentals, temporarily quite low in price, there is risk, but there is gain in view, too. I would like to explain the logic of investing in particular sectors of the economy. My take on the thing is that when I just spend my money, I spend it sort of evenly on the whole economy because my money is going to circulate. When I decide to invest my money in the equity of particular industries it is a focused decision.

Thus, I come to the issue of strategy. I am completely honest now: I have hard times to sketch any real strategy, i.e. a strategy which I am sure I will stick to. I see three basic directions. Firstly, I can keep the present portfolio, just invest more in each position so as to keep a constant structure. Secondly, I can keep the present portfolio as it is and invest that new portion of money in additional positions. Thirdly, and finally, I can sell the present portfolio in its entirety and open completely new a set of positions. My long-term purpose is, of course, to earn money. Still, my short-term purpose is to learn how to earn money by financial investment. Thus, the first option, i.e. constant structure of my portfolio, seems dumb. Firstly, it is not like I have nailed down something really workable. That last month has been a time of experimentation, summing up with a net loss. The third option sounds so crazy that it is tempting.  

I think about investing the immediately upcoming chunk of money into ETF funds, or so-called trackers. I have just realized they give a nice turbo boost to my investments. The one I already have – Amundi Epra DRG – performs nicely. The only problem is that it is denominated in euros, and I want to move towards dollars, at least for now.

Trackers sectorally adapted to my priorities. Trackers (ETFs) are a bit more expensive – they collect a transactional fee on the top of the fee collected by the broker – yet my experience with Amundi Epra, a tracker focused on European real estate, is quite positive in terms of net returns. I think about Invesco QQQ Trust (QQQ), a tracker oriented on quick-growth stock. Another one is Microsoft. OK, I think about Tesla, too, but it is more than $900 one share. I would have to sell a lot of what I already have in order to buy one. Maybe if I sell some of the well-performing biotechs in my portfolio? Square Inc., the publicly-listed sister company of Twitter, is another interesting one. This is IT, thus one of my preferred sectors. I am having a look at their fundamentals, and yes! They look as if they had finally learnt to make money.

I think I have made my choice. My next rent collected will go 50% into Invesco QQQ Trust (QQQ), and 50% into Square Inc..

My blog is supposed to be very much about investment, and my personal training therein, still I keep in mind the scientific edge. I am reworking, from the base, my concept of Energy Ponds, which I have already developed on for the last year or so (see, for example ‘The mind-blowing hydro’). The general background of ‘Energy Ponds’ consists in natural phenomena observable in Europe as the climate change progresses, namely: a) long-term shift in the structure of precipitations, from snow to rain b) increasing occurrence of floods and droughts c) spontaneous reemergence of wetlands. All these phenomena have one common denominator: increasingly volatile flow per second in rivers. The essential idea of Energy Ponds is to ‘financialize’ that volatile flow, so to say, i.e. to capture its local surpluses, store them for later, and use the very mechanism of storage itself as a source of economic value.

When water flows downstream, in a river, its retention can be approached as the opportunity for the same water to loop many times over the same specific portion of the collecting basin (of the river). Once such a loop is created, we can extend the average time that a liter of water spends in the whereabouts. Ram pumps, connected to storage structures akin to swamps, can give such an opportunity. A ram pump uses the kinetic energy of flowing water in order to pump some of that flow up and away from its mainstream. Ram pumps allow forcing a process, which we now as otherwise natural. Rivers, especially in geological plains, where they flow relatively slowly, tend to build, with time, multiple ramifications. Those branchings can be directly observable at the surface, as meanders, floodplains or seasonal lakes, but much of them is underground, as pockets of groundwater. In this respect, it is useful to keep in mind that mechanically, rivers are the drainpipes of rainwater from their respective basins. Another basic hydrological fact, useful to remember in the context of the Energy Ponds concept, is that strictly speaking retention of rainwater – i.e. a complete halt in its circulation through the collecting basin of the river – is rarely possible, and just as rarely it is a sensible idea to implement. Retention means rather a slowdown to the flow of rainwater through the collecting basin into the river.

One of the ways that water can be slowed down consists in making it loop many times over the same section of the river. Let’s imagine a simple looping sequence: water from the river is being ram-pumped up and away into retentive structures akin to swamps, i.e. moderately deep spongy structures underground, with high capacity for retention, covered with a superficial layer of shallow-rooted vegetation. With time, as the swamp fills with water, the surplus is evacuated back into the river, by a system of canals. Water stored in the swamp will be ultimately evacuated, too, minus evaporation, it will just happen much more slowly, by the intermediary of groundwaters. In order to illustrate the concept mathematically, let’ s suppose that we have water in the river flowing at the pace of, e.g. 45 m3 per second. We make it loop once via ram pumps and retentive swamps, and, if as a result of that looping, the speed of the flow is sliced by 3. On the long run we slow down the way that the river works as the local drainpipe: we slow it from 43 m3 per second down to [43/3 = 14,33…] m3 per second.  As water from the river flows slower overall, it can yield more environmental services: each cubic meter of water has more time to ‘work’ in the ecosystem.   

When I think of it, any human social structure, such as settlements, industries, infrastructures etc., needs to stay in balance with natural environment. That balance is to be understood broadly, as the capacity to stay, for a satisfactorily long time, within a ‘safety zone’, where the ecosystem simply doesn’t kill us. That view has little to do with the moral concepts of environment-friendliness or sustainability. As a matter of fact, most known human social structures sooner or later fall out of balance with the ecosystem, and this is how civilizations collapse. Thus, here comes the first important assumption: any human social structure is, at some level, an environmental project. The incumbent social structures, possible to consider as relatively stable, are environmental projects which have simply hold in place long enough to grow social institutions, and those institutions allow further seeking of environmental balance.

Some human structures can be deemed ‘sustainable’, but this looks rather like an exception than the rule. As a civilization, we are anything but frugal and energy-saving. Still, the practical question remains, how can we possibly enhance the creation of sustainable social structures (markets, cities, industries etc.), without relying on a hypothetical moral conversion from the alleged ‘greed’ and ‘wastefulness’, to a more or less utopian state of conscious sustainability. The model presented below argues that such enhancement can occur by creating economic ownership in local communities, as regards the assets invested in environmental projects. Economic ownership is to distinguish from the strictly speaking legal ownership. It can cover, of course, property rights as such, but it can stretch to many different types of enforceable claims on the proceeds from exploiting economic utility derived from the environmental projects in question.

Any human social structure generates an aggregate amount of environmental outcomes EV, understood as reduction of environmental risks. Environmental risk means the probable, uncertain occurrence of adverse environmental effects. Part of those outcomes is captured as economic utility U(EV), and partly comes as freeride benefits F(EV).  For any human social structure there is a threshold value U*(EV), above which the economic utility U(EV) is sufficient to generate social change supportive of the structure in question. Social change means the creation of institutions and markets, which, in turn, have the capacity to last. On the other hand, should U(EV) be lower than U*(EV), the structure in question cannot self-justify its interaction with natural environment, and falls apart.

The derivation of U(EV) is a developmental process rather than an instantaneous phenomenon. It is long-term social change, which can be theoretically approached as evolutionary adaptive walk in rugged landscape. In that adaptive walk, the crucial moment is the formation of markets and/or institutions, where exchange of utility occurs as stochastic change over time in an Ornstein–Uhlenbeck process with a jump component, akin to that observable in electricity prices, i.e. (Borovkova & Schmeck2017[1]). It means that human social structures become able to optimize their environmental impact when they form prices stable enough to be mean-reverted over time, whilst staying flexible enough to drift with jumps. Most technologies we invent serve to transform environmental outcomes into exchangeable goods endowed with economic utility. The set of technologies we use impacts our capacity to sustain social structures. Adaptive walk requires many similar instances of a social structure, similar enough to have common structural traits. Each such instance is a 1-mutation neighbour of at least one other instance. By the way, if you want to contact me directly, you can mail at: goodscience@discoversocialsciences.com


[1] Borovkova, S., & Schmeck, M. D. (2017). Electricity price modeling with stochastic time change. Energy Economics, 63, 51-65. http://dx.doi.org/10.1016/j.eneco.2017.01.002&nbsp;

What are the practical outcomes of those hypotheses being true or false?

 

My editorial on You Tube

 

This is one of those moments when I need to reassess what the hell I am doing. Scientifically, I mean. Of course, it is good to reassess things existentially, too, every now and then, but for the moment I am limiting myself to science. Simpler and safer than life in general. Anyway, I have a financial scheme in mind, where local crowdfunding platforms serve to support the development of local suppliers in renewable energies. The scheme is based on the observable difference between prices of electricity for small users (higher), and those reserved to industrial scale users (lower). I wonder if small consumers would be ready to pay the normal, relatively higher price in exchange of a package made of: a) electricity and b) shares in the equity of its suppliers.

I have a general, methodological hypothesis in mind, which I have been trying to develop over the last 2 years or so: collective intelligence. I hypothesise that collective behaviour observable in markets can be studied as a manifestation of collective intelligence. The purpose is to go beyond optimization and to define, with scientific rigour, what are the alternative, essentially equiprobable paths of change that a complex market can take. I think such an approach is useful when I am dealing with an economic model with a lot of internal correlation between variables, and that correlation can be so strong that it turns into those variables basically looping on each other. In such a situation, distinguishing independent variables from the dependent ones becomes bloody hard, and methodologically doubtful.

On the grounds of literature, and my own experimentation, I have defined three essential traits of such collective intelligence: a) distinction between structure and instance b) capacity to accumulate experience, and c) capacity to pass between different levels of freedom in social cohesion. I am using an artificial neural network, a multi-layer perceptron, in order to simulate such collectively intelligent behaviour.

The distinction between structure and instance means that we can devise something, make different instances of that something, each different by some small details, and experiment with those different instances in order to devise an even better something. When I make a mechanical clock, I am a clockmaker. When I am able to have a critical look at this clock, make many different versions of it – all based on the same structural connections between mechanical parts, but differing from each other by subtle details – and experiment with those multiple versions, I become a meta-clock-maker, i.e. someone who can advise clockmakers on how to make clocks. The capacity to distinguish between structures and their instances is one of the basic skills we need in life. Autistic people have a big problem in that department, as they are mostly on the instance side. To a severely autistic person, me in a blue jacket, and me in a brown jacket are two completely different people. Schizophrenic people are on the opposite end of the spectrum. To them, everything is one and the same structure, and they cannot cope with instances. Me in a blue jacket and me in a brown jacket are the same as my neighbour in a yellow jumper, and we all are instances of the same alien monster. I know you think I might be overstating, but my grandmother on the father’s side used to suffer from schizophrenia, and it was precisely that: to her, all strong smells were the manifestation of one and the same volatile poison sprayed in the air by THEM, and every person outside a circle of about 19 people closest to her was a member of THEM. Poor Jadwiga.

In economics, the distinction between structure and instance corresponds to the tension between markets and their underpinning institutions. Markets are fluid and changeable, they are like constant experimenting. Institutions give some gravitas and predictability to that experimenting. Institutions are structures, and markets are ritualized manners of multiplying and testing many alternative instances of those structures.

The capacity to accumulate experience means that as we experiment with different instances of different structures, we can store information we collect in the process, and use this information in some meaningful way. My great compatriot, Alfred Korzybski, in his general semantics, used to designate it as ‘the capacity to bind time’. The thing is not as obvious as one could think. A Nobel-prized mathematician, Reinhard Selten, coined up the concept of social games with imperfect recall (Harsanyi, Selten 1988[1]). He argued that as we, collective humans, accumulate and generalize experience about what the hell is going on, from time to time we shake off that big folder, and pick the pages endowed with the most meaning. All the remaining stuff, judged less useful on the moment, is somehow archived in culture, so as it basically stays there, but becomes much harder to access and utilise. The capacity to accumulate experience means largely the way of accumulating experience, and doing that from-time-to-time archiving. We can observe this basic distinction in everyday life. There are things that we learn sort of incrementally. When I learn to play piano – which I wish I was learning right now, cool stuff – I practice, I practice, I practice and… I accumulate learning from all those practices, and one day I give a concert, in a pub. Still, other things, I learn them sort of haphazardly. Relationships are a good example. I am with someone, one day I am mad at her, the other day I see her as the love of my life, then, again, she really gets on my nerves, and then I think I couldn’t live without her etc. Bit of a bumpy road, isn’t it? Yes, there is some incremental learning, but you become aware of it after like 25 years of conjoint life. Earlier on, you just need to suck ass and keep going.

There is an interesting theory in economics, labelled as « semi – martingale » (see for example: Malkiel, Fama 1970[2]). When we observe changes in stock prices, in a capital market, we tend to say they are random, but they are not. You can test it. If the price is really random, it should fan out according to the pattern of normal distribution. This is what we call a full martingale. Any real price you observe actually swings less broadly than normal distribution: this is a semi-martingale. Still, anyone with any experience in investment knows that prediction inside the semi-martingale is always burdened with a s**tload of error. When you observe stock prices over a long time, like 2 or 3 years, you can see a sequence of distinct semi-martingales. From September through December it swings inside one semi-martingale, then the Ghost of Past Christmases shakes it badly, people panic, and later it settles into another semi-martingale, slightly shifted from the preceding one, and here it goes, semi-martingaling for another dozen of weeks etc.

The central theoretical question in this economic theory, and a couple of others, spells: do we learn something durable through local shocks? Does a sequence of economic shocks, of whatever type, make a learning path similar to the incremental learning of piano playing? There are strong arguments in favour of both possible answers. If you get your face punched, over and over again, you must be a really dumb asshole not to learn anything from that. Still, there is that phenomenon called systemic homeostasis: many systems, social structures included, tend to fight for stability when shaken, and they are frequently successful. The memory of shocks and revolutions is frequently erased, and they are assumed to have never existed.

The issue of different levels in social cohesion refers to the so-called swarm theory (Stradner et al 2013[3]). This theory studies collective intelligence by reference to animals, which we know are intelligent just collectively. Bees, ants, hornets: all those beasts, when acting individually, as dumb as f**k. Still, when they gang up, they develop amazingly complex patterns of action. That’s not all. Those complex patterns of theirs fall into three categories, applicable to human behaviour as well: static coupling, dynamic correlated coupling, and dynamic random coupling.

When we coordinate by static coupling, we always do things together in the same way. These are recurrent rituals, without much room for change. Many legal rules, and institutions they form the basis of, are examples of static coupling. You want to put some equity-based securities in circulation? Good, you do this, and this, and this. You haven’t done the third this? Sorry, man, but you cannot call it a day yet. When we need to change the structure of what we do, we should somehow loosen that static coupling and try something new. We should dissolve the existing business, which is static coupling, and look for creating something new. When we do so, we can sort of stay in touch with our customary business partners, and after some circling and asking around we form a new business structure, involving people we clearly coordinate with. This is dynamic correlated coupling. Finally, we can decide to sail completely uncharted waters, and take our business concept to China, or to New Zealand, and try to work with completely different people. What we do, in such a case, is emitting some sort of business signal into the environment, and waiting for any response from whoever is interested. This is dynamic random coupling. Attracting random followers to a new You Tube channel is very much an example of the same.

At the level of social cohesion, we can be intelligent in two distinct ways. On the one hand, we can keep the given pattern of collective associations behaviour at the same level, i.e. one of the three I have just mentioned. We keep it ritualized and static, or somehow loose and dynamically correlated, or, finally, we take care of not ritualizing too much and keep it deliberately at the level of random associations. On the other hand, we can shift between different levels of cohesion. We take some institutions, we start experimenting with making them more flexible, at some point we possibly make it as free as possible, and we gain experience, which, in turn, allows us to create new institutions.

When applying the issue of social cohesion in collective intelligence to economic phenomena, we can use a little trick, to be found, for example, in de Vincenzo et al (2018[4]): we assume that quantitative economic variables, which we normally perceive as just numbers, are manifestations of distinct collective decisions. When I have the price of energy, let’s say, €0,17 per kilowatt hour, I consider it as the outcome of collective decision-making. At this point, it is useful to remember the fundamentals of intelligence. We perceive our own, individual decisions as outcomes of our independent thinking. We associate them with the fact of wanting something, and being apprehensive regarding something else etc. Still, neurologically, those decisions are outcomes of some neurons firing in a certain sequence. Same for economic variables, i.e. mostly prices and quantities: they are fruit of interactions between the members of a community. When I buy apples in the local marketplace, I just buy them for a certain price, and, if they look bad, I just don’t buy. This is not any form of purposeful influence upon the market. Still, when 10 000 people like me do the same, sort of ‘buy when price good, don’t when the apple is bruised’, a patterned process emerges. The resulting price of apples is the outcome of that process.

Social cohesion can be viewed as association between collective decisions, not just between individual actions. The resulting methodology is made, roughly speaking, of three steps. Step one: I put all the economic variables in my model over a common denominator (common scale of measurement). Step two: I calculate the relative cohesion between them with the general concept of a fitness function, which I can express, for example, as the Euclidean distance between local values of variables in question. Step three: I calculate the average of those Euclidean distances, and I calculate its reciprocal, like « 1/x ». This reciprocal is the direct measure of cohesion between decisions, i.e. the higher the value of this precise « 1/x », the more cohesion between different processes of economic decision-making.

Now, those of you with a sharp scientific edge could say now: “Wait a minute, doc. How do you know we are talking about different processes of decision making? Who do you know that variable X1 comes from a different process than variable X2?”. This is precisely my point. The swarm theory tells me that if I can observe changing a cohesion between those variables, I can reasonably hypothesise that their underlying decision-making processes are distinct. If, on the other hand, their mutual Euclidean distance stays the same, I hypothesise that they come from the same process.

Summing up, here is the general drift: I take an economic model and I formulate three hypotheses as for the occurrence of collective intelligence in that model. Hypothesis #1: different variables of the model come from different processes of collective decision-making.

Hypothesis #2: the economic system underlying the model has the capacity to learn as a collective intelligence, i.e. to durably increase or decrease the mutual cohesion between those processes. Hypothesis #3: collective learning in the presence of economic shocks is different from the instance of learning in the absence of such shocks.

They look nice, those hypotheses. Now, why the hell should anyone bother? I mean what are the practical outcomes of those hypotheses being true or false? In my experimental perceptron, I express the presence of economic shocks by using hyperbolic tangent as neural function of activation, whilst the absence of shocks (or the presence of countercyclical policies) is expressed with a sigmoid function. Those two yield very different processes of learning. Long story short, the sigmoid learns more, i.e. it accumulates more local errors (this more experimental material for learning), and it generates a steady trend towards lower a cohesion between variables (decisions). The hyperbolic tangent accumulates less experiential material (it learns less), and it is quite random in arriving to any tangible change in cohesion. The collective intelligence I mimicked with that perceptron looks like the kind of intelligence, which, when going through shocks, learns only the skill of returning to the initial position after shock: it does not create any lasting type of change. The latter happens only when my perceptron has a device to absorb and alleviate shocks, i.e. the sigmoid neural function.

When I have my perceptron explicitly feeding back that cohesion between variables (i.e. feeding back the fitness function considered as a local error), it learns less and changes less, but not necessarily goes through less shocks. When the perceptron does not care about feeding back the observable distance between variables, there is more learning and more change, but not more shocks. The overall fitness function of my perceptron changes over time The ‘over time’ depends on the kind of neural activation function I use. In the case of hyperbolic tangent, it is brutal change over a short time, eventually coming back to virtually the same point that it started from. In the hyperbolic tangent, the passage between various levels of association, according to the swarm theory, is super quick, but not really productive. In the sigmoid, it is definitely a steady trend of decreasing cohesion.

I want to know what the hell I am doing. I feel I have made a few steps towards that understanding, but getting to know what I am doing proves really hard.

I am consistently delivering good, almost new science to my readers, and love doing it, and I am working on crowdfunding this activity of mine. As we talk business plans, I remind you that you can download, from the library of my blog, the business plan I prepared for my semi-scientific project Befund  (and you can access the French version as well). You can also get a free e-copy of my book ‘Capitalism and Political Power’ You can support my research by donating directly, any amount you consider appropriate, to my PayPal account. You can also consider going to my Patreon page and become my patron. If you decide so, I will be grateful for suggesting me two things that Patreon suggests me to suggest you. Firstly, what kind of reward would you expect in exchange of supporting me? Secondly, what kind of phases would you like to see in the development of my research, and of the corresponding educational tools?

[1] Harsanyi, J. C., & Selten, R. (1988). A general theory of equilibrium selection in games. MIT Press Books, 1.

[2] Malkiel, B. G., & Fama, E. F. (1970). Efficient capital markets: A review of theory and empirical work. The journal of Finance, 25(2), 383-417.

[3] Stradner, J., Thenius, R., Zahadat, P., Hamann, H., Crailsheim, K., & Schmickl, T. (2013). Algorithmic requirements for swarm intelligence in differently coupled collective systems. Chaos, Solitons & Fractals, 50, 100-114.

[4] De Vincenzo, I., Massari, G. F., Giannoccaro, I., Carbone, G., & Grigolini, P. (2018). Mimicking the collective intelligence of human groups as an optimization tool for complex problems. Chaos, Solitons & Fractals, 110, 259-266.

How can I possibly learn on that thing I have just become aware I do?

 

My editorial on You Tube

 

I keep working on the application of neural networks to simulate the workings of collective intelligence in humans. I am currently macheting my way through the model proposed by de Vincenzo et al in their article entitled ‘Mimicking the collective intelligence of human groups as an optimization tool for complex problems’ (2018[1]). In the spirit of my own research, I am trying to use optimization tools for a slightly different purpose, that is for simulating the way things are done. It usually means that I relax some assumptions which come along with said optimization tools, and I just watch what happens.

Vincenzo et al propose a model of artificial intelligence, which combines a classical perceptron, such as the one I have already discussed on this blog (see « More vigilant than sigmoid », for example) with a component of deep learning based on the observable divergences in decisions. In that model, social agents strive to minimize their divergences and to achieve relative consensus. Mathematically, it means that each decision is characterized by a fitness function, i.e. a function of mathematical distance from other decisions made in the same population.

I take the tensors I have already been working with, namely the input tensor TI = {LCOER, LCOENR, KR, KNR, IR, INR, PA;R, PA;NR, PB;R, PB;NR} and the output tensor is TO = {QR/N; QNR/N}. Once again, consult « More vigilant than sigmoid » as for the meaning of those variables. In the spirit of the model presented by Vincenzo et al, I assume that each variable in my tensors is a decision. Thus, for example, PA;R, i.e. the basic price of energy from renewable sources, which small consumers are charged with, is the tangible outcome of a collective decision. Same for the levelized cost of electricity from renewable sources, the LCOER, etc. For each i-th variable xi in TI and TO, I calculate its relative fitness to the overall universe of decisions, as the average of itself, and of its Euclidean distances to other decisions. It looks like:

 

V(xi) = (1/N)*{xi + [(xi – xi;1)2]0,5 + [(xi – xi;2)2]0,5 + … + [(xi – xi;K)2]0,5}

 

…where N is the total number of variables in my tensors, and K = N – 1.

 

In a next step, I can calculate the average of averages, thus to sum up all the individual V(xi)’s and divide that total by N. That average V*(x) = (1/N) * [V(x1) + V(x2) + … + V(xN)] is the measure of aggregate divergence between individual variables considered as decisions.

Now, I imagine two populations: one who actively learns from the observed divergence of decisions, and another one who doesn’t really. The former is represented with a perceptron that feeds back the observable V(xi)’s into consecutive experimental rounds. Still, it is just feeding that V(xi) back into the loop, without any a priori ideas about it. The latter is more or less what it already is: it just yields those V(xi)’s but does not do much about them.

I needed a bit of thinking as for how exactly should that feeding back of fitness function look like. In the algorithm I finally came up with, it looks differently for the input variables on the one hand, and for the output ones. You might remember, from the reading of « More vigilant than sigmoid », that my perceptron, in its basic version, learns by estimating local errors observed in the last round of experimentation, and then adding those local errors to the values of input variables, just to make them roll once again through the neural activation function (sigmoid or hyperbolic tangent), and see what happens.

As I upgrade my perceptron with the estimation of fitness function V(xi), I ask: who estimates the fitness function? What kind of question is that? Well, a basic one. I have that neural network, right? It is supposed to be intelligent, right? I add a function of intelligence, namely that of estimating the fitness function. Who is doing the estimation: my supposedly intelligent network or some other intelligent entity? If it is an external intelligence, mine, for a start, it just estimates V(xi), sits on its couch, and watches the perceptron struggling through the meanders of attempts to be intelligent. In such a case, the fitness function is like sweat generated by a body. The body sweats but does not have any way of using the sweat produced.

Now, if the V(xi) is to be used for learning, the perceptron is precisely the incumbent intelligent structure supposed to use it. I see two basic ways for the perceptron to do that. First of all, the input neuron of my perceptron can capture the local fitness functions on input variables and add them, as additional information, to the previously used values of input variables. Second of all, the second hidden neuron can add the local fitness functions, observed on output variables, to the exponent of the neural activation function.

I explain. I am a perceptron. I start my adventure with two tensors: input TI = {LCOER, LCOENR, KR, KNR, IR, INR, PA;R, PA;NR, PB;R, PB;NR} and output TO = {QR/N; QNR/N}. The initial values I start with are slightly modified in comparison to what was being processed in « More vigilant than sigmoid ». I assume that the initial market of renewable energies – thus most variables of quantity with ‘R’ in subscript – is quasi inexistent. More specifically, QR/N = 0,01 and  QNR/N = 0,99 in output variables, whilst in the input tensor I have capital invested in capacity IR = 0,46 (thus a readiness to go and generate from renewables), and yet the crowdfunding flow K is KR = 0,01 for renewables and KNR = 0,09 for non-renewables. If you want, it is a sector of renewable energies which is sort of ready to fire off but hasn’t done anything yet in that department. All in all, I start with: LCOER = 0,26; LCOENR = 0,48; KR = 0,01; KNR = 0,09; IR = 0,46; INR = 0,99; PA;R = 0,71; PA;NR = 0,46; PB;R = 0,20; PB;NR = 0,37; QR/N = 0,01; and QNR/N = 0,99.

Being a pure perceptron, I am dumb as f**k. I can learn by pure experimentation. I have ambitions, though, to be smarter, thus to add some deep learning to my repertoire. I estimate the relative mutual fitness of my variables according to the V(xi) formula given earlier, as arithmetical average of each variable separately and its Euclidean distance to others. With the initial values as given, I observe: V(LCOER; t0) = 0,302691788; V(LCOENR; t0) = 0,310267104; V(KR; t0) = 0,410347388; V(KNR; t0) = 0,363680721; V(IR ; t0) = 0,300647174; V(INR ; t0) = 0,652537097; V(PA;R ; t0) = 0,441356844 ; V(PA;NR ; t0) = 0,300683099 ; V(PB;R ; t0) = 0,316248176 ; V(PB;NR ; t0) = 0,293252713 ; V(QR/N ; t0) = 0,410347388 ; and V(QNR/N ; t0) = 0,570485945. All that stuff put together into an overall fitness estimation is like average V*(x; t0) = 0,389378787.

I ask myself: what happens to that fitness function when as I process information with my two alternative neural functions, the sigmoid or the hyperbolic tangent. I jump to experimental round 1500, thus to t1500, and I watch. With the sigmoid, I have V(LCOER; t1500) =  0,359529289 ; V(LCOENR; t1500) =  0,367104605; V(KR; t1500) =  0,467184889; V(KNR; t1500) = 0,420518222; V(IR ; t1500) =  0,357484675; V(INR ; t1500) =  0,709374598; V(PA;R ; t1500) =  0,498194345; V(PA;NR ; t1500) =  0,3575206; V(PB;R ; t1500) =  0,373085677; V(PB;NR ; t1500) =  0,350090214; V(QR/N ; t1500) =  0,467184889; and V(QNR/N ; t1500) = 0,570485945, with average V*(x; t1500) =  0,441479829.

Hmm, interesting. Working my way through intelligent cognition with a sigmoid, after 1500 rounds of experimentation, I have somehow decreased the mutual fitness of decisions I make through individual variables. Those V(xi)’s have changed. Now, let’s see what it gives when I do the same with the hyperbolic tangent: V(LCOER; t1500) =   0,347752478; V(LCOENR; t1500) =  0,317803169; V(KR; t1500) =   0,496752021; V(KNR; t1500) = 0,436752021; V(IR ; t1500) =  0,312040791; V(INR ; t1500) =  0,575690006; V(PA;R ; t1500) =  0,411438698; V(PA;NR ; t1500) =  0,312052766; V(PB;R ; t1500) = 0,370346458; V(PB;NR ; t1500) = 0,319435252; V(QR/N ; t1500) =  0,496752021; and V(QNR/N ; t1500) = 0,570485945, with average V*(x; t1500) =0,413941802.

Well, it is becoming more and more interesting. Being a dumb perceptron, I can, nevertheless, create two different states of mutual fitness between my decisions, depending on the kind of neural function I use. I want to have a bird’s eye view on the whole thing. How can a perceptron have a bird’s eye view of anything? Simple: it rents a drone. How can a perceptron rent a drone? Well, how smart do you have to be to rent a drone? Anyway, it gives something like the graph below:

 

Wow! So this is what I do, as a perceptron, and what I haven’t been aware so far? Amazing. When I think in sigmoid, I sort of consistently increase the relative distance between my decisions, i.e. I decrease their mutual fitness. The sigmoid, that function which sorts of calms down any local disturbance, leads to making a decision-making process like less coherent, more prone to embracing a little chaos. The hyperbolic tangent thinking is different. It occasionally sort of stretches across a broader spectrum of fitness in decisions, but as soon as it does so, it seems being afraid of its own actions, and returns to the initial level of V*(x). Please, note that as a perceptron, I am almost alive, and I produce slightly different outcomes in each instance of myself. The point is that in the line corresponding to hyperbolic tangent, the comb-like pattern of small oscillations can stretch and move from instance to instance. Still, it keeps the general form of a comb.

OK, so this is what I do, and now I ask myself: how can I possibly learn on that thing I have just become aware I do? As a perceptron, endowed with this precise logical structure, I can do one thing with information: I can arithmetically add it to my input. Still, having some ambitions for evolving, I attempt to change my logical structure, and I risk myself into incorporating somehow the observable V(xi) into my neural activation function. Thus, the first thing I do with that new learning is to top the values of input variables with local fitness functions observed in the previous round of experimenting. I am doing it already with local errors observed in outcome variables, so why not doubling the dose of learning? Anyway, it goes like: xi(t0) = xi(t-1) + e(xi; t-1) + V(xi; t-1). It looks interesting, but I am still using just a fraction of information about myself, i.e. just that about input variables. Here is where I start being really ambitious. In the equation of the sigmoid function, I change s = 1 / [1 + exp(∑xi*Wi)] into s = 1 / [1 + exp(∑xi*Wi + V(To)], where V(To) stands for local fitness functions observed in output  variables. I do the same by analogy in my version based on hyperbolic tangent. The th = [exp(2*∑xi*wi)-1] / [exp(2*∑xi*wi) + 1] turns into th = {exp[2*∑xi*wi + V(To)] -1} / {exp[2*∑xi*wi + V(To)] + 1}. I do what I know how to do, i.e. adding information from fresh observation, and I apply it to change the structure of my neural function.

All those ambitious changes in myself, put together, change my pattern of learing as shown in the graph below:

When I think sigmoid, the fact of feeding back my own fitness function does not change much. It makes the learning curve a bit steeper in the early experimental rounds, and makes it asymptotic to a little lower threshold in the last rounds, as compared to learning without feedback on V(xi). Yet, it is the same old sigmoid, with just its sleeves ironed. On the other hand, the hyperbolic tangent thinking changes significantly. What used to look like a comb, without feedback, now looks much more aggressive, like a plough on steroids. There is something like a complex cycle of learning on the internal cohesion of decisions made. Generally, feeding back the observable V(xi) increases the finally achieved cohesion in decisions, and, in the same time, it reduces the cumulative error gathered by the perceptron. With that type of feedback, the cumulative error of the sigmoid, which normally hits around 2,2 in this case, falls to like 0,8. With hyperbolic tangent, cumulative errors which used to be 0,6 ÷ 0,8 without feedback, fall to 0,1 ÷ 0,4 with feedback on V(xi).

 

The (provisional) piece of wisdom I can have as my takeaway is twofold. Firstly, whatever I do, a large chunk of perceptual learning leads to a bit less cohesion in my decisions. As I learn by experience, I allow myself more divergence in decisions. Secondly, looping on that divergence, and including it explicitly in my pattern of learning leads to relatively more cohesion at the end of the day. Still, more cohesion has a price – less learning.

 

I am consistently delivering good, almost new science to my readers, and love doing it, and I am working on crowdfunding this activity of mine. As we talk business plans, I remind you that you can download, from the library of my blog, the business plan I prepared for my semi-scientific project Befund  (and you can access the French version as well). You can also get a free e-copy of my book ‘Capitalism and Political Power’ You can support my research by donating directly, any amount you consider appropriate, to my PayPal account. You can also consider going to my Patreon page and become my patron. If you decide so, I will be grateful for suggesting me two things that Patreon suggests me to suggest you. Firstly, what kind of reward would you expect in exchange of supporting me? Secondly, what kind of phases would you like to see in the development of my research, and of the corresponding educational tools?

[1] De Vincenzo, I., Massari, G. F., Giannoccaro, I., Carbone, G., & Grigolini, P. (2018). Mimicking the collective intelligence of human groups as an optimization tool for complex problems. Chaos, Solitons & Fractals, 110, 259-266.

More vigilant than sigmoid

My editorial on You Tube

 

I keep working on the application of neural networks as simulators of collective intelligence. The particular field of research I am diving into is the sector of energy, its shift towards renewable energies, and the financial scheme I invented some time ago, which I called EneFin. As for that last one, you can consult « The essential business concept seems to hold », in order to grasp the outline.

I continue developing the line of research I described in my last update in French: « De la misère, quoi ». There are observable differences in the prices of energy according to the size of the buyer. In many countries – practically in all the countries of Europe – there are two, distinct price brackets. One, which I further designated as PB, is reserved to contracts with big consumers of energy (factories, office buildings etc.) and it is clearly lower. Another one, further called PA, is applied to small buyers, mainly households and really small businesses.

As an economist, I have that intuitive thought in the presence of price forks: that differential in prices is some kind of value. If it is value, why not giving it some financial spin? I came up with the idea of the EneFin contract. People buy energy from a local supplier, in the amount Q, who sources it from renewables (water, wind etc.), and they pay the price PA, thus generating a financial flow equal to Q*PA. That flow buys two things: energy priced at PB, and participatory titles in the capital of their supplier, for the differential Q*(PA – PB). I imagine some kind of crowdfunding platform, which could channel the amount of capital K = Q*(PA – PB).

That K remains in some sort of fluid relationship to I, or capital invested in the productive capacity of energy suppliers. Fluid relationship means that each of those capital balances can date other capital balances, no hard feelings held. As we talk (OK, I talk) about prices of energy and capital invested in capacity, it is worth referring to LCOE, or Levelized Cost Of Electricity. The LCOE is essentially the marginal cost of energy, and a no-go-below limit for energy prices.

I want to simulate the possible process of introducing that general financial concept, namely K = Q*(PA – PB), into the market of energy, in order to promote the development of diversified networks, made of local suppliers in renewable energy.

Here comes my slightly obsessive methodological idea: use artificial intelligence in order to simulate the process. In classical economic method, I make a model, I take empirical data, I regress some of it on another some of it, and I come up with coefficients of regression, and they tell me how the thing should work if we were living in a perfect world. Artificial intelligence opens a different perspective. I can assume that my model is a logical structure, which keeps experimenting with itself and we don’t the hell know where exactly that experimentation leads. I want to use neural networks in order to represent the exact way that social structures can possibly experiment with that K = Q*(PA – PB) thing. Instead of optimizing, I want to see that way that possible optimization can occur.

I have that simple neural network, which I already referred to in « The point of doing manually what the loop is supposed to do » and which is basically quite dumb, as it does not do any abstraction. Still, it nicely experiments with logical structures. I am sketching its logical structure in the picture below. I distinguish four layers of neurons: input, hidden 1, hidden 2, and output. When I say ‘layers’, it is a bit of grand language. For the moment, I am working with one single neuron in each layer. It is more of a synaptic chain.

Anyway, the input neuron feeds data into the chain. In the first round of experimentation, it feeds the source data in. In consecutive rounds of learning through experimentation, that first neuron assesses and feeds back local errors, measured as discrepancies between the output of the output neuron, and the expected values of output variables. The input neuron is like the first step in a chain of perception, in a nervous system: it receives and notices the raw external information.

The hidden layers – or the hidden neurons in the chain – modify the input data. The first hidden neuron generates quasi-random weights, which the second hidden neuron attributes to the input variables. Just as in a nervous system, the input stimuli are assessed as for their relative importance. In the original algorithm of perceptron, which I used to design this network, those two functions, i.e. generating the random weights and attributing them to input variables, were fused in one equation. Still, my fundamental intent is to use neural networks to simulate collective intelligence, and intuitively guess those two functions are somehow distinct. Pondering the importance of things is one action and using that ponderation for practical purposes is another. It is like scientist debating about the way to run a policy, and the government having the actual thing done. These are two separate paths of action.

Whatever. What the second hidden neuron produces is a compound piece of information: the summation of input variables multiplied by random weights. The output neuron transforms this compound data through a neural function. I prepared two versions of this network, with two distinct neural functions: the sigmoid, and the hyperbolic tangent. As I found out, the way they work is very different, just as the results they produce. Once the output neuron generates the transformed data – the neural output – the input neuron measures the discrepancy between the original, expected values of output variables, and the values generated by the output neuron. The exact way of computing that discrepancy is made of two operations: calculating the local derivative of the neural function, and multiplying that derivative by the residual difference ‘original expected output value minus output value generated by the output neuron’. The so calculated discrepancy is considered as a local error, and is being fed back into the input neuron as an addition to the value of each input variable.

Before I go into describing the application I made of that perceptron, as regards my idea for financial scheme, I want to delve into the mechanism of learning triggered through repeated looping of that logical structure. The input neuron measures the arithmetical difference between the output of the network in the preceding round of experimentation, and that difference is being multiplied by the local derivative of said output. Derivative functions, in their deepest, Newtonian sense, are magnitudes of change in something else, i.e. in their base function. In the Newtonian perspective, everything that happens can be seen either as change (derivative) in something else, or as an integral (an aggregate that changes its shape) of still something else. When I multiply the local deviation from expected values by the local derivative of the estimated value, I assume this deviation is as important as the local magnitude of change in its estimation. The faster things happen, the more important they are, so do say. My perceptron learns by assessing the magnitude of local changes it induces in its own estimations of reality.

I took that general logical structure of the perceptron, and I applied it to my core problem, i.e. the possible adoption of the new financial scheme to the market of energy. Here comes sort of an originality in my approach. The basic way of using neural networks is to give them a substantial set of real data as learning material, make them learn on that data, and then make them optimize a hypothetical set of data. Here you have those 20 old cars, take them into pieces and try to put them back together, observe all the anomalies you have thus created, and then make me a new car on the grounds of that learning. I adopted a different approach. My focus is to study the process of learning in itself. I took just one set of actual input values, exogenous to my perceptron, something like an initial situation. I ran 5000 rounds of learning in the perceptron, on the basis of that initial set of values, and I observed how is learning taking place.

My initial set of data is made of two tensors: input TI and output TO.

The thing I am the most focused on is the relative abundance of energy supplied from renewable sources. I express the ‘abundance’ part mathematically as the coefficient of energy consumed per capita, or Q/N. The relative bend towards renewables, or towards the non-renewables is apprehended as the distinction between renewable energy QR/N consumed per capita, and the non-renewable one, the QNR/N, possibly consumed by some other capita. Hence, my output tensor is TO = {QR/N; QNR/N}.

I hypothesise that TO is being generated by input made of prices, costs, and capital outlays. I split my price fork PA – PB (price for the big ones minus price for the small ones) into renewables and non-renewables, namely into: PA;R, PA;NR, PB;R, and PB;NR. I mirror the distinction in prices with that in the cost of energy, and so I call LCOER and LCOENR. I want to create a financial scheme that generates a crowdfunded stream of capital K, to finance new productive capacities, and I want it to finance renewable energies, and I call KR. Still, some other people, like my compatriots in Poland, might be so attached to fossils they might be willing to crowdfund new installations based on non-renewables. Thus, I need to take into account a KNR in the game. When I say capital, and I say LCOE, I sort of feel compelled to say aggregate investment in productive capacity, in renewables, and in non-renewables, and I call it, respectively, IR and INR. All in all, my input tensor spells TI = {LCOER, LCOENR, KR, KNR, IR, INR, PA;R, PA;NR, PB;R, PB;NR}.

The next step is scale and measurement. The neural functions I use in my perceptron like having their input standardized. Their tastes in standardization differ a little. The sigmoid likes it nicely spread between 0 and 1, whilst the hyperbolic tangent, the more reckless of the two, tolerates (-1) ≥ x ≥ 1. I chose to standardize the input data between 0 and 1, so as to make it fit into both. My initial thought was to aim for an energy market with great abundance of renewable energy, and a relatively declining supply of non-renewables. I generally trust my intuition, only I like to leverage it with a bit of chaos, every now and then, and so I ran some pseudo-random strings of values and I chose an output tensor made of TO = {QR/N = 0,95; QNR/N = 0,48}.

That state of output is supposed to be somehow logically connected to the state of input. I imagined a market, where the relative abundance in the consumption of, respectively, renewable energies and non-renewable ones is mostly driven by growing demand for the former, and a declining demand for the latter. Thus, I imagined relatively high a small-user price for renewable energy and a large fork between that PA;R and the PB;R. As for non-renewables, the fork in prices is more restrained (than in the market of renewables), and its top value is relatively lower. The non-renewable power installations are almost fed up with investment INR, whilst the renewables could still do with more capital IR in productive assets. The LCOENR of non-renewables is relatively high, although not very: yes, you need to pay for the fuel itself, but you have economies of scale. As for the LCOER for renewables, it is pretty low, which actually reflects the present situation in the market.

The last part of my input tensor regards the crowdfunded capital K. I assumed two different, initial situations. Firstly, it is virtually no crowdfunding, thus a very low K. Secondly, some crowdfunding is already alive and kicking, and it is sort of slightly above the half of what people expect in the industry.

Once again, I applied those qualitative assumptions to a set of pseudo-random values between 0 and 1. Here comes the result, in the table below.

 

Table 1 – The initial values for learning in the perceptron

Tensor Variable The Market with virtually no crowdfunding   The Market with significant crowdfunding
Input TI LCOER         0,26           0,26
LCOENR         0,48           0,48
KR         0,01   <= !! =>         0,56    
KNR         0,01            0,52    
IR         0,46           0,46
INR         0,99           0,99
PA;R         0,71           0,71
PA;NR         0,46           0,46
PB;R         0,20           0,20
PB;NR         0,37           0,37
Output TO QR/N         0,95           0,95
QNR/N         0,48           0,48

 

The way the perceptron works means that it generates and feeds back local errors in each round of experimentation. Logically, over the 5000 rounds of experimentation, each input variable gathers those local errors, like a snowball rolling downhill. I take the values of input variables from the last, i.e. the 5000th round: they have the initial values, from the table above, and, on the top of them, there is cumulative error from the 5000 experiments. How to standardize them, so as to make them comparable with the initial ones? I observe: all those final output values have the same cumulative error in them, across all the TI input tensor. I choose a simple method for standardization. As the initial values were standardized over the interval between 0 and 1, I standardize the outcoming values over the interval 0 ≥ x ≥ (1 + cumulative error).

I observe the unfolding of cumulative error along the path of learning, made of 5000 steps. There is a peculiarity in each of the neural functions used: the sigmoid, and the hyperbolic tangent. The sigmoid learns in a slightly Hitchcockian way. Initially, local errors just rocket up. It is as if that sigmoid was initially yelling: ‘F******k! What a ride!’. Then, the value of errors drops very sharply, down to something akin to a vanishing tremor, and starts hovering lazily over some implicit asymptote. Hyperbolic tangent learns differently. It seems to do all it can to minimize local errors whenever it is possible. Obviously, it is not always possible. Every now and then, that hyperbolic tangent produces an explosively high value of local error, like a sudden earthquake, just to go back into forced calm right after. You can observe those two radically different ways of learning in the two graphs below.

Two ways of learning – the sigmoidal one and the hyper-tangential one – bring interestingly different results, just as differentiated are the results of learning depending on the initial assumptions as for crowdfunded capital K. Tables 2 – 5, further below, list the results I got. A bit of additional explanation will not hurt. For every version of learning, i.e. sigmoid vs hyperbolic tangent, and K = 0,01 vs K ≈ 0,5, I ran 5 instances of 5000 rounds of learning in my perceptron. This is the meaning of the word ‘Instance’ in those tables. One instance is like a tensor of learning: one happening of 5000 consecutive experiments. The values of output variables remain constant all the time: TO = {QR/N = 0,95; QNR/N = 0,48}. The perceptron sweats in order to come up with some interesting combination of input variables, given this precise tensor of output.

 

Table 2 – Outcomes of learning with the sigmoid, no initial crowdfunding

 

The learnt values of input variables after 5000 rounds of learning
Learning with the sigmoid, no initial crowdfunding
Instance 1 Instance 2 Instance 3 Instance 4 Instance 5
cumulative error 2,11 2,11 2,09 2,12 2,16
LCOER 0,7617 0,7614 0,7678 0,7599 0,7515
LCOENR 0,8340 0,8337 0,8406 0,8321 0,8228
KR 0,6820 0,6817 0,6875 0,6804 0,6729
KNR 0,6820 0,6817 0,6875 0,6804 0,6729
IR 0,8266 0,8262 0,8332 0,8246 0,8155
INR 0,9966 0,9962 1,0045 0,9943 0,9832
PA;R 0,9062 0,9058 0,9134 0,9041 0,8940
PA;NR 0,8266 0,8263 0,8332 0,8247 0,8155
PB;R 0,7443 0,7440 0,7502 0,7425 0,7343
PB;NR 0,7981 0,7977 0,8044 0,7962 0,7873

 

 

Table 3 – Outcomes of learning with the sigmoid, with substantial initial crowdfunding

 

The learnt values of input variables after 5000 rounds of learning
Learning with the sigmoid, substantial initial crowdfunding
Instance 1 Instance 2 Instance 3 Instance 4 Instance 5
cumulative error 1,98 2,01 2,07 2,03 1,96
LCOER 0,7511 0,7536 0,7579 0,7554 0,7494
LCOENR 0,8267 0,8284 0,8314 0,8296 0,8255
KR 0,8514 0,8529 0,8555 0,8540 0,8504
KNR 0,8380 0,8396 0,8424 0,8407 0,8369
IR 0,8189 0,8207 0,8238 0,8220 0,8177
INR 0,9965 0,9965 0,9966 0,9965 0,9965
PA;R 0,9020 0,9030 0,9047 0,9037 0,9014
PA;NR 0,8189 0,8208 0,8239 0,8220 0,8177
PB;R 0,7329 0,7356 0,7402 0,7375 0,7311
PB;NR 0,7891 0,7913 0,7949 0,7927 0,7877

 

 

 

 

 

Table 4 – Outcomes of learning with the hyperbolic tangent, no initial crowdfunding

 

The learnt values of input variables after 5000 rounds of learning
Learning with the hyperbolic tangent, no initial crowdfunding
Instance 1 Instance 2 Instance 3 Instance 4 Instance 5
cumulative error 1,1 1,27 0,69 0,77 0,88
LCOER 0,6470 0,6735 0,5599 0,5805 0,6062
LCOENR 0,7541 0,7726 0,6934 0,7078 0,7257
KR 0,5290 0,5644 0,4127 0,4403 0,4746
KNR 0,5290 0,5644 0,4127 0,4403 0,4746
IR 0,7431 0,7624 0,6797 0,6947 0,7134
INR 0,9950 0,9954 0,9938 0,9941 0,9944
PA;R 0,8611 0,8715 0,8267 0,8349 0,8450
PA;NR 0,7432 0,7625 0,6798 0,6948 0,7135
PB;R 0,6212 0,6497 0,5277 0,5499 0,5774
PB;NR 0,7009 0,7234 0,6271 0,6446 0,6663

 

 

Table 5 – Outcomes of learning with the hyperbolic tangent, substantial initial crowdfunding

 

The learnt values of input variables after 5000 rounds of learning
Learning with the hyperbolic tangent, substantial initial crowdfunding
Instance 1 Instance 2 Instance 3 Instance 4 Instance 5
cumulative error -0,33 0,2 -0,06 0,98 -0,25
LCOER (0,1089) 0,3800 0,2100 0,6245 0,0110
LCOENR 0,2276 0,5681 0,4497 0,7384 0,3111
KR 0,3381 0,6299 0,5284 0,7758 0,4096
KNR 0,2780 0,5963 0,4856 0,7555 0,3560
IR 0,1930 0,5488 0,4251 0,7267 0,2802
INR 0,9843 0,9912 0,9888 0,9947 0,9860
PA;R 0,5635 0,7559 0,6890 0,8522 0,6107
PA;NR 0,1933 0,5489 0,4252 0,7268 0,2804
PB;R (0,1899) 0,3347 0,1522 0,5971 (0,0613)
PB;NR 0,0604 0,4747 0,3306 0,6818 0,1620

 

The cumulative error, the first numerical line in each table, is something like memory. It is a numerical expression of how much experience has the perceptron accumulated in the given instance of learning. Generally, the sigmoid neural function accumulates more memory, as compared to the hyper-tangential one. Interesting. The way of processing information affects the amount of experiential data stored in the process. If you use the links I gave earlier, you will see different logical structures in those two functions. The sigmoid generally smoothes out anything it receives as input. It puts the incoming, compound data in the negative exponent of the Euler’s constant e = 2,72, and then it puts the resulting value as part of the denominator of 1. The sigmoid is like a bumper: it absorbs shocks. The hyperbolic tangent is different. It sort of exposes small discrepancies in input. In human terms, the hyper-tangential function is more vigilant than the sigmoid. As it can be observed in this precise case, absorbing shocks leads to more accumulated experience than vigilantly reacting to observable change.

The difference in cumulative error, observable in the sigmoid-based perceptron vs that based on hyperbolic tangent is particularly sharp in the case of a market with substantial initial crowdfunding K. In 3 instances on 5, in that scenario, the hyper-tangential perceptron yields a negative cumulative error. It can be interpreted as the removal of some memory, implicitly contained in the initial values of input variables. When the initial K is assumed to be 0,01, the difference in accumulated memory, observable between the two neural functions, significantly shrinks. It looks as if K ≥ 0,5 was some kind of disturbance that the vigilant hyperbolic tangent attempts to eliminate. That impression of disturbance created by K ≥ 0,5 is even reinforced as I synthetically compare all the four sets of outcomes, i.e. tables 2 – 5. The case of learning with the hyperbolic tangent, and with substantial initial crowdfunding looks radically different from everything else. The discrepancy between alternative instances seems to be the greatest in this case, and the incidentally negative values in the input tensor suggest some kind of deep shakeoff. Negative prices and/or negative costs mean that someone external is paying for the ride, probably the taxpayers, in the form of some fiscal stimulation.

I am consistently delivering good, almost new science to my readers, and love doing it, and I am working on crowdfunding this activity of mine. As we talk business plans, I remind you that you can download, from the library of my blog, the business plan I prepared for my semi-scientific project Befund  (and you can access the French version as well). You can also get a free e-copy of my book ‘Capitalism and Political Power’ You can support my research by donating directly, any amount you consider appropriate, to my PayPal account. You can also consider going to my Patreon page and become my patron. If you decide so, I will be grateful for suggesting me two things that Patreon suggests me to suggest you. Firstly, what kind of reward would you expect in exchange of supporting me? Secondly, what kind of phases would you like to see in the development of my research, and of the corresponding educational tools?