Representative for collective intelligence

I am generalizing from the article which I am currently revising, and I am taking a broader view on many specific strands of research I am running, mostly in order to move forward with my hypothesis of collective intelligence in human social structures. I want to recapitulate on my method – once more – in order to extract and understand its meaning. 

I have recently realized a few things about my research. Firstly, I am using the logical structure of an artificial neural network as a simulator more than an optimizer, as digital imagination rather than functional, goal-oriented intelligence, and that seems to be the way of using AI which hardly anyone else in social sciences seems to be doing. The big question which I am (re)asking myself is to what extent are my simulations representative for the collective intelligence of human societies.

I start gently, with variables, hence with my phenomenology. I mostly use the commonly accessible and published variables, such as those published by the World Bank, the International Monetary Fund, STATISTA etc. Sometimes, I make my own coefficients out of those commonly accepted metrics, e.g. the coefficient of resident patent applications per 1 million people, the proportion between the density of population in cities and the general one, or the coefficient of fixed capital assets per 1 patent application.

My take on any variables in social sciences is very strongly phenomenological, or even hermeneutic. I follow the line of logic which you can find, for example, in “Phenomenology of Perception” by Maurice Merleau-Ponty (reprint, revised, Routledge, 2013, ISBN 1135718601, 9781135718602). I assume that any of the metrics we have in social sciences is an entanglement of our collective cognition with the actual s**t going on. As the actual s**t going on encompasses our way of forming our collective cognition, any variable used in social sciences is very much like a person’s attempt to look at themselves from a distance. Yes! This is what we use mirrors for! Variables used in social sciences are mirrors. Still, they are mirrors made largely by trial and error, with a little bit of a shaky hand, and each of them shows actual social reality in slightly disformed a manner.

Empirical research in social sciences consists, very largely, in a group of people trying to guess something about themselves on the basis of repeated looks into a set of imperfect mirrors. Those mirrors are imperfect, and yet they serve some purpose. I pass to my second big phenomenological take on social reality, namely that our entangled observations thereof are far from being haphazard. The furtive looks we catch of the phenomenal soup, out there, are purposeful. We pay attention to things which pay off. We define specific variables in social sciences because we know by experience that paying attention to those aspects of social reality brings concrete rewards, whilst not paying attention thereto can hurt, like bad.

Let’s take inflation. Way back in the day, like 300 years ago, no one really used the term of inflation because the monetary system consisted in a multitude of currencies, mixing private and public deeds of various kinds. Entire provinces in European countries could rely on bills of exchange issued by influential merchants and bankers, just to switch to other type of bills 5 years later. Fluctuations in the rates of exchange in those multiple currencies very largely cancelled each other. Each business of respectable size was like a local equivalent of the today’s Forex exchange. Inflation was a metric which did not even make sense at the time, as any professional of finance would intuitively ask back: ‘Inflation? Like… inflation in which exactly among those 27 currencies I use everyday?’.

Standardized monetary systems, which we call ‘FIAT money’ today, steadied themselves only in the 19th century. Multiple currencies progressively fused into one, homogenized monetary mass, and mass conveys energy. Inflation is loss of monetary energy, like entropy of the monetary mass. People started paying attention to inflation when it started to matter.

We make our own social reality, which is fundamentally unobservable to us, and it makes sense because it is hard to have an objective, external look at a box when we are staying inside the box. Living in that box, we have learnt, over time, how to pay attention to the temporarily important properties of the box. We have learnt how to use maths for fine tuning in that selective perception of ours. We learnt, for example, to replace the basic distinction between people doing business and people not doing business at all with finer shades of how much business are people doing exactly in a unit of time-space.   

Therefore, a set of empirical variables, e.g. from the World Bank, is a collection of imperfect observations, which represent valuable outcomes social outcomes. A set of N socio-economic variables represents N collectively valuable social outcomes, which, in turn, correspond to N collective pursuits – it is a set of collective orientations. Now, my readers have the full right to protest: ‘Man, just chill. You are getting carried away by your own ideas. Quantitative variables about society and economy are numbers, right? They are the metrics of something. Measurement is objective and dispassionate. How can you say that objectively gauged metrics are collective orientations?’. Yes, these are all valid objections, and I made up that little imaginary voice of my readers on the basis of reviews that I had for some of my papers.

Once again, then. We measure the things we care about, and we go to great lengths in creating accurate scales and methods of measurement for the things we very much care about. Collective coordination is costly and hard to achieve. If we devote decades of collective work to nail down the right way of measuring, e.g. the professional activity of people, it probably matters. If it matters, we are collectively after optimizing it. A set of quantitative, socio-economic variables represents a set of collectively pursued orientations.

In the branch of philosophy called ethics, there is a stream of thought labelled ‘contextual ethics’, whose proponents claim that whatever normatively defined values we say we stick to, the real values we stick to are to be deconstructed from our behaviour. Things we are recurrently and systematically after are our contextual ethical values. Yes, the socio-economic variables we can get from your average statistical office are informative about the contextual values of our society.

When I deal with a variable like the % of electricity in the total consumption of energy, I deal with a superimposition of two cognitive perspectives. I observe something that happens in the social reality, and that phenomenon takes the form of a spatially differentiated, complex state of things, which changes over time, i.e. one complex state transitions into another complex state etc. On the other hand, I observe a collective pursuit to optimize that % of electricity in the total consumption of energy.

The process of optimizing a socio-economic metric makes me think once again about the measurement of social phenomena. We observe and measure things which are important to us because they give us some sort of payoff. We can have collective payoffs in three basic ways. We can max out, for one. Case: Gross Domestic Product, access to sanitation. We can keep something as low as possible, for two. Case: murder, tuberculosis. Finally, we can maintain some kind of healthy dynamic balance. Case: inflation, use of smartphones. Now, let’s notice that we don’t really do fine calculations about murder or tuberculosis. Someone is healthy or sick, still alive or already murdered. Transitional states are not really of much of a collective interest. As it comes to outcomes which pay off by the absence of something, we tend to count them digitally, like ‘is there or isn’t there’. On the other hand, those other outcomes, which we max out on or keep in equilibrium, well, that’s another story. We invent and perfect subtle scales of measurement for those phenomena. That makes me think about a seminal paper titled ‘Selection by consequences’, by the founding father of behaviourism, Burrhus Frederic Skinner. Skinner introduced the distinction between positive and negative reinforcements. He claimed that negative reinforcements are generally stronger in shaping human behaviour, whilst being clumsier as well. We just run away from a tiger, we don’t really try to calibrate the right distance and the right speed of evasion. On the other hand, we tend to calibrate quite finely our reactions to positive reinforcements. We dose our food, we measure exactly the buildings we make, we learn by small successes etc.  

If a set of quantitative socio-economic variables is informative about a set of collective orientations (collectively pursued outcomes), one of the ways we can study that set consists in establishing the hierarchy of orientations. Are some of those collective values more important than others? What does it even mean ‘more important’ in this context, and how can it be assessed? We can imagine that each among the many collective orientations is an individual pursuing their idiosyncratic path of payoffs from interactions with the external world. By the way, this metaphor is closer to reality than it could appear at the first sight. Each human is, in fact, a distinct orientation. Each of us is action. This perspective has been very sharply articulated by Martin Heidegger, in his “Being and Time”.    

Hence, each collective orientation can be equated to an individual force, pulling the society in a specific direction. In the presence of many socio-economic variables, I assume the actual social reality is a superimposition of those forces. They can diverge or concur, as they please, I do not make any assumptions about that. Which of those forces pulls the most powerfully?

Here comes my mathematical method, in the form of an artificial neural network. I proceed step by step. What does it mean that we collectively optimize a metric? Mostly by making it coherent with our other orientations. Human social structures are based on coordination, and coordination happens both between social entities (individuals, cities, states, political parties etc.), and between different collective pursuits. Optimizing a metric representative for a collectively valuable outcome means coordinating with other collectively valuable outcomes. In that perspective, a phenomenon represented (imperfectly) with a socio-economic metric is optimized when it remains in some kind of correlation with other phenomena, represented with other metrics. The way I define correlation in that statement is a broad one: correlation is any concurrence of events displaying a repetitive, functional pattern.

Thus, when I study the force of a given variable as a collective orientation in a society, I take this variable as the hypothetical output in the process (of collective orientation, and I simulate that process as the output variable sort of dragging the remaining variables behind it, by the force of functional coherence. With a given set of empirical variables, I make as many mutations thereof as I have variables. Each mutated set represents a process, where one variable as output, and the remaining ones as input. The process consists of as many experiments as there are observational rows in my database. Most socio-economic variables come in rows of the type “country A in year X”.  

Here, I do a little bit of mathematical cavalry with two different models of swarm intelligence: particle swarm and ants’ colony (see: Gupta & Srivastava 2020[1]). The model of particle swarm comes from the observation of birds, which keeps me in a state of awe about human symbolic creativity, and it models the way that flocks of birds stay collectively coherent when they fly around in the search of food. Each socio-economic variable is a collective orientation, and in practical terms it corresponds to a form of social behaviour. Each such form of social behaviour is a bird, which observes and controls its distance from other birds, i.e. from other forms of social behaviour. Societies experiment with different ways of maintaining internal coherence between different orientations. Each distinct collective orientation observes and controls its distance from other collective orientations. From the perspective of an ants’ colony, each form of social behaviour is a pheromonal trace which other forms of social behaviour can follow and reinforce, or not give a s**t about it, to their pleasure and leisure. Societies experiment with different strengths attributed to particular forms of social behaviour, which mimics an ants’ colony experimenting with different pheromonal intensities attached to different paths toward food.

Please, notice that both models – particle swarm and ants’ colony – mention food. Food is the outcome to achieve. Output variables in mutated datasets – which I create out of the empirical one – are the food to acquire. Input variables are the moves and strategies which birds (particles) or ants can perform in order to get food. Experimentation the ants’ way involves weighing each local input (i.e. the input of each variable in each experimental round) with a random weight R, 0 < R < 1. When experimenting the birds’ way, I drop into my model the average Euclidean distance E from the local input to all the other local inputs.   

I want to present it all rolled nicely into an equation, and, as noblesse oblige, I introduce symbols. The local input of an input variable xi in experimental round tj is represented with xi(tj), whilst the local value of the output variable xo is written as xo(tj). The compound experimental input which the society makes, both the ants’ way and the birds’ way, is written as h(tj), and it spells h(tj) = x1(tj)*R* E[xi(tj-1)] + x2(tj)*R* E[x2(tj-1)] + … + xn(tj)*R* E[xn(tj-1)].    

Up to that point, this is not really a neural network. It mixes things up, but it does not really adapt. I mean… maybe there is a little intelligence? After all, when my variables act like a flock of birds, they observe each other’s position in the previous experimental round, through the E[xi(tj-1)] Euclidean thing. However, I still have no connection, at this point, between the compound experimental input h(tj) and the pursued output xo(tj). I need a connection which would work like an observer, something like a cognitive meta-structure.

Here comes the very basic science of artificial neural networks. There is a function called hyperbolic tangent, which spells tanh = (e2x – 1)/(e2x + 1) where x can be whatever you want. This function happens to be one of those used in artificial neural networks, as neural activation, i.e. as a way to mediate between a compound input and an expected output. When I have that compound experimental input h(tj) = x1(tj)*R* E[xi(tj-1)] + x2(tj)*R* E[x2(tj-1)] + … + xn(tj)*R* E[xn(tj-1)], I can put it in the place of x in the hyperbolic tangent, and I bet tanh = (e2h  – 1)/(e2h  + 1). In a neural network, error in optimization can be calculated, generally, as e = xo(tj) – tanh[h(tj)]. That error can be fed forward into the next experimental round, and then we are talking, ‘cause the compound experimental input morphs into:

>>  input h(tj) = x1(tj)*R* E[xi(tj-1)]*e(tj-1) + x2(tj)*R* E[x2(tj-1)] *e(tj-1) + … + xn(tj)*R* E[xn(tj-1)] *e(tj-1)   

… and that means that each compound experimental input takes into account both the coherence of the input in question (E), and the results of previous attempts to optimize.

Here, I am a bit stuck. I need to explain, how exactly the fact of computing the error of optimization e = xo(tj) – tanh[h(tj)] is representative for collective intelligence.

[1] Gupta, A., & Srivastava, S. (2020). Comparative analysis of ant colony and particle swarm optimization algorithms for distance optimization. Procedia Computer Science, 173, 245-253.

Tax on Bronze

I am trying to combine the line of logic which I developed in the proof-of-concept for the idea I labelled ‘Energy Ponds’ AKA ‘Project Aqueduct’ with the research on collective intelligence in human societies. I am currently doing serious review of literature as regards the theory of complex systems, as it looks like just next door to my own conceptual framework. The general idea is to use the theory of complex systems – within the general realm of which the theory of cellular automata looks the most promising, for the moment – to simulate the emergence and absorption of a new technology in the social structure.  

I started to sketch the big lines of that picture in my last update in French, namely in ‘L’automate cellulaire respectable’. I assume that any new technology burgeons inside something like a social cell, i.e. a group of people connected by common goals and interests, together with some kind of institutional vehicle, e.g. a company, a foundation etc. It is interesting to notice that new technologies develop through the multiplication of such social cells rather than through linear growth of just one cell. Up to a point this is just one cell growing, something like the lone wolf of Netflix in the streaming business, and then ideas start breeding and having babies with other people.

I found an interesting quote in the book which is my roadmap through the theory of complex systems, namely in ‘What Is a Complex System?’ by James Landyman and Caroline Wiesner (Yale University Press 2020, ISBN 978-0-300-25110-4). On page 56 (Kindle Edition), Landyman and Wiesner write something interesting about the collective intelligence in colonies of ants: ‘What determines a colony’s survival is its ability to grow quickly, because individual workers need to bump into other workers often to be stimulated to carry out their tasks, and this will happen only if the colony is large. Army ants, for example, are known for their huge swarm raids in pursuit of prey. With up to 200 000 virtually blind foragers, they form trail systems that are up to 20 metres wide and 100 metres long (Franks et al. 1991). An army of this size harvests prey of 40 grams and more each day. But if a small group of a few hundred ants accidentally gets isolated, it will go round in a circle until the ants die from starvation […]’.

Interesting. Should nascent technologies have an ant-like edge to them, their survival should be linked to them reaching some sort of critical size, which allows the formation of social interactions in the amount which, in turn, an assure proper orientation in all the social cells involved. Well, looks like nascent technologies really are akin to ant colonies because this is exactly what happens. When we want to push a technology from its age of early infancy into the phase of development, a critical size of the social network is required. Customers, investors, creditors, business partners… all that lot is necessary, once again in a threshold amount, to give a new technology the salutary kick in the ass, sending it into the orbit of big business.

I like jumping quickly between ideas and readings, with conceptual coherence being an excuse just as frequently as it is a true guidance, and here comes an article on urban growth, by Yu et al. (2021[1]). The authors develop a model of urban growth, based on the empirical data on two British cities: Oxford and Swindon. The general theoretical idea here is that strictly speaking urban areas are surrounded by places which are sort of in two minds whether they like being city or countryside. These places can be represented as spatial cells, and their local communities are cellular automatons which move cautiously, step by step, into alternative states of being more urban or more rural. Each such i-th cellular automaton displays a transition potential Ni, which is a local balance between the benefits of urban agglomeration Ni(U), as opposed to the benefits Ni(N) of conserving scarce non-urban resources. The story wouldn’t be complete without the shit-happens component Ri of randomness, and the whole story can be summarized as: Ni = Ni(U) – Ni(N) + Ri.

Yu et al. (2021 op. cit.) add an interesting edge to the basic theory of cellular automata, such as presented e.g. in Bandini, Mauri & Serra (2001[2]), namely the component of different spatial scales. A spatial cell in a peri-urban area can be attracted to many spatial aspects of being definitely urban. Those people may consider the possible benefits of sharing the same budget for local schools in a perimeter of 5 kilometres, as well as the possible benefits of connecting to a big hospital 20 km away. Starting from there, it looks a bit gravitational. Each urban cell has a power of attraction for non-urban cells, however that power decays exponentially with physical distance.

I generalize. There are many technologies spreading across the social space, and each of them is like a city. I mean, it does not necessarily have a mayor, but it has dense social interactions inside, and those interactions create something like a gravitational force for external social cells. When a new technology gains new adherents, like new investors, new engineers, new business entities, it becomes sort of seen and known. I see two phases in the development of a nascent technology. Before it gains enough traction in order to exert significant gravitational force on the temporarily non-affiliated social cells, a technology grows through random interactions of the initially involved social cells. If those random interactions exceed a critical threshold, thus if there are enough forager ants in the game, their simple interactions create an emergence, which starts coagulating them into a new industry.

I return to cities and their growth, for a moment. I return to the story which Yu et al. (2021[3]) are telling. In my own story on a similar topic, namely in my draft paper ‘The Puzzle of Urban Density And Energy Consumption’, I noticed an amazing fact: whilst individual cities grow, others decay or even disappear, and the overall surface of urban areas on Earth seems to be amazingly stationary over many decades. It looks as if the total mass, and hence the total gravitational attraction of all the cities on Earth was a constant over at least one human generation (20 – 25 years). Is it the same with technologies? I mean, is there some sort of constant total mass that all technologies on Earth have, within the lifespan of one human generation, and there are just specific technologies getting sucked into that mass whilst others drop out and become moons (i.e. cold, dry places with not much to do and hardly any air to breathe).

What if a new technology spreads like Tik-Tok, i.e. like a wildfire? There is science for everything, and there is some science about fires in peri-urban areas as well. That science is based on the same theory of cellular automata. Jiang et al. (2021[4]) present a model, where territories prone to wildfires are mapped into grids of square cells. Each cell presents a potential to catch fire, through its local properties: vegetation, landscape, local climate. The spread of a wildfire from a given cell R0 is always based on the properties of the cells surrounding the fire.

Cirillo, Nardi & Spitoni (2021[5]) present an interesting mathematical study of what happens when, in a population of cellular automata, each local automaton updates itself into a state which is a function of the preceding state in the same cell, as well as of the preceding states in the two neighbouring cells. It means, among other things, that if we add the dimension of time to any finite space Zd where cellular automata dwell, the immediately future state of a cell is a component of the available neighbourhood for the present state of that cell. Cirillo, Nardi & Spitoni (2021) demonstrate, as well, that if we know the number and the characteristics of the possible states which one cellular automaton can take, like (-1, 0, 1), we can compute the total number of states that automaton can take in a finite number of moves. If we make many such cellular automatons move in the same space Zd , a probabilistic chain of complex states emerge.

As I wrote in ‘L’automate cellulaire respectable’, I see a social cell built around a new technology, e.g. ‘Energy Ponds’, moving, in the first place, along two completely clear dimensions: physical size of installations and financial size of the balance sheet. Movements along these two axes are subject to the influence happening along some foggy, unclear dimensions connected to preferences and behaviour: expected return on investment, expected future value of the firm, risk aversion as opposed to risk affinity etc. That makes me think, somehow, about a theory next door to that of cellular automata, namely the theory of swarms. This is a theory which explains complex changes in complex systems through changes in strength of correlation between individual movements. According to the swarm theory, a complex set which behaves like a swarm can adapt to external stressors by making the moves of individual members more or less correlated with each other. A swarm in routine action has its members couple their individual behaviour rigidly, like marching in step. A swarm alerted by a new stressor can loosen it a little, and allow individual members some play in their behaviour, like ‘If I do A, you do B or C or D, anyway one out of these three’. A swarm in mayhem loses it completely and there is no behavioural coupling whatsoever between members.

When it comes to the development and societal absorption of a new technology, the central idea behind the swarm-theoretic approach is that in order to do something new, the social swarm has to shake it off a bit. Social entities need to loosen their mutual behavioural coupling so as to allow some of them to do something else than just ritually respond to the behaviour of others. I found an article which I can use to transition nicely from the theory of cellular automata to the swarm theory: Puzicha & Buchholz (2021[6]). The paper is essentially applicable to the behaviour of robots, yet it is about a swarm of 60 distributed autonomous mobile robots which need to coordinate through a communication network with low reliability and restricted capacity. In other words, sometimes those robots can communicate with each other, and sometimes they don’t. When some robots out of the 60 are having a chat, they can jam the restricted capacity of the network and thus bar the remaining robots from communicating. Incidentally, this is how innovative industries work. When a few companies, let’s say the calibre of unicorns, are developing a new technology. They absorb the attention of investors, governments, potential business partners and potential employees. They jam the restricted field of attention available in the markets of, respectively, labour and capital.      

Another paper from the same symposium ‘Intelligent Systems’, namely Serov, Voronov & Kozlov (2021[7]), leads in a slightly different direction. Whilst directly derived from the functioning of communication systems, mostly the satellite-based ones, the paper suggests a path of learning in a network, where the capacity for communication is restricted, and the baseline method of balancing the whole thing is so burdensome for the network that it jams communication even further. You can compare it to a group of people who are all so vocal about the best way to allow each other to speak that they have no time and energy left for speaking their mind and listening to others. I have found another paper, which is closer to explaining the behaviour of those individual agents when they coordinate just sort of. It is Gupta & Srivastava (2020[8]), who compare two versions of swarm intelligence: particle swarm and ant colony. The former (particle swarm) generalises a problem applicable to birds. Simple, isn’t it? A group of birds will randomly search for food. Birds don’t know where exactly the food is, so they follow the bird which is nearest to the food.  The latter emulates the use of pheromones in a colony of ants. Ants selectively spread pheromones as they move around, and they find the right way of moving by following earlier deposits of pheromones. As many ants walk many times a given path, the residual pheromones densify and become even more attractive. Ants find the optimal path by following maximum pheromone deposition.

Gupta & Srivastava (2020) demonstrate that the model of ant colony, thus systems endowed with a medium of communication which acts by simple concentration in space and time are more efficient for quick optimization than the bird-particle model, based solely on observing each other’s moves. From my point of view, i.e. from that of new technologies, those results reach deeper than it could seem at the first sight. Financial capital is like a pheromone. One investor-ant drops some financial deeds at a project, and it can hopefully attract further deposits of capital etc. Still, ant colonies need to reach a critical size in order for that whole pheromone business to work. There needs to be a sufficient number of ants per unit of available space, in order to create those pheromonal paths. Below the critical size, no path becomes salient enough to create coordination and ants starve to death fault of communicating efficiently. Incidentally, the same is true for capital markets. Some 11 years ago, right after the global financial crisis, a fashion came to create small, relatively informal stock markets, called ‘alternative capital markets’. Some of them were created by the operators of big stock markets (e.g. the AIM market organized by the London Stock Exchange), some others were completely independent ventures. Now, a decade after that fashion exploded, the conclusion is similar to ant colonies: fault of reaching a critical size, those alternative capital markets just don’t work as smoothly as the big ones.

All that science I have quoted makes my mind wander, and it starts walking down the path of hilarious and absurd. I return, just for a moment, to another book: ‘1177 B.C. THE YEAR CIVILIZATION COLLAPSED. REVISED AND UPDATED’ by Eric H. Cline (Turning Points in Ancient History, Princeton University Press, 2021, ISBN 9780691208022). The book gives in-depth an account of the painful, catastrophic end of a whole civilisation, namely that of the Late Bronze Age, in the Mediterranean and the Levant. The interesting thing is that we know that whole network of empires – Egypt, Hittites, Mycenae, Ugarit and whatnot – collapsed at approximately the same moment, around 1200 – 1150 B.C., we know they collapsed violently, and yet we don’t know exactly how they collapsed.

Alternative history comes to my mind. I imagine the transition from Bronze Age to the Iron Age similarly to what we do presently. The pharaoh-queen VanhderLeyenh comes up with the idea of iron. Well, she doesn’t, someone she pays does. The idea is so seducing that she comes, by herself this time, with another one, namely tax on bronze. ‘C’mon, Mr Brurumph, don’t tell me you can’t transition to iron within the next year. How many appliances in bronze do you have? Five? A shovel, two swords, and two knives. Yes, we checked. What about your rights? We are going through a deep technological change, Mr Brurumph, this is not a moment to talk about rights. Anyway, this is not even the new era yet, and there is no such thing as individual rights. So, Mr Brurumph, a one-year notice for passing from bronze to iron is more than enough. Later, you pay the bronze tax on each bronze appliance we find. Still, there is a workaround. If you officially identify as a non-Bronze person, and you put the corresponding sign over your door, you have a century-long prolongation on that tax’.

Mr Brurumph gets pissed off. Others do too. They feel lost in a hostile social environment. They start figuring s**t out, starting from the first principles of their logic. They become cellular automata. They focus on nailing down the next immediate move to make. Errors are costly. Swarm behaviour forms. Fights break out. Cities get destroyed. Not being liable to pay the tax on bronze becomes a thing. It gets support and gravitational attraction. It becomes tempting to join the wandering hordes of ‘Tax Free People’ who just don’t care and go. The whole idea of iron gets postponed like by three centuries.  

[1] Yu, J., Hagen-Zanker, A., Santitissadeekorn, N., & Hughes, S. (2021). Calibration of cellular automata urban growth models from urban genesis onwards-a novel application of Markov chain Monte Carlo approximate Bayesian computation. Computers, environment and urban systems, 90, 101689.

[2] Bandini, S., Mauri, G., & Serra, R. (2001). Cellular automata: From a theoretical parallel computational model to its application to complex systems. Parallel Computing, 27(5), 539-553.

[3] Yu, J., Hagen-Zanker, A., Santitissadeekorn, N., & Hughes, S. (2021). Calibration of cellular automata urban growth models from urban genesis onwards-a novel application of Markov chain Monte Carlo approximate Bayesian computation. Computers, environment and urban systems, 90, 101689.

[4] Jiang, W., Wang, F., Fang, L., Zheng, X., Qiao, X., Li, Z., & Meng, Q. (2021). Modelling of wildland-urban interface fire spread with the heterogeneous cellular automata model. Environmental Modelling & Software, 135, 104895.

[5] Cirillo, E. N., Nardi, F. R., & Spitoni, C. (2021). Phase transitions in random mixtures of elementary cellular automata. Physica A: Statistical Mechanics and its Applications, 573, 125942.

[6] Puzicha, A., & Buchholz, P. (2021). Decentralized model predictive control for autonomous robot swarms with restricted communication skills in unknown environments. Procedia Computer Science, 186, 555-562.

[7] Serov, V. A., Voronov, E. M., & Kozlov, D. A. (2021). A neuro-evolutionary synthesis of coordinated stable-effective compromises in hierarchical systems under conflict and uncertainty. Procedia Computer Science, 186, 257-268.

[8] Gupta, A., & Srivastava, S. (2020). Comparative analysis of ant colony and particle swarm optimization algorithms for distance optimization. Procedia Computer Science, 173, 245-253.

L’automate cellulaire respectable

J’essaie de développer une jonction entre deux créneaux de ma recherche : l’étude de faisabilité pour mon « Projet Aqueduc » d’une part et ma recherche plus théorique sur le phénomène d’intelligence collective d’autre part. Question : comment prédire et prévoir l’absorption d’une technologie nouvelle dans une structure sociale ? En des termes plus concrets, comment puis-je prévoir l’absorption de « Projet Aqueduc » dans l’environnement socio-économique ? Pour me rendre la vie plus difficile – ce qui est toujours intéressant – je vais essayer de construire le modèle de cette absorption à partir d’une base théorique relativement nouvelle pour moi, notamment la théorie d’automates cellulaires. En termes de littérature, pour le moment, je me réfère à deux articles espacés de 20 ans l’un de l’autre : Bandini, Mauri & Serra (2001[1]) ainsi que Yu et al. (2021[2]).

Pourquoi cette théorie précise ? Pourquoi pas, en fait ? Sérieusement, la théorie d’automates cellulaires essaie d’expliquer des phénomènes très complexes – qui surviennent dans des structures qui ont l’air d’être vraiment intelligentes – à partir d’assomptions très faibles à propos du comportement individuel d’entités simples à l’intérieur de ces structures. En plus, cette théorie est déjà bien traduite en termes d’intelligence artificielle et se marie donc bien avec mon but général de développer une méthode de simuler des changements socio-économiques avec des réseaux neuronaux.

Il y a donc un groupe des gens qui s’organisent d’une façon ou d’une autre autour d’une technologie nouvelle. Les ressources économiques et la structure institutionnelle de ce groupe peuvent varier : ça peut être une société de droit, un projet public-privé, une organisation non-gouvernementale etc. Peu importe : ça commence comme une microstructure sociale. Remarquez : une technologie existe seulement lorsque et dans la mesure qu’une telle structure existe, sinon une structure plus grande et plus complexe. Une technologie existe seulement lorsqu’il y a des gens qui s’occupent d’elle.

Il y a donc ce groupe organisé autour d’une technologie naissante. Tout ce que nous savons sur l’histoire économique et l’histoire des technologies nous dit que si l’idée s’avère porteuse, d’autres groupes plus ou moins similaires vont se former. Je répète : d’autres groupes. Lorsque la technologie des voitures électriques avait finalement bien mordu dans le marché, ceci n’a pas entraîné l’expansion monopolistique de Tesla. Au contraire : d’autres entités ont commencé à bâtir de façon indépendante sur l’expérience de Tesla. Aujourd’hui, chacun des grands constructeurs automobiles vit une aventure plus ou moins poussée avec les bagnoles électriques et il y a toute une vague des startups crées dans le même créneau. En fait, la technologie du véhicule électrique a donné une deuxième jeunesse au modèle de petite entreprise automobile, un truc qui semblait avoir été renvoyé à la poubelle de l’histoire.

L’absorption d’une technologie nouvelle peut donc être représentée comme la prolifération des cellules bâties autour de cette technologie. A quoi bon, pouvez-vous demander. Pourquoi inventer un modèle théorique de plus pour le développement des technologies nouvelles ? Après tout, il y en a déjà pas mal, de tels modèles. Le défi théorique consiste à simuler le changement technologique de façon à cerner des Cygnes Noirs possibles. La différence entre un cygne noir tout simple et un Cygne Noir écrit avec des majuscules est que ce dernier se réfère au livre de Nassim Nicolas Taleb « The Black Swan. The impact of the highly improbable », Penguin, 2010. Oui, je sais, il y a plus que ça. Un Cygne Noir en majuscules peut bien être le Cygne Noir de Tchaïkovski, donc une femme (Odile) autant attirante que dangereuse par son habileté d’introduire du chaos dans la vie d’un homme. Je sais aussi que si j’arrangerai une conversation entre Tchaïkovski et Carl Gustav Jung, les deux messieurs seraient probablement d’accord qu’Odile alias Cygne Noir symbolise le chaos, en opposition à l’ordre fragile dans la vie de Siegfried, donc à Odette. Enfin, j’fais pas du ballet, moi, ici. Je blogue. Ceci implique une tenue différente, ainsi qu’un genre différent de flexibilité. Je suis plus âgé que Siegfried, aussi, comme par une génération.  

De tout en tout, mon Cygne Noir à moi est celui emprunté à Nassim Nicolas Taleb et c’est donc un phénomène qui, tout en étant hors d’ordinaire et surprenant pour les gens concernés, est néanmoins fonctionnellement et logiquement dérivé d’une séquence des phénomènes passés. Un Cygne Noir se forme autour des phénomènes qui pendant un certain temps surviennent aux extrémités de la courbe Gaussienne, donc à la frange de probabilité. Les Cygnes Noirs véhiculent du danger et des opportunités nouvelles, à des doses aussi variées que le sont les Cygnes Noirs eux-mêmes. L’intérêt pratique de cerner des Cygnes Noirs qui peuvent surgir à partir de la situation présente est donc celui de prévenir des risques du type catastrophique d’une part et de capter très tôt des opportunités exceptionnelles d’autre part.

Voilà donc que, mine de rien, je viens d’enrichir la description fonctionnelle de ma méthode de simuler l’intelligence collective des sociétés humaines avec les réseaux neuronaux artificiels. Cette méthode peut servir à identifier à l’avance des développements possibles du type de Cygne Noir : significatifs, subjectivement inattendus et néanmoins fonctionnellement enracinées dans la réalité présente.

Il y a donc cette technologie nouvelle et il y a des cellules socio-économiques qui se forment autour d’elle. Il y a des espèces distinctes des cellules et chaque espèce correspond à une technologie différente. Chaque cellule peut être représentée comme un automate cellulaire A = (Zd, S, n, Sn+1 -> S), dont l’explication commence avec Zd, donc l’espace à d dimensions ou les cellules font ce qu’elles ont à faire. L’automate cellulaire ne sait rien sur cet espace, tout comme une truite n’est pas vraiment forte lorsqu’il s’agit de décrire une rivière. Un automate cellulaire prend S états différents et ces états sont composés des mouvements du type un-pas-à-la-fois, dans n emplacements cellulaires adjacents. L’automate sélectionne ces S états différents dans un catalogue plus large Sn+1 de tous les états possibles et la fonction Sn+1 -> S alias la règle locale de l’automate A décrit de façon générale le quotient de cette sélection, donc la capacité de l’automate cellulaire d’explorer toutes les possibilités de bouger son cul (cellulaire) juste d’un cran à partir de la position actuelle.

Pourquoi distinguer ces quatre variables structurelles dans l’automate cellulaire ? Pourquoi n’assumons-nous pas que le nombre possible des mouvements « n » est une fonction constante des dimensions offertes par l’espace Zd ? Pourquoi ne pas assumer que le nombre réel d’états S est égal au total possible de Sn+1 ? Eh bien parce que la théorie d’automates cellulaires a des ambitions de servir à quelque chose d’utile et elle s’efforce de simuler la réalité. Il y a donc une technologie nouvelle encapsulée dans une cellule sociale A. L’espace social autour d’A est vaste, mais il peut y avoir des portes verrouillées. Des marchés oligopoles, des compétiteurs plus rapides et plus entreprenants, des obstacles légaux et mêmes des obstacles purement sociaux. Si une société à qui vous proposez de coopérer dans votre projet innovant craint d’être exposée à 10 000 tweets enragés de la part des gens qui n’aiment pas votre technologie, cette porte-là est fermée, quoi que la dimension où elle se trouve est théoriquement accessible.

Si je suis un automate cellulaire tout à fait ordinaire et j’ai la possibilité de bouger dans n emplacements sociaux adjacents à celui où je suis maintenant, je commence par choisir juste un mouvement et voir ce qui se passe. Lorsque tout se passe de façon satisfaisante, j’observe mon environnement immédiat nouveau – j’observe donc le « n » nouveau visible à partir de la cellule où je viens de bouger – je fais un autre mouvement dans un emplacement sélectionné dans ce nouveau « n » et ainsi de suite. Dans un environnement immédiat « n » moi, l’automate cellulaire moyen, j’explore plus qu’un emplacement possible de parmi n seulement lorsque je viens d’essuyer un échec dans l’emplacement précédemment choisi et j’avais décidé que la meilleure stratégie est de retourner à la case départ tout en reconsidérant les options possibles.         

La cellule sociale bâtie autour d’une technologie va donc se frayer un chemin à travers l’espace social Zd, en essayant de faire des mouvement réussis, donc en sélectionnant une option de parmi les « n » possibles. Oui, les échecs ça arrive et donc parfois la cellule sociale va expérimenter avec k > 1 mouvements immédiats. Néanmoins, la situation où k = n c’est quand les gens qui travaillent sur une technologie nouvelle ont essayé, en vain, toutes les options possibles sauf une dernière et se jettent la tête en avant dans celle-ci, qui s’avère une réussite. De telles situations arrivent, je le sais. Je crois bien que Canal+ était une aventure de ce type à ces débuts. Néanmoins, lorsqu’un truc marche, dans le lancement d’une technologie nouvelle, on juste continue dans la foulée sans regarder par-dessus l’épaule.

Le nombre réel S d’états que prend un automate cellulaire est donc largement sujet à l’hystérèse. Chaque mouvement réussi est un environnement immédiat de moins à exploiter, donc celui laissé derrière nous.  En même temps, c’est un défi nouveau de faire l’autre mouvement réussi au premier essai sans s’attarder dans des emplacements alternatifs. L’automate cellulaire est donc un voyageur plus qu’un explorateur. Bref, la formulation A = (Zd, S, n, Sn+1 -> S) d’un automate cellulaire exprime donc des opportunités et des contraintes à la fois.

Ma cellule sociale bâtie autour de « Projet Aqueduc » coexiste avec des cellules sociales bâties autour d’autres technologies. Comme tout automate cellulaire respectable, je regarde autour de moi et je vois des mouvements évidents en termes d’investissement. Je peux bouger ma cellule sociale en termes de capital accumulé ainsi que de l’échelle physique des installations. Je suppose que les autres cellules sociales centrées sur d’autres technologies vont faire de même : chercher du capital et des opportunités de croître physiquement. Excellent ! Voilà donc que je vois deux dimensions de Zd : l’échelle financière et l’échelle physique. Je me demande comment faire pour y bouger et je découvre d’autres dimensions, plus comportementales et cognitives celles-là : le retour interne (profit) espéré sur l’investissement ainsi que le retour externe (croissance de valeur d’entreprise), la croissance générale du marché de capital d’investissement etc.

Trouver des dimensions nouvelles, c’est fastoche, par ailleurs. Beaucoup plus facile que c’est montré dans les films de science-fiction. Il suffit de se demander ce qui peut bien gêner nos mouvements, regarder bien autour, avoir quelques conversations et voilà ! Je peux découvrir des dimensions nouvelles même sans accès à un téléporteur inter-dimensionnel à haute énergie. Je me souviens d’avoir vu sur You Tube une série de vidéos dont les créateurs prétendaient savoir à coup sûr que le grand collisionneur de hadrons (oui, celui à Genève) a ouvert un tunnel vers l’enfer. Je passe sur des questions simplissimes du genre : « Comment savez-vous que c’est un tunnel, donc un tube avec une entrée et une sortie ? Comment savez-vous qu’il mène en enfer ? Quelqu’un est-il allé de l’autre côté et demandé les locaux où ça où ils habitent ? ». Le truc vraiment épatant est qu’il y a toujours des gens qui croient dur comme fer que vous avez besoin des centaines de milliers de dollars d’investissement et des années de recherche scientifique pour découvrir un chemin vers l’enfer. Ce chemin, chacun de nous l’a à portée de la main. Suffit d’arrêter de découvrir des dimensions nouvelles dans notre existence.

Bon, je suis donc un automate cellulaire respectable qui développe le « Projet Aqueduc » à partir d’une cellule d’enthousiastes et en présence d’autres automates cellulaires. On bouge, nous, les automates cellulaires, le long de deux dimensions bien claires d’échelle – capital accumulé et taille physique des installations – et on sait que bouger dans ces dimensions-ci exige un effort dans d’autres dimensions moins évidentes qui s’entrelacent autour d’intérêt général pour notre idée de la part des gens extra – cellulaires. Notre Zd est en fait un Zd eh ben alors !. Le fait d’avoir deux dimensions bien visibles et un nombre discutable de dimensions plus floues fait que le nombre « n » des mouvements possibles est tout aussi discutable et on évite d’en explorer toutes les nuances. On saute sur le premier emplacement possible de parmi « n », ce qui nous transporte dans un autre « n », puis encore et encore.

Lorsque tous les automates cellulaires démontrent des règles locales Sn+1 -> S à peu près cohérentes, il est possible d’en faire une description instantanée Zd -> S, connue aussi comme configuration de A ou bien son état global. Le nombre d’états possibles que mon « Projet Aqueduc » peut prendre dans un espace rempli d’automates cellulaires va dépendre du nombre d’états possibles d’autres automates cellulaires. Ces descriptions instantanées Zd -> S sont, comme le nom l’indique, instantanées, donc temporaires et locales. Elles peuvent changer. En particulier, le nombre S d’états possibles de mon « Projet Aqueduc » change en fonction de l’environnement immédiat « n » accessible à partir de la position courante t. Une séquence de positions correspond donc à une séquence des configurations ct = Zd -> S (t) et cette séquence est désignée comme comportement de l’automate cellulaire A ou bien son évolution.        

[1] Bandini, S., Mauri, G., & Serra, R. (2001). Cellular automata: From a theoretical parallel computational model to its application to complex systems. Parallel Computing, 27(5), 539-553.

[2] Yu, J., Hagen-Zanker, A., Santitissadeekorn, N., & Hughes, S. (2021). Calibration of cellular automata urban growth models from urban genesis onwards-a novel application of Markov chain Monte Carlo approximate Bayesian computation. Computers, environment and urban systems, 90, 101689.

The red-neck-cellular automata

I continue revising my work on collective intelligence, and I am linking it to the theory of complex systems. I return to the excellent book ‘What Is a Complex System?’ by James Landyman and Karoline Wiesner (Yale University Press, 2020, ISBN 978-0-300-25110-4, Kindle Edition). I take and quote their summary list of characteristics that complex systems display, on pages 22 – 23: “ […] which features are necessary and sufficient for which kinds of complexity and complex system. The features are as follows:

1. Numerosity: complex systems involve many interactions among many components.

2. Disorder and diversity: the interactions in a complex system are not coordinated or controlled centrally, and the components may differ.

3. Feedback: the interactions in complex systems are iterated so that there is feedback from previous interactions on a time scale relevant to the system’s emergent dynamics.

4. Non-equilibrium: complex systems are open to the environment and are often driven by something external.

5. Spontaneous order and self-organisation: complex systems exhibit structure and order that arises out of the interactions among their parts.

6. Nonlinearity: complex systems exhibit nonlinear dependence on parameters or external drivers.

7. Robustness: the structure and function of complex systems is stable under relevant perturbations.

8. Nested structure and modularity: there may be multiple scales of structure, clustering and specialisation of function in complex systems.

9. History and memory: complex systems often require a very long history to exist and often store information about history.

10. Adaptive behaviour: complex systems are often able to modify their behaviour depending on the state of the environment and the predictions they make about it”.

As I look at the list, my method of simulating collective intelligence is coherent therewith. Still, there is one point which I think I need to dig a bit more into: that whole thing with simple entities inside the complex system. In most of my simulations, I work on interactions between cognitive categories, i.e. between quantitative variables. Interaction between real social entities is most frequently implied rather than empirically nailed down. Still, there is one piece of research which sticks out a bit in that respect, and which I did last year. It is devoted to cities and their role in the human civilisation. I wrote quite a few blog updates on the topic, and I have one unpublished paper written thereon, titled ‘The Puzzle of Urban Density And Energy Consumption’. In this case, I made simulations of collective intelligence with my method, thus I studied interactions between variables. Yet, in the phenomenological background of emerging complexity in variables, real people interact in cities: there are real social entities interacting in correlation with the connections between variables. I think the collective intelligence of cities the piece of research where I have the surest empirical footing, as compared to others.

There is another thing which I almost inevitably think about. Given the depth and breadth of the complexity theory, such as I start discovering it with and through that ‘What Is a Complex System?’ book, by James Landyman and Karoline Wiesner, I ask myself: what kind of bacon can I bring to that table? Why should anyone bother about my research? What theoretical value added can I supply? A good way of testing it is talking real problems. I have just signalled my research on cities. The most general hypothesis I am exploring is that cities are factories of new social roles in the same way that the countryside is a factory of food. In the presence of demographic growth, we need more food, and we need new social roles for new humans coming around. In the absence of such new social roles, those new humans feel alienated, they identify as revolutionaries fighting for the greater good, they identify the incumbent humans as oppressive patriarchy, and the next thing you know, there is systemic, centralized, government-backed terror. Pardon my French, this is a system of social justice. Did my bit of social justice, in the communist Poland.

Anyway, cities make new social roles by making humans interact much more abundantly than they usually do in a farm. More abundant an interaction means more data to process for each human brain, more s**t to figure out, and the next thing you know, you become a craftsman, a businessperson, an artist, or an assassin. Once again, being an assassin in the countryside would not make much sense. Jumping from one roof to another looks dashing only in an urban environment. Just try it on a farm.

Now, an intellectual challenge. How can humans, who essentially don’t know what to do collectively, can interact so as to create emergent complexity which, in hindsight, looks as if they had known what to do? An interesting approach, which hopefully allows using some kind of neural network, is the paradigm of the maze. Each individual human is so lost in social reality that the latter appears as a maze, which one ignores the layout of. Before I go further, one linguistic thing is to nail down. I feel stupid using impersonal forms such as ‘one’, or ‘an individual’. I like more concreteness. I am going to start with George the Hero. George the Hero lives in a maze, and I stress it: he lives there. Social reality is like a maze to George, and, logically, George does not even want to get out of that maze, ‘cause that would mean being lonely, with no one around to gauge George’s heroism. George the Hero needs to stay in the maze.

The first thing which George the Hero needs to figure out is the dimensionality of the maze. How many axes can George move along in that social complexity? Good question. George needs to experiment in order to discover that. He makes moves in different social directions. He looks around what different kinds of education he can possibly get. He assesses his occupational options, mostly jobs and business ventures. He asks himself how he can structure his relations with family and friends. Is being an asshole compatible with fulfilling emotional bonds with people around?  

Wherever George the Hero currently is in the maze, there are n neighbouring and available cells around him. In each given place of the social maze, George the Hero has n possible ways to move further, into those n accessible cells in the immediate vicinity, and that is associated with k dimensions of movement. What is k, exactly? Here, I can refer to the theory of cellular automata, which attempts to simulate interactions between really simple, cell-like entities (Bandini, Mauri & Serra 2001[1]; Yu et al. 2021[2]). There is something called ‘von Neumann neighbourhood’. It corresponds to the assumption that if George the Hero has n neighbouring social cells which he move into, he can move like ‘left-right-forward-back’. That, in turn, spells k = n/2. If George can move into 4 neighbouring cells, he moves in a 2-dimensional space. Should he be able to move into 6 adjacent cells of the social maze, he has 3 dimensions to move along etc. Trouble starts when George sees an odd number of places to move to, like 5 or 7, on the account of these giving half-dimensions, like 5/2 = 2.5, 7/2 = 3.5 etc. Half a dimension means, in practical terms, that George the Hero faces social constraints. There might be cells around, mind you, which technically are there, but there are walls between George and them, and thus, for all practical purposes, the Hero can afford not to give a f**k.

George the Hero does not like to move back. Hardly anyone does. Thus, when George has successfully moved from cell A to cell B, he will probably not like going back to A, just in order to explore another cell adjacent thereto. People behave heuristically. People build up on their previous gains. Once George the Hero has moved from A to B, B becomes his A for the next move. He will choose one among the cells adjacent to B (now A), move there etc. George is a Hero, not a scientist, and therefore he carves a path through the social maze rather than discovers the maze as such. Each cell in the maze contains some rewards and some threats. George can get food and it means getting into a dangerously complex relation with that sabre-tooth tiger. George can earn money and it means giving up some of his personal freedom. George can bond with other people and find existential meaning and it means giving up even more of what he provisionally perceives as his personal freedom.

The social maze is truly a maze because there are many Georges around. Interestingly, many Georges in English give one Georges in French, and I feel this is the point where I should drop the metaphor of George the Hero. I need to get more precise, and thus I go to a formal concept in the theory of cellular automata, namely that of a d-dimensional cellular automaton, which can be mathematically expressed as A = (Zd, S, N, Sn+1 -> S). In that automaton A, Zd stands for the architecture of the maze, thus a lattice of d – tuples of integer numbers. In plain human, Zd is given by the number of dimensions, possibly constrained, which a human can move along in the social space. Many people carve their paths across the social maze, no one likes going back, and thus the more people are around, and the better they can communicate their respective experiences, the more exhaustive knowledge we have of the surrounding Zd.

There is a finite set S of states in that social space Zd, and that finitude is connected to the formally defined neighbourhood of the automaton A, namely the N. Formally, N is a finite ordered subset of Zd, and, besides the ‘left-right-forward-back’ neighbourhood of von Neumann, there is a more complex one, namely the Moore’s neighbourhood. In the latter, we can move diagonally between cells, like to the left and forward, to the right and forward etc. Keeping in mind that neighbourhood means, in practical terms, the number n of cells which we can move into from the social cell we are currently in, the cellular automaton can be rephrased as as A = (Zd, S, n, Sn+1 -> S). The transition Sn+1 -> S, called the local rule of A, makes more sense now. With me being in a given cell of the social maze, and there being n available cells immediately adjacent to mine, that makes n +1 cells where I can possibly be in, and I can technically visit all those cells in a finite number of Sn+1 combinatorial paths. The transition Sn+1 -> S expresses the way which I carve my finite set S of states out of the generally available Sn+1.       

If I assume that cities are factories of new social roles, the cellular automaton of an urban homo sapiens should be more complex than the red-neck-cellular automaton in a farm folk. It might mean greater an n, thus more cells available for moving from where I am now. It might also mean more efficient a Sn+1 -> S local rule, i.e. a better way to explore all the possible states I can achieve starting from where I am. There is a separate formal concept for that efficiency in the local rule, and it is called configuration of the cellular automaton AKA its instantaneous description AKA its global state, and it refers to the map Zd -> S. Hence, the configuration of my cellular automaton is the way which the overall social space Zd mapes into the set S of states actually available to me.

Right, if I have my cellular automaton with a configuration map Zd -> S, it is sheer fairness that you have yours too, and your cousin Eleonore has another one for herself, as well. There are many of us in the social space Zd. We are many x’s in the Zd. Each x of us has their own configuration map Zd -> S. If we want to get along with each other, our individual cellular automatons need to be mutually coherent enough to have a common, global function of cellular automata, and we know there is such a global function when we can collectively produce a sequence of configurations.

According to my own definition, a social structure is a collectively intelligent structure to the extent that it can experiment with many alternative versions of itself and select the fittest one, whilst staying structurally coherent. Structural coherence, in turn, is the capacity to relax and tighten, in a sequence, behavioural coupling inside the society, so as to allow the emergence and grounding of new behavioural patterns. The theory of cellular automata provides me some insights in that respect. Collective intelligence means the capacity to experiment with ourselves, right? That means experimenting with our global function Zd -> S, i.e. with the capacity to translate the technically available social space Zd into a catalogue S of possible states. If we take a random sample of individuals in a society, and study their cellular automatons A, they will display local rules Sn+1 -> S, and these can be expressed as coefficients (S / Sn+1), 0 ≤ (S / Sn+1) ≤ 1. The latter express the capacity of individual cellular automatons to generate actual states S of being out of the generally available menu of Sn+1.

In a large population, we can observe the statistical distribution of individual (S / Sn+1) coefficients of freedom in making one’s cellular state. The properties of that statistical distribution, e.g. the average (S / Sn+1) across the board, are informative about how intelligent collectively the given society is. The greater the average (S / Sn+1), the more possible states can the given society generate in the incumbent social structure, and the more it can know about the fittest state possible. That looks like a cellular definition of functional freedom.

[1] Bandini, S., Mauri, G., & Serra, R. (2001). Cellular automata: From a theoretical parallel computational model to its application to complex systems. Parallel Computing, 27(5), 539-553.

[2] Yu, J., Hagen-Zanker, A., Santitissadeekorn, N., & Hughes, S. (2021). Calibration of cellular automata urban growth models from urban genesis onwards-a novel application of Markov chain Monte Carlo approximate Bayesian computation. Computers, environment and urban systems, 90, 101689.

L’impression de respirer

J’avance avec la révision de ma recherche sur le phénomène d’intelligence collective, que je viens de documenter dans « The collective of individual humans being any good at being smart ». Je m’efforce à faire jonction entre mes idées à moi, d’une part, et deux autres créneaux de recherche : la théorie des systèmes complexes et l’approche psychologique à l’intelligence collective. La première, je la travaille sur la base du livre ‘What Is a Complex System?’ écrit par James Landyman et Karoline Wiesner, publié en 2020 chez Yale University Press (ISBN 978-0-300-25110-4, Kindle Edition). Quant à l’approche psychologique, ma lecture de référence est, pour le moment, le livre ‘The Knowledge Illusion. Why we never think alone’ écrit par Steven Sloman et Philip Fernbach, publié en 2017 chez RIVERHEAD BOOKS (originellement chez Penguin Random House LLC, Ebook ISBN: 9780399184345, Kindle Edition).

Je viens de cerner l’idée centrale de mon approche au phénomène d’intelligence collective, et c’est l’utilisation des réseaux neuronaux artificiels – donc de l’Intelligence Artificielle – comme simulateurs des phénomènes sociaux complexes. La touche originale bien à moi que je veux ajouter à ce sujet, vaste par ailleurs, est la façon d’utiliser des réseaux neuronaux très simples, possibles à programmer dans une feuille de calcul Excel. Ma méthode va donc un peu à l’encontre du stéréotype des super-nuages numériques portés par des super-ordinateurs joints eux-mêmes en réseau, tout ça pour prédire la prochaine mode vestimentaire ou la prochaine super-affaire en Bourse.

Lorsque je pense à la structure d’un livre que je pourrais écrire à ce sujet, le squelette conceptuel qui me vient à l’esprit est du scientifique classique. Ça commence avec une « Introduction » générale et peu formelle, genre montrer pourquoi faire tout ce bruit à propos de l’idée en question. Une section de « Matériel empirique et méthode » ensuit, ou je discute le type de données empiriques à travailler avec ainsi que la méthode de leur traitement. Le pas suivant est de présenter « La théorie et revue de littérature du sujet » en un chapitre séparé et enfin des « Exemples d’application », soit des calculs faits sur des données réelles avec la méthode en question.     

Le noyau conceptuel formel de mon approche est – pour le moment – la fonction d’adaptation. Lorsque j’ai un ensemble de variables socio-économiques quantitatives, je peux faire des assomptions plus ou moins fortes à propos de leur signification et pertinence empirique, mais je peux assumer de manière tout à fait solide que chacune de ces variables peut représenter un résultat fonctionnel important, dont l’achèvement nous poursuivons comme société. En présence de « n » variables que peux poser « n » hypothèses du type : ces gens-là poursuivent l’optimisation de la variable « i » comme orientation collective. Une telle hypothèse veut dire que toutes les variables dans l’ensemble X = (x1, x2, …, x­n), observées dans une séquence de « m » occurrences locales (t1, t2,…, tm), forment une chaîne d’états fonctionnels locaux f{x1(t), x2(t), …, x­n(t)}.  La société étudiée compare chaque état fonctionnel local à une valeur espérée de résultat xi(t) et la fonction d’adaptation produit l’erreur locale d’adaptation e(t) = xi(t)f{x1(t), x2(t), …, x­n(t)}.  La variable « xi » fait partie de l’ensemble X = (x1, x2, …, x­n). La chaîne d’états fonctionnels f{x1(t), x2(t), …, x­n(t)} est donc produite aussi bien avec la variable optimisée « xi » elle-même qu’avec les autres variables. La logique de ceci est simple : la plupart de phénomènes sociaux que nous décrivons avec des variables quantitatives, tel le Produit National Brut (mon exemple préféré), démontrent une hystérèse significative. Le PNB d’aujourd’hui sert à produire le PNB de l’après-demain, tout comme le nombre des demandes de brevet d’aujourd’hui contribue à créer le même PNB de l’après-demain.

J’essaie de faire un rapprochement entre la théorie des systèmes complexes et ma méthode à moi. Je me réfère en particulier à ‘What Is a Complex System?’ (Landyman, Wiesner 2020). Le passage que je trouve particulièrement intéressant vu ma propre méthode est celui de la page 16, que je me permets de traduire sur le champ : « Comportement coordonné ne requiert pas de contrôleur suprême […] Il est surprenant que le mouvement collectif d’une volée d’oiseaux, d’un banc de poissons ou d’un essaim d’insectes peut être reproduit par un ensemble de robots programmés à obéir juste quelques règles simples. Chaque individu doit rester près d’une poignée des voisins et ne peut pas heurter d’autres individus. Comme l’individu avance, il contrôle régulièrement sa distance par rapport aux autres pour l’ajuster de façon correspondante. En conséquence, un mouvement de groupe se forme spontanément. Le comportement adaptatif du collectif surgit d’interactions répétées, dont chacune est relativement simple en elle-même […] ».

Le truc intéressant, là, c’est que je fais exactement la même opération logique dans les réseaux neuronaux que je fais et utilise dans ma recherche sur l’intelligence collective. A l’intérieur de chaque occurrence empirique dans mon ensemble de données (donc, de façon pratique, dans chaque vers de ma base de données), je calcule en ensuite je propage un méta-paramètre de distance Euclidienne entre chaque variable et toutes les autres. Le Produit Intérieur Brut en Suède en 2007 vérifie donc sa distance Euclidienne par rapport à l’inflation, au taux d’emploi etc., tout ça en Suède en 2007. Le PIB dans mon réseau neuronal se comporte donc comme un oiseau : ça vole de façon à contrôler sa distance par rapport aux autres phénomènes sociaux.

Chaque vers de la base de données est donc accompagné d’un vecteur-fantôme des distances Euclidiennes, qui est ensuite utilisé par le réseau comme information pertinente à la tentative d’adaptation dans l’occurrence empirique suivante, donc dans le vers suivant de la base des données. Initialement, lorsque je programmais ce truc, je ne savais pas ce que ça va donner. Je ne savais presque rien de cet aspect particulier de la théorie de complexité. Je venais juste de lire quelques articles sur la théorie d’essaim dans la programmation des robots et je voulais voir comment ça marche chez moi (Wood & Thompson 2021[1]; Li et al. 2021[2]).  Je m’adaptais juste de façon (probablement) intelligente au flot de mes propres pensées. Il se fait que la propagation de ces distances Euclidiennes locales entres les variables impacte le réseau et son apprentissage de façon profonde.

Voilà donc un point certain de rapprochement entre ma méthode d’utiliser les réseaux neuronaux artificiels pour simuler l’intelligence collective et la théorie des systèmes complexes. Lorsque je crée, pour un ensemble des variables quantitatives socio-économiques, un ensemble fantôme accompagnant des distances mathématiques locales entre ces variables et je propage ces distances à travers le réseau, les nombres apprennent de façon accélérée.          

Une petite explication est de rigueur, à propos de la notion de distance mathématique. Moi, j’utilise la distance Euclidienne entre les nombres simples. Dans le domaine du Data Science c’est l’équivalent de la pierre taillée. Il y a des mesures beaucoup plus sophistiquées, ou une distance Euclidienne est calculée entre des matrices entières des nombres. Moi, j’aime bien utiliser le type d’intelligence artificielle que je comprends.

Je peux donc résumer un point important de ma méthode, tout en l’enracinant dans la théorie des systèmes complexes. Nous pouvons imaginer les sociétés humaines comme des essaims des phénomènes que nous observons de façon imparfaite à travers des variables quantitatives. L’essaim des phénomènes s’auto-organise à travers les actions d’êtres humains qui contrôlent, de façon imparfaite et néanmoins cohérente, quelle est la distance (cohérence mutuelle) entre les phénomènes distincts. Le fait que chaque culture humaine s’efforce de créer et maintenir une cohérence interne est donc le mécanisme de contrôle qui facilite l’émergence des systèmes complexes.

Mon intuition à moi, lorsque j’introduisais ces mesures-fantômes de distance Euclidienne entre les variables était un peu contraire, en fait. Mon truc, depuis ma thèse de doctorat, c’est l’innovation et le changement technologique. Après avoir lu ces articles sur la théorie d’essaim je me suis dit que l’innovation survient lorsqu’une société se dit (collectivement) « Merde ! Ras le bol avec la monotonie ! Faut secouer tout ça un peu ! Eh, les gars ! Oui, vous ! On veut dire : oui, nous ! On relâche la cohérence interne ! Oui, juste pour quelques années, pas de souci ! Oui, merde, on vous (nous) promet de ne pas inventer Facebook, enfin on espère… ».  

La société que je représente avec un réseau neuronal est donc capable d’innovation parce qu’elle peut relâcher sa cohérence culturelle interne juste ce qu’il faut pour laisser entrer des phénomènes nouveaux. Ce que j’observe mathématiquement dans mes simulations avec des données socio-économiques réelles : lorsque je propage la distance Euclidienne entre les variables à travers le réseau, celui-ci donne l’impression de respirer. Ça se gonfle et ça se dégonfle, en cadence rythmique.  

[1] Wood, M. A., & Thompson, C. (2021). Crime prevention, swarm intelligence and stigmergy: Understanding the mechanisms of social media-facilitated community crime prevention. The British Journal of Criminology, 61(2), 414-433.

[2] Li, M., Porter, A. L., Suominen, A., Burmaoglu, S., & Carley, S. (2021). An exploratory perspective to measure the emergence degree for a specific technology based on the philosophy of swarm intelligence. Technological Forecasting and Social Change, 166, 120621.

The collective of individual humans being any good at being smart

I am working on two topics in parallel, which is sort of normal in my case. As I know myself, instead of asking “Isn’t two too much?”, I should rather say “Just two? Run out of ideas, obviously”. I keep working on a proof-of-concept article for the idea which I provisionally labelled “Energy Ponds” AKA “Project Aqueduct”, on the one hand. See my two latest updates, namely ‘I have proven myself wrong’ and ‘Plusieurs bouquins à la fois, comme d’habitude’, as regards the summary of what I have found out and written down so far. As in most research which I do, I have come to the conclusion that however wonderful the concept appears, the most important thing in my work is the method of checking the feasibility of that concept. I guess I should develop on the method more specifically.

On the other hand, I am returning to my research on collective intelligence. I have just been approached by a publisher, with a kind invitation to submit the proposal for a book on that topic. I am passing in review my research, and the available literature. I am wondering what kind of central thread I should structure the entire book around. Two threads turn up in my mind, as a matter of fact. The first one is the assumption that whatever kind of story I am telling, I am actually telling the story of my own existence. I feel I need to go back to the roots of my interest in the phenomenon of collective intelligence, and those roots are in my meddling with artificial neural networks. At some point, I came to the conclusion that artificial neural networks can be good simulators of the way that human societies figure s**t out. I need to dig again into that idea.

My second thread is the theory of complex systems AKA the theory of complexity. The thing seems to be macheting its way through the jungle of social sciences, those last years, and it looks interestingly similar to what I labelled as collective intelligence. I came by the theory of complexity in three books which I am reading now (just three?). The first one is a history book: ‘1177 B.C. The Year Civilisation Collapsed. Revised and Updated’, published by Eric H. Cline with Princeton University Press in 2021[1]. The second book is just a few light years away from the first one. It regards mindfulness. It is ‘Aware. The Science and Practice of Presence. The Groundbreaking Meditation Practice’, published by Daniel J. Siegel with TarcherPerigee in 2018[2]. The third book is already some sort of a classic; it is ‘The Black Swan. The impact of the highly improbable’ by Nassim Nicolas Taleb with Penguin, in 2010.   

I think it is Daniel J. Siegel who gives the best general take on the theory of complexity, and I allow myself to quote: ‘One of the fundamental emergent properties of complex systems in this reality of ours is called self-organization. That’s a term you might think someone in psychology or even business might have created—but it is a mathematical term. The form or shape of the unfolding of a complex system is determined by this emergent property of self-organization. This unfolding can be optimized, or it can be constrained. When it’s not optimizing, it moves toward chaos or toward rigidity. When it is optimizing, it moves toward harmony and is flexible, adaptive, coherent, energized, and stable’. (Siegel, Daniel J.. Aware (p. 9). Penguin Publishing Group. Kindle Edition).  

I am combining my scientific experience with using AI as social simulator with the theory of complex systems. I means I need to UNDERSTAND, like really. I need to understand my own thinking, in the first place, and then I need to combine it with whatever I can understand from other people’s thinking. It started with a simple artificial neural network, which I used to write my article ‘Energy efficiency as manifestation of collective intelligence in human societies’ (Energy, 191, 116500, ).  I had a collection of quantitative variables, which I had previously meddled with using classical regression. As regression did not really bring much conclusive results, I had the idea of using an artificial neural network. Of course, today, neural networks are a whole technology and science. The one I used is the equivalent of a spear with a stone tip as compared to a battle drone. Therefore, the really important thing is the fundamental logic of neural networking as compared to regression, in analyzing quantitative data.

When I do regression, I come up with a function, like y = a1*x1 + a2*x2 + …+ b, I trace that function across the cloud of empirical data points I am working with, and I measure the average distance from those points to the line of my function. That average distance is the average (standard) error of estimation with that given function. I repeat the process as many times as necessary to find a function which both makes sense logically and yields the lowest standard error of estimation. The central thing is that I observe all my data at once, as if it was all happening at the same time and as if I was observing it from outside. Here is the thing: I observe it from outside, but when that empirical data was happening, i.e. when the social phenomena expressed in my quantitative variables were taking place, everybody (me included) was inside, not outside.

How to express mathematically the fact of being inside the facts measured? One way is to take those empirical occurrences one by one, sort of Denmark in 2005, and then Denmark in 2006, and then Germany in 2005 etc. Being inside the events changes my perspective on what is the error of estimation, as compared to being outside. When I am outside, error means departure from the divine plan, i.e. from the regression function. When I am inside things that are happening, error happens as discrepancy between what I want and expect, on the one hand, and what I actually get, on the other hand. These are two different errors of estimation, measured as departures from two different functions. The regression function is the most accurate (or as accurate as you can get) mathematical explanation of the empirical data points. The function which we use when simulating the state of being inside the events is different: it is a function of adaptation.      

Intelligent adaptation means that we are after something: food, sex, power, a new Ferrari, social justice, 1000 000 followers on Instagram…whatever. There is something we are after, some kind of outcome we try to optimize. When I have a collection of quantitative variables which describe a society, such as energy efficiency, headcount of population, inflation rates, incidence of Ferraris per 1 million people etc., I can make a weak assumption that any of these can express a desired outcome. Here, a digression is due. In science and philosophy, weak assumptions are assumptions which assume very little, and therefore they are bloody hard to discard. On the other hand, strong assumptions assume a lot, and that makes them pretty good targets for discarding criticism. In other words, in science and philosophy, weak assumptions are strong and strong assumptions are weak. Obvious, isn’t it? Anyway, I make that weak assumption that any phenomenon we observe and measure with a numerical scale can be a collectively desired outcome we pursue.

Another assumption I make, a weak one as well, is sort of hidden in the word ‘expresses’. Here, I relate to a whole line of philosophical and scientific heritage, going back to people like Plato, Kant, William James, Maurice Merleau-Ponty, or, quite recently, Michael Keane (1972[3]), as well as Berghout & Verbitskiy (2021[4]). Very nearly everyone who seriously thought (or keeps thinking, on the account of being still alive) about human cognition of reality agrees that we essentially don’t know s**t. We make cognitive constructs in our minds, so as to make at least a little bit of sense of the essentially chaotic reality outside our skin, and we call it empirical observation. Mind you, stuff inside our skin is not much less chaotic, but this is outside the scope of social sciences. As we focus on quantitative variables commonly used in social sciences, the notion of facts becomes really blurred. Have you ever shaken hands with energy efficiency, with Gross Domestic Product or with the mortality rate? Have you touched it? No? Neither have I. These are highly distilled cognitive structures which we use to denote something about the state of society.

Therefore, I assume that quantitative, socio-economic variables express something about the societies observed, and that something is probably important if we collectively keep record of it. If I have n empirical variables, each of them possibly represents collectively important outcomes. As these are distinct variables, I assume that, with all the imperfections and simplification of the corresponding phenomenology, each distinct variable possibly represents a distinct type of collectively important outcome. When I study a human society through the lens of many quantitative variables, I assume they are informative about a set of collectively important social outcomes in that society.

Whilst a regression function explains how many variables are connected when observed ex post and from outside, an adaptation function explains and expresses the way that a society addresses important collective outcomes in a series of trials and errors. Here come two fundamental differences between studying a society with a regression function, as opposed to using an adaptation function. Firstly, for any collection of variables, there is essentially one regression function of the type:  y = a1*x1 + a2*x2 + …+ an*xn + b. On the other hand, with a collection of n quantitative variables at hand, there is at least as many functions of adaptation as there are variables. We can hypothesize that each individual variable x is the collective outcome to pursue and optimize, whilst the remaining n – 1 variables are instrumental to that purpose. One remark is important to make now: the variable informative about collective outcomes pursued, that specific x, can be and usually is instrumental to itself. We can make a desired Gross Domestic Product based on the Gross Domestic Product we have now. The same applies to inflation, energy efficiency, share of electric cars in the overall transportation system etc. Therefore, the entire set of n variables can be assumed instrumental to the optimization of one variable x from among them.   

Mathematically, it starts with assuming a functional input f(x1, x2, …, xn) which gets pitched against one specific outcome xi. Subtraction comes as the most logical representation of that pitching, and thus we have the mathematical expression ‘xi – f(x1, x2, …, xn)’, which informs about how close the society observed has come to the desired outcome xi. It is technically possible that people just nail it, and xi = f(x1, x2, …, x­n), whence xi – f(x1, x2, …, x­n) = 0. This is a perfect world, which, however, can be dangerously perfect. We know those societies of apparently perfectly happy people, who live in harmony with nature, even if that harmony means hosting most intestinal parasites of the local ecosystem. One day other people come, with big excavators, monetary systems, structured legal norms, and the bubble bursts, and it hurts.

Thus, on the whole, it might be better to hit xi ≠ f(x1, x2, …, x­n), whence xi – f(x1, x2, …, x­n) ≠ 0. It helps learning new stuff. The ‘≠ 0’ part means there is an error in adaptation. The functional input f(x1, x2, …, x­n) hits above or below the desired xi. As we want to learn, that error in adaptation AKA e = xi – f(x1, x2, …, xn) ≠ 0, makes any practical sense when we utilize it in subsequent rounds of collective trial and error. Sequence means order, and a timeline. We have a sequence {t0, t1, t2, …, tm} of m moments in time. Local adaptation turns into ‘xi(t) – ft(x1, x2, …, x­n)’, and error of adaptation becomes the time-specific et = xi(t) – ft(x1, x2, …, x­n) ≠ 0. The clever trick consists in taking e(t0) = xi(t0) – ft0(x1, x2, …, x­n) ≠ 0 and combining it somehow with the next functional input ft1(x1, x2, …, x­n). Mathematically, if we want to combine two values, we can add them up or multiply them. We keep in mind that division is a special case of multiplication, namely x * (1/z). We I add up two values, I assume they are essentially of the same kind and sort of independent from each other. When, on the other hand, I multiply them, they become entwined so that each of them reproduces the other one. Multiplication ‘x * z’ means that x gets reproduced z times and vice versa. When I have the error of adaptation et0 from the last experimental round and I want to combine it with the functional input of adaptation ft1(x1, x2, …, x­n) in the next experimental round, that whole reproduction business looks like a strong assumption, with a lot of weak spots on it. I settle for the weak assumption then, and I assume that ft1(x1, x2, …, x­n) becomes ft0(x1, x2, …, x­n) + e(t0).

The expression ft0(x1, x2, …, x­n) + e(t0) makes any functional sense only when and after we have e(t0) = xi(t0) – ft0(x1, x2, …, x­n) ≠ 0. Consequently, the next error of adaptation, namely e(t1) = xi(t1) – ft1(x1, x2, …, x­n) ≠ 0 can come into being only after its predecessor et0 has occurred. We have a chain of m states in the functional input of the society, i.e. {ft0(x1, x2, …, x­n) => ft1(x1, x2, …, x­n) => … => ftm(x1, x2, …, x­n)}, associated with a chain of m desired outcomes {xi(t0) => xi(t1) => … => xi(tm)}, and with a chain of errors in adaptation {e(t0) => e(t1) => …=> e(tm)}. That triad – chain of functional inputs, chain of desired outcomes, and the chain of errors in adaptation – makes for me the closest I can get now to the mathematical expression of the adaptation function. As errors get fed along the chain of states (as I see it, they are being fed forward, but in the algorithmic version, you can backpropagate them), those errors are some sort of dynamic memory in that society, the memory from learning to adapt.

Here we can see the epistemological difference between studying a society from outside, and explaining its workings with a regression function, on the one hand, and studying those mechanisms from inside, by simulation with an adaptation function, on the other hand. Adaptation function is the closest I can get, in mathematical form, to what I understand by collective intelligence. As I have been working with that general construct, I progressively zoomed in on another concept, namely that of intelligent structure, which I define as a structure which learns by experimenting with many alternative versions of itself whilst staying structurally coherent, i.e. by maintaining basic coupling between particular components.

I feel like comparing my approach to intelligent structures and their collective intelligence with the concept of complex systems, as discussed in the literature I have just referred to. I returned, therefore, to the book entitled ‘1177 B.C. The Year Civilisation Collapsed. Revised and Updated’, by Eric H. Cline, Princeton University Press, 2021. The theory of complex systems is brought forth in that otherwise very interesting piece in order to help formulating an answer to the following question: “Why did the great empires of the Late Bronze Age, such as Egypt, the Hittites, or the Myceneans, collapse all in approximately the same time, around 1200 – 1150 B.C.?”.  The basic assertion which Eric Cline develops on and questions is that the entire patchwork of those empires in the Mediterranean, the Levant and the Middle East was one big complex system, which collapsed on the account of having overkilled it slightly in the complexity department.

I am trying to reconstruct the definition of systemic complexity such as Eric Cline uses it in his flow of logic. I start with the following quote: Complexity science or theory is the study of a complex system or systems, with the goal of explaining the phenomena which emerge from a collection of interacting objects’. If we study a society as a complex system, we need to assume two things. There are many interacting objects in it, for one, and their mutual interaction leads to the emergence of some specific phenomena. Sounds cool. I move on, and a few pages later I find the following statement: ‘In one aspect of complexity theory, behavior of those objects is affected by their memories and “feedback” from what has happened in the past. They are able to adapt their strategies, partly on the basis of their knowledge of previous history’. Nice. We are getting closer. Entities inside a complex system accumulate memory, and they learn on that basis. This is sort of next door to the three sequential chains: states, desired outcomes, and errors in adaptation, which I coined up.

Further, I find an assertion that a complex social system is typically “alive”, which means that it evolves in a complicated, nontrivial way, whilst being open to influences from the environment. All that leads to the complex system to generate phenomena which can be considered as surprising and extreme. Good. This is the moment to move to the next book:  ‘The Black Swan. The impact of the highly improbable’ by Nassim Nicolas Taleb , Penguin, 2010. Here comes a lengthy quote, which I bring here for the sheer pleasure of savouring one more time Nassim Taleb’s delicious style: “[…] say you attribute the success of the nineteenth-century novelist Honoré de Balzac to his superior “realism,” “insights,” “sensitivity,” “treatment of characters,” “ability to keep the reader riveted,” and so on. These may be deemed “superior” qualities that lead to superior performance if, and only if, those who lack what we call talent also lack these qualities. But what if there are dozens of comparable literary masterpieces that happened to perish? And, following my logic, if there are indeed many perished manuscripts with similar attributes, then, I regret to say, your idol Balzac was just the beneficiary of disproportionate luck compared to his peers. Furthermore, you may be committing an injustice to others by favouring him. My point, I will repeat, is not that Balzac is untalented, but that he is less uniquely talented than we think. Just consider the thousands of writers now completely vanished from consciousness: their record does not enter into analyses. We do not see the tons of rejected manuscripts because these writers have never been published. The New Yorker alone rejects close to a hundred manuscripts a day, so imagine the number of geniuses that we will never hear about. In a country like France, where more people write books while, sadly, fewer people read them, respectable literary publishers accept one in ten thousand manuscripts they receive from first-time authors”.

Many people write books, few people read them, and that creates something like a flow of highly risky experiments. That coincides with something like a bottleneck of success, with possibly great positive outcomes (fame, money, posthumous fame, posthumous money for other people etc.), and a low probability of occurrence. A few salient phenomena are produced – the Balzacs – whilst the whole build-up of other writing efforts, by less successful novelists, remains in the backstage of history. That, in turn, somehow rhymes with my intuition that intelligent structures need to produce big outliers, at least from time to time. On the one hand, those outliers can be viewed as big departures from the currently expected outcomes. They are big local errors. Big errors mean a lot of information to learn from. There is an even further-going, conceptual coincidence with the theory and practice of artificial neural networks. A network can be prone to overfitting, which means that it learns too fast, sort of by jumping prematurely to conclusions, before and without having worked through the required work through local errors in adaptation.

Seen from that angle, the function of adaptation I have come up with has a new shade. The sequential chain of errors appears as necessary for the intelligent structure to be any good. Good. Let’s jump to the third book I quoted with respect to the theory of complex systems: ‘Aware. The Science and Practice of Presence. The Ground-breaking Meditation Practice’, by Daniel J. Siegel, TarcherPerigee, 2018. I return to the idea of self-organisation in complex systems, and the choice between three different states: a) the optimal state of flexibility, adaptability, coherence, energy and stability b) non-optimal rigidity and c) non-optimal chaos.

That conceptual thread concurs interestingly with my draft paper: ‘Behavioral absorption of Black Swans: simulation with an artificial neural network’ . I found out that with the chain of functional input states {ft0(x1, x2, …, x­n) => ft1(x1, x2, …, x­n) => … => ftm(x1, x2, …, x­n)} being organized in rigorously the same way, different types of desired outcomes lead to different patterns of learning, very similar to the triad which Daniel Siegel refers to. When my neural network does its best to optimize outcomes such as Gross Domestic Product, it quickly comes to rigidity. It makes some errors in the beginning of the learning process, but then it quickly drives the local error asymptotically to zero and is like ‘We nailed it. There is no need to experiment further’. There are other outcomes, such as the terms of trade (the residual fork between the average price of exports and that of imports), or the average number of hours worked per person per year, which yield a curve of local error in the form of a graceful sinusoid, cyclically oscillating between different magnitudes of error. This is the energetic, dynamic balance. Finally, some macroeconomic outcomes, such as the index of consumer prices, can make the same neural network go nuts, and generate an ever-growing curve of local error, as if the poor thing couldn’t learn anything sensible from looking at the prices of apparel and refrigerators. The (most) puzzling thing in all that differences in pursued outcomes are the source of discrepancy in the patterns of learning, not the way of learning as such. Some outcomes, when pursued, keep the neural network I made in a state of healthy adaptability, whilst other outcomes make it overfit or go haywire.  

When I write about collective intelligence and complex system, it can come as a sensible idea to read (and quote) books which have those concepts explicitly named. Here comes ‘The Knowledge Illusion. Why we never think alone’ by Steven Sloman and Philip Fernbach, RIVERHEAD BOOKS (An imprint of Penguin Random House LLC, Ebook ISBN: 9780399184345, Kindle Edition). In the introduction, titled ‘Ignorance and the Community of Knowledge’, Sloman and Fernbach write: “The human mind is not like a desktop computer, designed to hold reams of information. The mind is a flexible problem solver that evolved to extract only the most useful information to guide decisions in new situations. As a consequence, individuals store very little detailed information about the world in their heads. In that sense, people are like bees and society a beehive: Our intelligence resides not in individual brains but in the collective mind. To function, individuals rely not only on knowledge stored within our skulls but also on knowledge stored elsewhere: in our bodies, in the environment, and especially in other people. When you put it all together, human thought is incredibly impressive. But it is a product of a community, not of any individual alone”. This is a strong statement, which I somehow distance myself from. I think that collective human intelligence can be really workable when individual humans are any good at being smart. Individuals need to have practical freedom of action, based on their capacity to figure s**t out in difficult situations, and the highly fluid ensemble of individual freedoms allows the society to make and experiment with many alternative versions of themselves.

Another book is more of a textbook. It is ‘What Is a Complex System?’ by James Landyman and Karoline Wiesner, published with Yale University Press (ISBN 978-0-300-25110-4, Kindle Edition). In the introduction (p.15), Landyman and Wiesner claim: “One of the most fundamental ideas in complexity science is that the interactions of large numbers of entities may give rise to qualitatively new kinds of behaviour different from that displayed by small numbers of them, as Philip Anderson says in his hugely influential paper, ‘more is different’ (1972). When whole systems spontaneously display behaviour that their parts do not, this is called emergence”. In my world, those ‘entities’ are essentially the chained functional input states {ft0(x1, x2, …, x­n) => ft1(x1, x2, …, x­n) => … => ftm(x1, x2, …, x­n)}. My entities are phenomenological – they are cognitive structures which fault of a better word we call ‘empirical variables’. If the neural networks I make and use for my research are any good at representing complex systems, emergence is the property of data in the first place. Interactions between those entities are expressed through the function of adaptation, mostly through the chain {e(t0) => e(t1) => …=> e(tm)} of local errors, concurrent with the chain of functional input states.

I think I know what the central point and thread of my book on collective intelligence is, should I (finally) write that book for good. Artificial neural networks can be used as simulators of collective social behaviour and social change. Still, they do not need to be super-performant network. My point is that with the right intellectual method, even the simplest neural networks, those possible to program into an Excel spreadsheet, can be reliable cognitive tools for social simulation.

[1] LCCN 2020024530 (print) | LCCN 2020024531 (ebook) | ISBN 9780691208015 (paperback) | ISBN 9780691208022 (ebook) ; Cline, Eric H.. 1177 B.C.: 6 (Turning Points in Ancient History, 1) . Princeton University Press. Kindle Edition.

[2] LCCN 2018016987 (print) | LCCN 2018027672 (ebook) | ISBN 9780143111788 | ISBN 9781101993040 (hardback) ; Siegel, Daniel J.. Aware (p. viii). Penguin Publishing Group. Kindle Edition.

[3] Keane, M. (1972). Strongly mixing measures. Inventiones mathematicae, 16(4), 309-324. DOI

[4] Berghout, S., & Verbitskiy, E. (2021). On regularity of functions of Markov chains. Stochastic Processes and their Applications, Volume 134, April 2021, Pages 29-54,

DIY algorithms of our own

I return to that interesting interface of science and business, which I touched upon in my before-last update, titled ‘Investment, national security, and psychiatry’ and which means that I return to discussing two research projects I start being involved in, one in the domain of national security, another one in psychiatry, both connected by the idea of using artificial neural networks as analytical tools. What I intend to do now is to pass in review some literature, just to get the hang of what is the state of science, those last days.

On the top of that, I have been asked by my colleagues to crash take the leadership of a big, multi-thread research project in management science. The multitude of threads has emerged as a circumstantial by-product of partly the disruption caused by the pandemic, and partly as a result of excessive partition in the funding of research. As regards the funding of research, Polish universities have sort of two financial streams. One consists of big projects, usually team-based, financed by specialized agencies, such as the National Science Centre ( ) or the National Centre for Research and Development ( ). Another one is based on relatively small grants, applied for by and granted to individual scientists by their respective universities, which, in turn, receive bulk subventions from the Ministry of Education and Science. Personally, I think that last category, such as it is being allocated and used now, is a bit of a relic. It is some sort of pocket money for the most urgent and current expenses, relatively small in scale and importance, such as the costs of publishing books and articles, the costs of attending conferences etc. This is a financial paradox: we save and allocate money long in advance, in order to have money for essentially incidental expenses – which come at the very end of the scientific pipeline – and we have to make long-term plans for it. It is a case of fundamental mismatch between the intrinsic properties of a cash flow, on the one hand, and the instruments used for managing that cash flow, on the other hand.

Good. This is introduction to detailed thinking. Once I have those semantic niceties checked out, I cut into the flesh of thinking, and the first piece I intend to cut out is the state of science as regards Territorial Defence Forces and their role amidst the COVID-19 pandemic. I found an interesting article by Tiutiunyk et al. (2018[1]). It is interesting because it gives a detailed methodology for assessing operational readiness in any military unit, territorial defence or other. That corresponds nicely to Hypothesis #2 which I outlined for that project in national security, namely: ‘the actual role played by the TDF during the pandemic was determined by the TDF’s actual capacity of reaction, i.e. speed and diligence in the mobilisation of human and material resources’. That article by Tiutiunyk et al. (2018) allows entering into details as regards that claim. 

Those details start unfolding from the assumption that operational readiness is there when the entity studied possesses the required quantity of efficient technical and human resources. The underlying mathematical concept is quite simple. I the given situation, adequate response requires using m units of resources at k% of capacity during time te. The social entity studied can muster n units of the same resources at l% of capacity during the same time te. The most basic expression of operational readiness is, therefore, a coefficient OR = (n*l)/(m*k). I am trying to find out what specific resources are the key to that readiness. Tiutiunyk et al. (2018) offer a few interesting insights in that respect. They start by noticing the otherwise known fact that resources used in crisis situations are not exactly the same we use in everyday course of life and business, and therefore we tend to hold them for a time longer than their effective lifecycle. We don’t amortize them properly because we don’t really control for their physical and moral depreciation. One of the core concepts in territorial defence is to counter that negative phenomenon, and to maintain, through comprehensive training and internal control, a required level of capacity.

As I continue going through literature, I come by an interesting study by I. Bet-El (2020), titled: ‘COVID-19 and the future of security and defence’, published by the European Leadership Network ( ). Bet-El introduces an important distinction between threats and risks, and, contiguously, the distinction between security and defence: ‘A threat is a patent, clear danger, while risk is the probability of a latent danger becoming patent; evaluating that probability requires judgement. Within this framework, defence is to be seen as the defeat or deterrence of a patent threat, primarily by military, while security involves taking measures to prevent latent threats from becoming patent and if the measures fail, to do so in such a way that there is time and space to mount an effective defence’. This is deep. I do a lot of research in risk management, especially as I invest in the stock market. When we face a risk factor, our basic behavioural response is hedging or insurance. We hedge by diversifying our exposures to risk, and we insure by sharing the risk with other people. Healthcare systems are a good example of insurance. We have a flow of capital that fuels a manned infrastructure (hospitals, ambulances etc.), and that infrastructure allows each single sick human to share his or her risks with other people. Social distancing is the epidemic equivalent of hedging. When cutting completely or significantly throttling social interactions between households, we have each household being sort of separated from the epidemic risk in other households. When one node in a network is shielded from some of the risk occurring in other nodes, this is hedging.

The military is made for responding to threats rather than risks. Military action is a contingency plan, implemented when insurance and hedging have gone to hell. The pandemic has shown that we need more of such buffers, i.e. more social entities able to mobilise quickly into deterring directly an actual threat. Territorial Defence Forces seem to fit the bill.  Another piece of literature, from my own, Polish turf, by Gąsiorek & Marek (2020[2]), state straightforwardly that Territorial Defence Forces have proven to be a key actor during the COVID-19 pandemic precisely because they maintain a high degree of actual readiness in their crisis-oriented resources, as compared to other entities in the Polish public sector.

Good. I have a thread, from literature, for the project devoted to national security. The issue of operational readiness seems to be somehow in the centre, and it translates into the apparently fluent frontier between security and national defence. Speed of mobilisation in the available resources, as well as the actual reliability of those resources, once mobilized, look like the key to understanding the surprisingly significant role of Territorial Defence Forces during the COVID-19 pandemic. Looks like my initial hypothesis #2, claiming that the actual role played by the TDF during the pandemic was determined by the TDF’s actual capacity of reaction, i.e. speed and diligence in the mobilisation of human and material resources, is some sort of theoretical core to that whole body of research.

In our team, we plan and have a provisional green light to run interviews with the soldiers of Territorial Defence Forces. That basic notion of actually mobilizable resources can help narrowing down the methodology to apply in those interviews, by asking specific questions pertinent to that issue. Which specific resources proved to be the most valuable in the actual intervention of TDF in pandemic? Which resources – if any – proved to be 100% mobilizable on the spot? Which of those resources proved to be much harder to mobilise than it had been initially assumed? Can we rate and rank all the human and technical resources of TDF as for their capacity to be mobilised?

Good. I gently close the door of that room in my head, filled with Territorial Defence Forces and the pandemic. I make sure I can open it whenever I want, and I open the door to that other room, where psychiatry dwells. Me and those psychiatrists I am working with can study a sample of medical records as regards patients with psychosis. Verbal elocutions of those patients are an important part of that material, and I make two hypotheses along that tangent:

>> Hypothesis #1: the probability of occurrence in specific grammatical structures A, B, C, in the general grammatical structure of a patient’s elocutions, both written and spoken, is informative about the patient’s mental state, including the likelihood of psychosis and its specific form.

>> Hypothesis #2: the action of written self-reporting, e.g. via email, from the part of a psychotic patient, allows post-clinical treatment of psychosis, with results observable as transition from mental state A to mental state B.

I start listening to what smarter people than me have to say on the matter. I start with Worthington et al. (2019[3]), and I learn there is a clinical category: clinical high risk for psychosis (CHR-P), thus a set of subtler (than psychotic) ‘changes in belief, perception, and thought that appear to represent attenuated forms of delusions, hallucinations, and formal thought disorder’. I like going backwards upstream, and I immediately ask myself whether that line of logic can be reverted. If there is clinical high risk for psychosis, the occurrence of those same symptoms in reverse order, from severe to light, could be a path of healing, couldn’t it?

Anyway, according to Worthington et al. (2019), some 25% of people with diagnosed CHR-P transition into fully scaled psychosis. Once again, from the perspective of risk management, 25% of actual occurrence in a risk category is a lot. It means that CHR-P is pretty solid as risk assessment comes. I further learn that CHR-P, when represented as a collection of variables (a vector for friends with a mathematical edge), entails an internal distinction into predictors and converters. Predictors are the earliest possible observables, something like a subtle smell of possible s**t, swirling here and there in the ambient air. Converters are information that bring progressive confirmation to predictors.

That paper by Worthington et al. (2019) is a review of literature in itself, and allows me to compare different approaches to CHR-P. The most solid ones, in terms of accurately predicting the onset of full-clip psychosis, always incorporate two components: assessment of the patient’s social role, and analysis of verbalized thought. Good. Looks promising. I think the initial hypotheses should be expanded into claims about socialization.

I continue with another paper, by Corcoran and Cecchi (2020[4]). Generally, patients with psychotic disorders display lower a semantic coherence than ordinary. The flow of meaning in their speech is impended: they can express less meaning in the same volume of words, as compared to a mentally healthy person. Reduced capacity to deliver meaning manifests as apparent tangentiality in verbal expression. Psychotic patients seem to err in their elocutions. Reduced complexity of speech, i.e. relatively low a capacity to swing between different levels of abstraction, with a tendency to exaggerate concreteness, is another observable which informs about psychosis. Two big families of diagnostic methods follow that twofold path. Latent Semantic Analysis (LSA) seems to be the name of the game as regards the study of semantic coherence. Its fundamental assumption is that words convey meaning by connecting to other words, which further unfolds into assuming that semantic similarity, or dissimilarity, with a more or less complex coefficient joint occurrence, as opposed to disjoint occurrence inside big corpuses of language.  

Corcoran and Cecchi (2020) name two main types of digital tools for Latent Semantic Analysis. One is Word2Vec (, and I found a more technical and programmatic approach there to at: . Another one is GloVe, which I found three interesting references to, at , , and at .

As regards semantic complexity, two types of analytical tools seem to run the show. One is the part-of-speech (POS) algorithm, where we tag words according to their grammatical function in the sentence: noun, verb, determiner etc. There are already existing digital platforms for implementing that approach, such as Natural Language Toolkit ( ). Another angle is that of speech graphs, where words are nodes in the network of discourse, and their connections (e.g. joint occurrence) to other words are edges in that network. Now, the intriguing thing about that last thread is that it seems to had been burgeoning in the late 1990ies, and then it sort of faded away. Anyway, I found two references for an algorithmic approach to speech graphs, at , and at .

That quick review of literature, as regards natural language as predictor of psychosis, leads me to an interesting sidestep. Language is culture, right? Low coherence, and low complexity in natural language are informative about psychosis, right? Now, I put that argument upside down. What if we, homo (mostly) sapiens have a natural proclivity to psychosis, with that overblown cortex of ours? What if we had figured out, at some point of our evolutionary path, that language is a collectively intelligent tool which, with is unique coherence and complexity required for efficient communication, keeps us in a state of acceptable sanity, until we go on Twitter, of course.  

Returning to the intellectual discipline which I should demonstrate, as a respectable researcher, the above review of literature brings one piece of good news, as regards the project in psychiatry. Initially, in this specific team, we assumed that we necessarily need an external partner, most likely a digital business, with important digital resources in AI, in order to run research on natural language. Now, I realized that we can assume two scenarios: one with big, fat AI from that external partner, and another one, with DIY algorithms of our own. Gives some freedom of movement. Cool.

[1] Tiutiunyk, V. V., Ivanets, H. V., Tolkunov, І. A., & Stetsyuk, E. I. (2018). System approach for readiness assessment units of civil defense to actions at emergency situations. Науковий вісник Національного гірничого університету, (1), 99-105. DOI: 10.29202/nvngu/2018-1/7

[2] Gąsiorek, K., & Marek, A. (2020). Działania wojsk obrony terytorialnej podczas pandemii COVID–19 jako przykład wojskowego wsparcia władz cywilnych i społeczeństwa. Wiedza Obronna. DOI:

[3] Worthington, M. A., Cao, H., & Cannon, T. D. (2019). Discovery and validation of prediction algorithms for psychosis in youths at clinical high risk. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging.

[4] Corcoran, C. M., & Cecchi, G. (2020). Using language processing and speech analysis for the identification of psychosis and other disorders. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging.

Investment, national security, and psychiatry

I need to clear my mind a bit. For the last few weeks, I have been working a lot on revising an article of mine, and I feel I need a little bit of a shake-off. I know by experience that I need a structure to break free from another structure. Yes, I am one of those guys. I like structures. When I feel I lack one, I make one.

The structure which I want to dive into, in order to shake off the thinking about my article, is the thinking about my investment in the stock market. My general strategy in that department is to take the rent, which I collect from an apartment in town, every month, and to invest it in the stock market. Economically, it is a complex process of converting the residential utility of a real asset (apartment) into a flow of cash, thus into a financial asset with quite steady a market value (inflation is still quite low), and then I convert that low-risk financial asset into a differentiated portfolio of other financial assets endowed with higher a risk (stock). I progressively move capital from markets with low risk (residential real estate, money) into a high-risk-high-reward market.

I am playing a game. I make a move (monthly cash investment), and I wait for a change in the stock market. I am wrapping my mind around the observable change, and I make my next move the next month. With each move I make, I gather information. What is that information? Let’s have a look at my portfolio such as it is now. You can see it in the table below:

StockValue in EURReal return in €Rate of return I have as of April 6ht, 2021, in the morning
CASH & CASH FUND & FTX CASH (EUR) € 25,82 €                                    –   €                                     25,82
ALLEGRO.EU SA € 48,86 €                               (2,82)-5,78%
ALTIMMUNE INC. – COMM € 1 147,22 €                            179,6515,66%
APPLE INC. – COMMON ST € 1 065,87 €                                8,210,77%
BIONTECH SE € 1 712,88 €                           (149,36)-8,72%
CUREVAC N.V. € 711,00 €                             (98,05)-13,79%
DEEPMATTER GROUP PLC € 8,57 €                               (1,99)-23,26%
FEDEX CORPORATION COMM € 238,38 €                              33,4914,05%
FIRST SOLAR INC. – CO € 140,74 €                             (11,41)-8,11%
GRITSTONE ONCOLOGY INC € 513,55 €                           (158,43)-30,85%
INPOST € 90,74 €                             (17,56)-19,35%
MODERNA INC. – COMMON € 879,85 €                             (45,75)-5,20%
NOVAVAX INC. – COMMON STOCK € 1 200,75 €                            398,5333,19%
NVIDIA CORPORATION – C € 947,35 €                              42,254,46%
ONCOLYTICS BIOTCH CM € 243,50 €                             (14,63)-6,01%
SOLAREDGE TECHNOLOGIES € 683,13 €                             (83,96)-12,29%
SOLIGENIX INC. COMMON € 518,37 €                           (169,40)-32,68%
TESLA MOTORS INC. – C € 4 680,34 €                            902,3719,28%
VITALHUB CORP.. € 136,80 €                               (3,50)-2,56%
WHIRLPOOL CORPORATION € 197,69 €                              33,1116,75%
  €       15 191,41 €                            840,745,53%

A few words of explanation are due. Whilst I have been actively investing for 13 months, I made this portfolio in November 2020, when I did some major reshuffling. My overall return on the cash invested, over the entire period of 13 months, is 30,64% as for now (April 6th, 2021), which makes 30,64% * (12/13) = 28,3% on the annual basis.

The 5,53% of return which I have on this specific portfolio makes roughly 1/6th of the total return in have on all the portfolios I had over the past 13 months. It is the outcome of my latest experimental round, and this round is very illustrative of the mistake which I know I can make as an investor: panic.

In August and September 2020, I collected some information, I did some thinking, and I made a portfolio of biotech companies involved in the COVID-vaccine story: Pfizer, Biontech, Curevac, Moderna, Novavax, Soligenix. By mid-October 2020, I was literally swimming in extasy, as I had returns on these ones like +50%. Pure madness. Then, big financial sharks, commonly called ‘investment funds’, went hunting for those stocks, and they did what sharks do: they made their target bleed before eating it. They boxed and shorted those stocks in order to make their prices affordably low for long investment positions. At the time, I lost control of my emotions, and when I saw those prices plummet, I sold out everything I had. Almost as soon as I did it, I realized what an idiot I had been. Two weeks later, the same stocks started to rise again. Sharks had had their meal. In response, I did what I still wonder whether it was wise or stupid: I bought back into those positions, only at a price higher than what I sold them for.

Selling out was stupid, for sure. Was buying back in a wise move? I don’t know, like really. My intuition tells me that biotech companies in general have a bright future ahead, and not only in connection with vaccines. I am deeply convinced that the pandemic has already built up, and will keep building up an interest for biotechnology and medical technologies, especially in highly innovative forms. This is even more probable as we realized that modern biotechnology is very largely digital technology. This is what is called ‘platforms’ in the biotech lingo. These are digital clouds which combine empirical experimental data with artificial intelligence, and the latter is supposed to experiment virtually with that data. Modern biotechnology consists in creating as many alternative combinations of molecules and lifeforms as we possibly can make and study, and then pick those which offer the best combination of biological outcomes with the probability of achieving said outcomes.

My currently achieved rates of return, in the portfolio I have now, are very illustrative of an old principle in capital investment: I will fail most of the times. Most of my investment decisions will be failures, at least in the short and medium term, because I cannot possibly outsmart the incredibly intelligent collective structure of the stock market. My overall gain, those 5,53% in the case of this specific portfolio, is the outcome of 19 experiments, where I fail in 12 of them, for now, and I am more or less successful in the remaining 7.

The very concept of ‘beating the market’, which some wannabe investment gurus present, is ridiculous. The stock market is made of dozens of thousands of human brains, operating in correlated coupling, and leveraged with increasingly powerful artificial neural networks. When I expect to beat that networked collective intelligence with that individual mind of mine, I am pumping smoke up my ass. On the other hand, what I can do is to do as many different experiments as I can possibly spread my capital between.

It is important to understand that any investment strategy, where I assume that from now on, I will not make any mistakes, is delusional. I made mistakes in the past, and I am likely to make mistakes in the future. What I can do is to make myself more predictable to myself. I can narrow down the type of mistakes I tend to make, and to create the corresponding compensatory moves in my own strategy.

Differentiation of risk is a big principle in my investment philosophy, and yet it is not the only one. Generally, with the exception of maybe 2 or 3 days in a year, I don’t really like quick, daily trade in the stock market. I am more of a financial farmer: I sow, and I wait to see plants growing out of those seeds. I invest in industries rather than individual companies. I look for some kind of strong economic undertow for my investments, and the kind of undertow I specifically look for is high potential for deep technological change. Accessorily, I look for industries which sort of logically follow human needs, e.g. the industry of express deliveries in the times of pandemic. I focus on three main fields of technology: biotech, digital, and energy.

Good. I needed to shake off, and I am. Thinking and writing about real business decisions helped me to take some perspective. Now, I am gently returning into the realm of science, without completely leaving the realm of business: I am navigating the somehow troubled and feebly charted waters of money for science. I am currently involved in launching and fundraising for two scientific projects, in two very different fields of science: national security and psychiatry. Yes, I know, they can conjunct in more points than we commonly think they can. Still, in canonical scientific terms, these two diverge.

How come I am involved, as researcher, in both national security and psychiatry? Here is the thing: my method of using a simple artificial neural network to simulate social interactions seems to be catching on. Honestly, I think it is catching on because other researchers, when they hear me talking about ‘you know, simulating alternative realities and assessing which one is the closest to the actual reality’ sense in me that peculiar mental state, close to the edge of insanity, but not quite over that edge, just enough to give some nerve and some fun to science.

In the field of national security, I teamed up with a scientist strongly involved in it, and we take on studying the way our Polish forces of Territorial Defence have been acting in and coping with the pandemic of COVID-19. First, the context. So far, the pandemic has worked as a magnifying glass for all the f**kery in public governance. We could all see a minister saying ‘A,B and C will happen because we said so’, and right after there was just A happening, with a lot of delay, and then a completely unexpected phenomenal D appeared, with B and C bitching and moaning they haven’t the right conditions for happening decently, and therefore they will not happen at all.  This is the first piece of the context. The second is the official mission and the reputation of our Territorial Defence Forces AKA TDF. This is a branch of our Polish military, created in 2017 by our right-wing government. From the beginning, these guys had the reputation to be a right-wing militia dressed in uniforms and paid with taxpayers’ money. I honestly admit I used to share that view. TDF is something like the National Guard in US. These are units made of soldiers who serve in the military, and have basic military training, but they have normal civilian lives besides. They have civilian jobs, whilst training regularly and being at the ready should the nation call.

The initial idea of TDF emerged after the Russian invasion of the Crimea, when we became acutely aware that military troops in nondescript uniforms, apparently lost, and yet strangely connected to the Russian government, could massively start looking lost by our Eastern border. The initial idea behind TDF was to significantly increase the capacity of the Polish population for mobilising military resources. Switzerland and Finland largely served as models.

When the pandemic hit, our government could barely pretend they control the situation. Hospitals designated as COVID-specific had frequently no resources to carry out that mission. Our government had the idea of mobilising TDF to help with basic stuff: logistics, triage and support in hospitals etc. Once again, the initial reaction of the general public was to put the label of ‘militarisation’ on that decision, and, once again, I was initially thinking this way. Still, some friends of mine, strongly involved as social workers supporting healthcare professionals, started telling me that working with TDF, in local communities, was nothing short of amazing. TDF had the speed, the diligence, and the capacity to keep their s**t together which many public officials lacked. They were just doing their job and helping tremendously.

I started scratching the surface. I did some research, and I found out that TDF was of invaluable help for many local communities, especially outside of big cities. Recently, I accidentally had a conversation about it with M., the scientist whom I am working with on that project. He just confirmed my initial observations.

M. has strong connections with TDF, including their top command. Our common idea is to collect abundant, interview-based data from TDF soldiers mobilised during the pandemic, as regards the way they carried out their respective missions. The purely empirical edge we want to have here is oriented on defining successes and failures, as well as their context and contributing factors. The first layer of our study is supposed to provide the command of TDF with some sort of case-studies-based manual for future interventions. At the theoretical, more scientific level, we intend to check the following hypotheses:      

>> Hypothesis #1: during the pandemic, TDF has changed its role, under the pressure of external events, from the initially assumed, properly spoken territorial defence, to civil defence and assistance to the civilian sector.

>> Hypothesis #2: the actual role played by the TDF during the pandemic was determined by the TDF’s actual capacity of reaction, i.e. speed and diligence in the mobilisation of human and material resources.

>> Hypothesis #3: collectively intelligent human social structures form mechanisms of reaction to external stressors, and the chief orientation of those mechanisms is to assure proper behavioural coupling between the action of external stressors, and the coordinated social reaction. Note: I define behavioural coupling in terms of the games’ theory, i.e. as the objectively existing need for proper pacing in action and reaction.   

The basic method of verifying those hypotheses consists, in the first place, in translating the primary empirical material into a matrix of probabilities. There is a finite catalogue of operational procedures that TDF can perform. Some of those procedures are associated with territorial military defence as such, whilst other procedures belong to the realm of civil defence. It is supposed to go like: ‘At the moment T, in the location A, procedure of type Si had a P(T,A, Si) probability of happening’. In that general spirit, Hypothesis #1 can be translated straight into a matrix of probabilities, and phrased out as ‘during the pandemic, the probability of TDF units acting as civil defence was higher than seeing them operate as strict territorial defence’.

That general probability can be split into local ones, e.g. region-specific. On the other hand, I intuitively associate Hypotheses #2 and #3 with the method which I call ‘study of orientation’. I take the matrix of probabilities defined for the purposes of Hypothesis #1, and I put it back to back with a matrix of quantitative data relative to the speed and diligence in action, as regards TDF on the one hand, and other public services on the other hand. It is about the availability of vehicles, capacity of mobilisation in people etc. In general, it is about the so-called ‘operational readiness’, which you can read more in, for example, the publications of RAND Corporation (  

Thus, I take the matrix of variables relative to operational readiness observable in the TDF, and I use that matrix as input for a simple neural network, where the aggregate neural activation based on those metrics, e.g. through a hyperbolic tangent, is supposed to approximate a specific probability relative to TDF people endorsing, in their operational procedures, the role of civil defence, against that of military territorial defence. I hypothesise that operational readiness in TDF manifests a collective intelligence at work and doing its best to endorse specific roles and applying specific operational procedures. I make as many such neural networks as there are operational procedures observed for the purposes of Hypothesis #1. Each of these networks is supposed to represent the collective intelligence of TDF attempting to optimize, through its operational readiness, the endorsement and fulfilment of a specific role. In other words, each network represents an orientation.

Each such network transforms the input data it works with. This is what neural networks do: they experiment with many alternative versions of themselves. Each experimental round, in this case, consists in a vector of metrics informative about the operational readiness TDF, and that vector locally tries to generate an aggregate outcome – its neural activation – as close as possible to the probability of effectively playing a specific role. This is always a failure: the neural activation of operational readiness always falls short of nailing down exactly the probability it attempts to optimize. There is always a local residual error to account for, and the way a neural network (well, my neural network) accounts for errors consists in measuring them and feeding them into the next experimental round. The point is that each such distinct neural network, oriented on optimizing the probability of Territorial Defence Forces endorsing and fulfilling a specific social role, is a transformation of the original, empirical dataset informative about the TDF’s operational readiness.

Thus, in this method, I create as many transformations (AKA alternative versions) of the actual operational readiness in TDF, as there are social roles to endorse and fulfil by TDF. In the next step, I estimate two mathematical attributes of each such transformation: its Euclidean distance from the original empirical dataset, and the distribution of its residual error. The former is informative about similarity between the actual reality of TDF’s operational readiness, on the one hand, and alternative realities, where TDF orient themselves on endorsing and fulfilling just one specific role. The latter shows the process of learning which happens in each such alternative reality.

I make a few methodological hypotheses at this point. Firstly, I expect a few, like 1 ÷ 3 transformations (alternative realities) to fall particularly close from the actual empirical reality, as compared to others. Particularly close means their Euclidean distances from the original dataset will be at least one order of magnitude smaller than those observable in the remaining transformations. Secondly, I expect those transformations to display a specific pattern of learning, where the residual error swings in a predictable cycle, over a relatively wide amplitude, yet inside that amplitude. This is a cycle where the collective intelligence of Territorial Defence Forces goes like: ‘We optimize, we optimize, it goes well, we narrow down the error, f**k!, we failed, our error increased, and yet we keep trying, we optimize, we optimize, we narrow down the error once again…’ etc. Thirdly, I expect the remaining transformations, namely those much less similar to the actual reality in Euclidean terms, to display different patterns of learning, either completely dishevelled, with the residual error bouncing haphazardly all over the place, or exaggeratedly tight, with error being narrowed down very quickly and small ever since.

That’s the outline of research which I am engaging into in the field of national security. My role in this project is that of a methodologist. I am supposed to design the system of interviews with TDF people, the way of formalizing the resulting data, binding it with other sources of information, and finally carrying out the quantitative analysis. I think I can use the experience I already have with using artificial neural networks as simulators of social reality, mostly in defining said reality as a vector of probabilities attached to specific events and behavioural patterns.     

As regards psychiatry, I have just started to work with a group of psychiatrists who have abundant professional experience in two specific applications of natural language in the diagnosing and treating psychoses. The first one consists in interpreting patients’ elocutions as informative about their likelihood of being psychotic, relapsing into psychosis after therapy, or getting durably better after such therapy. In psychiatry, the durability of therapeutic outcomes is a big thing, as I have already learnt when preparing for this project. The second application is the analysis of patients’ emails. Those psychiatrists I am starting to work with use a therapeutic method which engages the patient to maintain contact with the therapist by writing emails. Patients describe, quite freely and casually, their mental state together with their general existential context (job, family, relationships, hobbies etc.). They don’t necessarily discuss those emails in subsequent therapeutic sessions; sometimes they do, sometimes they don’t. The most important therapeutic outcome seems to be derived from the very fact of writing and emailing.

In terms of empirical research, the semantic material we are supposed to work with in that project are two big sets of written elocutions: patients’ emails, on the one hand, and transcripts of standardized 5-minute therapeutic interviews, on the other hand. Each elocution is a complex grammatical structure in itself. The semantic material is supposed to be cross-checked with neurological biomarkers in the same patients. The way I intend to use neural networks in this case is slightly different from that national security thing. I am thinking about defining categories, i.e. about networks which guess similarities and classification out of crude empirical data. For now, I make two working hypotheses:

>> Hypothesis #1: the probability of occurrence in specific grammatical structures A, B, C, in the general grammatical structure of a patient’s elocutions, both written and spoken, is informative about the patient’s mental state, including the likelihood of psychosis and its specific form.

>> Hypothesis #2: the action of written self-reporting, e.g. via email, from the part of a psychotic patient, allows post-clinical treatment of psychosis, with results observable as transition from mental state A to mental state B.

The fine details of theory

I keep digging. I keep revising that manuscript of mine – ‘Climbing the right hill – an evolutionary approach to the European market of electricity’ – in order to resubmit it to Applied Energy. Some of my readers might become slightly fed up with that thread. C’mon, man! How long do you mean to work on that revision? It is just an article! Yes, it is just an article, and I have that thing in me, those three mental characters: the curious ape, the happy bulldog, and the austere monk. The ape is curious, and it almost instinctively reaches for interesting things. My internal bulldog just loves digging out tasty pieces and biting into bones. The austere monk in me observes the intellectual mess, which the ape and the bulldog make together, and then he takes that big Ockham’s razor, from the recesses of his robe, and starts cutting bullshit out. When the three of those start dancing around a topic, it is a long path to follow, believe me.

In this update, I intend to structure the theoretical background of my paper. First, I restate the essential point of my own research, which I need and want to position in relation to other people’s views and research. I claim that energy-related policies, including those with environmental edge, should assume that whatever we do with energy, as a civilisation, is a by-product of actions purposefully oriented on other types of outcomes. Metaphorically, when I claim that a society should take the shift towards renewable energies as its chief goal, and take everything else as instrumental, is like saying that the chief goal of an individual should be to keep their blood sugar firmly at 80,00, whatever happens. What’s the best way to achieving it? Putting yourself in a clinic, under permanent intravenous nutrition, and stop experimenting with that thing people call ‘food’, ‘activity’, ‘important things to do’. Anyone wants to do it? Hardly anyone, I guess. The right level of blood sugar can be approximately achieved as balanced outcome of a proper lifestyle, and can serve as a gauge of whether our actual lifestyle is healthy.

Coming back from my nutritional metaphor to energy-related policies, there is no historical evidence that any human society has ever achieved any important change regarding the production of energy or its consumption, by explicitly stating ‘From now on, we want better energy management’. The biggest known shifts in our energy base happened as by-products of changes oriented on something else. In Europe, my home continent, we had three big changes. First, way back in the day, like starting from the 13th century, we harnessed the power of wind and that of water in, respectively, windmills and watermills. That served to provide kinetic energy to grind cereals into flour, which, in turn, served to feed a growing urban population. Windmills and watermills brought with them a silent revolution, which we are still wrapping our minds around. By the end of the 19th century, we started a massive shift towards fossil fuels. Why? Because we expected to drive Ferraris around, one day in the future? Not really. We just went terribly short on wood. People who claim that Europe should recreate its ‘ancestral’ forests deliberately ignore the fact that hardly anyone today knows what those ancestral forests should look like. Virtually all the forests we have today come from massive replantation which took place starting from the beginning of the 20th century. Yes, we have a bunch of 400-year-old oaks across the continent, but I dare reminding that one oak is not exactly a forest.

The civilisational change which I think is going on now, in our human civilisation, is the readjustment of social roles, and of the ways we create new social roles, in the presence of a radical demographic change: unprecedently high headcount of population, accompanied by just as unprecedently low rate of demographic growth. For hundreds of years, our civilisation has been evolving as two concurrent factories: the factory of food in the countryside, and the factory of new social roles in cities. Food comes the best when the headcount of humans immediately around is a low constant, and new social roles burgeon the best when humans interact abundantly, and therefore when they are tightly packed together in a limited space. The basic idea of our civilisation is to put most of the absolute demographic growth into cities and let ourselves invent new ways of being useful to each other, whilst keeping rural land as productive as possible.

That thing had worked for centuries. It had worked for humanity that had been relatively small in relation to available space and had been growing quickly into that space. That idea of separating the production of food from the creation of social roles and institutions was adapted precisely to that demographic pattern, which you can still find vestiges of in some developing countries, as well as in emerging markets, with urban population several dozens of times denser than the rural one, and cities that look like something effervescent. These cities grow bubbles out of themselves, and those bubbles burst just as quickly. My own trip to China showed me how cities can be truly alive, with layers and bubbles inside them. One is tempted to say these cities are something abnormal, as compared to the orderly, demographically balanced urban entities in developed countries. Still, historically, this is what cities are supposed to look like.

Now, something is changing. There is more of us on the planet than it has ever been but, at the same time, we experience unprecedently low rate of demographic growth. Whilst we apparently still manage to keep total urban land on the planet at a constant level ( ), we struggle with keeping the surface of agricultural land up to our needs ( ). As in any system tilted out of balance, weird local phenomena start occurring, and the basic metrics pertinent to the production and consumption of energy show an interesting pattern. When I look at the percentage participation of renewable sources in the total consumption of energy ( ), I see a bumpy cycle which looks like learning with experimentation. When I narrow down to the participation of renewables in the total consumption of electricity (, what I see is a more pronounced trend upwards, with visible past experimentation. The use of nuclear power to generate electricity ( looks like a long-run experiment, which now is in its phase of winding down.

Now, two important trends come into my focus. Energy efficiency, defined as average real output per unit of energy use ( quite unequivocal a trend upwards. Someone could say ‘Cool, we purposefully make ourselves energy efficient’. Still, when we care to have a look at the coefficient of energy consumed per person per year (, a strong trend upwards appears, with some deep bumps in the past. When I put those two trends back to back, I conclude that what we really max out on is the real output of goods and services in our civilisation, and energy efficiency is just a means to that end.

It is a good moment to puncture an intellectual balloon. I can frequently see and hear people argue that maximizing real output, in any social entity or context, is a manifestation of stupid, baseless greed and blindness to the truly important stuff. Still, please consider the following line of logic. We, humans, interact with the natural environment, and interact with each other.  When we interact with each other a lot, in highly dense networks of social relations, we reinforce each other’s learning, and start spinning the wheel of innovation and technological change. Abundant interaction with each other gives us new ideas for interacting with the natural environment.

Cities have peculiar properties. Firstly, by creating new social roles through intense social interaction, they create new products and services, and therefore new markets, connected in chains of value added. This is how the real output of goods and services in a society becomes a complex, multi-layered network of technologies, and this is how social structures become self-propelling businesses. The more complexity in social roles is created, the more products and services emerge, which brings the development in greater a number of markets. That, in turn, gives greater a real output, greater income per person, which incentivizes to create new social roles etc. This how social complexity creates the phenomenon called economic growth.

The phenomenon of economic growth, thus the quantitative growth in complex, networked technologies which emerge in relatively dense human settlements, has a few peculiar properties. You can’t see it, you can’t touch it, and yet you can immediately feel when its pace changes. Economic growth is among the most abstract concepts of social sciences, and yet living in a society with real economic growth at 5% per annum is like a different galaxy when compared to living in a place where real economic growth is actually a recession of -5%. The arithmetical difference is just 10 percentage points, around the top of something underlying which makes the base of 1. Still, lives in those two contexts are completely different. At +5% in real economic growth, starting a new business is generally a sensible idea, provided you have it nailed down with a business plan. At – 5% a year, i.e. in recession, the same business plan can be an elaborate way of committing economic and financial suicide. At +5%, political elections are usually won by people who just sell you the standard political bullshit, like ‘I will make your lives better’ claimed by a heavily indebted alcoholic with no real career of their own. At -5%, politics start being haunted by those sinister characters, who look and sound like evil spirits from our dreams and claim they ‘will restore order and social justice’.

The society which we consider today as normal is a society of positive real economic growth. All the institutions we are used to, such as healthcare systems, internal security, public administration, education – all that stuff works at least acceptably smoothly when complex, networked technologies of our society have demonstrable capacity to increase their real economic output. That ‘normal’ state of society is closely connected to the factories of social roles which we commonly call ‘cities’. Real economic growth happens when the amount of new social roles – fabricated through intense interactions between densely packed humans – is enough for the new humans coming around. Being professionally active means having a social role solid enough to participate in the redistribution of value added created in complex technological networks. It is both formal science and sort of accumulated wisdom in governance that we’d better have most of the adult, able bodied people in that state of professional activity. A small fringe of professionally inactive people is somehow healthy a margin of human energy free to be professionally activated, and when I say ‘small’, it is like no more than 5% of the adult population. Anything above becomes both a burden and a disruption to social cohesion. Too big a percentage of people with no clear, working social roles makes it increasingly difficult to make social interactions sufficiently abundant and complex to create enough new social roles for new people. This is why governments of this world attach keen importance to the accurate measurement of the phenomenon quantified as ‘unemployment’.  

Those complex networks of technologies in our societies, which have the capacity to create social roles and generate economic growth, work their work properly when we can transact about them, i.e. when we have working markets for the final economic goods produced with those technologies, and for intermediate economic goods produced for them. It is as if the whole thing worked when we can buy and sell things. I was born in 1968, in a communist country, namely Poland, and I can tell you that in the absence of markets the whole mechanism just jams, progressively to a halt. Yes, markets are messy and capricious, and transactional prices can easily get out of hand, creating inflation, and yet markets give those little local incentives needed to get the most of human social roles. In the communist Poland, I remember people doing really strange things, like hoarding massive inventories of refrigerators or women’s underwear, just to create some speculative spin in an ad hoc, semi-legal or completely illegal market. It looks as if people needed to market and transact for real, amidst the theoretically perfectly planned society.   

Anyway, economic growth is observable through big sets of transactions in product markets, and those transactions have two attributes: quantities and prices AKA Q an P. It is like Q*P = ∑qi*pi. When I have – well, when we have – that complex network of technologies functionally connected to a factory of social roles for new humans, that thing makes ∑qi*pi, thus a lot of local transactions with quantities qi, at prices pi. The economic growth I have been so vocal about in the last few paragraphs is the real growth, i.e. in quantity Q = ∑qi. On the long run, what I am interested in, and my government is interested in, is to reasonably max out on ∆ Q = ∆∑qi. Quantities change slowly and quite predictably, whilst prices tend to change quickly and, mostly on the short term, chaotically. Measuring accurately real economic growth involving kicking the ‘*pi’ component out of the equation and extracting just ∆ Q = ∆∑qi. Question: why bothering with the observation of Q*P = ∑qi*pi when the real thing we need is just ∆ Q = ∆∑qi? Answer: because there is no other way. Complex networks of technologies produce economic growth by creating increasing diversity in social roles in concurrence with increasing diversity in products and their respective markets. No genius has come up, so far, with a method to add up, directly, the volume of visits in hairdresser’s salons with the volume of electric vehicles made, and all that with the volume of energy consumed.

I have ventured myself far from the disciplined logic of revision in my paper, for resubmitting it. The logical flow required in this respect by Applied Energy is the following: introduction first, method and material next, theory follows, and calculations come after. The literature which I refer to in my writing needs to have two dimensions: longitudinal and lateral. Laterally, I divide other people’s publications into three basic groups: a) standpoints which I argue with b) methods and assumptions which I agree with and use to support my own reasoning, and c) viewpoints which sort of go elsewhere, and can be interesting openings into something even different from what I discuss. Longitudinally, the literature I cite needs, in the first place, to open up on the main points of my paper. This is ‘Introduction’. Publications which I cite here need to point at the utility of developing the line of research which I develop. They need to convey strong, general claims which sort of set my landmarks.

The section titled ‘Theory’ is supposed to provide the fine referencing of my method, so as to both support the logic thereof, and to open up on the detailed calculations I develop in the following section. Literature which I bring forth here should contain specific developments, both factual and methodological, something like a conceptual cobweb. In other words, ‘Introduction’ should be provocative, whilst ‘Theory’ transforms provocation into a structure.

Among the recent literature I am passing in review, three papers come forth as provocative enough for me to discuss them in the introduction of my article:  Andreoni 2020[1], Koponen & Le Net 2021[2]. The first of the three on that list, namely the paper by professor Valeria Andreoni, well in the mainstream of the MuSIASEM methodology (Multi-scale Integrated Analysis of Societal and Ecosystem Metabolism), sets an important line of theoretical debate, namely the arguable imperative to focus energy-related policies, and economic policies in general, on two outcomes, namely on maximizing energy efficiency (i.e. maximizing the amount of real output per unit of energy consumption), and on minimizing cross sectional differences between countries as regards energy efficiency. Both postulates are based on the assumption that energy efficiency of national economies corresponds to the metabolic efficiency of living organisms, and that maxing out on both is an objective evolutionary purpose in both cases. My method has the same general foundations as MuSIASEM. I claim that societies can be studied similarly to living organisms.

At that point, I diverge from the MuSIASEM framework: instead of focusing on the metabolism of such organically approached societies, I pay attention to their collective cognitive processes, their collective intelligence. I claim that human societies are collectively intelligent structures, which learn by experimenting with many alternative versions of themselves whilst staying structurally coherent. From that assumption, I derive two further claims. Firstly, if we reduce disparities between countries with respect to any important attribute of theirs, including energy efficiency, we kick out of the game a lot of opportunities for future learning: the ‘many alternative versions’ part of the process is no more there. Secondly, I claim there is no such thing as objective evolutionary purpose, would it be maximizing energy efficiency or anything else. Evolution has no purpose; it just has the mechanism of selection by replication. Replication of humans is proven to happen the most favourably when we collectively learn fast and make a civilisation out of that learning.

Therefore, whilst having no objective evolutionary purpose, human societies have objective orientations: we collectively attempt to optimize some specific outcomes, which have the attribute to organize our collective learning the most efficiently, in a predictable cycle of, respectively, episodes marked with large errors in adjustment, and those displaying much smaller errors in that respect.

From that theoretical cleavage between my method and the postulates of the MuSIASEM framework, I derive two practical claims as regards economic policies, especially as regards environmentally friendly energy systems. Looking for homogeneity between countries is a road to nowhere, for one. Expecting that human societies will purposefully strive to maximize their overall energy efficiency is unrealistic a goal, and therefore it is a harmful assumption in the presence of serious challenges connected to climate change, for two. Public policies should explicitly aim for disparity of outcomes in technological race, and the race should be oriented on outcomes which are being objectively pursued by human societies.

Whilst disagreeing with professor Valeria Andreoni on principles, I find her empirical findings highly interesting. Rapid economic change, especially the kind of change associated with crises, seems to correlate with deepening disparities between countries in terms of energy efficiency. In other words, when large economic systems need to adjust hard and fast, they sort of play their individual games separately as regards energy efficiency. Rapid economic adjustment under constraint is conducive to creating a large discrepancy of alternative states in what energy efficiency can possibly be, in the context of other socio-economic outcomes, and, therefore, more material is there for learning collectively by experimenting with many alternative versions of ourselves.

Against that theoretical sketch, I place the second paper which I judge worth to introduce with: Koponen, K., & Le Net, E. (2021): Towards robust renewable energy investment decisions at the territorial level. Applied Energy, 287, 116552. . I chose this one because it shows a method very similar to mine: the authors build a simulative model in Excel, where they create m = 5000 alternative futures for a networked energy system aiming at optimizing 5 performance metrics. The model was based on actual empirical data as for those variables, and the ‘alternative futures’ are, in other words, 5000 alternative states of the same system. Outcomes are gauged with the so-called regret analysis, where the relative performance in a specific outcome is measured as the residual difference between its local value, and, respectively, its general minimum or maximum, depending on whether the given metric is something we strive to maximize (e.g. capacity), or to minimize (e.g. GHG).

I can generalize on the method presented by Koponen and Le Net, and assume that any given state of society can be studied as one among many alternative states of said society, and the future depends very largely on how this society will navigate through the largely uncharted waters of itself being in many alternative states. Navigators need a star in the sky, to find their North, and so do societies. Koponen and Le Net simulate civilizational navigation under the constraint of four stars, namely the cost of CO2, the cost of electricity, the cost of natural gas, and the cost of biomass. I generalize and say that experimentation with alternative versions of us being collectively intelligent can be oriented on optimizing many alternative Norths, and the kind of North we will most likely pursue is the kind which allows us to learn efficiently how to go from one alternative future to another.

Good. This is my ‘Introduction’. It sets the tone for the method I present in the subsequent section, and the method opens up on the fine details of theory.

[1] Andreoni, V. (2020). The energy metabolism of countries: Energy efficiency and use in the period that followed the global financial crisis. Energy Policy, 139, 111304.

[2] Koponen, K., & Le Net, E. (2021): Towards robust renewable energy investment decisions at the territorial level. Applied Energy, 287, 116552.  .

The traps of evolutionary metaphysics

I think I have moved forward in the process of revising my manuscript ‘Climbing the right hill – an evolutionary approach to the European market of electricity’, as a resubmission Applied Energy . A little digression: as I provide, each time, a link to the original form of that manuscript, my readers can compare the changes I develop on, in those updates, with the initial flow of logic.

I like discussing important things in reverse order. I like starting from what apparently is the end and the bottom line of thinking. From there, I go forward by going back, sort of. In an article, the end is the conclusion, possibly summarized in 5 ÷ 6 bullet points and optionally coming together with a graphical abstract. I conclude this specific piece of research by claiming that energy-oriented policies, e.g. those oriented on developing renewable sources, could gain in efficiency by being: a) national rather than continental or global b) explicitly oriented on optimizing the country’s terms of trade in global supply chains c) just as explicitly oriented on the development of some specific types of jobs whilst purposefully winding down other types thereof.

I give twofold a base for that claim. Firstly, I have that stylized general observation about energy-oriented policies: globally or continentally calibrated policies, such as, for example, the now famous Paris Climate Agreement, work so slow and with so much friction that they become ineffective for any practical purpose, whilst country-level policies are much more efficient in the sense that one can see a real transition from point A to point B. Secondly, my own research – which I present in this article under revision – brings evidence that national social structures orient themselves on optimizing their terms of trade and their job markets in priority, whilst treating the energy-related issues as instrumental. That specific collective orientation seems, in turn, to have its source in the capacity of human social structures to develop a strongly cyclical, predictable pattern of collective learning precisely in relation to the terms of trade, and to the job market, whilst collective learning oriented on other measurable variables, inclusive of those pertinent to energy management, is much less predictable.

That general conclusion is based on quantitative results of my empirical research, which brings forth 4 quantitative variables – price index in exports (PL_X), average hours worked per person per year (AVH), the share of labour compensation in Gross National Income (LABSH), and the coefficient of human capital (HC – average years of schooling per person) – out of a total scope of 49 observables, as somehow privileged collective outcomes marked with precisely that recurrent, predictable pattern of learning.

The privileged position of those specific variables, against the others, manifests theoretically as their capacity to produce simulated social realities much more similar to the empirically observable state thereof than simulated realities based on other variables, whilst producing a strongly cyclical series of local residual errors in approximating said empirically observable state.

The method which allowed to produce those results generates simulated social realities with the use of artificial neural networks. Two types of networks are used to generate two types of simulation. One is a neural network which optimizes a specific empirical variable as its output, whilst using the remaining empirical variables as instrumental input. I call that network ‘procedure of learning by orientation’. The other type of network uses the same empirical variable as its optimizable output and replaces the vector of other empirical variables with a vector of hypothetical probabilities, corresponding to just as hypothetical social roles, in the presence of a random disturbance factor. I label this network as ‘learning procedure by pacing’.

The procedure of learning by orientation produces as many alternative sets of numerical values as there are variables in the original empirical dataset X used in research. In this case, it was a set made of n = 49 variables, and thus 49 alternative sets Si are created. Each alternative set Si consists of values transformed by the corresponding neural network from the original empirical ones. Both the original dataset X and the n = 49 transformations Si thereof can be characterized, mathematically, with their respective vectors of mean expected values. Euclidean distances between those vectors are informative about the mathematical similarity between the corresponding sets.

Therefore, learning by orientation produces n = 49 simulations Si of the actual social reality represented in the set X, when each such simulation is biased towards optimizing one particular variable ‘i’ from among the n = 49 variables studied, and each such simulation displays a measurable Euclidean similarity to the actual social reality studied. My own experience in applying this specific procedure is that a few simulations Si, namely those pegged on optimizing four variables – price index in exports [Si(PL_X)], average hours worked per person per year [Si[AVH)], the share of labour compensation in Gross National Income [Si(LABSH)], and the coefficient of human capital [Si(HC) – average years of schooling per person] – display much closer Euclidean a distance to the actual reality X than any other simulation. Much closer means closer by orders of magnitude, by the way. The difference is visible.

The procedure of learning by pacing produces n = 49 simulations as well, yet these simulations are not exactly transformations of the original dataset X. In this case, simulated realities are strictly simulated, i.e. they are hypothetical states from the very beginning, and individual variables from the set X serve as the basis for setting a trajectory of transformation for those hypothetical states. Each such hypothetical state is a matrix of probabilities, associated with two sets of social roles: active and dormant. Active social roles are being endorsed by individuals in that hypothetical society and their baseline, initial probabilities are random, non-null values. Dormant social roles are something like a formalized prospect for immediate social change, and their initial probabilities are null.

This specific neural network produces new hypothetical states in two concurrent ways: typical neural activation, and random disturbance. In the logical structure of the network, random disturbance occurs before neural activation, and thus I am describing details of the former in the first place. Random disturbance is a hypothetical variable, separate from probabilities associated with social roles. It is a random value 0 < d < 1, associated with a threshold of significance d*. When d > d*, d becomes something like an additional error, fed forward into the network, i.e. impacting the next experimental round performed in the process of learning.

In the procedure of learning by pacing, neural activation is triggered by aggregating partial probabilities, associated with social roles, and possibly pre-modified by the random disturbance, through the operation of weighed average of the type ∑ fj(pi, X(i,j), dj, ej-1,), where fj is the function of neural activation in the j-th experimental round of learning, pi is the probability associated with the i-th social role, X(i,j) is the random weight of pi in the j-th experimental round, dj stands for random disturbance specific to that experimental round, and ej-1 is residual error fed forward from the previous experimental round j-1.

Now, just to be clear: there is a mathematical difference, in that logical structure, between random disturbance dj, and the random weight X(i,j). The former is specific to a given experimental round, but general across all the component probabilities in that round. If you want, di is like an earthquake, momentarily shaking the entire network, and is supposed to represent the fact that social reality just never behaves as planned. This is the grain of chaos in that mathematical order. On the other hand, X(i,j) is a strictly formal factor in the neural activation function, and its only job is to allow experimenting with data.

Wrapping it partially up, the whole method I use in this article revolves around the working hypothesis that a given set of empirical data, which I am working with, represents collectively intelligent learning, where human social structures collectively experiment with many alternative versions of themselves and select those versions which offer the most desirable states in a few specific variables. I call these variables ‘collective orientations’ and I further develop that hypothesis by claiming that collective orientations have that privileged position because they allow a specific type of collective learning, strongly cyclical, with large amplitude of residual error.

In both procedures of learning, i.e. in orientation, and in pacing, I introduce an additional component, namely that of self-observed internal coherence. The basic idea is that a social structure is a structure because the functional connections between categories of phenomena are partly independent from the exact local content of those categories. People remain in predictable functional connections to their Tesla cars, whatever exact person and exact car we are talking about. In my method, and, as a matter of fact, in any quantitative method, variables are phenomenological categories, whilst the local values of those variables inform about the local content to find in respective categories. My idea is that mathematical distance between values represents temporary coherence between the phenomenological categories behind the corresponding variables. I use the Euclidean distance of the type E = [(a – b)2]0,5 as the most elementary measure of mathematical distance. The exact calculation I do is the average Euclidean distance that each i-th variable in the set of n variables keeps from each l-th variable among the remaining k = n – 1 variables, in the same experimental round j. Mathematically, it goes like: avgE = { ∑[(xi – xl)2]0,5 }/k. When I use avgE as internally generated input in a neural network, I use the information about internal coherence as meta-data in the process of learning.

Of course, someone could ask what the point is of measuring local Euclidean distance between, for example, annual consumption of energy per capita and the average number of hours worked annually per capita, thus between kilograms of oil equivalent and hours. Isn’t it measuring the distance between apples and oranges? Well, yes, it is, and when you run a grocery store, knowing the coherence between your apples and your oranges can come handy, for one. In a neural network, variables are standardized, usually over their respective maximums, and therefore both apples and oranges are measured on the same scale, for two.       

The method needs to be rooted in theory, which has two levels: general and subject-specific. At the general level, I need acceptably solid theoretical basis for positing the working hypothesis, as phrased out in the preceding paragraph, to any given set of empirical, socio-economic data. Subject-specific theory is supposed to allow interpreting the results of empirical research as conducted according to the above-discussed method.

General theory revolves around four core concepts, namely those of: intelligent structure, chain of states, collective orientation, and social roles as mirroring phenomena for quantitative socio-economic variables. Subject-specific theory, on the other hand, is pertinent to the general issue of energy-related policies, and to their currently most tangible manifestation, i.e., to environmentally friendly sources of energy.

The theoretical concept of intelligent structure, such as I use it in my research, is mostly based on another concept, known from evolutionary biology, namely that of adaptive walk in rugged landscape, combined with the phenomenon of tacit coordination. We, humans, do things together without being fully aware we are doing them together or even whilst thinking we oppose each other (e.g. Kuroda & Kameda 2019[1]). capacity for social evolutionary tinkering (Jacob 1977[2]) through tacit coordination, such that the given society displays social change akin to an adaptive walk in rugged landscape (Kauffman & Levin 1987[3]; Kauffman 1993[4]; Nahum et al. 2015[5]).

Each distinct state of the given society (e.g. different countries in the same time or different moments in time as regards the same country) is interpreted as a vector of observable properties, and each empirical instance of that vector is a 1-mutation-neighbour to at least one other instance. All the instances form a space of social entities. In the presence of external stressor, each such mutation (each entity) displays a given fitness to achieve the optimal state, regarding the stressor in question, and therefore the whole set of social entities yields a complex vector of fitness to cope with the stressor.

The assumption of collective intelligence means that each social entity is able to observe itself as well as other entities, so as to produce social adaptation for achieving optimal fitness. Social change is an adaptive walk, i.e. a set of local experiments, observable to each other and able to learn from each other’s observed fitness. The resulting path of social change is by definition uneven, whence the expression ‘adaptive walk in rugged landscape’. There is a strong argument that such adaptive walks occur at a pace proportional to the complexity of social entities involved. The greater the number of characteristics involved, the greater the number of epistatic interactions between them, and the more experiments it takes to have everything more or less aligned for coping with a stressor.

Somehow concurrently to the evolutionary theory, another angle of approach seems interesting, for solidifying theoretical grounds to my method: the swarm theory (e.g. Wood & Thompson 2021[6]; Li et al. 2021[7]). Swarm learning consists in shifting between different levels of behavioural coupling between individuals. When we know for sure we have everything nicely figured out, we coordinate, between individuals, by fixed rituals or by strongly correlated mutual reaction. As we have more and more doubts whether the reality which we think we are so well adapted to is the reality actually out there, we start loosening the bonds of behavioural coupling, passing through weakening correlation, and all the way up to random concurrence. That unbundling of social coordination allows incorporating new behavioural patterns into individual social roles, and then learning how to coordinate as regards that new stuff.   

As the concept of intelligent structure seems to have a decent theoretical base, the next question is: how the hell can I represent it mathematically? I guess that a structure is a set of connections inside a complex state, where complexity is as a collection of different variables. I think that the best mathematical construct which fits that bill is that of imperfect Markov chains (e.g. Berghout & Verbitskiy 2021[8]): there is a state of reality Xn = {x1, x2, …, xn}, which we cannot observe directly, whilst there is a set of observables {Yn} such that Yn = π (Xn), the π being a coding map of Xn. We can observe through the lens of Yn. That quite contemporary theory by Berghout and Verbitskyi sends to an older one, namely to the theory of g-measures (e.g. Keane 1972[9]), and all that falls into an even broader category of ergodic theory, which is the theory of what happens to complex systems when they are allowed to run for a long time. Yes, when we wonder what kind of adult our kids will grow up into, this is ergodic theory.

The adaptive walk of a human society in the rugged landscape of whatever challenges they face can be represented as a mathematical chain of complex states, and each such state is essentially a matrix: numbers in a structure. In the context of intelligent structures and their adaptive walks, it can be hypothesized that ergodic changes in the long-going, complex stuff about what humans do together happen with a pattern and are far from being random. There is a currently ongoing, conceptual carryover from biology to social sciences, under the general concept of evolutionary trajectory (Turchin et al. 2018[10]; Shafique et al. 2020[11]). That concept of evolutionary trajectory can be combined with the idea that our human culture pays particular attention to phenomena which make valuable outcomes, such as presented, for example, in the Interface Theory of Perception (Hoffman et al. 2015[12], Fields et al. 2018[13]). Those two theories taken together allow hypothesising that, as we collectively learn by experimenting with many alternative versions of our societies, we systematically privilege those particular experiments where specific social outcomes are being optimized. In other words, we can have objectively existing, collective ethical values and collective praxeological goals, without even knowing we pursue them.

The last field of general theory I need to ground in literature is the idea of representing the state of a society as a vector of probabilities associated with social roles. This is probably the wobbliest theoretical boat among all those which I want to have some footing in. Yes, social sciences have developed that strong intuition that humans in society form and endorse social roles, which allows productive coordination. As Max Weber wrote in his book ‘Economy and Society’: “But for the subjective interpretation of action in sociological work these collectivities must be treated as solely the resultants and modes of organization of the particular acts of individual persons, since these alone can be treated as agents in a course of subjectively understandable action”. The same intuition is to find in Talcott Parsons’ ‘Social system’, e.g. in Chapter VI, titled ‘The Learning of Social Role-Expectations and the Mechanisms of 138 Socialization of Motivation’: “An established state of a social system is a process of complementary interaction of two or more individual actors in which each conforms with the expectations of the other(’s) in such a way that alter’s reactions to ego’s actions are positive sanctions which serve to reinforce his given need-dispositions and thus to fulfill his given expectations. This stabilized or equilibrated interaction process is the fundamental point of reference for all dynamic motivational analysis of social process. […] Every society then has the mechanisms which have been called situational specifications of role-orientations and which operate through secondary identifications and imitation. Through them are learned the specific role-values and symbol-systems of that particular society or sub-system of it, the level of expectations which are to be concretely implemented in action in the actual role”.

Those theoretical foundations laid, the further we go, the more emotions awaken as the concept of social role gets included in scientific research. I have encountered views, (e.g. Schneider & Bos 2019[14]) that social roles, whilst being real, are a mechanism of oppression rather than social development. On the other hand, it can be assumed that in the presence of demographic growth, when each consecutive generation brings greater a number of people than the previous one, we need new social roles. That, in turn, allows developing new technologies, instrumental to performing these roles (e.g. Gil-Hernández et al. 2017[15]).

Now, I pass to the subject-specific, theoretical background of my method. I think that the closest cousin to my method, which I can find in recently published literature, is the MuSIASEM framework, where the acronym, deliberately weird, I guess, stands for ‘Multi-scale Integrated Analysis of Societal and Ecosystem Metabolism’. This is a whole stream of research, where human societies are studied as giant organisms, and the ways we, humans, make and use energy, is studied as a metabolic function of those giant bodies. The central assumption of the MuSIASEM methodology is that metabolic systems survive and evolve by maxing out on energy efficiency. The best metabolism for an economic system is the most energy-efficient one, which means the greatest possible amount of real output per unit of energy consumption. In terms of practical metrics, we talk about GDP per kg of oil equivalent in energy, or, conversely, about the kilograms of oil equivalent needed to produce one unit (e.g. $1 bln) of GDP. You can consult Andreoni 2020[16], Al-Tamimi & Al-Ghamdi 2020[17] or Velasco-Fernández et al. 2020[18], as some of the most recent examples of MuSIASEM being applied in empirical research.

This approach is strongly evolutionary. It assumes that any given human society can be in many different, achievable states, each state displaying a different energy efficiency. The specific state which yields the most real output per unit of energy consumed is the most efficient metabolism available to that society at the moment, and, logically, should be the short-term evolutionary target. Here, I dare disagreeing fundamentally. In nature, there is no such thing as evolutionary targets. Evolution happens by successful replication. The catalogue of living organisms which we have around, today, are those which temporarily are the best at replicating themselves, and not necessarily those endowed with the greatest metabolic efficiency. There are many examples of species which, whilst being wonders of nature in terms of biologically termed efficiency, are either endemic or extinct. Feline predators, such as the jaguar or the mountain lion, are wonderfully efficient in biomechanical terms, which translates into their capacity to use energy efficiently. Yet, their capacity to take over available habitats is not really an evolutionary success.

In biological terms, metabolic processes are a balance of flows rather than intelligent strive for maximum efficiency. As Niebel et al. (2019[19]) explain it: ‘The principles governing cellular metabolic operation are poorly understood. Because diverse organisms show similar metabolic flux patterns, we hypothesized that a fundamental thermodynamic constraint might shape cellular metabolism. Here, we develop a constraint-based model for Saccharomyces cerevisiae with a comprehensive description of biochemical thermodynamics including a Gibbs energy balance. Non-linear regression analyses of quantitative metabolome and physiology data reveal the existence of an upper rate limit for cellular Gibbs energy dissipation. By applying this limit in flux balance analyses with growth maximization as the objective function, our model correctly predicts the physiology and intracellular metabolic fluxes for different glucose uptake rates as well as the maximal growth rate. We find that cells arrange their intracellular metabolic fluxes in such a way that, with increasing glucose uptake rates, they can accomplish optimal growth rates but stay below the critical rate limit on Gibbs energy dissipation. Once all possibilities for intracellular flux redistribution are exhausted, cells reach their maximal growth rate. This principle also holds for Escherichia coli and different carbon sources. Our work proposes that metabolic reaction stoichiometry, a limit on the cellular Gibbs energy dissipation rate, and the objective of growth maximization shape metabolism across organisms and conditions’.  

Therefore, if we translate the principles of biological metabolism into those of economics and energy management, the energy-efficiency of any given society is a temporary balance achieved under constraint. Whilst those states of society which clearly favour excessive dissipation of energy are not tolerable on the long run, energy efficiency is a by-product of the strive to survive and replicate, rather than an optimizable target state. Human societies are far from being optimally energy efficient for the simple reason that we have plenty of energy around, and, with the advent of renewable sources, we have even less constraint to optimize energy-efficiency.

We, humans, survive and thrive by doing things together. The kind of efficiency that allows maxing out on our own replication is efficiency in coordination. This is why we have all that stuff of social roles, markets, institutions, laws and whatnot. These are our evolutionary orientations, because we can see immediate results thereof in terms of new humans being around. A stable legal system, with a solid centre of political power in the middle of it, is a well-tested way of minimizing human losses due to haphazard violence. Once a society achieves that state, it can even move from place to place, as local resources get depleted.

I think I have just nailed down one of my core theoretical contentions. The originality of my method is that it allows studying social change as collectively intelligent learning, whilst remaining very open as for what this learning is exactly about. My method is essentially evolutionary, whilst avoiding the traps of evolutionary metaphysics, such as hypothetical evolutionary targets. I can present my method and my findings as a constructive theoretical polemic with the MuSIASEM framework.

[1] Kuroda, K., & Kameda, T. (2019). You watch my back, I’ll watch yours: Emergence of collective risk monitoring through tacit coordination in human social foraging. Evolution and Human Behavior, 40(5), 427-435.

[2] Jacob, F. (1977). Evolution and tinkering. Science, 196(4295), 1161-1166

[3] Kauffman, S., & Levin, S. (1987). Towards a general theory of adaptive walks on rugged landscapes. Journal of theoretical Biology, 128(1), 11-45

[4] Kauffman, S. A. (1993). The origins of order: Self-organization and selection in evolution. Oxford University Press, USA

[5] Nahum, J. R., Godfrey-Smith, P., Harding, B. N., Marcus, J. H., Carlson-Stevermer, J., & Kerr, B. (2015). A tortoise–hare pattern seen in adapting structured and unstructured populations suggests a rugged fitness landscape in bacteria. Proceedings of the National Academy of Sciences, 112(24), 7530-7535,    

[6] Wood, M. A., & Thompson, C. (2021). Crime prevention, swarm intelligence and stigmergy: Understanding the mechanisms of social media-facilitated community crime prevention. The British Journal of Criminology, 61(2), 414-433.

[7] Li, M., Porter, A. L., Suominen, A., Burmaoglu, S., & Carley, S. (2021). An exploratory perspective to measure the emergence degree for a specific technology based on the philosophy of swarm intelligence. Technological Forecasting and Social Change, 166, 120621.

[8] Berghout, S., & Verbitskiy, E. (2021). On regularity of functions of Markov chains. Stochastic Processes and their Applications, Volume 134, April 2021, Pages 29-54,

[9] Keane, M. (1972). Strongly mixingg-measures. Inventiones mathematicae, 16(4), 309-324. DOI

[10] Turchin, P., Currie, T. E., Whitehouse, H., François, P., Feeney, K., Mullins, D., … & Spencer, C. (2018). Quantitative historical analysis uncovers a single dimension of complexity that structures global variation in human social organization. Proceedings of the National Academy of Sciences, 115(2), E144-E151.

[11] Shafique, L., Ihsan, A., & Liu, Q. (2020). Evolutionary trajectory for the emergence of novel coronavirus SARS-CoV-2. Pathogens, 9(3), 240.

[12] Hoffman, D. D., Singh, M., & Prakash, C. (2015). The interface theory of perception. Psychonomic bulletin & review, 22(6), 1480-1506.

[13] Fields, C., Hoffman, D. D., Prakash, C., & Singh, M. (2018). Conscious agent networks: Formal analysis and application to cognition. Cognitive Systems Research, 47, 186-213.

[14] Schneider, M. C., & Bos, A. L. (2019). The application of social role theory to the study of gender in politics. Political Psychology, 40, 173-213.

[15] Gil-Hernández, C. J., Marqués-Perales, I., & Fachelli, S. (2017). Intergenerational social mobility in Spain between 1956 and 2011: The role of educational expansion and economic modernisation in a late industrialised country. Research in social stratification and mobility, 51, 14-27.

[16] Andreoni, V. (2020). The energy metabolism of countries: Energy efficiency and use in the period that followed the global financial crisis. Energy Policy, 139, 111304.

[17] Al-Tamimi and Al-Ghamdi (2020), ‘Multiscale integrated analysis of societal and ecosystem metabolism of Qatar’ Energy Reports, 6, 521-527,

[18] Velasco-Fernández, R., Pérez-Sánchez, L., Chen, L., & Giampietro, M. (2020), A becoming China and the assisted maturity of the EU: Assessing the factors determining their energy metabolic patterns. Energy Strategy Reviews, 32, 100562.

[19] Niebel, B., Leupold, S. & Heinemann, M. An upper limit on Gibbs energy dissipation governs cellular metabolism. Nat Metab 1, 125–132 (2019).