I return to the project which I started in Spring this year (i.e. 2020), and which I had put aside to some extent: the book I want to write on the role and function of cities in our civilization, including the changes, which we, city slickers, can expect in the foreseeable future. As I think about it now, I guess I had to digest intellectually both my essential method of research for that book, and the core empirical findings which I want to connect to. The method consists in studying human civilization as collective intelligence, thus a collection of intelligent structures, able to learn by experimenting with many alternative versions of themselves. Culture, laws and institutions, technologies: I consider all those anthropological categories as cognitive constructs, which we developed over centuries to study our own collective intelligence and being de facto parts thereof.
Collective intelligence, in that perspective, is an overarching conceptual frame, and as overarching frames frequently do, the concept risks to become a cliché. The remedy I want and intend to use is mathematics. I want to write the book as a collection of conceptual developments and in-depth empirical insights into hypotheses previously formulated with the help of a mathematical model. This is, I think, a major originality of my method. In social sciences, we tend to go the other way around: we formulate hypotheses by sort of freestyling intellectually, and then we check them with mathematical models. I start with just a little bit of intellectual freestyling, then I formulate my assumptions mathematically, and I use the mathematical model which results from those assumptions to formulate hypotheses for further research.
I adopt such a strongly mathematical method because we have a whole class of mathematical models which seem to fit the bill perfectly: artificial neural networks. Yes, I consider artificial neural networks as mathematical models in the first place, and only then as algorithms. The mathematical theory which I associate artificial neural networks the most closely with is that of state space, combined with the otherwise related theory of Markov chains. In other words, whatever happens, I attempt to represent it as a matrix of values, which is being transformed into another matrix of values. The artificial neural network I use for that representation reflects both the structure of the matrix in question, and the mechanism of transformation, which, by the way, is commonly called σ – algebra. By ‘commonly’ I mean commonly in mathematics.
My deep intuition – ‘deep’ means that I understand that intuition just partly – is that artificial neural networks are the best mathematical representation of collective intelligence we can get for now. Therefore I use them as a mathematical model, and here comes a big difference between the way I use them and a typical programmer does. Programmers of artificial intelligence are, as far as I know (my son is a programmer, and, yes, sometimes we speak human lingo to each other), absolutely at home with considering artificial neural networks as black boxes, i.e. as something that does something, yet we don’t really need to understand what exactly that thing is, which neural networks do, and we essentially care about those networks being accurate and quick in whatever they do.
I, in my methodological world, I adopt completely different a stance. I care most of all about understanding very specifically what the is the neural network doing, and I draw my conclusions from the way it does things. I don’t need the neural network I use to be super-fast neither super accurate: I need to understand how it does whatever it does.
I use two types of neural networks in that spirit, both 100% hand made. The first one serves me to identify the direction a social system (collective intelligence) follows in its collective learning. You can see an application in this draft paper of mine, titled ‘Climbing the right hill’. The fundamental logic of that network is to take an empirical dataset and use the neural network to produce as many alternative transformations of that dataset as there are variables in it. Each transformation takes a different variable from the empirical dataset as its desired output (i.e. it optimizes all the other variables as instrumental input). I measure the Euclidean similarity (Euclidean distance) between each individual transformation and the source dataset. I assume that the transformation which falls relatively the closest to source empirical data is the best representation of the collective intelligence represented in that data. Thus, at the end of the day, this specific type of neural network serves me to discover what we are really after, as a society.
The second type of network is built as a matrix of probabilities, modified by a quasi-random factor of disturbance. I am tempted to say that this network attempts to emulate coincidence and quasi-randomness of events. I made it and I keep using it as pure simulation: there is no empirical data which the network learns on. It starts with a first, controlled vector of probabilities, and then it transforms that vector in a finite number of experimental iterations (usually I make that network perform 3000 experimental rounds). In the first application I made of that network, probabilities correspond to social roles, and more specifically to the likelihood that a random person in the society studied endorses the given social role (see ‘The perfectly dumb, smart social structure’). At a deeper, and, in the same time, more general a level, I assume that probability as such is a structural variable of observable reality. A network which simulates changes in a vector of probabilities simulated change in the structure of events.
Long story short, I have two neural networks for making precise hypotheses: one uncovers orientations and pursued values in sets of socio-economic data, whilst the other simulates structural change in compound probabilities attached to specific phenomena. When I put that lot to real computational work, two essential conclusions emerge, sort of across the board, whatever empirical problem I am currently treating. Firstly, all big sets of empirical socio-economic data are after something specific. I mean, when I take the first of those two networks, the one that clones an empirical dataset into as many transformations as there are variables, a few of those transformations, like 1 ÷ 3 of them, are much closer to the original, in Euclidean terms, than all the rest. When I say closer, it is several times closer. Secondly, vectors of probabilities are tenacious and resilient. When I take the second of those networks, the one which prods vectors of probabilities with quasi-random disturbances, those probabilities tend to resist. Even if, in some 100 experimental rounds, some of those probabilities get kicked out of the system, i.e. their values descend to 0, they reappear a few hundred of experimental rounds later, as if by magic. Those probabilities can be progressively driven down if the factor of disturbance, which I include in the network, consists in quasi-randomly dropping new events into the game. The phenomenological structure of reality seems to be something very stable, once set in place, however simple I make that reality a priori. It yields to increasing complexity (new phenomena, with their probabilities coming to the game) rather than to arbitrary reduction of the pre-set phenomena.
I generalize those observations. A collective intelligence, i.e. an intelligent social structure, able to learn by experimenting with many alternative versions of itself, can stay coherent in tat experimentation and seems to stay coherent because it pursues very clear collective outcomes. I am even tempted to reframe it as a condition: a human social structure can evolve as a collectively intelligent structure under the condition of having very clear collectively pursued values. If it doesn’t, it is doomed to disintegrate and to be replaced by another collectively intelligent social structure, which, in turn, is sufficiently oriented to stay internally coherent whilst experimenting with itself. As I descend to the level of human behaviour, observed as the probability of an average individual endorsing specific patterns of behaviour, those behavioural patterns are resilient to exogenous destruction, and, in the same time, quite malleable when new patterns emerge and start to compete with the old ones. When a culture starts from a point A, defined as a set of social roles and behavioural patterns with assorted probabilities of happening, that point A needs a bloody long time, or, in other words, a bloody big lot of collectively intelligent experimentation, to vanish completely.
Now, I want to narrow down the scope of hypotheses I intend to formulate, by specifying the basic empirical findings which I have made so far, and which make the foundations of my research on cities. The first empirical finding does not come from me, but from the CIESIN centre at the Columbia University, and it is both simple and mind blowing: however the formal boundaries of urban areas are being redefined by local governments, the total surface of urban areas, defined as abnormally dense agglomerations of man-made structures and night-time lights, seems to have been constant over the last 30 years, maybe even more. In other words, whilst we have a commonly shared impression that cities grow, they seem to be growing only at the expense of other cities. You can check those numbers via the stats available with the World Bank (https://data.worldbank.org/indicator/AG.LND.TOTL.UR.K2 ). As you will be surfing with the World Bank, you can also call by another metric, the total surface of agricultural land on the planet (https://data.worldbank.org/indicator/AG.LND.AGRI.K2 ) and you will see that it has been growing, by hiccups, since 1960, i.e. since that stat is being collected.
To complete the picture, you can check the percentage of urban population in the total human population on the planet (https://data.worldbank.org/indicator/SP.URB.TOTL.IN.ZS ) and you will see that we have been becoming more and more urban, and right now, we are prevalently urban. Long story short, there are more and more urban humans, who apparently live in a constant urban space, and feed themselves out of a growing area of agricultural land. At the end of the day, cities seem to become increasingly different from the countryside, as regards the density of population: urban populations on Earth are becoming systematically more dense than rural ones.
I am cross breeding my general observations from what my two neural networks tend to do, with those main empirical findings about cities, and I am trying to formulate precise hypotheses for further research. Hypothesis #1: cities are purposeful demographic anomalies, with a clear orientation on optimizing specific social outcomes. Hypothesis #2: if and to the extent that the purpose of cities is to create new social roles, through intense social interaction in a limited physical space, the creation of new social roles involves their long coexistence with older social roles, and, therefore, the resulting growth in social complexity is exponential. Hypothesis #3: the COVID-19 pandemic, as an exogenous factor of disturbance, is likely to impact us in three possible ways: a) it can temporarily make disappear some social roles b) on the long run, it is likely to increase social complexity, i.e. to make us create a whole new set of social roles and c) it can change the fundamental orientation (i.e. the pursued collective values) of cities as demographic anomalies.
In your spare time you can also watch this video I made a few weeks ago: ‘Urban Economics and City Management #1 Lockdowns in pandemic and the role of cities’ : https://youtu.be/fYIz_6JVVZk . It recounts and restates my starting point in this path of research. I browse through the main threads of connection between the pandemic of COVID-19 and the civilisational role of cities. The virus, which just loves densely populated places, makes us question the patterns of urban life, and makes us ask question as for the future of cities.
I am returning to the strictly speaking written blogging, after a long break, which I devoted to preparing educational material for the upcoming winter semester 2020/2021. I am outlining a line of research which I can build my teaching around, in the same time. Something looms, and that something is my old obsession: collective intelligence of our human societies and its connection to artificial intelligence. Well, when I say ‘old’, it means ‘slightly seasoned’. I mean, I have been nurturing that obsession for a total of like 4 years, with having it walking around and talking like for the last 18 months or so. It is not truly old, even if ideas were red wine. Anyway, the current shade I paint into that obsession of mine is that human societies have a built-in mechanism of creating new social roles for new humans coming in, in the presence of demographic growth. Cities are very largely factories of social roles, in my view. Close, intense social interactions in a limited space are a mechanism of accelerated collective learning, whence accelerated formation of new skillsets, and those new skillsets, all they need is an opportunity to earn a living with and they turn into social roles.
I have a deep feeling that digital platforms, ranging from the early-hominid-style things like Twitter, all the way up to working and studying via MS Teams or Zoom, have developed as another accelerator of social roles. This accelerator works differently. It is essentially spaceless, although, on the large scale, it is very energy consuming at the level of server power. Still, early cities used to shape new social roles through the skilled labour they required to be built and expanded. A substantial part of whatever we think we know about mathematics and physics comes from geometry, which, in turn, comes from architecture and early machine-building. Similarly, digital platforms make new social roles by stimulating the formation of new skillsets required to develop those platforms, and then to keep them running.
Crazy thoughts come to my mind. What if we, humans, are truly able to think ahead, like really ahead, many generations ahead? What if by the mid-20th century we collectively told ourselves: ‘Look, guys. We mean, us. Cities are great, but there is more and more of us around, all that lot needs food, and food needs agricultural land to be grown and bred on. We need to keep the surface of agricultural land intact at the least, or slightly growing at best, whence the necessity to keep the total surface of urban land under control. Still, we need that space of intense social interactions to make new social roles. Tough nut to crack, this one. Cool, so here is the deal: we start by shrinking transistors to a size below the perceptual capacity of human sight, which is going to open up on a whole range of electronic technologies, which, in turn, will make it worthwhile to create a whole new family of languages just for giving them orders, to those electronics. Hopefully, after 2 or 3 human generations, that is going to create a new plane of social interactions, sort of merging with cities and yet sort of supplanting them’.
And so I follow that trail of collective human intelligence configuring itself in the view of making enough social roles for new humans coming. I am looking for parallels with the human brain. I know, I know, this is a bit far-fetched as parallel, still it is better than nothing. Anyway, in the brain, there is the cortex, i.e. the fancy intellectual, then we have the limbic system, i.e. the romantic Lord Byron, and finally there is the hypothalamus, i.e. the primitive stuff in charge of vegetative impulses. Do we have such distinct functional realms in our collective intelligence? I mean, do we have a subsystem that generates elementary energies (i.e. capacities to perform basic types of action), another one which finds complex cognitive bearings in the world, and something in between, which mediates between objective data and fundamental drives, forming something like preferences, proclivities, values etc. ?
Cool. Enough philosophy. Let’s get into science. As I am writing about digital platforms, I can do something useful just as well, i.e. I can do some review of literature and use it both in my own science and in my teaching. Here comes an interesting paper by Beeres et al. (2020[1]) regarding the correlation between the use of social media, and the prevalence of mental health problems among adolescents in Sweden. The results are strangely similar to the correlation between unemployment and criminality, something I know well from my baseline field of science, i.e. economics. It is a strong correlation across space and a weak, if not a non-existent one over time. The intensity of using social media by Swedish adolescents seems to be correlated positively with the incidence of mental disorders, i.e. adolescents with higher a probability of such disorders tend to use social media more heavily than those mentally more robust adolescents. Still, when an adolescent person increases their starting-point intensity of using social media, that change is not correlated longitudinally with an increased incidence of mental disorders. In other words, whoever is solid in the beginning, stays this way, and whoever is f**ked up, stays that way, too.
The method of research presented in that paper looks robust. The sample is made of 3959 willing participants, fished out from among an initial sample of 12 512 people. This is respectable, as social science comes. The gauge of mental health was Strength and Difficulties Questionnaire (SDQ), which is practically 100% standardized (Goodman & Goodman 2009[2]) and allows distinguishing between internalized, emotional and peer problems on the one hand, and those externalized ones, connected to conduct and hyperactivity. If you are interested in the exact way this questionnaire looks, you can go and consult: https://www.sdqinfo.org/a0.html . The use of social media was self-reported, as answer to the question on the number of hours spent on social media, writing or reading blogs, and chatting online, separately for weekdays and weekends. That answer was standardized, on a scale ranging from 30 minutes a day up to 7 hours a day. Average daily time spent on social media was calculated on the basis of answers given.
The results reported by Beeres et al. (2020) are interesting in a few different ways. Firstly, they seem to discard very largely the common claim that increased use of social media contributes to increased prevalence of mental disorders in adolescents. Intensive use of social media is rather symptomatic of such disorders. That would reverse the whole discourse about this specific phenomenon. Instead of saying ‘Social media make kids go insane’, we should be rather saying ‘Social media facilitate the detection of mental disorders’. Still, one problem remains: if the most intense use of social media among adolescents is observable in those most prone to mental disorders, we have a possible scenario where either the whole culture forming on and through social media, or some specific manifestations thereof, are specifically adapted to people with mental disorders.
Secondly, we have a general case of a digital technology serving a specific social function, i.e. that of mediating social relations of a specific social group (adolescents in developed countries) in a specific context (propensity to mental disorders). Digital technologies are used as surrogate of other social interactions, in people who most likely have hard times going through such interactions.
Another paper, still warm, straight from bakery, by Lin et al. (2020[3]), is entitled ‘Investigating mediated effects of fear of COVID-19 and COVID-19 misunderstanding in the association between problematic social media use, psychological distress, and insomnia’. The first significant phenomena it is informative about is the difficulty to make a simple, catchy title for a scientific paper. Secondly, the authors start from the same hypothesis which Beeres et al. (2020) seem to have discarded, namely that social media use (especially problematic social media use) may give rise to psychological distress. Moreover, Lin et al. (2020) come to the conclusion that it is true. Same science, same hypothesis, different results. I f**king love science. You just need to look into the small print.
The small print here starts with the broad social context. Empirical research by Lin et al. (2020) was conducted in Iran, on participants over 18 years old, whose participation was acquired via Google Forms. The sample consisted of 1506 persons, with an average age of 26 years, and a visible prevalence of women, who made over 58% of the sample. The tool used for detecting mental disorders was the Hospital Anxiety and Depression Scale (HADS). The follow up period was of two weeks, against two years in the case of research by Beeres et al. (2020). Another thing is that whilst Beeres et al. (2020) explicitly the longitudinal within-person variance from the lateral inter-person one, Lin et al. (2020) compute their results without such distinction. Consequently, they come to the conclusion that problematic use of social media is significantly correlated with mental disorders.
I try to connect those two papers to my concept of collective intelligence, and with the use of artificial intelligence. We have an intelligent structure, i.e. humans hanging around together. How do we know we are collectively intelligent? Well, we can make many alternative versions of us being together, each version being like one-mutation neighbour to others, and we can learn new ways of doing things by choosing the best fitting version among those alternatives. On the top of that, we can do the whole stunt whilst staying acceptably cohesive as society. Among many alternative versions of us being together there is a subset, grouping different manners of using social media. Social media are based on artificial intelligence. Each platform runs an algorithm which adapts the content you see to your previously observed online behaviour: the number of times you click on an add, the number of times you share and repost somebody else’s posts, the number of times you publish your own content etc. At the bottom line, the AI in action here adapts so as you max out on the time spent on the platform, and on the clicks you make whilst hanging around there.
The papers I have just quoted suggest that artificial intelligence at work in social media is somehow accommodative of people with mental disorders. This is truly interesting, because the great majority of social institutions we have had so far, i.e. since however we started as intelligent hominids, has been actually the opposite. One of the main ways to detect serious mental problems in a person consists in observing their social relations. If they have even a mild issue with mental health, they are bound to have something seriously off either with their emotional bonds to the immediate social environment (family and friends, mostly) or with their social role in the broader environment (work, school etc.). I made an educational video out of that quick review of literature, and I placed it on You Tube as: Behavioural modelling and content marketing #3 Social media and mental health
[1] Beeres, D. T., Andersson, F., Vossen, H. G., & Galanti, M. R. (2020). Social media and mental health among early adolescents in Sweden: a longitudinal study with 2-year follow-up (KUPOL Study). Journal of Adolescent Health, https://doi.org/10.1016/j.jadohealth.2020.07.042
[2] Goodman, A., Goodman, R. (2009) Strengths and Difficulties Questionnaire as a Dimensional Measure of Child Mental Health, Journal of the American Academy of Child & Adolescent Psychiatry, Volume 48, Issue 4,
[3] Lin, C. Y., Broström, A., Griffiths, M. D., & Pakpour, A. H. (2020). Investigating mediated effects of fear of COVID-19 and COVID-19 misunderstanding in the association between problematic social media use, psychological distress, and insomnia. Internet interventions, 21, 100345, https://doi.org/10.1016/j.invent.2020.100345
I am developing directly on the mathematical model I started to sketch in my last update, i.e. in Social roles and pathogens: our average civilisation. This is an extension of my earlier research regarding the application of artificial neural networks to simulate collective intelligence in human societies. I am digging down one particular rabbit-hole, namely the interaction between the prevalence of social roles, and that of disturbances to the social structure, such as epidemics, natural disasters, long-term changes in natural environment, radically new technologies etc.
Here comes to my mind, and thence to my writing, a mathematical model that generalizes some of the intuitions, which I already, tentatively, phrased out in my last update. The general idea is that society can be represented as a body of phenomena able to evolve endogenously (i.e. by itself, in plain human lingo), plus an external disturbance. Disturbance is anything that knocks society out of balance: a sudden, massive change in technology, a pandemic, climate change, full legalization of all drugs worldwide, Justin Bieber becoming the next president of the United States etc.
Thus, we have the social structure and a likely disturbance to it. Social structure is a set SR = {sr1, sr2, …, srm} of ‘m’ social roles, defined as combinations of technologies and behavioural patterns. The set SR can be stable or unstable. Some of the social roles can drop out of the game. Just checking: does anybody among my readers know what did the craft of a town crier consist in, back in the day? That guy was a local media industry, basically. You paid him for shouting your message in one or more public places in the town. Some social roles can emerge. Twenty years ago, the social role of an online influencer was associated mostly with black public relations, and today it is a regular occupation.
Disappearance or emergence of social roles is one plane of social change, and mutual cohesion between social roles is another one. In any relatively stable social structure, the existing social roles are culturally linked to each other. The behaviour of a political journalist is somehow coherent with the behaviour of politicians he or she interviews. The behaviour of a technician with a company of fibreoptic connections is somehow coherent with the behaviour of end users of those connections. Yet, social change can loosen the ties between social roles. I remember the early 1990ies, in Poland, just after the transition from communism. It was an odd moment, when, for example, many public officers, e.g. maires or ministers, were constantly experimenting with their respective roles. That very loose coupling of social roles is frequently observable in start-up businesses, on the other hand. In many innovative start-ups, when you start a new job, you’d better be prepared to its exact essence and form taking shape as you work.
All in all, I hypothesise four basic types of social change in an established structure, under the impact of an exogenous disturbance. Scenario A assumes the loosening of cohesion between social roles, under the impact of an exogenous disturbance, with a constant catalogue of social roles in place. Scenario B implies that external stressor makes some social roles disappear, whilst scenarios C and D represent the emergence of new social roles, in two different perspectives. In Scenario C, new social roles are not coherent with the established ones, whilst Scenario D assumes such a cohesion.
Mathematically, I represent the whole thing in the form of a simple neural network, a multi-layer perceptron. I have written a lot about using neural networks as representation of collective intelligence, and now, I feel like generalising my theoretical stance and explaining two important points, namely what exactly I mean by a neural network, and why do I apply a neural network instead of a stochastic model, such as e.g. an Ito drift.
A neural network is a sequence of equations, which can be executed in a loop, over a finite sequence ER = {er1, er2, …, ern} of ‘n’ of experimental rounds, and that recurrent sequence of equations has a scalable capacity to learn. In other words, equation A takes input data, transforms it, feeds the result into equation B, which feeds into equation C etc., and, at some point, the result yielded by the last equation in the sequence gets fed into equation A once again, and the whole sequence runs another round A > B > C > …> A etc.. In each consecutive experimental round erj, equation A taps into raw empirical data, and into the result of the previous experimental round ej-1. Another way of defining a neural network is to say that it is a general, logical structure able to learn by producing many specific instances of itself and observing their specific properties. Both definitions meet in the concept of logical structure and learning. It is quite an old observation in our culture that some logical structures, such as sequences of words, have the property of creating much more meaning than others. When I utter a sequence ‘Noun + Verb + Noun’, e.g. ‘I eat breakfast’, it has the capacity to produce more meaning than a sequence of the type ‘Verb + Verb + Verb’, e.g. ‘Eat read walk’. The latter sequence leaves more ambiguity, and the amount of that ambiguity makes that sequence of words virtually useless in daily life, save for online memes.
There are certain peg structures in the sequence of equations that make a neural network, i.e. some equations and sequences thereof which just need to be there, and which the network cannot produce meaningful results. I am going to present the peg structure of a neural network, and then I will explain its parts one by one.
Thus, the essential structure is the following: [Equation of random experimentation ε* xi (er1)] => [Equation of aggregation h = ∑ ε* xi (er1)] => [Equation of neural activation NA = (a*ebh ± 1) / (a*ebh ± 1) ] => {Equation of error assessment e(er1) = [O(er1) – NA(er1)]*c} => {[Equation of backpropagation] [Equation of random experimentation + acknowledgement of error from the previous experimental round] [ε* xi (erj) + e(er1)]} => {Equation of aggregation h = ∑ [ε* xi (erj) + e(er1)]} etc.
In that short sequential description, I combined mathematical expressions with formal logic. Brackets of different types – round (), square [] and curly {} – serve to delineate distinct logical categories. The arrowed symbols stand for logical connections, with ‘’ being an equivalence, and ‘=>’ and implication. That being explained, I can start explaining those equations and their sequence. The equation of random experimentation expresses what an infant’s brain does: it learns, by trial and error, i.e. my mixing stimuli in various hierarchies and seeing which hierarchy of importance, attached to individual pieces of sensory data, works better. In an artificial neural network, random experimentation means that each separate piece of data is being associated with a random number ε between 0 and 1, e.g. 0,2 or 0,87 etc. A number between 0 and 1 can be interpreted in two ways: as a probability, or as the fraction of a whole. In the associated pair ε* xi (erj), the random weight 0 < ε < 1 can be seen as hypothetical probability that the given piece xi of raw data really matters in the experimental round erj. From another angle, we can interpret the same pair ε* xi (erj) as an experiment: what happens when we cut fraction ε from the piece of data xi. it can be for one, or as a slice cut out of that piece of data.
Random experimentation in the first experimental round er1 is different from what happens in consecutive rounds erj. In the first round, the equation of random experimentation just takes the data xi. In any following round, the same equation must account for the error of adjustment incurred in previous rounds. The logic is still the same: what happens if we assume a probability of 32% that error from past experiments really matters vs. the probability of 86%?
The equation of aggregation corresponds to the most elementary phase of what we could call making sense of reality, or to language. A live intelligent brain collects separate pieces of data into large semantic chunks, such as ‘the colour red’, ‘the neighbour next door’, ‘that splendid vintage Porsche Carrera’ etc. The summation h = ∑ ε* xi (erj) is such a semantic chunk, i.e. h could be equivalent to ‘the neighbour next door’.
Neural activation is the next step in the neural network making sense of reality. It is the reaction to the neighbour next door. The mathematical expression NA = (a*ebh ± 1) / (a*ebh ± 1) is my own generalisation of two commonly used activation functions: the sigmoid and the hyperbolic tangent. The ‘e’ symbol is the mathematical constant e, and ‘h’ in the expression ebh is the ‘h’ chunk of pre-processed data from the equation of aggregation. The ‘b’ coefficient is usually a small integer, e.g. b = 2 in the hyperbolic tangent, and -1 in the basic version of the sigmoid function.
The logic of neural activation consists in combining a constant component with a variable one, just as a live nervous system has some baseline neural activity, e.g. the residual muscular tonus, which ramps up in the presence of stimulation. In the equation of hyperbolic tangent, namely NA = tanh = (e2h – 1) / (e2h + 1), the constant part is (e2 – 1) / (e2 + 1) = 0,761594156. Should my neural activation be the sigmoid, it goes like NA = sig = 1 / (1 + e-h), with the constant root of 1 / (1 + e-1) = 0,731058579.
Now, let’s suppose that the activating neuron NA gets excited about a stream of sensory experience represented by input data: x1 = 0.19, x2 = 0.86, x3 = 0.36, x4 = 0.18, x5 = 0.93. At the starting point, the artificial mind has no idea how important are particular pieces of data, so it experiments by assigning them a first set of aleatory coefficients – ε1 = 0.85, ε2 = 0.70, ε3 = 0.08, ε4 = 0.71, ε5 = 0.20 – which means that we experiment with what happens if x3 was totally unimportant, x4 was hardly more significant, whilst x1, x2 and x3 are really important. Aggregation yields h = 0,19*0,85 +0,86*0,70 + 0,36*0,08 + 0,18*0,71 + 0,93*0,20 = 1,10.
An activating neuron based on the hyperbolic tangent gets into a state of NA = tanh = (e2*1,10 – 1) / (e2*1,10 + 1) = 0.801620, and another activating neuron working with the sigmoid function thinks NA = sig = 1 / (1 + e-1,10) = 0,7508457. Another experiment with the same data consists in changing the aleatory coefficients of importance and seeing what happens, thus in saying ε1 = 0.48, ε2 = 0.44, ε3 = 0.24, ε4 = 0.27, ε5 = 0.80 and aggregating h = 0,19*0,48 +0,86*0,44 + 0,36*0,24 + 0,18*0,27 + 0,93*0,80 = 1,35. In response to the same raw data aggregated in a different way, the hyperbolic tangent says NA = tanh = (e2*1,35 – 1) / (e2*1,35 + 1) = 0,873571 and the activating neuron which sees reality as a sigmoid retorts: ‘No sir, absolutely not. I say NA = sig = 1 / (1 + e-1,35) = 0,7937956’. What do you want: equations are like people, they are ready to argue even about 0,25 of difference in aggregate input from reality.
Those two neural reactions bear a difference, visible as gradients of response, or elasticities of response to a change in aggregate output. The activating neuron based on hyperbolic tangent yields a susceptibility of (0,873571 – 0,801620) / (1,35 – 1,10) = 0.293880075, which the sigmoid sees as an overreaction, with its well-pondered (0,7937956 – 0,7508457) / (1,35 – 1,10) = 0,175427218. That’s an important thing to know about neural networks: they can be more or less touchy in their reaction. Hyperbolic tangent produces more stir, and the sigmoid is more like ‘calm down’ in its ways.
Whatever the neural activation NA produces, gets compared with a pre-set outcome O, or output variable. Error is assessed as e(erj) = [O(erj) – NA(erj)]*c, where ‘c’ is na additional factor, sometimes the local derivative of NA. It just serves to put c there: it can amplify (c > 1) or downplay (c < 1) the importance of local errors and therefore make the neural network more or less sensitive to making errors.
Before I pass to discussing the practical application of that whole logical structure to the general problem at hand, i.e. the way that a social structure reacts to exogenous disturbances, one more explanation is due, namely the issue of backpropagation of error, where said error is being fed forward. One could legitimately ask how the hell is it possible to backpropagate something whilst feeding it forward. Let’s have a look at real life. When I learn to play piano, for example, I make mistakes in my play, and I utilise them to learn. I learn by repeating over and over again the same sequence of musical notes. Repetition is an instance of feeding forward. Each consecutive time I play the same sequence, I move forward one more round. However, if I want that move forward to be really productive as regards learning, I need to review, each time, my entire technique. I need to go back to my first equation and run the whole sequence of equations again. I need to backpropagate my mistakes over the whole sequence of behaviour. Backpropagating errors and feeding them forward calls two different aspects of the same action. I backpropagate errors across the logical structure of the neural network, and I feed them forward over consecutive rounds of experimentation.
Now, it is time to explain how I simulate the whole issue of disturbed social structure, and the four scenarios A, B, C, and D, which I described a few paragraphs earlier. The trick I used consists in creating a baseline neural network, one which sort of does something but not much really, and then making mutants out of it, and comparing the outcomes yielded by mutants with that produced by their baseline ancestor. For the baseline version, I have been looking for a neural network which learns lightning fast on the short run but remains profoundly stupid on the long run. I wanted quick immediate reaction and no capacity whatsoever to narrow down the error and adjust to it.
The input layer of the baseline neural network is made of the set SR = {sr1, sr2, …, srm} of ‘m’ social roles, and one additional variables representative for the hypothetical disturbance. Each social role sri corresponds to a single neuron, which can take values between 0 and 1. Those values represent the probability of occurrence in the social role sri. If, for example, in the experimental round e = 100, the input value of the social role sri is sri(e100) = 0.23, it means that 23% of people manifest the distinctive signs of that social role. Of course, consistently with what I perceive as the conceptual acquis of social sciences, I assume that an individual can have multiple, overlapping social roles.
The factor of disturbance RB is an additional variable in the input layer of the network and comes with similar scale and notation. It takes values between 0 and 1, which represent the probability of disturbing occurrence in the social structure. Once again, RB can be anything, disturbing positively, negatively, or kind of we have no idea what it is going to bring about.
Those of you who are familiar with the architecture of neural networks might wonder how I am going to represent the emergence of new social roles without modifying the structure of the network. Here comes a mathematical trick, which, fortunately enough, is well grounded in social sciences. The mathematical part of the trick consists in incorporating dormant social roles in the initial set SR = {sr1, sr2, …, srm}, i.e. social roles assigned with arbitrary 0 value, i.e. zero probability of occurrence. On the historically short run, i.e. at the scale of like one generation, new social roles are largely predictable. As we are now, we can reasonably predict the need for new computer programmers, whilst being able to safely assume a shortage of jobs for cosmic janitors, collecting metal scrap from the terrestrial orbit. In 20 years from now, that perspective can change – and it’d better change, as we have megatons of metal crap on the orbit – yet, for now, it looks pretty robust.
Thus, in the set SR = {sr1, sr2, …, srm}, I reserve k neurons for active social roles, and l neurons for dormant ones, with, of course, k + l = m. All in all, in the actual network I programmed in Excel, I had k = 20 active social roles, l = 19 dormant social roles, and one neuron corresponding to the disturbance factor RB.
Now, the issue of social cohesion. In this case, we are talking about cohesion inside the set SR = {sr1, sr2, …, srm}. Mathematically, cohesion inside a set of numerical values can be represented as the average numerical distance between them. Therefore, I couple the input layer of 20k + 19l + RB = 40 neurons is coupled with a layer of meta-input, i.e. with a layer of 40 other neurons whose sole function is to inform about the Euclidean distance between the current value of each input neuron, and the values of the other 39 input neurons.
Euclidean distance plays the role of fitness function (see Hamann et al. 2010[1]). Each social role in the set SR = {sr1, sr2, …, srm}, with its specific probability of occurrence, displays a Euclidean distance from the probability of occurrence in other social roles. The general idea behind this specific mathematical turn is that in a stable structure, the Euclidean distance between phenomena stays more or less the same. When, as a society, we take care of being collectively cohesive, we use the observation of cohesion as data, and the very fact of minding our cohesion helps us to maintain cohesion. When, on the other hand, we don’t care about social cohesion, then we stop using (feeding forward) this specific observation, and social cohesion dissolves.
For the purposes of my own scientific writing, I commonly label that Euclidean distance as V, i.e. V(sri; ej) stands for the average Euclidean distance between social role sri, and all the other m – 1 social roles in the set SR = {sr1, sr2, …, srm}, in the experimental round ej. When input variables are being denominated on a scale from 0 to 1, thus typically standardized for a neural network, and the network uses (i.e. feeds forward) the meta input on cohesion between variables, the typical Euclidean distance you can expect is like 0,1 ≤ V(sri; ej) ≤ 0,3. When the social structure loses it, Euclidean distance between phenomena starts swinging, and that interval tends to go into 0,05 ≤ V(sri; ej) ≤ 0,8. This is how the general idea of social cohesion is translated into a mathematical model.
Thus, my neural network uses, as primary data, basic input about the probability of specific social roles being played by a randomly chosen individual, and metadata about cohesion between those probabilities. I start by assuming that all the active k = 20 social roles occur with the same probability of 0,5. In other words, at the starting point, each individual in the society displays a 50% probability of endorsing any of the k = 20 social roles active in this specific society. Reminder: l = 19 dormant social roles stay at 0, i.e. each of them has 0% of happening, and the RB disturbance stays at 0% probability as well. All is calm. This is my experimental round 1, or e1. In the equation of random experimentation, each social role sri gets experimentally weighed with a random coefficient, and with its local Euclidean distance from other social roles. Of course, as all k = 20 social roles have the same probability of 50%, their distance from each other is uniform and always makes V = 0,256097561. All is calm.
As I want my baseline AI to be quick on the uptake and dumb as f**k on the long-haul flight of learning, I use neural activation through hyperbolic tangent. As you could have seen earlier, this function is sort of prone to short term excitement. In order to assess the error, I use both logic and one more mathematical trick. In the input, I made each of k = 20 social roles equiprobable in its happening, i.e. 0,50. I assume that the output of neural activation should also be 0,50. Fifty percent of being anybody’s social role should yield fifty percent: simplistic, but practical. I go e(erj) = O(erj) – NA(erj) = 0,5 – tanh = 0,5 – [(e2h – 1) / (e2h + 1)], and I feed forward that error from round 1 to the next experimental round. This is an important trait of this particular neural network: in each experimental round, it experiments adds up the probability from previous experimental round and the error made in the same, previous experimental round, and with the assumption that expected value of output should be a probability of 50%.
That whole mathematical strategy yields interesting results. Firstly, in each experimental round, each active social role displays rigorously the same probability of happening, and yet that uniformly distributed probability changes from one experimental round to another. We have here a peculiar set of phenomena, which all have the same probability of taking place, which, in turn, makes all those local probabilities equal to the average probability in the given experimental round, i.e. to the expected value. Consequently, the same happens to the internal cohesion of each experimental round: all Euclidean distances between input probabilities are equal to each other, and to their average expected distance. Technically, after having discovered that homogeneity, I could have dropped the whole idea of many social roles sri in the database and reduce the input data just to three variables (columns): one active social role, one dormant, and the disturbance factor RB. Still, I know by experience that even simple neural networks tend to yield surprising results. Thus, I kept the architecture ’20k + 19l + RB’ just for the sake of experimentation.
That whole baseline neural network, in the form of an Excel file, is available underTHIS LINK. In Table 1, below, I summarize the essential property of this mathematical structure: short cyclicality. The average probability of happening in each social role swings regularly, yielding, at the end of the day, an overall average probability of 0,33. Interesting. The way this neural network behaves, it represents a recurrent sequence of two very different states of society. In odd experimental rounds (i.e. 1, 3, 5,… etc.) each social role has 50% or more of probability of manifesting itself in an individual, and the relative cohesion inside the set of social roles is quite high. On the other hand, in even experimental rounds (i.e. 2, 4, 6, … etc.), social roles become disparate in their probability of happening in a given time and place of society, and the internal cohesion of the network is low. The sequence of those two states looks like the work of a muscle: contract, relax, contract, relax etc.
Table 1 – Characteristics of the baseline neural network
Experimental round
Average probability of input
Cohesion – Average Euclidean distance V in input
Aggregate input ‘h’
Error to backpropagate
1
0,5000
0,2501
1,62771505
-0,4257355
2
0,0743
0,0372
0,02990319
0,47010572
3
0,5444
0,2723
1,79626958
-0,4464183
4
0,0980
0,0490
0,05191633
0,44813027
5
0,5461
0,2732
1,60393868
-0,4222593
6
0,1238
0,0619
0,09320145
0,40706748
7
0,5309
0,2656
1,59030006
-0,4201953
8
0,1107
0,0554
0,07157025
0,4285517
9
0,5392
0,2698
1,49009281
-0,4033418
10
0,1359
0,0680
0,11301796
0,38746079
11
0,5234
0,2618
1,51642329
-0,4080723
12
0,1153
0,0577
0,06208368
0,43799596
13
0,5533
0,2768
1,92399208
-0,458245
14
0,0950
0,0476
0,03616495
0,46385081
15
0,5589
0,2796
1,51645936
-0,4080786
16
0,1508
0,0755
0,13860251
0,36227827
17
0,5131
0,2567
1,29611259
-0,3607191
18
0,1524
0,0762
0,12281062
0,37780311
19
0,5302
0,2652
1,55382594
-0,4144146
20
0,1158
0,0579
0,06391662
0,43617027
…
…
…
…
…
Average over 3000 rounds
0,3316
0,1659
0,8113
0,0000041
Variance
0,0408
0,0102
0,5345
0,162
Variability*
0,6092
0,6092
0,9012
97 439,507
*Variability is calculated as standard deviation, i.e. square root of variance, divided by the average.
Now, I go into the scenario A of social change. The factor of disturbance RB gets activated and provokes a loosening of social cohesion. Mathematically, it involves a few modifications to the baseline network. Activation of the disturbance RB involves two steps. Firstly, numerical values of this specific variable in the network needs to take non-null values: the disturbance is there. I do it by generating random numbers in the RB column of the database. Secondly, there must be a reaction to disturbance, and the reaction consists in disconnecting the layer of neurons, which I labelled meta-data, i.e. the one containing Euclidean distances between the raw data points.
Here comes the overarching issue of sensitivity to disturbance, which goes across all the four scenarios (i.e. A, B, C, and D). As representation of what’s going on in social structure, it is about collective and individual alertness. When a new technology comes out into the market, I don’t necessarily change my job, but when that technology spreads over a certain threshold of popularity, I might be strongly pushed to reconsider my decision. When COVID-19 started hitting the global population, all levels of reaction (i.e. governments, media etc.) were somehow delayed in relation to the actual epidemic spread. This is how social change happens in reaction to a stressor: there is a threshold of sensitivity.
When I throw a handful of random values into the database, as values of disturbance RB, they are likely to be distributed under a bell-curve. I translate mathematically the social concept of sensitivity threshold as a value under that curve, past which the network reacts by cutting ties between errors input as raw data from previous experimental rounds, and the measurement of Euclidean distance between them. Question: how to set this value so as it fits with the general logic of that neural network? I decided to set the threshold at the absolute value of the error recorded in the previous experimental round. Thus, for example, when error generated in round 120 is e120 = -0.08, the threshold of activation for triggering the response to disturbance is ABS(-0,08) = 0,08. The logic behind this condition is that social disturbance becomes significant when it is more prevalent than normal discrepancy between social goals and the actual outcomes.
I come back to the scenario A, thus to the hypothetical situation when the factor of disturbance cuts the ties of cohesion between existing, active social roles. I use the threshold condition ‘ifRB(erj) > e(erj-1), then don’t feed forward V(erj-1)’, and this is what happens. First of all, the values of probability assigned to all active social roles remain just as uniform, in every experimental round, as they are in the baseline neural network I described earlier. I know, now, that the neural network, such as I designed it, is not able to discriminate between inputs. It just generates a uniform distribution thereof. That being said, the uniform probability of happening in social roles sri follows, in scenario A, a clearly different trajectory than the monotonous oscillation in the baseline network. The first 134 experimental rounds yield a progressive decrease in probability down to 0. Somewhere in rounds 134 ÷ 136 the network reaches a paradoxical situation, when no active social role in the k = 20 subset has any chance of manifesting itself. It is a society without social roles, and all that because the network stops feeding forward meta-data on its own internal cohesion when the disturbance RB goes over the triggering point. Past that zero point, a strange cycle of learning starts, in irregular leaps: the uniform probability attached to social roles rises up to an upper threshold, and then descends again back to zero. The upper limit of those successive leaps oscillates and then, at an experimental round somewhere between er400 and er1000, probability jumps just below 0,7 and stays this way until the end of the 3000 experimental rounds I ran this neural network through. At this very point, the error recorded by the network gets very close to zero and stays there as well: the network has learnt whatever it was supposed to learn.
Of course, the exact number of experimental rounds in that cycle of learning is irrelevant society-wise. It is not 400 days or 400 weeks; it is the shape of the cycle that really matters. That shape suggests that, when an external disturbance switches off internal cohesion between social roles in a social structure, the so-stimulated society changes in two phases. At first, there are successive, hardly predictable episodes of virtual disappearance of distinct social roles. Professions disappear, family ties distort etc. It is interesting. Social roles get suppressed simply because there is no need for them to stay coherent with other social roles. Then, a hyper-response emerges. Each social role becomes even more prevalent than before the disturbance started happening. It means a growing probability that one and the same individual plays many social roles in parallel.
I pass to scenario B of social change, i.e. the hypothetical situation when the exogenous disturbance straightforwardly triggers the suppression of social roles, and the network keeps feeding forward meta-data on internal cohesion between social roles. Interestingly, suppression of social roles under this logical structure is very short lived, i.e. 1 – 5 experimental rounds, and then the network yields an error which forces social roles to disappear.
One important observation is to note as regards scenarios B, C, and D of social change in general. Such as the neural network is designed, with the threshold of social disturbance calibrated on the error from previous experimental round, error keeps oscillating within an apparently constant amplitude over all the 3000 experimental rounds. In other words, there is no visible reduction of magnitude in error. Some sort of social change is occurring in scenarios B, C, and D, still it looks as a dynamic equilibrium rather than a definitive change of state. That general remark kept in mind, the way that the neural network behaves in scenario B is coherent with the observation made regarding the side effects of its functioning in scenario A: when the factor of disturbance triggers the disappearance of some social roles, they re-emerge spontaneously, shortly after. To the extent that the neural network I use here can be deemed representative for real social change, widely prevalent social roles seem to be a robust part of the social structure.
Now, it is time to screen comparatively the results yielded by the neural network when it is supposed to represent scenarios C and D of social change: I study situations when a factor of social disturbance, calibrated in its significance on the error made by the neural network in previous experimental rounds, triggers the emergence of new social roles. The difference between those two scenarios is in the role of social cohesion. Mathematically, I did it by activating the dormant l = 19 social roles in the network, with a random component. When the random value generated in the column of social disturbance RB is greater than the error observed in the previous experimental round, thus when RB(erj) > e(erj-1), then each of the l = 19 dormant social roles gets a random positive value between 0 and 1. That random positive value gets processed in two alternative ways. In scenario C, it goes directly into aggregation and neural activation, i.e. there is no meta-data on the Euclidean distance between any of those newly emerging social roles and other social roles. Each new social role is considered as a monad, which develops free from constraints of social cohesion. Scenario D establishes such a constraint, thus the randomly triggered probability of a woken up, and previously dormant social role is being aggregated, and fed into neural activation with meta-data as for its Euclidean distance from other social roles.
Scenarios C and D share one important characteristic: heterogeneity in new social roles. The k = 20 social roles active from the very beginning, thus social roles ‘inherited’ from the baseline social network, share a uniform probability of happening in each experimental round. Still, as probabilities of new social roles, triggered by the factor of disturbance, are random by default, these probabilities are distributed aleatorily. Therefore, scenarios C and D represent a general case of a new, heterogenous social structure emerging in the presence of an incumbent rigid social structure. Given that specific trait, I introduce a new method of comparing those two sets of social roles, namely by the average probability attached to social roles, calculated over the 3000 experimental rounds. I calculate the average probability of active social roles across all the 3000 experimental rounds, and I compare it with individual, average probabilities obtained for each of the new social roles (or woken up and previously dormant social roles) over 3000 experimental rounds. The idea behind this method is that in big sets of observations, arithmetical average represents the expected value, or the expected state of the given variable.
The process of social change observed, respectively, in scenarios C and D, is different. In the scenario C, the uniform probability attached to the incumbent k = 20 social roles follows a very calm trend, oscillating slightly between 0,2 and 0,5, whilst the heterogenous probabilities of newly triggered l = 19 social roles swing quickly and broadly between 0 and 1. When the network starts feeding forward meta-data on Euclidean distance between each new social role and the others, it creates additional oscillation in the uniform probability of incumbent social roles. The latter gets systematically and cyclically pushed into negative values. A negative probability is logically impossible and represents no real phenomenon. Well, I mean… It is possible to assume that the negative probability of one phenomenon represents the probability of the opposite phenomenon taking place, but this is really far-fetched and doesn’t really find grounding in the logical structure of this specific neural network. Still, the cycle of change where the probability of something incumbent and previously existing gets crushed down to zero (and below) represents a state of society, when a new phenomenon aggressively pushes the incumbent phenomena out of the system.
Let’s see how those two processes of social change, observed in scenarios C and D, translate into expected states of social roles, i.e. into average probabilities. The first step in this analysis is to see how heterogeneous are those average expected states across the new social roles, triggered out of dormancy by the intrusion of the disturbance RB. In scenario C, new social roles display average probabilities between 0,32 and 0,35. Average probabilities corresponding to each individual, new social role differs from others by no more than 0.03, thus by a phenomenological fringe to be found in the tails of the normal distribution. By comparison, the average uniform probability attached to the existing social roles is 0,31. Thus, in the absence of constraint regarding social cohesion between new social roles and the incumbent ones, the expected average probability in both categories is very similar.
In scenario D, average probabilities of new social roles oscillate between 0,45 and 0,49, with just as little disparity as in scenario C, but, in the same time, they push the incumbent social roles out of the nest, so to say. The average uniform probability in the latter, after 3000 experimental rounds, is 0.01, which is most of all a result of the ‘positive probability – negative probability’ cycle during experimentation.
It is time to sum up my observations from the entire experiment conducted through and with a neural network. The initial intention was to understand better the mechanism which underlies one of my most fundamental claims regarding the civilizational role of cities, namely that cities, as a social contrivance, serve to accommodate a growing population in the framework of an increasingly complex network of social roles.
I am focusing on the ‘increasingly complex’ part of that claim. I want to understand patterns of change in the network of social roles, i.e. how can the complexity of that network evolve over time. The kind of artificial behaviour I induced in a neural network allows identifying a few recurrent patterns, which I can transform into hypotheses for further research. There is a connection between social cohesion and the emergence/disappearance of new social roles, for one. Social cohesion drags me back into the realm of the swarm theory. As a society, we seem to be evolving by a cycle of loosening and tightening in the way that social roles are coupled with each other.
Discover Social Sciences is a scientific blog, which I, Krzysztof Wasniewski, individually write and manage. If you enjoy the content I create, you can choose to support my work, with a symbolic $1, or whatever other amount you please, via MY PAYPAL ACCOUNT. What you will contribute to will be almost exactly what you can read now. I have been blogging since 2017, and I think I have a pretty clearly rounded style.
In the bottom on the sidebar of the main page, you can access the archives of that blog, all the way back to August 2017. You can make yourself an idea how I work, what do I work on and how has my writing evolved. If you like social sciences served in this specific sauce, I will be grateful for your support to my research and writing.
‘Discover Social Sciences’ is a continuous endeavour and is mostly made of my personal energy and work. There are minor expenses, to cover the current costs of maintaining the website, or to collect data, yet I want to be honest: by supporting ‘Discover Social Sciences’, you will be mostly supporting my continuous stream of writing and online publishing. As you read through the stream of my updates on https://discoversocialsciences.com , you can see that I usually write 1 – 3 updates a week, and this is the pace of writing that you can expect from me.
Another takeaway you can be interested in is ‘The Business Planning Calculator’, an Excel-based, simple tool for financial calculations needed when building a business plan.
Both the e-book and the calculator are available via links in the top right corner of the main page on https://discoversocialsciences.com .
[1] Hamann, H., Stradner, J., Schmickl, T., & Crailsheim, K. (2010). Artificial hormone reaction networks: Towards higher evolvability in evolutionary multi-modular robotics. arXiv preprint arXiv:1011.3912.
[1] Xie, X. F., Zhang, W. J., & Yang, Z. L. (2002, May). Dissipative particle swarm optimization. In Proceedings of the 2002 Congress on Evolutionary Computation. CEC’02 (Cat. No. 02TH8600) (Vol. 2, pp. 1456-1461). IEEE.
[2] Poli, R., Kennedy, J., & Blackwell, T. (2007). Particle swarm optimization. Swarm intelligence, 1(1), 33-57.
[3] Torres, S. (2012). Swarm theory applied to air traffic flow management. Procedia Computer Science, 12, 463-470.
[4] Stradner, J., Thenius, R., Zahadat, P., Hamann, H., Crailsheim, K., & Schmickl, T. (2013). Algorithmic requirements for swarm intelligence in differently coupled collective systems. Chaos, Solitons & Fractals, 50, 100-114.
I am getting into the groove of a new form of expression: the rubber duck. I explained more specifically the theory of the rubber duck in the update entitled A test pitch of my ‘Energy Ponds’ business concept. I use mostly videos, where, as I am talking to an imaginary audience, I sharpen (hopefully) my own ideas and the way of getting them across. In this update, I am cutting out some slack from my thinking about the phenomenon of collective intelligence, and my use of neural networks to simulate the way that human, collective intelligence works (yes, it works).
The structure of my updates on this blog changes as my form is changing. Instead of placing the link to my video in like the first subheading of the update, I place it further blow, sort of in conclusion. I prepare my updates with an extensive use of Power Point, in order both to practice a different way of formulating my ideas, and in order to have slides for my video presentation. Together with the link to You Tube, you will find another one, to the Power Point document.
Ad rem, i.e. get the hell to the point, man. I am trying to understand better my own thinking about collective intelligence and the bridging towards artificial intelligence. As I meditate about it, I find an essential phenomenological notion: the droplet of information. With the development of digital technologies, we communicate more and more with some sort of pre-packaged, whoever-is-interested-can-pick-it-up information. Videos on you tube, blogging updates, books, articles are excellent examples thereof. When I talk to the camera of my computer, I am both creating a logical structure for myself, and a droplet of information for other people.
Communication by droplets is fundamentally different from other forms, like meetings, conversations, letters etc. Until recently, and by ‘recently’ I mean like 1990, most organized human structures worked with precisely addressed information. Since we started to grow the online part of our civilization, we have been coordinating more and more with droplet information. It is as if information was working like a hormone. As You Tube swells, we have more and more of that logical hormone accumulated in our civilization.
That’s precisely my point of connection with artificial intelligence. When I observe the way a neural network works (yes, I observe them working step by step, iteration by iteration, as strange as it might seem), I see a structure which uses error as food for learning. Residual local error is to a neural network what, once again, a hormone is to a living organism.
I
noticed it is one month that I did not post anything on my blog. Well, been
doing things, you know. Been writing, and thinking by the same occasion. I am
forming a BIG question in my mind, a question I want to answer: how are we
going to respond to climate change? Among all the possible scenarios of such
response, which are we the most likely to follow? When I have a look, every now
and then, at Greta Thunberg’s astonishingly quick social ascent, I wonder why are
we so divided about something apparently so simple? I am very clear: this is
not a rhetorical question from my part. Maybe I should claim something like: ‘We
just need to get all together, hold our hands and do X, Y, Z…’. Yes, in a
perfect world we would do that. Still, in the world we actually live in, we don’t.
Does it mean we are collectively stupid, like baseline, and just some enlightened
individuals can sometimes see the truly rational path of moving ahead? Might
be. Yet, another view is possible. We might be doing apparently dumb things
locally, and those apparent local flops could sum up to something quite
sensible at the aggregate scale.
There
is some science behind that intuition, and some very provisional observations. I
finally (and hopefully) nailed down the revision of the
article on energy efficiency. I have already started
developing on this one in my last update, entitled ‘Knowledge
and Skills’, and now, it is done. I have just revised the
article, quite deeply, and by the same occasion, I hatched a methodological
paper, which I submitted to MethodsX.
As I want to develop a broader discussion on these two papers, without
repeating their contents, I invite my readers to get acquainted with their PDF,
via the archives of my blog. Thus, by clicking the title Energy
Efficiency as Manifestation of Collective Intelligence in Human Societies,
you can access the subject matter paper on energy efficiency, and clicking on Neural
Networks As Representation of Collective Intelligence
will take you to the methodological article.
I
think I know how to represent, plausibly, collective intelligence with
artificial intelligence. I am showing the essential concept in the picture
below. Thus, I start with a set of empirical data, describing a society. Well
in the lines of what I have been writing, on this blog, since early spring this
year, I assume that quantitative variables in my dataset, e.g. GDP per capita,
schooling indicators, the probability for an average person to become a mad
scientist etc. What is the meaning of those variables? Most of all, they exist
and change together. Banal, but true. In other words, all that stuff represents
the cumulative outcome of past, collective action and decision-making.
I
decided to use the intellectual momentum, and I used the same method with a
different dataset, and a different set of social phenomena. I took Penn Tables
9.1 (Feenstra et al. 2015[1]), thus a well-known base
of macroeconomic data, and I followed the path sketched in the picture below.
Long
story short, I have two big surprises. When I look upon energy efficiency and
its determinants, turns out energy efficiency is not really the chief outcome
pursued by the 59 societies studied: they care much more about the local, temporary
proportions between capital immobilised in fixed assets, and the number of
resident patent applications. More specifically, they seem to be principally
optimizing the coefficient of fixed assets per 1 patent application. That is
quite surprising. It sends me back to my peregrinations through the land of
evolutionary theory (see for example: My
most fundamental piece of theory).
When
I take a look at the collective intelligence (possibly) embodied in Penn Tables
9.1, I can see this particular collective wit aiming at optimizing the share of
labour in the proceeds from selling real output in the first place. Then,
almost immediately after, comes the average number of hours worked per person
per year. You can click on
this link and read the full manuscript I have just submitted with
the Quarterly Journal of Economics.
Wrapping
it (provisionally) up, as I did some social science with the assumption of
collective intelligence in human societies taken at the level of methodology, and
I got truly surprising results. That
thing about energy efficiency – i.e. the fact that when in presence of some
capital in fixed assets, and some R&D embodied in patentable inventions, we
seem caring about energy efficiency only secondarily – is really mind-blowing. I
had already done some research on energy as factor of social change, and,
whilst I have never been really optimistic about our collective capacity to
save energy, I assumed that we orient ourselves, collectively, on some kind of
energy balance. Apparently, we do only when we have nothing else to pay
attention to. On the other hand, the
collective focus on macroeconomic variables pertinent to labour, rather
than prices and quantities, is just as gob-smacking. All economic education,
when you start with Adam Smith and take it from there, assumes that economic
equilibriums, i.e. those special states of society when we are sort of in balance
among many forces at work, are built around prices and quantities. Still, in that
research I have just completed, the only kind of price my neural network can build
a plausibly acceptable learning around, is the average price level in international
trade, i.e. in exports, and in imports. All the prices, which I have been
taught, and which I taught are the cornerstones of economic equilibrium, like
prices in consumption or prices in investment, when I peg them as output
variables of my perceptron, the incriminated perceptron goes dumb like hell and
yields negative economic aggregates. Yes, babe: when I make my neural network
pay attention to price level in investment goods, it comes to the conclusion
that the best idea is to have negative national income, and negative population.
Returning
to the issue of climate change and our collective response to it, I am trying
to connect my essential dots. I have just served some like well-cooked science,
and not it is time to bite into some raw one. I am biting into facts which I
cannot explain yet, like not at all. Did you know, for example, that there are
more and more adult people dying in high-income countries, like per 1000, since
2014? You can consult the data available with World Bank, as regards the
mortality of men and that
in women. Infant mortality is generally falling, just as adult mortality in
low, and middle-income countries. It is just about adult people in wealthy
societies categorized as ‘high income’: there are more and more of them dying per
1000. Well, I should maybe say ‘more of us’, as I am 51, and relatively
well-off, thank you. Anyway, all the way up through 2014, adult mortality in high-income
countries had been consistently subsiding, reaching its minimum in 2014 at 57,5
per 1000 in women, and 103,8 in men. In 2016, it went up to 60,5 per 1000 in
women, and 107,5 in men. It seems counter-intuitive. High-income countries are
the place where adults are technically exposed to the least fatal hazards. We
have virtually no wars around high income, we have food in abundance, we enjoy reasonably
good healthcare systems, so WTF? As regards low-income countries, we could
claim that adults who die are relatively the least fit for survival ones, but what
do you want to be fit for in high-income places? Driving a Mercedes around? Why
it started to revert since 2014?
Intriguingly,
high income countries are also those, where the difference in adult mortality
between men and women is the most pronounced, in men almost the double of what
is observable in women. Once again, it is something counter-intuitive. In low-income
countries, men are more exposed to death in battle, or to extreme conditions,
like work in mines. Still, in high-income countries, such hazards are remote.
Once again, WTF? Someone could say: it is about natural selection, about
eliminating the weak genetics. Could be, and yet not quite. Elimination of weak
genetics takes place mostly through infant mortality. Once we make it like through
the first 5 years of our existence, the riskiest part is over. Adult mortality
is mostly about recycling used organic material (i.e. our bodies). Are human
societies in high-income countries increasing the pace of that recycling? Why
since 2015? Is it more urgent to recycle used men than used women?
There
is one thing about 2015, precisely connected to climate change. As I browsed some
literature about droughts in Europe and their possible impact on agriculture (see
for example All
hope is not lost: the countryside is still exposed), it turned out that
2015 was precisely the year when we started to sort of officially admitting
that we have a problem with agricultural droughts on our continent. Even more
interestingly, 2014 and 2015 seem to have been the turning point when aggregate
damages from floods, in Europe, started to curb down after something like two
decades of progressive increase. We swapped one calamity for another one, and
starting from then, we started to recycle used adults at more rapid a pace. Of
course, most of Europe belongs to the category of high-income countries.
See?
That’s what I call raw science about collective intelligence. Observation with
a lot of questions and very remote idea as for the method of answering them. Something
is apparently happening, maybe we are collectively intelligent in the process, and
yet we don’t know how exactly (are we collectively intelligent). It is possible
that we are not. Warmer climate is associated with greater prevalence of
infectious diseases in adults (Amuakwa-Mensah
et al. 2017[1]),
for example, and yet it does not explain why is greater adult mortality happening
in high-income countries. Intuitively, infections attack where people are
poorly shielded against them, thus in countries with frequent incidence of
malnutrition and poor sanitation, thus in the low-income ones.
I
am consistently delivering good, almost new science to my readers, and love
doing it, and I am working on crowdfunding this activity of mine. You can
communicate with me directly, via the mailbox of this blog: goodscience@discoversocialsciences.com.
As we talk business plans, I remind you that you can download, from the library
of my blog, the business
plan I prepared for my semi-scientific project Befund (and you can access
the French version as well). You can also get a free e-copy of my book ‘Capitalism
and Political Power’ You can support my research by donating directly,
any amount you consider appropriate, to my PayPal account.
You can also consider going to my Patreon
page and become my patron. If you decide so, I will be
grateful for suggesting me two things that Patreon suggests me to suggest you.
Firstly, what kind of reward would you expect in exchange of supporting me?
Secondly, what kind of phases would you like to see in the development of my
research, and of the corresponding educational tools?
[1] Amuakwa-Mensah, F., Marbuah,
G., & Mubanga, M. (2017). Climate variability and infectious diseases
nexus: Evidence from Sweden. Infectious Disease Modelling, 2(2),
203-217.
[1] Feenstra, Robert C., Robert Inklaar
and Marcel P. Timmer (2015), “The Next Generation of the Penn World Table”
American Economic Review, 105(10), 3150-3182, available for download at www.ggdc.net/pwt
Once
again, I break my rhythm. Mind you, it happens a lot this year. Since January,
it is all about breaking whatever rhythm I have had so far in my life. I am
getting used to unusual, and I think it is a good thing. Now, I am breaking the
usual rhythm of my blogging. Normally, I have been alternating updates in
English with those in French, like one to one, with a pinchful of writing in my
mother tongue, Polish, every now and then. Right now, two urgent tasks require
my attention: I need to prepare new syllabuses,
for English-taught courses in the upcoming academic year, and to revise my draft article on the energy
efficiency of national economies.
Before
I attend to those tasks, however, a little bit of extended reflection on goals
and priorities in my life, somehow in the lines of my last update, « It might be a sign of narcissism
». I have just gotten back from Nice, France, where my son has just started his
semester of Erasmus + exchange, with the Sophia Antipolis University. In my
youth, I spent a few years in France, I went many times to France since, and
man, this time, I just felt the same, very special and very French kind of human
energy, which I remember from the 1980ies. Over the last 20 years or so, the
French seemed sort of had been sleeping inside their comfort zone but now, I
can see people who have just woken up and are wondering what the hell they had
wasted so much time on, and they are taking double strides to gather speed in
terms of social change. This is the innovative, brilliant, positively cocky
France I love. There is sort of a social pattern in France: when the French get
vocal, and possibly violent, in the streets, they are up to something as a
nation. The French Revolution in 1789 was an expression of popular discontent,
yet what followed was not popular satisfaction: it was one-century-long
expansion on virtually all plans: political, military, economic, scientific
etc. Right now, France is just over the top of the Yellow Vests protest, which
one of my French students devoted an essay to (see « Carl Lagerfeld and some guest
blogging from Emilien Chalancon, my student »). I
wonder who will be the Napoleon Bonaparte of our times.
When
entire nations are up to something, it is interesting. Dangerous, too, and yet
interesting. Human societies are, as a rule, the most up to something as
regards their food and energy base, and so I come to that revision of my
article. Here, below, you will find the letter of review I received from the
journal “Energy” after I submitted the initial manuscript, referenced
as Ms. Ref. No.: EGY-D-19-00258. The link to my manuscript is to find in the
first paragraph of this update. For those of you who are making their first
steps in science, it can be an illustration of what ‘scientific dialogue’
means. Further below, you will find a first sketch of my revision, accounting
for the remarks from reviewers.
Thus,
here comes the LETTER OF REVIEW (in italic):
Ms. Ref. No.: EGY-D-19-00258
Title:
Apprehending energy efficiency: what is the cognitive value of hypothetical
shocks? Energy
Dear
Dr. Wasniewski,
The
review of your paper is now complete, the Reviewers’ reports are below. As you
can see, the Reviewers present important points of criticism and a series of
recommendations. We kindly ask you to consider all comments and revise the
paper accordingly in order to respond fully and in detail to the Reviewers’
recommendations. If this process is completed thoroughly, the paper will be
acceptable for a second review.
If
you choose to revise your manuscript it will be due into the Editorial Office
by the Jun 23, 2019
Once
you have revised the paper accordingly, please submit it together with a
detailed description of your response to these comments. Please, also include a
separate copy of the revised paper in which you have marked the revisions made.
Please
note if a reviewer suggests you to cite specific literature, you should only do
so if you feel the literature is relevant and will improve your paper.
Otherwise please ignore such suggestions and indicate this fact to the handling
editor in your rebuttal.
When
submitting your revised paper, we ask that you include the following items:
Manuscript
and Figure Source Files (mandatory):
We
cannot accommodate PDF manuscript files for production purposes. We also ask
that when submitting your revision you follow the journal formatting
guidelines. Figures and tables may be embedded within the source file for the
submission as long as they are of sufficient resolution for Production. For any
figure that cannot be embedded within the source file (such as *.PSD Photoshop
files), the original figure needs to be uploaded separately. Refer to the Guide
for Authors for additional information. http://www.elsevier.com/journals/energy/0360-5442/guide-for-authors
Highlights
(mandatory):
Highlights
consist of a short collection of bullet points that convey the core findings of
the article and should be submitted in a separate file in the online submission
system. Please use ‘Highlights’ in the file name and include 3 to 5 bullet
points (maximum 85 characters, including spaces, per bullet point). See the
following website for more information
We
invite you to convert your supplementary data (or a part of it) into a Data in
Brief article. Data in Brief articles are descriptions of the data and
associated metadata which are normally buried in supplementary material. They
are actively reviewed, curated, formatted, indexed, given a DOI and freely
available to all upon publication. Data in Brief should be uploaded with your
revised manuscript directly to Energy. If your Energy research article is
accepted, your Data in Brief article will automatically be transferred over to
our new, fully Open Access journal, Data in Brief, where it will be editorially
reviewed and published as a separate data article upon acceptance. The Open
Access fee for Data in Brief is $500.
Then,
place all Data in Brief files (whichever supplementary files you would like to
include as well as your completed Data in Brief template) into a .zip file and
upload this as a Data in Brief item alongside your Energy revised manuscript.
Note that only this Data in Brief item will be transferred over to Data in
Brief, so ensure all of your relevant Data in Brief documents are zipped into a
single file. Also, make sure you change references to supplementary material in
your Energy manuscript to reference the Data in Brief article where
appropriate.
If
you have questions, please contact the Data in Brief publisher, Paige Shaklee
at dib@elsevier.com
In
order to give our readers a sense of continuity and since editorial procedure
often takes time, we encourage you to update your reference list by conducting
an up-to-date literature search as part of your revision.
On
your Main Menu page, you will find a folder entitled “Submissions Needing
Revision”. Your submission record will be presented here.
MethodsX
file (optional)
If
you have customized (a) research method(s) for the project presented in your
Energy article, you are invited to submit this part of your work as MethodsX
article alongside your revised research article. MethodsX is an independent
journal that publishes the work you have done to develop research methods to
your specific needs or setting. This is an opportunity to get full credit for
the time and money you may have spent on developing research methods, and to
increase the visibility and impact of your work.
2)
Place all MethodsX files (including graphical abstract, figures and other
relevant files) into a .zip file and
upload
this as a ‘Method Details (MethodsX) ‘ item alongside your revised Energy
manuscript. Please ensure all of your relevant MethodsX documents are zipped
into a single file.
3)
If your Energy research article is accepted, your MethodsX article will
automatically be transferred to MethodsX, where it will be reviewed and
published as a separate article upon acceptance. MethodsX is a fully Open
Access journal, the publication fee is only 520 US$.
Include
interactive data visualizations in your publication and let your readers
interact and engage more closely with your research. Follow the instructions
here: https://www.elsevier.com/authors/author-services/data- visualization to
find out about available data visualization options and how to include them
with your article.
MethodsX
file (optional)
We
invite you to submit a method article alongside your research article. This is
an opportunity to get full credit for the time and money you have spent on
developing research methods, and to increase the visibility and impact of your
work. If your research article is accepted, your method article will be
automatically transferred over to the open access journal, MethodsX, where it
will be editorially reviewed and published as a separate method article upon
acceptance. Both articles will be linked on ScienceDirect. Please use the
MethodsX template available here when preparing your article:
https://www.elsevier.com/MethodsX-template. Open access fees apply.
Reviewers’
comments:
Reviewer
#1: The paper is, at least according to the title of the paper, and attempt to
‘comprehend energy efficiency’ at a macro-level and perhaps in relation to
social structures. This is a potentially a topic of interest to the journal
community. However and as presented, the paper is not ready for publication for
the following reasons:
1.
A long introduction details relationship and ‘depth of emotional entanglement
between energy and social structures’ and concomitant stereotypes, the issue
addressed by numerous authors. What the Introduction does not show is the
summary of the problem which comes out of the review and which is consequently
addressed by the paper: this has to be presented in a clear and articulated way
and strongly linked with the rest of the paper. In simplest approach, the paper
does demonstrate why are stereotypes problematic. In the same context, it
appears that proposed methodology heavily relays on MuSIASEM methodology which
the journal community is not necessarily familiar with and hence has to be
explained, at least to the level used in this paper and to make the paper
sufficiently standalone;
2.
Assumptions used in formulating the model have to be justified in terms what
and how they affect understanding of link/interaction between social structures
and function of energy (generation/use) and also why are assumptions formulated
in the first place. Also, it is important here to explicitly articulate what is
aimed to achieve with the proposed model: as presented this somewhat comes
clear only towards the end of the paper. More fundamental question is what is
the difference between model presented here and in other publications by the
author: these have to be clearly explained.
3.
The presented empirical tests and concomitant results are again detached from
reality for i) the problem is not explicitly formulated, and ii) real-life
interpretation of results are not clear.
On
the practical side, the paper needs:
1.
To conform to style of writing adopted by the journal, including referencing;
2.
All figures have to have captions and to be referred to by it;
3.
English needs improvement.
Reviewer
#2: Please find the attached file.
Reviewer
#3: The article has a cognitive value. The author has made a deep analysis of
literature. Methodologically, the article does not raise any objections.
However, getting acquainted with its content, I wonder why the analysis does
not take into account changes in legal provisions. In the countries of the
European Union, energy efficiency is one of the pillars of shaping energy
policy. Does this variable have no impact on improving energy efficiency?
When
reading an article, one gets the impression that the author has prepared it for
editing in another journal. Editing it is incorrect! Line 13, page 10, error –
unwanted semicolon.
Now,
A FIRST SKETCH OF MY REVISION.
There
are the general, structural suggestions from the editors, notably to outline my
method of research, and to discuss my data, in separate papers. After that come
the critical remarks properly spoken, with a focus on explaining clearly – more
clearly than I did it in the manuscript – the assumptions of my model, as well
as its connections with the MUSIASEM model. I start with my method, and it is
an interesting exercise in introspection. I did the empirical research quite a
few months ago, and now I need to look at it from a distance, objectively. Doing
well at this exercise amounts, by the way, to phrasing accurately my
assumptions. I start with my fundamental variable, i.e. the so-called energy
efficiency, measured as the value of real output (i.e. the value of goods and
services produced) per unit of energy consumed, measured in kilograms of oil
equivalent. It is like: energy
efficiency = GDP/ energy consumed.
In
my mind, that coefficient is actually a coefficient of coefficients, more
specifically: GDP / energy consumed = [GDP per capita] / [consumption of
energy per capita ] = [GDP / population] / [energy consumed / population ].
Why so? Well, I assume that when any of us, humans, wants to have a meal, we
generally don’t put our fingers in the nearest electric socket. We consume
energy indirectly, via the local combination of technologies. The same local
combination of technologies makes our GDP. Energy efficiency measures two ends
of the same technological toolbox: its intake of energy, and its outcomes in terms
of goods and services. Changes over time in energy efficiency, as well as its
disparity across space depend on the unfolding of two distinct phenomena: the
exact composition of that local basket of technologies, like the overall heap of
technologies we have stacked up in our daily life, for one, and the efficiency
of individual technologies in the stack, for two. Here, I remember a model I
got to know in management science, precisely about how the efficiency changes
with new technologies supplanting the older ones. Apparently, a freshly implemented,
new technology is always less productive than the one it is kicking out of
business. Only after some time, when people learn how to use that new thing
properly, it starts yielding net gains in productivity. At the end of the day,
when we change our technologies frequently, there could very well not be any
gain in productivity at all, as we are constantly going through consecutive
phases of learning. Anyway, I see the coefficient of energy efficiency at
any given time in a given place as the cumulative outcome of past collective
decisions as for the repertoire of technologies we use.
That
is the first big assumption I make, and the second one comes from the
factorisation: GDP / energy consumed = [GDP per capita] / [consumption of
energy per capita ] = [GDP / population] / [energy consumed / population ].
I noticed a semi-intuitive, although not really robust correlation between the
two component coefficients. GDP per capita tends to be higher in countries with
better developed institutions, which, in turn, tend to be better developed in
the presence of relatively high a consumption of energy per capita. Mind you,
it is quite visible cross-sectionally, when comparing countries, whilst not
happening that obviously over time. If people in country A consume twice as
much energy per capita as people in country B, those in A are very likely to
have better developed institutions than folks in B. Still, if in any of the two
places the consumption of energy per capita grows or falls by 10%, it does not
automatically mean corresponding an increase or decrease in institutional
development.
Wrapping
partially up the above, I can see at least one main assumption in my method:
energy efficiency, measured as GDP per kg of oil equivalent in energy
consumed is, in itself, a pretty foggy metric, arguably devoid of intrinsic
meaning, and it is meaningful as an equilibrium of two component coefficients,
namely in GDP per capita, for one, and energy consumption per capita, for two. Therefore,
the very name ‘energy efficiency’ is problematic. If the vector [GDP; energy
consumption] is really a local equilibrium, as I intuitively see it, then we
need to keep in mind an old assumption of economic sciences: all equilibriums
are efficient, this is basically why they are equilibriums. Further down this
avenue of thinking, the coefficient of GDP per kg of oil equivalent shouldn’t
even be called ‘energy efficiency’, or, just in order not to fall into
pointless semantic bickering, we should take the ‘efficiency’ part into some
sort of intellectual parentheses.
Now,
I move to my analytical method. I accept as pretty obvious the fact that, at a
given moment in time, different national economies display different
coefficients of GDP per kg of oil equivalent consumed. This is coherent with
the above-phrased claim that energy efficiency is a local equilibrium rather
than a measure of efficiency strictly speaking. What gains in importance, with
that intellectual stance, is the study of change over time. In the manuscript
paper, I tested a very intuitive analytical method, based on a classical move,
namely on using natural logarithms of empirical values rather than empirical
values themselves. Natural logarithms eliminate a lot of non-stationarity and
noise in empirical data. A short reminder of what are natural logarithms is due
at this point. Any number can be represented as a power of another number, like
y = xz, where ‘x’ is called the root
of the ‘y’, ‘z’ is the exponent of the root, and ‘x’ is
also the base of ‘z’.
Some
roots are special. One of them is the so-called Euler’s number, or e =
2,718281828459, the base of the natural logarithm. When we treat e
≈ 2,72 as the root of another number, the corresponding exponent z
in y = ez has interesting properties: it can be
further decomposed as z = t*a, where t is the ordinal number of a
moment in time, and a is basically a parameter. In a moment, I
will explain why I said ‘basically’. The function y = t*a is
called ‘exponential function’ and proves useful in studying processes marked by
important hysteresis, i.e. when each consecutive step in the process depends
very strongly on the cumulative outcome of previous steps, like y(t)
depends on y(t – k). Compound interest is a classic example: when
you save money for years, with annual compounding of interest, each consecutive
year builds upon the interest accumulated in preceding years. If we represent
the interest rate, classically, as ‘r’, the function y = xt*r gives a
good approximation of how much you can save, with annually compounded ‘r’,
over ‘t’ years.
Slightly
different an approach to the exponential function can be formulated, and this
is what I did in the manuscript paper I am revising
now, in front of your very eyes. The natural logarithm of
energy efficiency measured as GDP per kg of oil equivalent can be considered as
local occurrence of change with strong a component of hysteresis. The
equilibrium of today depends on the cumulative outcomes of past equilibriums.
In a classic exponential function, I would approach that hysteresis as y(t) = et*a, with a
being a constant parameter of the function. Yet, I can assume that ‘a’ is local
instead of being general. In other words, what I did was y(t) = et*a(t)
with a(t) being obviously t-specific, i.e. local. I assume that
the process of change in energy efficiency is characterized by local magnitudes
of change, the a(t)’s. That a(t), in y(t) = et*a(t) is
slightly akin to the local first derivative, i.e. y’(t). The
difference between the local a(t) and y’(t) is that
the former is supposed to capture somehow more accurately the hysteretic side
of the process under scrutiny.
In
typical econometric tests, the usual strategy is to start with the empirical
values of my variables, transform them into their natural logarithms or some
sort of standardized values (e.g. standardized over their respective means, or
their standard deviations), and then run linear regression on those transformed
values. Another path of analysis consists in exponential regression, only there
is a problem with this one: it is hard to establish a reliable method of
transformation in empirical data. Running exponential regression on natural
logarithms looks stupid, as natural logarithms are precisely the exponents of
the exponential function, whence my intuitive willingness to invent a method
sort of in between linear regression, and the exponential one.
Once
I assume that local exponential coefficients a(t) in the
exponential progression y(t) = et*a(t)
have intrinsic meaning of their own, as local magnitudes of exponential change,
an interesting analytical avenue opens up. For each set of empirical values y(t),
I can construe a set of transformed values a(t) = ln[y(t)]/t.
Now, when you think about it, the actual a(t) depends on how you
calculate ‘t’, or, in other words, what calendar you apply. When I start
counting time 100 years before the starting year of my empirical data, my a(t)
will go like: a(t1) = ln[y(t1)]/101, a(t2)
= ln[y(t2)]/102 etc. The denominator ‘t’ will
change incrementally slowly. On the other hand, if I assume that the first year
of whatever is happening is one year before my empirical time series start, it
is a different ball game. My a(t1) = ln[y(t1)]/1, and
my a(t2) = ln[y(t2)]/2 etc.; incremental
change in denominator is much greater in this case. When I set my t0
at 100 years earlier than the first year of my actual data, thus t0 = t1 –
100, the resulting set of a(t) values transformed from
the initial y(t) data simulates a secular, slow trend of change. On
the other hand, setting t0 at t0 = t1-1 makes the resulting set
of a(t) values reflect quick change, and the t0 = t1 – 1
moment is like a hypothetical shock, occurring just before the actual empirical
data starts to tell its story.
Provisionally
wrapping it up, my assumptions, and thus my method, consists in studying changes
in energy efficiency as a sequence of equilibriums between relative wealth (GDP
per capita), on the one hand, and consumption of energy per capita. The passage
between equilibriums is a complex phenomenon, combining long term trends and
the short-term ones.
I
am introducing a novel angle of approach to the otherwise classic concept of economics,
namely that of economic equilibrium. I claim that equilibriums are
manifestations of collective intelligence in their host societies. In order to
form an economic equilibrium, would it be more local and Marshallian, or more general
and Walrasian, a society needs institutions that assure collective learning
through experimentation. They need some kind of financial market, enforceable
contracts, and institutions of collective bargaining. Small changes in energy
efficiency come out of consistent, collective learning through those
institutions. Big leaps in energy efficiency appear when the institutions of
collective learning undergo substantial structural changes.
I
am thinking about enriching the empirical part of my paper by introducing
additional demonstration of collective intelligence: a neural network, working with
the same empirical data, with or without the so-called fitness function. I
have that intuitive thought – although I don’t know yet how to get it across
coherently – that neural networks endowed with a fitness function are good at representing
collective intelligence in structured societies with relatively well-developed institutions.
I
go towards my syllabuses for the coming academic year. Incidentally, at least
one of the curriculums I am going to teach this fall fits nicely into the line
of research I am pursuing now: collective intelligence and the use of
artificial intelligence. I am developing the thing as an update on my blog, and
I write it directly in English. The course is labelled “Behavioural
Modelling and Content Marketing”. My principal goal is to teach students the
mechanics of behavioural interaction between human beings and digital technologies,
especially in social media, online marketing and content streaming. At my
university, i.e. the Andrzej Frycz-Modrzewski Krakow University (Krakow, Poland),
we have a general drill of splitting the general goal of each course into three
layers of expected didactic outcomes: knowledge, course-specific skills, and
general social skills. The longer I do science and the longer I teach, the less
I believe into the point of distinguishing knowledge from skills. Knowledge
devoid of any skills attached to it is virtually impossible to check, and
virtually useless.
As
I think about it, I imagine many different teachers and many students. Each
teacher follows some didactic goals. How do they match each other? They are
bound to. I mean, the community of teachers, in a university, is a local social
structure. We, teachers, we have different angles of approach to teaching, and,
of course, we teach different subjects. Yet, we all come from more or less the
same cultural background. Here comes a quick glimpse of literature I will be
referring to when lecturing ‘Behavioural Modelling and Content Marketing’:
the article by Molleman and Gachter (2018[1]), entitled
‘Societal background influences social learning in cooperative decision
making’, and another one, by Smaldino (2019[2]), under
the title ‘Social identity and cooperation in cultural evolution’. Molleman and
Gachter start from the well-known assumption that we, humans, largely owe our
evolutionary success to our capacity of social learning and cooperation. They
give the account of an experiment, where Chinese people, assumed to be
collectivist in their ways, are being compared to British people, allegedly
individualist as hell, in a social game based on dilemma and cooperation. Turns
out the cultural background matters: success-based learning is associated with
selfish behaviour and majority-based learning can help foster cooperation.
Smaldino goes down more theoretical a path, arguing that the structure society
shapes the repertoire of social identities available to homo sapiens in a given
place at a given moment, whence the puzzle of emergent, ephemeral groups as a
major factor in human cultural evolution. When I decide to form, on Facebook, a
group of people Not-Yet-Abducted-By-Aliens, is it a factor of cultural change,
or rather an outcome thereof?
When
I teach anything, what do I really want to achieve, and what does the conscious
formulation of those goals have in common with the real outcomes I reach? When
I use a scientific repository, like ScienceDirect,
that thing learns from me. When I download a bunch of articles on energy, it
suggests me further readings along the same lines. It learns from keywords I
use in my searches, and from the journals I browse. You can even have a look at
my recent history of downloads from ScienceDirect and make yourself an opinion about
what I am interested in. Just CLICK HERE,
it opens an Excel spreadsheet.
How
can I know I taught anybody anything useful? If a student asks me: ‘Pardon
me, sir, but why the hell should I learn all that stuff you teach? What’s the
point? Why should I bother?’. Right you are, sir or miss, whatever gender
you think you are. The point of learning that stuff… You can think of some
impressive human creation, like the Notre Dame cathedral, the Eiffel Tower, or
that Da Vinci’s painting, Lady with an Ermine. Have you ever wondered how much
work had been put in those things? However big and impressive a cathedral is,
it had been built brick by f***ing brick. Whatever depth of colour we can see
in a painting, it came out of dozens of hours spent on sketching, mixing
paints, trying, cursing, and tearing down the canvas. This course and its
contents are a small brick in the edifice of your existence. One more small
story that makes your individual depth as a person.
There
is that thing, at the very heart of behavioural modelling, and social sciences
in general. Fault of a better expression, I call it the Bignetti model. See,
for example, Bignetti 2014[3], Bignetti et al. 2017[4], or Bignetti 2018[5] for more
reading. Long story short, what professor Bignetti claims is that whatever
happens in observable human behaviour, individual or collective, whatever, has
already happened neurologically beforehand. Whatever we use to Tweet or
whatever we read, it is rooted in that wiring we have between the ears. The
thing is that actually observing how that wiring works is still a bit
burdensome. You need a lot of technology, and a controlled environment.
Strangely enough, opening one’s skull and trying to observe the contents at
work doesn’t really work. Reverse-engineered, the Bignetti model suggests
behavioural observation, and behavioural modelling, could be a good method to guess
how our individual brains work together, i.e. how we are intelligent
collectively.
I
go back to the formal structure of the course, more specifically to goals and
expected outcomes. I split: knowledge, skills, social competences. The knowledge,
for one. I expect the students to develop the understanding of the
following concepts: a) behavioural pattern b) social life as a collection of
behavioural patterns observable in human beings c) behavioural patterns
occurring as interactions of humans with digital technologies, especially with
online content and online marketing d) modification of human behaviour as a
response to online content e) the basics of artificial intelligence, like the weak
law of great numbers or the logical structure of a neural network. As for the course-specificskills, I expect my students to sharpen their edge in observing
behavioural patterns, and changes thereof in connection with online content.
When it comes to general social competences, I would like my students to
make a few steps forward on two paths: a) handling projects and b) doing
research. It logically implies that assessment in this course should and will
be project-based. Students will be graded on the grounds of complex projects,
covering the definition, observation, and modification of their own behavioural
patterns occurring as interaction with online content.
The
structure of an individual project will cover three main parts:
a) description of the behavioural sequence in question b) description of online
content that allegedly impacts that sequence, and c) the study of behavioural
changes occurring under the influence of online content. The scale of students’
grades is based on two component marks: the completeness of a student’s work,
regarding (a) – (c), and the depth of research the given student has brought up
to support his observations and claims. In Poland, in the academia, we
typically use a grading scale from 2 (fail) all the way up to 5 (very good),
passing through 3, 3+, 4, and 4+. As I see it, each student – or each team of
students, as there will be a possibility to prepare the thing in a team of up
to 5 people – will receive two component grades, like e.g. 3+ for completeness
and 4 for depth of research, and that will give (3,5 + 4)/2 = 3,75 ≈ 4,0.
Such
a project is typical research, whence the necessity to introduce students into
the basic techniques of science. That comes as a bit of a paradox, as those
students’ major is Film and Television Production, thus a thoroughly practical
one. Still, science serves in practical issues: this is something I deeply
believe and which I would like to teach my students. As I look upon those
goals, and the method of assessment, a structure emerges as regards the plan of
in-class teaching. At my university, the bulk of in-class interaction with
students is normally spread over 15 lectures of 1,5 clock hour each, thus 30
hours in total. In some curriculums it is accompanied by the so-called
‘workshops’ in smaller groups, with each such smaller group attending 7 – 8
sessions of 1,5 hour each. In this case, i.e. in the course of ‘Behavioural
Modelling and Content Marketing’, I have just lectures in my schedule.
Still, as I see it, I will need to do practical stuff with my youngsters. This
is a good moment to demonstrate a managerial technique I teach in other
classes, called ‘regressive planning’, which consists in taking the
final goal I want to achieve, assume this is supposed to be the outcome of a
sequence of actions, and then reverse engineer that sequence. Sort of ‘what do
I need to do if I want to achieve X at the end of the day?’.
If
I want to have my students hand me good quality projects by the end of the
semester, the last few classes out of the standard 15 should be devoted to
discussing collectively the draft projects. Those drafts should be based on
prior teaching of basic skills and knowledge, whence the necessity to give
those students a toolbox, and provoke in them curiosity to rummage inside. All
in all, it gives me the following, provisional structure of lecturing:
{input
= 15 classes} => {output = good quality projects by my students}
{input
= 15 classes} ó {input = [10
classes of preparation >> 5 classes of draft presentations and discussion
thereof]}
{input
= 15 classes} ó
{input = [5*(1 class of mindfuck to provoke curiosity + 1 class of systematic
presentation) + 5*(presentation + questioning and discussion)}
As
I see from what I have just written, I need to divide the theory accompanying
this curriculum into 5 big chunks. The first of those 5 blocks needs to
address the general frame of the course, i.e. the phenomenon of recurrent interaction
between humans and online content. I think the most important fact to highlight
is that algorithms of online marketing behave like sales people crossed with
very attentive servants, who try to guess one’s whims and wants. It is a huge
social change: it, I think, the first time in human history when virtually
every human with access to Internet interacts with a form of intelligence that
behaves like a butler, guessing the user’s preferences. It is transformational
for human behaviour, and in that first block I want to show my students how that
transformation can work. The opening, mindfucking class will consists in a behavioural
experiment in the lines of good, old role playing in psychology. I will
demonstrate to my students how a human would behave if they wanted to emulate
the behaviour of neural networks in online marketing. I will ask them questions
about what they usually do, and about what they did like during the last few days,
and I will guess their preferences on the grounds of their described behaviour.
I will tell my students to observe that butler-like behaviour of mine and to pattern
me. In a next step, I will ask students to play the same role, just for them to
get the hang of how a piece of AI works in online marketing. The point of this
first class is to define an expected outcome, like a variable, which neural
networks attempt to achieve, in terms of human behaviour observable through clicking.
The second, theoretical class of that first block will, logically, consist in
explaining the fundamentals of how neural networks work, especially in online
interactions with human users of online content.
I
think in the second two-class block I will address the issue of behavioural
patterns as such, i.e. what they are, and how can we observe them. I want the mindfuck
class in this block to be provocative intellectually, and I think I will use
role playing once again. I will ask my students to play roles of their choice,
and I will discuss their performance under a specific angle: how do you know
that your play is representative for this type of behaviour or person? What
specific pieces of behaviour are, in your opinion, informative about the social
identity of that role? Do other students agree that the type of behaviour played
is representative for this specific type of person? The theoretical class in
this block will be devoted to systematic lecture on the basics of behaviourism.
I guess I will serve to my students some Skinner, and some Timberlake, namely Skinner’s
‘Selection by Consequences’ (1981[6]), and Timberlake’s ‘Behaviour
Systems and Reinforcement’ (1993[7]).
In
the third two-class block I will return to interactions with online
content. In the mindfuck class, I will make my students meddle with You Tube,
and see how the list of suggested videos changes after we search for or click
on specific content, e.g how will it change after clicking 5 videos of
documentaries about wildlife, or after searching for videos on race cars. In
this class, I want my students to pattern the behaviour of You Tube. The theoretical
class of this block will be devoted to the ways those algorithms work. I think
I will focus on a hardcore concept of AI, namely the Gaussian mixture. I will
explain how crude observations on our clicking and viewing allows an algorithm
to categorize us.
As
we will pass to the fourth two-class block, I will switch to the concept
of collective intelligence, i.e. to how whole societies interact with various
forms of online, interactive neural networks. The class devoted to intellectual
provocation will be discursive. I will make students debate on the following claim:
‘Internet and online content allow our society to learn faster and more
efficiently’. There is, of course, a catch, and it is the definition of
learning fast and efficiently. How do we know we are quick and efficient in our
collective learning? What would slow and inefficient learning look like? How
can we check the role of Internet and online content in our collective
learning? Can we apply the John Stuart Mill’s logical canon to that situation? The
theoretical class in this block will be devoted to the phenomenon of collective
intelligence in itself. I would like to work through like two research papers devoted
to online marketing, e.g. Fink
et al. (2018[8])
and Takeuchi
et al. (2018[9]),
in order to show how online marketing unfolds into phenomena of collective
intelligence and collective learning.
Good,
so I come to the fifth two-class block, the last one before the
scheduled draft presentations by my students. It is the last teaching block
before they present their projects, and I think it should bring them back to
the root idea of these, i.e. to the idea of observing one’s own behaviour when
interacting with online content. The first class of the block, the one supposed
to stir curiosity, could consist in two steps of brain storming and discussion.
Students endorse the role of online marketers. In the first step, they define
one or two typical interactions between human behaviour, and the online content
they communicate. We use the previously learnt theory to make both the description
of behavioural patterns, and that of online marketing coherent and state-of-the-art.
In the next step, students discuss under what conditions they would behave according
to those pre-defined patterns, and what conditions would them make diverge from
it and follow different patterns. In the theoretical class of this block, I would
like to discuss two articles, which incite my own curiosity: ‘A place
for emotions in behaviour research system’ by Gordon M.Burghart (2019[10]), and ‘Disequilibrium
in behaviour analysis: A disequilibrium theory redux’ by Jacobs et al. (2019[11]).
I
am consistently delivering good, almost new science to my readers, and love
doing it, and I am working on crowdfunding this activity of mine. You can
communicate with me directly, via the mailbox of this blog: goodscience@discoversocialsciences.com.
As we talk business plans, I remind you that you can download, from the library
of my blog, the business plan I prepared for my
semi-scientific project Befund (and you can access the French version
as well). You can also get a free e-copy of my book ‘Capitalism and Political Power’
You can support my research by donating directly, any amount you consider
appropriate, to my
PayPal account. You can also consider going to my Patreon page and become
my patron. If you decide so, I will be grateful for suggesting me two things
that Patreon suggests me to suggest you. Firstly, what kind of reward would you
expect in exchange of supporting me? Secondly, what kind of phases would you
like to see in the development of my research, and of the corresponding
educational tools?
[1] Molleman, L., & Gächter, S.
(2018). Societal background influences social learning in cooperative decision
making. Evolution and Human Behavior, 39(5), 547-555.
[2] Smaldino, P. E. (2019). Social
identity and cooperation in cultural evolution. Behavioural Processes. Volume
161, April 2019, Pages 108-116
[3] Bignetti, E. (2014). The functional
role of free-will illusion in cognition:“The Bignetti Model”. Cognitive Systems
Research, 31, 45-60.
[4] Bignetti, E., Martuzzi, F., &
Tartabini, A. (2017). A Psychophysical Approach to Test:“The Bignetti Model”. Psychol
Cogn Sci Open J, 3(1), 24-35.
[5] Bignetti, E. (2018). New Insights
into “The Bignetti Model” from Classic and Quantum Mechanics Perspectives. Perspective,
4(1), 24.
[6] Skinner, B. F. (1981).
Selection by consequences. Science, 213(4507), 501-504.
[7] Timberlake, W. (1993).
Behavior systems and reinforcement: An integrative approach. Journal of the
Experimental Analysis of Behavior, 60(1), 105-128.
[8] Fink, M., Koller, M.,
Gartner, J., Floh, A., & Harms, R. (2018). Effective entrepreneurial
marketing on Facebook–A longitudinal study. Journal of business research.
[9] Takeuchi,
H., Masuda, S., Miyamoto, K., & Akihara, S. (2018). Obtaining Exhaustive Answer Set for
Q&A-based Inquiry System using Customer Behavior and Service Function
Modeling. Procedia Computer Science, 126, 986-995.
[10] Burghardt, G. M. (2019). A
place for emotions in behavior systems research. Behavioural processes.
[11] Jacobs, K. W., Morford, Z.
H., & King, J. E. (2019). Disequilibrium in behavior analysis: A
disequilibrium theory redux. Behavioural processes.
I
am recapitulating once again. Two things are going on in my mind: science
strictly spoken and a technological project. As for science, I am digging
around the hypothesis that we, humans, purposefully create institutions for
experimenting with new technologies and that the essential purpose of those institutions
is to maximize the absorption of energy from environment. I am obstinately
turning around the possible use of artificial intelligence as tools for
simulating collective intelligence in human societies. As for technology, I am
working on my concept of « Energy Ponds ». See my update entitled « The
mind-blowing hydro » for relatively the freshest developments on that
point. So far, I came to the conclusion that figuring out a viable financial
scheme, which would allow local communities to own local projects and adapt
them flexibly to local conditions is just as important as working out the
technological side. Oh, yes, and there is teaching, the third thing to occupy
my mind. The new academic year starts on October 1st and I am
already thinking about the stuff I will be teaching.
I
think it is good to be honest about myself, and so I am trying to be: I have a
limited capacity of multi-tasking. Even if I do a few different things in the
same time, I need those things to be kind of convergent and similar. This is
one of those moments when a written recapitulation of what I do serves me to
put some order in what I intend to do. Actually, why not using one of the
methods I teach my students, in management classes? I mean, why not using some
scholarly techniques of planning and goal setting?
Good,
so I start. What do I want? I want a monography on the application of
artificial intelligence to study collective intelligence, with an edge towards
practical use in management. I call it ‘Monography AI in CI – Management’.
I want the manuscript to be ready by the end of October 2019. I want a
monography on a broader topic of technological change being part of human
evolution, with the hypothesis mentioned in the preceding paragraph. This
monography, I give it a working title: ‘Monography Technological Change and
Human Evolution’. I have no clear deadline for the manuscript. I want 2 – 3
articles on renewable energies and their application. Same deadline as that
first monography: end of October 2019. I want to promote and develop my idea of
“Energy Ponds” and that of local financial schemes for such type of project. I
want to present this idea in at least one article, and in at least one public
speech. I want to prepare syllabuses for teaching, centred, precisely, on the
concept of collective intelligence, i.e. of social structures and institutions
made for experimentation and learning. Practically in each of the curriculums I
teach I want to go into the topic of collective learning.
How
will I know I have what I want? This is a control question, forcing me to give
precise form to my goals. As for monographies and articles it is all about
preparing manuscripts on time. A monography should be at least 400 pages each, whilst
articles should be some 30 pages-long each, in the manuscript form. That makes 460
– 490 pages to write (meaningfully, of course!) until the end of October, and
at least 400 other pages to write subsequently. Of course, it is not just about
hatching manuscripts: I need to have a publisher. As for teaching, I can assume
that I am somehow prepared to deliver a given line of logic when I have a
syllabus nailed down nicely. Thus, I need to rewrite my syllabuses not later
than by September 25th. I can evaluate progress in the promotion of my
“Energy Ponds” concept as I will have feedback from any people whom I informed
or will have informed about it.
Right,
the above is what I want technically and precisely, like in a nice schedule of
work. Now, what I like really want? I am 51, with good health and common sense
I have some 24 – 25 productive years ahead. This is roughly the time that
passed since my son’s birth. The boy is not a boy anymore, he is walking his
own path, and what looms ahead of me is like my last big journey in life. What
do I want to do with those years? I want to feel useful, very certainly. Yes, I
think this is one clear thing about what I want: I want to feel useful. How
will I know I am useful? Weeell, that’s harder to tell. As I am patiently
following the train of my thoughts, I think that I feel useful today, when I
can see that people around need me. On the top of that, I want to be
financially important and independent. Wealthy? Yes, but not for comfort as
such. Right now, I am employed, and my salary is my main source of income. I
perceive myself as dependent on my employer. I want to change it so as to have
substantial income (i.e. income greater than my current spending and thus
allowing accumulation) from sources other than a salary. Logically, I need
capital to generate that stream of non-wage income. I have some – an apartment
for rent – but as I look at it critically, I would need at least 7 times more
in order to have the rent-based income I want.
Looks
like my initial, spontaneous thought of being useful means, after having
scratched the surface, being sufficiently high in the social hierarchy to be
financially independent, and able to influence other people. Anyway, as I am
having a look at my short-term goals, I ask myself how do they bridge into my
long-term goals? The answer is: they don’t really connect, my short-term goals
and the long-term ones. There is a lot of missing pieces. I mean, how does the
fact of writing a scientific monography translate into multiplying by seven my
current equity invested in income-generating assets?
Now,
I want to think a bit deeper about what I do now, and I want to discover two
types of behavioural patterns. Firstly, there is probably something in what I
do, which manifests some kind of underlying, long-term ambitions or cravings in
my personality. Exploring what I do might be informative as for what I want to
achieve in that last big lap of my life. Secondly, in my current activities, I
probably have some behavioural patterns, which, when exploited properly, can
help me in achieving my long-term goals.
What
do I like doing? I like writing and reading about science. I like speaking in
public, whether it is a classroom or a conference. Yes, it might be a sign of
narcissism, still it can be used to a good purpose. I like travelling in
moderate doses. Looks like I am made for being a science writer and a science
speaker. It looks some sort of intermediate goal, bridging from my short-term,
scheduled achievements into the long-term, unscheduled ones. I do write
regularly, especially on my blog. I speak regularly in classrooms, as my basic
job is that of an academic teacher. What I do haphazardly, and what could bring
me closer to achieving my long-term goals, would be to speak in other public
contexts more frequently and sort of regularly, and, of course, make money on
it. By the way, as science writing and science speaking is concerned, I have a
crazy idea: scientific stand up. I am deeply fascinated with the art of some
stand up comedians: Bill Burr, Gabriel Iglesias, Joe Rogan, Kevin Hart or Dave
Chapelle. Getting across deep, philosophical content about human condition in
the form of jokes, and make people laugh when thinking about those things, is
an art I admire, and I would like to translate it somehow into the world of
science. The problem is that I don’t know how. I have never done any acting in
my life, never have written nor participated in writing any jokes for stand-up
comedy. As skillsets come, this is a complete terra incognita to me.
Now,
I jump to the timeline. I assume having those 24 years or so ahead of me. What
then, I mean when I hopefully reach 75 years of age. Now, I can shock some of
my readers, but provisionally I label that moment in 24 years from now as “the
decision whether I should die”. Those last years, I have been asking myself how
I would like to die. The question might seem stupid: nobody likes dying. Still,
I have been asking myself this question. I am going into deep existential
ranting, but I think what I think: when I compare my life with some accounts in
historical books, there is one striking difference. When I read letters and
memoirs of people from the 17th or 18th century, even
from the beginnings of the 20th century, those ancestors of ours
tended to ask themselves how worthy their life should be and how worthy their
death should come. We tend to ask, most of all, how long will we live. When I
think about it, that old attitude makes more sense. In the perspective of
decades, planning for maxing out on existential value is much more rational
than trying to max out on life expectancy as such. I guess we can have much
more control over the values we pursue than the duration of our life. I know
that what I am going to say might sound horribly pretentious, but I think I
would like to die like a Viking. I mean, not necessarily trying to kill
somebody, just dying by choice, whilst still having the strength to do
something important, and doing those important things. What I am really afraid
of is slow death by instalments, when my flame dies out progressively, leaving
me just weaker and weaker every month, whilst burdening other people with taking
care of me.
I
fix that provisional checkpoint at the age of 75, 24 years from now. An
important note before I go further: I have not decided I will die at the age of
75. I suppose that would be as presumptuous as assuming to live forever. I just
give myself a rationally grounded span of 24 years to live with enough energy
to achieve something worthy. If I have more, I will just have more. Anyway, how
much can I do in 24 years? In order to plan for that, I need to recapitulate
how much have I been able to do so far, like during an average year. A nicely
productive year means 2 – 3 acceptable articles, accompanied by 2 – 3 equally
acceptable conference presentations. On the top of that, a monography is
conceivable in one year. As for teaching, I can realistically do 600 – 700 hours
of public speech in one year. With that, I think I can nail down some 20
valuable meetings in business and science. In 24 years, I can write 24*550 =
13 200 pages, I can deliver 15 600 hours of public speech, and I can
negotiate something in 480 meetings or so.
Now,
as I talk about value, I can see there is something more far reaching than what
I have just named as my long-term goals. There are values which I want to
pursue. I mean, saying that I want to die like a Viking, and, in the same time,
stating my long-term goals in life in terms of income and capital base: that
sound ridiculous. I know, I know, dying like a Viking, in the times of Vikings,
meant very largely to pillage until the last breath. Still, I need values. I
think the shortcut to my values is via my dreams. What are they, my dreams? Now,
I make a sharp difference between dreams and fantasies. A fantasy is: a)
essentially unrealistic, such as riding a flying unicorn b) involving just a
small, relatively childish part of my personality. On the other hand, a dream –
such as contributing to making my home country, Poland, go 100% off fossil
fuels – is something that might look impossible to achieve, yet its achievement
is a logical extension of my present existence.
What
are they, my dreams? Well, I have just named one, i.e. playing a role in
changing the energy base of my country. What else do I value? Family,
certainly. I want my son to have a good life. I want to feel useful to other
people (that was already in my long-term goals, and so I am moving it to the
category of dreams and values). Another thing comes to my mind: I want to tell
the story of my parents. Apparently banal – lots of people do it or at least
attempt to – and yet nagging as hell. My father died in February, and around
the time of the funeral, as I was talking to family and friends, I discovered
things about my dad which I had not the faintest idea of. I started going
through old photographs and old letters in a personal album I didn’t even know
he still had. Me and my father, we were not very close. There was a lot of bad
blood between us. Still, it was my call to take care of him during the last 17
years of his life, and it was my call to care for what we call in Poland ‘his
last walk’, namely that from the funeral chapel to the tomb properly spoken. I
suddenly had a flash glimpse of the personal history, the rich, textured
biography I had in front of my eyes, visible through old images and old words, all
that in the background of the vanishing spark of life I could see in my
father’s eyes during his last days.
How
will I know those dreams and values are fulfilled in my life? I can measure
progress in my work on and around projects connected to new sources of energy. I
can measure it by observing the outcomes. When things I work on get done, this
is sort of tangible. As for being useful to other people, I go once again down
the same avenue: to me, being useful means having an unequivocally positive
impact on other people. Impact is important, and thus, in order to have that
impact, I need some kind of leadership position. Looking at my personal life
and at my dream to see my son having a good life, it comes as the hardest thing
to gauge. This seems to be the (apparently) irreducible uncertainty in my
perfect plan. Telling my parents’ story: how will I prove to myself I will have
told it? A published book? Maybe…
I
sum it up, at least partially. I can reasonably expect to deliver a certain
amount of work over the 24 years to come: approximately 13 200 pages of
written content, 15 600 hours of public speech, and 450 – 500 meetings,
until my next big checkpoint in life, at the age of 75. I would like to focus
that work on building a position of leadership, in view of bringing some change
to my own country, Poland, mostly in the field of energy. As the first stage is
to build a good reputation of science communicator, the leadership in question
is likely to be rather a soft one. In that plan, two things remain highly
uncertain. Firstly, how should I behave in order to be as good a human being as
I possibly can? Secondly, what is the real importance of that
telling-my-parents’-story thing in the whole plan? How important is it for my
understanding of how to live well those 24 years to come? What fraction of
those 13 200 written pages (or so), should refer to that story?
Now,
I move towards collective intelligence, and to possible applications of
artificial intelligence to study the collective one. Yes, I am a scientist, and
yes, I can use myself as an experimental unit. I can extrapolate my personal
experience as the incidence of something in a larger population. The exact path
of that incidence can shape the future actions and structures of that
population. Good, so now, there is someone – anyone, in fact – who comes and
tells to my face: ‘Look, man, you’re bullshitting yourself and people around
you! Your plans look stupid, and if attitudes like yours spread, our
civilisation will fall into pieces!’. Fair enough, that could be a valid
point. Let’s check. According to the data published by the Central Statistical
Office of the Republic of Poland, in 2019, there are n = 453 390 people in
Poland aged 51, like me, 230 370 of them being men, and 232 020
women. I assume that attitudes such as my own, expressed in the preceding
paragraphs, are one type among many occurring in that population of 51-year-old
Polish people. People have different views on life and other things, so to say.
Now,
I hypothesise in two opposite directions. In Hypothesis A, I state that
just some among those different attitudes make any sense, and there is a
hypothetical distribution of those attitudes in the general population, which
yields the best social outcomes whilst eliminating early all nonsense attitudes
from the social landscape. In other words, some worldviews are so dysfunctional
that they’d better disappear quickly and be supplanted by those more sensible
ones. Going even deeper, it means that quantitative distributions of attitudes
in the general population fall into two classes: those completely haphazard, existential
accidents without much grounds for staying in existence, on the one hand, and
those sensible and functional ones, which can be sustained with benefit to all,
on the other hand. In hypothesis ~A,
i.e. the opposite to A, I speculate that observed diversity in attitudes is
a phenomenon in itself and does not really reduce to any hypothetically better
one. It is the old argument in favour of diversity. Old as it is, it has old
mathematical foundations, and, interestingly, is one of cornerstones in what we
call today Artificial Intelligence.
In
Vapnik,
Chervonenkis 1971[1] , a paper
reputed to be kind of seminal for the today’s AI, I found reference to the
classical Bernoulli’s theorem, known also as the weak law of
large numbers: the relative frequency of an event A in a
sequence of independent trials converges (in probability) to the probability of
that event. Please, note that roughly the same can be found in the
so-called Borel’s law of large numbers, named after Émile Borel.
It is deep maths: each phenomenon bears a given probability of happening, and
this probability is sort of sewn into the fabric of reality. The empirically
observable frequency of occurrence is always an approximation of this
quasi-metaphysical probability. That goes a bit against the way probability is being
taught at school: it is usually about that coin – or dice – being tossed many
times etc. It implies that probability exists at all only as long as there are
things actually happening. No happening, no probability. Still, if you think
about it, there is a reason why those empirically observable frequencies tend to
be recurrent, and the reason is precisely that underlying capacity of the given
phenomenon to take place.
Basic
neural networks, the perceptron-type ones, experiment with weights being attributed
to input variables, in order to find a combination of weights which allows the
perceptron getting the closest possible to a target value. You can find descriptions
of that procedure in « Thinking
Poisson, or ‘WTF are the other folks doing?’ », for example. Now, we can shift
a little bit our perspective and assume that what we call ‘weights’ of input
variables are probabilities that a phenomenon, denoted by the given variable,
happens at all. A vector of weights attributed to input variables is a collection
of probabilities. Walking down this avenue of thinking leads me precisely to
the Hypothesis ~A, presented a few paragraphs ago. Attitudes congruous
with that very personal confession of mine, developed even more paragraphs ago,
have an inherent probability of happening, and the more we experiment, the
closer we can get to that probability. If someone tells to my face that I’m an
idiot, I can reply that: a) any worldview has an idiotic side, no worries b) my
particular idiocy is representative of a class of idiocies, which, in turn, the
civilisation needs to figure out something clever for the next few centuries.
I
am consistently delivering good, almost new science to my readers, and love
doing it, and I am working on crowdfunding this activity of mine. You can
communicate with me directly, via the mailbox of this blog: goodscience@discoversocialsciences.com.
As we talk business plans, I remind you that you can download, from the library
of my blog, the business plan I prepared for my
semi-scientific project Befund (and you can access the French version
as well). You can also get a free e-copy of my book ‘Capitalism and Political Power’
You can support my research by donating directly, any amount you consider
appropriate, to my PayPal account. You can also consider going
to my Patreon page
and become my patron. If you decide so, I will be grateful for suggesting me
two things that Patreon suggests me to suggest you. Firstly, what kind of
reward would you expect in exchange of supporting me? Secondly, what kind of
phases would you like to see in the development of my research, and of the
corresponding educational tools?
[1] Vapnik, V. N. (1971). CHERVONENKIS,
On the uniform convergence ofrelativefrequencies. Theory of Probability and Its
Applications, 16, 264-280.
I am thinking about a few things, as usually, and, as usually, it is a laborious process. The first one is a big one: what the hell am I doing what I am doing for? I mean, what’s the purpose and the point of applying artificial intelligence to simulating collective intelligence? There is one particular issue that I am entertaining in this regard: the experimental check. A neural network can help me in formulating very precise hypotheses as for how a given social structure can behave. Yet, these are hypotheses. How can I have them checked?
Here
is an example. Together with a friend, we are doing some research about the
socio-economic development of big cities in Poland, in the perspective of
seeing them turning into so-called ‘smart cities’. We came to an interesting
set of hypotheses generated by a neural network, but we have a tiny little
problem: we propose, in the article, a financial scheme for cities but we don’t
quite understand why we propose this exact scheme. I know it sounds idiotic,
but well: it is what it is. We have an idea, and we don’t know exactly where
that idea came from.
I
have already discussed the idea in itself on my blog, in « Locally
smart. Case study in finance.» : a local investment fund,
created by the local government, to finance local startup businesses. Business
means investment, especially at the aggregate scale and in the long run. This
is how business works: I invest, and I have (hopefully) a return on my investment.
If there is more and more private business popping up in those big Polish
cities, and, in the same time, local governments are backing off from
investment in fixed assets, let’s make those business people channel capital
towards the same type of investment that local governments are withdrawing
from. What we need is an institutional scheme where local governments
financially fuel local startup businesses, and those businesses implement
investment projects.
I
am going to try and deconstruct the concept, sort of backwards. I am sketching
the landscape, i.e. the piece of empirical research that brought us to
formulating the whole idea of investment fund paired with crowdfunding. Big Polish cities show an interesting pattern
of change: local populations, whilst largely stagnating demographically, are
becoming more and more entrepreneurial, which is observable as an increasing
number of startup businesses per 10 000 inhabitants. On the other hand, local
governments (city councils) are spending a consistently decreasing share of
their budgets on infrastructural investment. There is more and more business
going on per capita, and, in the same time, local councils seem to be slowly
backing off from investment in infrastructure. The cities we studied as for
this phenomenon are: Wroclaw, Lodz, Krakow, Gdansk, Kielce, Poznan, Warsaw.
More
specifically, the concept tested through the neural network consists in
selecting, each year, 5% of the most promising local startups, and funds each
of them with €80 000. The logic behind this concept is that when a
phenomenon becomes more and more frequent – and this is the case of startups in
big Polish cities – an interesting strategy is to fish out, consistently, the
‘crème de la crème’ from among those frequent occurrences. It is as if we were
soccer promotors in a country, where more and more young people start playing
at a competitive level. A viable strategy consists, in such a case, in
selecting, over and over again, the most promising players from the top of the
heap and promote them further.
Thus,
in that hypothetical scheme, the local investment fund selects and supports the
most promising from amongst the local startups. Mind you, that 5% rate of
selection is just an idea. It could be 7% or 3% just as well. A number had to
be picked, in order to simulate the whole thing with a neural network, which I
present further. The 5% rate can be seen as an intuitive transference from the
s-Student significance test in statistics. When you test a correlation for its
significance, with the t-Student test, you commonly assume that at least 95% of
all the observations under scrutiny is covered by that correlation, and you can
tolerate a 5% outlier of fringe cases. I suppose this is why we picked,
intuitively, that 5% rate of selection among the local startups: 5% sounds just
about right to delineate the subset of most original ideas.
Anyway,
the basic idea consists in creating a local investment fund controlled by the
local government, and this fund would provide a standard capital injection of
€80 000 to 5% of most promising local startups. The absolute number STF
(i.e. financed startups) those 5% translate into can be calculated as: STF
= 5% * (N/10 000) * ST10 000, where N is the
population of the given city, and ST10 000 is the
coefficient of startup businesses per 10 000 inhabitants. Just to give you
an idea what it looks like empirically, I am presenting data for Krakow (KR, my
hometown) and Warsaw (WA, Polish capital), in 2008 and 2017, which I designate,
respectively, as STF(city_acronym; 2008) and STF(city_acronym;
2017). It goes like:
That glimpse of empirics allows guessing why we
applied a neural network to that whole thing: the two core variables, namely
population and the coefficient of startups per 10 000 people, can change
with a lot of autonomy vis a vis each other. In the whole sample that we used
for basic stochastic analysis, thus 7 cities from 2008 through 2017 equals 70
observations, those two variables are Pearson-correlated at r = 0,6267. There
is some significant correlation, and yet some 38% of observable variance in
each of those variables doesn’t give a f**k about the variance of the other
variable. The covariance of these two seems to be dominated by the variability
in population rather than by uncertainty as for the average number of startups
per 10 000 people.
What we have is quite predictable a trend of growing
propensity to entrepreneurship, combined with a bit of randomness in
demographics. Those two can come in various duos, and their duos tend to be
actually trios, ‘cause we have that other thing, which I already mentioned:
investment outlays of local governments and the share of those outlays in the
overall local budgets. Our (my friend’s and mine) intuitive take on that
picture was that it is really interesting to know the different ways those
Polish cities can go in the future, rather that setting one central model. I
mean, the central stochastic model is interesting too. It says, for example,
that the natural logarithm of the number of startups per 10 000
inhabitants, whilst being negatively correlated with the share of investment
outlays in the local government’s budget, it is positively correlated with the
absolute amount of those outlays. The more a local government spends on fixed
assets, the more startups it can expect per 10 000 inhabitants. That latter
variable is subject to some kind of scale effects from the part of the former.
Interesting. I like scale effects. They are intriguing. They show phenomena,
which change in a way akin to what happens when I heat up a pot full of water:
the more heat have I supplied to water, the more different kinds of stuff can
happen. We call it increase in the number of degrees of freedom.
The stochastically approached degrees of freedom in
the coefficient of startups per 10 000 inhabitants, you can see them in
Table 1, below. The ‘Ln’ prefix means, of course, natural logarithms. Further
below, I return to the topic of collective intelligence in this specific
context, and to using artificial intelligence to simulate the thing.
Table 1
Explained variable: Ln(number
of startups per 10 000 inhabitants)R2 = 0,608N = 70
Explanatory variable
Coefficient of regression
Standard error
Significance level
Ln(investment outlays of the local government)
-0,093
0,048
p = 0,054
Ln(total budget of the local government)
0,565
0,083
p < 0,001
Ln(population)
-0,328
0,09
p < 0,001
Constant
-0,741
0,631
p = 0,245
I take the correlations from Table 1, thus the
coefficients of regression from the first numerical column, and I check their
credentials with the significance level from the last numerical column. As I
want to understand them as real, actual things that happen in the cities
studied, I recreate the real values. We are talking about coefficients of
startups per 10 000 people, comprised somewhere the observable minimum ST10 000
= 140, and the maximum equal to ST10 000=
345, with a mean at ST10 000 = 223. It terms
of natural logarithms, that world folds into something between ln(140) =
4,941642423 and ln(345) = 5,843544417, with the expected
mean at ln(223) = 5,407171771. Standard deviation Ω
from that mean can be reconstructed from the standard error, which is
calculated as s = Ω/√N, and, consequently, Ω = s*√N.
In this case, with N = 70, standard deviation Ω = 0,631*√70
= 5,279324767.
That regression is interesting to the extent that it leads
to an absurd prediction. If the population of a city shrinks asymptotically
down to zero, and if, in the same time, the budget of the local government
swells up to infinity, the occurrence of entrepreneurial behaviour (number of
startups per 10 000 inhabitants) will tend towards infinity as well. There
is that nagging question, how the hell can the budget of a local government
expand when its tax base – the population – is collapsing. I am an economist
and I am supposed to answer questions like that.
Before being an economist, I am a scientist. I ask
embarrassing questions and then I have to invent a way to give an answer. Those
stochastic results I have just presented make me think of somehow haphazard a
set of correlations. Such correlations can be called dynamic, and this, in
turn, makes me think about the swarm theory and collective intelligence (see Yang et al. 2013[1] or What
are the practical outcomes of those hypotheses being true or false?).
A social structure, for example that of a city, can be seen as a community of
agents reactive to some systemic factors, similarly to ants or bees being
reactive to pheromones they produce and dump into their social space. Ants and
bees are amazingly intelligent collectively, whilst, let’s face it, they are
bloody stupid singlehandedly. Ever seen a bee trying to figure things out in
the presence of a window? Well, not only can a swarm of bees get that s**t down
easily, but also, they can invent a way of nesting in and exploiting the
whereabouts of the window. The thing is that a bee has its nervous system
programmed to behave smartly mostly in social interactions with other bees.
I have already developed on the topic of money and
capital being a systemic factor akin to a pheromone (see Technological
change as monetary a phenomenon). Now, I am walking down this
avenue again. What if city dwellers react, through entrepreneurial behaviour –
or the lack thereof – to a certain concentration of budgetary spending from the
local government? What if the budgetary money has two chemical hooks on it – one
hook observable as ‘current spending’ and the other signalling ‘investment’ –
and what if the reaction of inhabitants depends on the kind of hook switched
on, in the given million of euros (or rather Polish zlotys, or PLN, as we are
talking about Polish cities)?
I am returning, for a moment, to the negative
correlation between the headcount of population, on the one hand, and the
occurrence of new businesses per 10 000 inhabitants. Cities – at least
those 7 Polish cities that me and my friend did our research on – are finite
spaces. Less people in the city means less people per 1 km2 and vice
versa. Hence, the occurrence of entrepreneurial behaviour is negatively
correlated with the density of population. A behavioural pattern emerges. The
residents of big cities in Poland develop entrepreneurial behaviour in response
to greater a concentration of current budgetary spending by local governments,
and to lower a density of population. On the other hand, greater a density of
population or less money spent as current payments from the local budget act as
inhibitors of entrepreneurship. Mind you, greater a density of population means
greater a need for infrastructure – yes, those humans tend to crap and charge
their smartphones all over the place – whence greater a pressure on the local
governments to spend money in the form of investment in fixed assets, whence
the secondary in its force, negative correlation between entrepreneurial
behaviour and investment outlays from local budgets.
This is a general, behavioural hypothesis. Now, the
cognitive challenge consists in translating the general idea into as precise
empirical hypotheses as possible. What precise states of nature can happen in
those cities? This is when artificial intelligence – a neural network – can
serve, and this is when I finally understand where that idea of investment fund
had come from. A neural network is good at producing plausible combinations of
values in a pre-defined set of variables, and this is what we need if we want
to formulate precise hypotheses. Still, a neural network is made for learning.
If I want the thing to make those hypotheses for me, I need to give it a
purpose, i.e. a variable to optimize, and learn as it is optimizing.
In social sciences, entrepreneurial behaviour is
assumed to be a good thing. When people recurrently start new businesses, they
are in a generally go-getting frame of mind, and this carries over into social
activism, into the formation of institutions etc. In an initial outburst of
neophyte enthusiasm, I might program my neural network so as to optimize the
coefficient of startups per 10 000 inhabitants. There is a catch, though.
When I tell a neural network to optimize a variable, it takes the most likely
value of that variable, thus, stochastically, its arithmetical average, and it
keeps recombining all the other variables so as to have this one nailed down,
as close to that most likely value as possible. Therefore, if I want a neural
network to imagine relatively high occurrences of entrepreneurial behaviour, I
shouldn’t set said behaviour as the outcome variable. I should mix it with
others, as an input variable. It is very human, by the way. You brace for
achieving a goal, you struggle the s**t out of yourself, and you discover, with
negative amazement, that instead of moving forward, you are actually repeating
the same existential pattern over and over again. You can set your personal
compass, though, on just doing a good job and having fun with it, and then,
something strange happens. Things get done sort of you haven’t even noticed
when and how. Goals get nailed down even without being phrased explicitly as
goals. And you are having fun with the whole thing, i.e. with life.
Same for artificial intelligence, as it is, as a
matter of fact, an
artful expression of our own, human intelligence: it
produces the most interesting combinations of variables as a by-product of
optimizing something boring. Thus, I want my neural network to optimize on
something not-necessarily-fascinating and see what it can do in terms of people
and their behaviour. Here comes the idea of an investment fund. As I have been
racking my brains in the search of place where that idea had come from, I
finally understood: an investment fund is both an institutional scheme, and a
metaphor. As a metaphor, it allows decomposing an aggregate stream of
investment into a set of more or less autonomous projects, and decisions
attached thereto. An investment fund is a set of decisions coordinated in a dynamically
correlated manner: yes, there are ways and patterns to those decisions, but
there is a lot of autonomous figuring-out-the-thing in each individual case.
Thus, if I want to put functionally together those two
social phenomena – investment channelled by local governments and
entrepreneurial behaviour in local population – an investment fund is a good
institutional vessel to that purpose. Local government invests in some assets,
and local homo sapiens do the same in the form of startups. What if we mix them
together? What if the institutional scheme known as public-private partnership
becomes something practiced serially, as a local market for ideas and projects?
When we were designing that financial scheme for local
governments, me and my friend had the idea of dropping a bit of crowdfunding
into the cooking pot, and, as strange as it could seem, we are bit confused as
for where this idea came from. Why did we think about crowdfunding? If I want
to understand how a piece of artificial intelligence simulates collective
intelligence in a social structure, I need to understand what kind of logical
connections had I projected into the neural network. Crowdfunding is sort of
spontaneous. When I am having a look at the typical conditions proposed by
businesses crowdfunded at Kickstarter
or at StartEngine,
these are shitty contracts, with all the due respect. Having a Master’s in law,
when I look at the contracts offered to investors in those schemes, I wouldn’t
sign such a contract if I had any room for negotiation. I wouldn’t even sign a
contract the way I am supposed to sign it via a crowdfunding platform.
There
is quite a strong piece of legal and business science to claim that
crowdfunding contracts are a serious disruption to the established contractual
patterns (Savelyev
2017[2]).
Crowdfunding largely rests on the so-called smart contracts, i.e. agreements
written and signed as software on Blockchain-based platforms. Those contracts
are unusually flexible, as each amendment, would it be general or specific, can
be hash-coded into the history of the individual contractual relation. That
puts a large part of legal science on its head. The basic intuition of any
trained lawyer is that we negotiate the s**t of ourselves before the signature
of the contract, thus before the formulation of general principles, and anything
that happens later is just secondary. With smart contracts, we are pretty
relaxed when it comes to setting the basic skeleton of the contract. We just
put the big bones in, and expect we gonna make up the more sophisticated stuff
as we go along.
With
the abundant usage of smart contracts, crowdfunding platforms have peculiar
legal flexibility. Today you sign up for having a discount of 10% on one Flower Turbine, in
exchange of £400 in capital crowdfunded via a smart contract. Next week, you
learn that you can turn your 10% discount on one turbine into 7% on two
turbines if you drop just £100 more into that pig coin. Already the first step
(£400 against the discount of 10%) would be a bit hard to squeeze into
classical contractual arrangements as for investing into the equity of a
business, let alone the subsequent amendment (Armour,
Enriques 2018[3]).
Yet,
with a smart contract on a crowdfunding platform, anything is just a few clicks
away, and, as astonishing as it could seem, the whole thing works. The
click-based smart contracts are actually enforced and respected. People do sign
those contracts, and moreover, when I mentally step out of my academic lawyer’s
shoes, I admit being tempted to sign such a contract too. There is a specific
behavioural pattern attached to crowdfunding, something like the Russian
‘Davaj, riebiata!’ (‘Давай, ребята!’ in the original spelling). ‘Let’s do it
together! Now!’, that sort of thing. It is almost as I were giving someone the
power of attorney to be entrepreneurial on my behalf. If people in big Polish
cities found more and more startups, per 10 000 residents, it is a more
and more recurrent manifestation of entrepreneurial behaviour, and crowdfunding
touches the very heart of entrepreneurial behaviour (Agrawal
et al. 2014[4]). It is
entrepreneurship broken into small, tradable units. The whole concept we
invented is generally placed in the European context, and in Europe
crowdfunding is way below the popularity it has reached in North America (Rupeika-Aboga,
Danovi 2015[5]).
As a matter of fact, European entrepreneurs seem to consider crowdfunding as
really a secondary source of financing.
Time
to sum up a bit all those loose thoughts. Using a neural network to simulate
collective behaviour of human societies involves a few deep principles, and a
few tricks. When I study a social structure with classical stochastic tools and
I encounter strange, apparently paradoxical correlations between phenomena,
artificial intelligence may serve. My intuitive guess is that a neural network
can help in clarifying what is sometimes called ‘background correlations’ or
‘transitive correlations’: variable A is correlated with variable C through the
intermediary of variable B, i.e. A is significantly correlated with B, and B is
significantly correlated with C, but the correlation between A and C remains
insignificant.
When
I started to use a neural network in my research, I realized how important it
is to formulate very precise and complex hypotheses rather than definitive
answers. Artificial intelligence allows to sketch quickly alternative states of
nature, by gazillions. For a moment, I am leaving the topic of those financial
solutions for cities, and I return to my research on energy, more specifically
on energy efficiency. In a draft article I wrote last autumn, I started to
study the relative impact of the velocity of money, as well as that of the
speed of technological change, upon the energy efficiency of national
economies. Initially, I approached the thing in the
nicely and classically stochastic a way. I came up
with conclusions of the type: ‘variance in the supply of money makes 7% of the
observable variance in energy efficiency, and the correlation is robust’. Good,
this is a step forward. Still, in practical terms, what does it give? Does it
mean that we need to add money to the system in order to have greater an energy
efficiency? Might well be the case, only you don’t add money to the system just
like that, ‘cause most of said money is account money on current bank accounts,
and the current balances of those accounts reflect the settlement of
obligations resulting from complex private contracts. There is no government
that could possibly add more complex contracts to the system.
Thus,
stochastic results, whilst looking and sounding serious and scientific, have
remote connexion to practical applications. On the other hand, if I take the
same empirical data and feed it into a neural network, I get alternative states
of nature, and those states are bloody interesting. Artificial intelligence can
show me, for example, what happens to energy efficiency if a social system is
more or less conservative in its experimenting with itself. In short,
artificial intelligence allows super-fast simulation of social experiments, and
that simulation is theoretically robust.
I
am consistently delivering good, almost new science to my readers, and love
doing it, and I am working on crowdfunding this activity of mine. You can
communicate with me directly, via the mailbox of this blog: goodscience@discoversocialsciences.com.
As we talk business plans, I remind you that you can download, from the library
of my blog, the business plan I prepared for my
semi-scientific project Befund (and you can access the French version
as well). You can also get a free e-copy of my book ‘Capitalism and Political Power’
You can support my research by donating directly, any amount you consider
appropriate, to my PayPal account. You can also consider going
to my Patreon page
and become my patron. If you decide so, I will be grateful for suggesting me
two things that Patreon suggests me to suggest you. Firstly, what kind of
reward would you expect in exchange of supporting me? Secondly, what kind of
phases would you like to see in the development of my research, and of the
corresponding educational tools?
[1] Yang, X. S., Cui, Z., Xiao, R.,
Gandomi, A. H., & Karamanoglu, M. (2013). Swarm intelligence and bio-inspired
computation: theory and applications.
[2] Savelyev, A. (2017). Contract law
2.0:‘Smart’contracts as the beginning of the end of classic contract law.
Information & Communications Technology Law, 26(2), 116-134.
[3] Armour, J., & Enriques, L.
(2018). The promise and perils of crowdfunding: Between corporate finance and
consumer contracts. The Modern Law Review, 81(1), 51-84.
[4] Agrawal, A., Catalini, C., &
Goldfarb, A. (2014). Some simple economics of crowdfunding. Innovation Policy
and the Economy, 14(1), 63-97
[5] Rupeika-Apoga, R., & Danovi, A.
(2015). Availability of alternative financial resources for SMEs as a critical
part of the entrepreneurial eco-system: Latvia and Italy. Procedia Economics
and Finance, 33, 200-210.
Our artificial intelligence: the
working title of my research, for now. Volume 1: Energy and technological
change. I am doing a little bit of rummaging in available data, just to make
sure I keep contact with reality. Here comes a metric: access to electricity in the world, measured as the % of total human
population[1].
The trend line looks proudly ascending. In 2016, 87,38% of mankind had at least
one electric socket in their place. Ten years earlier, by the end of 2006, they
were 81,2%. Optimistic. Looks like something growing almost linearly. Another
one: « Electric power transmission and
distribution losses
»[2].
This one looks different: instead of a clear trend, I observe something shaking
and oscillating, with the width of variance narrowing gently down, as time
passes. By the end of 2014 (last data point in this dataset), we were globally
at 8,25% of electricity lost in transmission. The lowest coefficient of loss
occurred in 1998: 7,13%.
I move from distribution to
production of electricity, and to its percentage
supplied from nuclear power plants[3]. Still another shape, that of a
steep bell with surprisingly lean edges. Initially, it was around 2% of global
electricity supplied by the nuclear. At the peak of fascination, it was 17,6%,
and at the end of 2014, we went down to 10,6%. The thing seems to be
temporarily stable at this level. As I move to water, and to the percentage of electricity derived
from the hydro[4], I see another type of change: a
deeply serrated, generally descending trend. In 1971, we had 20,2% of our total
global electricity from the hydro, and by the end of 2014, we were at 16,24%.
In the meantime, it looked like a rollercoaster. Yet, as I am having a look at other renewables (i.e. other than
hydroelectricity) and their share in the total supply of electricity[5], the shape of the corresponding
curve looks like a snake, trying to figure something out about a vertical wall.
Between 1971 and 1988, the share of those other renewables in the total
electricity supplied moved from 0,25% to 0,6%. Starting from 1989, it is an
almost perfectly exponential growth, to reach 6,77% in 2015.
Just to have a complete picture, I
shift slightly, from electricity to energy consumption as a whole, and I check the global share of renewables
therein[6]. Surprise! This curve does not
behave at all as it is expected to behave, after having seen the previously
cited share of renewables in electricity. Instead of a snake sniffing a wall,
we can see a snake like from above, or something like e meandering river. This
seems to be a cycle over some 25 years (could it be Kondratiev’s?), with a peak around 18% of renewables in
the total consumption of energy, and a trough somewhere by 16,9%. Right now, we
seem to be close to the peak.
I am having a look at the big, ugly
brother of hydro: the oil, gas and coal sources of
electricity and
their share in the total amount of electricity produced[7].
Here, I observe a different shape of change. Between 1971 and 1986, the fossils
dropped their share from 62% to 51,47%. Then, it rockets up back to 62% in 1990.
Later, a slowly ascending trend starts, just to reach a peak, and oscillate for
a while around some 65 ÷ 67% between 2007 and 2011. Since then, the fossils are
dropping again: the short-term trend is descending.
Finally, one of the basic metrics I
have been using frequently in my research on energy: the final consumption thereof, per
capita, measured in kilograms of oil equivalent[8]. Here, we are back in the world of
relatively clear trends. This one is ascending, with some bumps on the way,
though. In 1971, we were at 1336,2 koe per person per year. In 2014, it was
1920,655 koe.
Thus, what are all those curves
telling me? I can see three clearly different patterns. The first is the
ascending trend, observable in the access to electricity, in the
consumption of energy per capita, and, since the late 1980ies, in the share of
electricity derived from renewable sources. The second is a cyclical
variation: share of renewables in the overall consumption of energy, to
some extent the relative importance of hydroelectricity, as well as that of the
nuclear. Finally, I can observe a descending trend in the relative
importance of the nuclear since 1988, as well as in some episodes from the life
of hydroelectricity, coal and oil.
On the top of that, I can
distinguish different patterns in, respectively, the production of energy, on
the one hand, and its consumption, on the other hand. The former seems to
change along relatively predictable, long-term paths. The latter looks like a
set of parallel, and partly independent experiments with different sources of
energy. We are collectively intelligent: I deeply believe that. I mean, I hope.
If bees and ants can be collectively smarter than singlehandedly, there is some
potential in us as well.
Thus, I am progressively designing a
collective intelligence, which experiments with various sources of energy, just
to produce those two, relatively lean, climbing trends: more energy per capita
and ever growing a percentage of capitae with access to electricity. Which
combinations of variables can produce a rationally desired energy efficiency?
How is the supply of money changing as we reach different levels of energy
efficiency? Can artificial intelligence make energy policies? Empirical check:
take a real energy policy and build a neural network which reflects the logical
structure of that policy. Then add a method of learning and see, what it
produces as hypothetical outcome.
What is the cognitive value of
hypotheses made with a neural network? The answer to this question starts with
another question: how do hypotheses made with a neural network differ from any
other set of hypotheses? The hypothetical states of nature produced by a neural
network reflect the outcomes of logically structured learning. The process of
learning should represent real social change and real collective intelligence. There
are four most important distinctions I have observed so far, in this
respect: a) awareness of internal cohesion b) internal competition c) relative
resistance to new information and d) perceptual selection (different ways of
standardizing input data).
The awareness of internal cohesion, in a neural network, is a function
that feeds into the consecutive experimental rounds of learning the information
on relative cohesion (Euclidean distance) between variables. We assume that
each variable used in the neural network reflects a sequence of collective
decisions in the corresponding social structure. Cohesion between variables
represents the functional connection between sequences of collective decisions.
Awareness of internal cohesion, as a logical attribute of a neural network,
corresponds to situations when societies are aware of how mutually coherent
their different collective decisions are. The lack of logical feedback on
internal cohesion represents situation when societies do not have that internal
awareness.
As I metaphorically look around and
ask myself, what awareness do I have about important collective decisions in my
local society. I can observe and pattern people’s behaviour, for one. Next
thing: I can read (very literally) the formalized, official information
regarding legal issues. On the top of that, I can study (read, mostly)
quantitatively formalized information on measurable attributes of the society,
such as GDP per capita, supply of money, or emissions of CO2.
Finally, I can have that semi-formalized information from what we call “media”,
whatever prefix they come with: mainstream media, social media, rebel media,
the-only-true-media etc.
As I look back upon my own life and
the changes which I have observed on those four levels of social awareness, the
fourth one, namely the media, has been, and still is the biggest game changer. I
remember the cultural earthquake in 1990 and later, when, after decades of
state-controlled media in the communist Poland, we suddenly had free press and
complete freedom of publishing. Man! It was like one of those moments when you
step out of a calm, dark alleyway right into the middle of heavy traffic in the
street. Information, it just wheezed past.
There is something about media, both
those called ‘mainstream’, and the modern platforms like Twitter or You Tube:
they adapt to their audience, and the pace of that adaptation is accelerating.
With Twitter, it is obvious: when I log into my account, I can see the Tweets
only from people and organizations whom I specifically subscribed to observe.
With You Tube, on my starting page, I can see the subscribed channels, for one,
and a ton of videos suggested by artificial intelligence on the grounds of what
I watched in the past. Still, the mainstream media go down the same avenue.
When I go bbc.com, the types of news presented are very largely what the
editorial team hopes will max out on clicks per hour, which, in turn, is based
on the types of news that totalled the most clicks in the past. The same was
true for printed newspapers, 20 years ago: the stuff that got to headlines was
the kind of stuff that made sales.
Thus, when I simulate collective
intelligence of a society with a neural network, the function allowing the
network to observe its own, internal cohesion seems to be akin the presence of
media platforms. Actually, I have already observed, many times, that adding
this specific function to a multi-layer perceptron (type of neural network)
makes that perceptron less cohesive. Looks like a paradox: observing the
relative cohesion between its own decisions makes a piece of AI less cohesive.
Still, real life confirms that observation. Social media favour the phenomenon
known as « echo chamber »: if I want, I can expose myself only to the
information that minimizes my cognitive dissonance and cut myself from anything
that pumps my adrenaline up. On a large scale, this behavioural pattern
produces a galaxy of relatively small groups encapsulated in highly distilled,
mutually incoherent worldviews. Have you ever wondered what it would be to use
GPS navigation to find your way, in the company of a hardcore flat-Earther?
When I run my perceptron over
samples of data regarding the energy – efficiency of national economies –
including the function of feedback on the so-called fitness function is largely equivalent to simulating a society
with abundant mediatic activity. The absence of such feedback is, on the other
hand, like a society without much of a media sector.
Internal competition, in a neural network, is the deep
underlying principle for structuring a multi-layer perceptron into separate layers, and
manipulating the number of neurons in each layer. Let’s suppose I have two
neural layers in a perceptron: A, and B, in this exact order. If I put three
neurons in the layer A, and one neuron in the layer B, the one in B will be able
to choose between the 3 signals sent from the layer A. Seen from the A
perspective, each neuron in A has to compete against the two others for the attention
of the single neuron in B. Choice on one end of a synapse equals competition on
the other end.
When I want to introduce choice in a
neural network, I need to introduce internal competition as well. If any neuron
is to have a choice between processing input A and its rival, input B, there
must be at least two distinct neurons – A and B – in a functionally distinct,
preceding neural layer. In a collective intelligence, choice requires
competition, and there seems to be no way around it. In a real brain, neurons form synaptic
sequences, which means that the great majority of our neurons fire because
other neurons have fired beforehand. We very largely think because we think,
not because something really happens out there. Neurons in charge of
early-stage collection in sensory data compete for the attention of our brain
stem, which, in turn, proposes its pre-selected information to the limbic
system, and the emotional exultation of the latter incites he cortical areas to
think about the whole thing. From there, further cortical activity happens just
because other cortical activity has been happening so far.
I propose you a quick self-check:
think about what you are thinking right now, and ask yourself, how much of what
you are thinking about is really connected to what is happening around you. Are
you thinking a lot about the gradient of temperature close to your skin? No,
not really? Really? Are you giving a lot of conscious attention to the chemical
composition of the surface you are touching right now with your fingertips? Not
really a lot of conscious thinking about this one either? Now, how much conscious
attention are you devoting to what [fill in the blank] said about [fill
in the blank], yesterday? Quite a lot of attention, isn’t it?
The point is that some ideas die
out, in us, quickly and sort of silently, whilst others are tough survivors and
keep popping up to the surface of our awareness. Why? How does it happen? What
if there is some kind of competition between synaptic paths? Thoughts, or
components thereof, that win one stage of the competition pass to the next,
where they compete again.
Internal competition requires
complexity. There needs to be something to compete for, a next step in the
chain of thinking. A neural network with internal competition reflects a
collective intelligence with internal hierarchies that offer rewards. Interestingly,
there is research showing that greater complexity gives more optimizing
accuracy to a neural network, but just as long as we are talking about really
low complexity, like 3 layers of neurons instead of two. As complexity is further
developed, accuracy decreases noticeably. Complexity is not the best solution
for optimization: see Olawoyin
and Chen (2018[9]).
Relative resistance to new
information corresponds to the way that an intelligent structure deals with
cognitive dissonance. In order to have any cognitive dissonance whatsoever, we
need at least two pieces of information: one that we have already appropriated
as our knowledge, and the new stuff, which could possibly disturb the placid
self-satisfaction of the I-already-know-how-things-work. Cognitive dissonance
is a potent factor of stress in human beings as individuals, and in whole
societies. Galileo would have a few words to say about it. Question: how to
represent in a mathematical form the stress connected to cognitive dissonance?
My provisional answer is: by division. Cognitive dissonance means that I
consider my acquired knowledge as more valuable than new information. If I want
to decrease the importance of B in relation to A, I divide B by a factor
greater than 1, whilst leaving A as it is. The denominator of new information
is supposed to grow over time: I am more resistant to the really new stuff than
I am to the already slightly processed information, which was new yesterday. In
a more elaborate form, I can use the exponential progression (see The really textbook-textbook
exponential growth).
I noticed an interesting property of
the neural network I use for studying energy efficiency. When I introduce
choice, internal competition and hierarchy between neurons, the perceptron gets
sort of wild: it produces increasing error instead of decreasing error, so it
basically learns how to swing more between possible states, rather than how to
narrow its own trial and error down to one recurrent state. When I add a
pinchful of resistance to new information, i.e. when I purposefully create
stress in the presence of cognitive dissonance, the perceptron calms down a
bit, and can produce a decreasing error.
Selection of information can occur
already at the level of primary perception. I developed on this one in « Thinking Poisson, or ‘WTF are the other
folks doing?’ ».
Let’s suppose that new science comes as for how to use particular sources of
energy. We can imagine two scenarios of reaction to that new science. On the
one hand, the society can react in a perfectly flexible way, i.e. each new
piece of scientific research gets evaluated as for its real utility for energy
management, and gest smoothly included into the existing body of technologies.
On the other hand, the same society (well, not quite the same, an alternative
one) can sharply distinguish those new pieces of science into ‘useful stuff’
and ‘crap’, with little nuance in between.
What do we know about collective
learning and collective intelligence? Three essential traits come to my mind. Firstly, we make
social structures, i.e. recurrent combinations of social relations, and those
structures tend to be quite stable. We like having stable social structures. We
almost instinctively create rituals, rules of conduct, enforceable contracts
etc., thus we make stuff that is supposed to make the existing stuff last. An
unstable social structure is prone to wars, coups etc. Our collective
intelligence values stability. Still, stability is not the same as perfect
conservatism: our societies have imperfect recall. This is the second important
trait. Over (long periods of) time we collectively shake off, and replace old
rules of social games with new rules, and we do it without disturbing the
fundamental social structure. In other words: stable as they are, our social
structures have mechanisms of adaptation to new conditions, and yet those
mechanisms require to forget something about our past. OK, not just forget
something: we collectively forget a shitload of something. Thirdly, there had
been many local human civilisations, and each of them had eventually collapsed,
i.e. their fundamental social structures had disintegrated. The civilisations
we have made so far had a limited capacity to learn. Sooner or later, they
would bump against a challenge which they were unable to adapt to. The
mechanism of collective forgetting and shaking off, in every known historically
documented case, had a limited efficiency.
I intuitively guess that simulating
collective intelligence with artificial intelligence is likely to be the most
fruitful when we simulate various capacities to learn. I think we can model
something like a perfectly adaptable collective intelligence, i.e. the one
which has no cognitive dissonance and processes information uniformly over
time, whilst having a broad range of choice and internal competition. Such a
neural network behaves in the opposite way to what we tend to associate with
AI: instead of optimizing and narrowing down the margin of error, it creates
new alternative states, possibly in a broadening range. This is a collective
intelligence with lots of capacity to learn, but little capacity to steady
itself as a social structure. From there, I can muzzle the collective
intelligence with various types of stabilizing devices, making it progressively
more and more structure-making, and less flexible. Down that avenue, the
solver-type of artificial intelligence lies, thus a neural network that just solves
a problem, with one, temporarily optimal solution.
I am consistently delivering good,
almost new science to my readers, and love doing it, and I am working on
crowdfunding this activity of mine. You can communicate with me directly, via the
mailbox of this blog: goodscience@discoversocialsciences.com.
As we talk business plans, I remind you that you can download, from the library
of my blog, the business plan I prepared for my
semi-scientific project Befund (and you can access the French version as well). You can also get a free
e-copy of my book ‘Capitalism and Political Power’ You can support my research by
donating directly, any amount you consider appropriate, to my PayPal account. You can also consider going to my Patreon page and become my patron. If you decide so, I will
be grateful for suggesting me two things that Patreon suggests me to suggest
you. Firstly, what kind of reward would you expect in exchange of supporting
me? Secondly, what kind of phases would you like to see in the development of
my research, and of the corresponding educational tools?