The possible Black Swans

I am re-digesting, like a cow, some of the intellectual food I figured out recently. I return to the specific strand of my research to be found in the unpublished manuscript ‘The Puzzle of Urban Density And Energy Consumption’, and I want to rummage a bit inside one specific issue, namely the meaning which I can attach to the neural activation function in the quantitative method I use.

Just to give a quick sketch of the landscape, I work through a general hypothesis that our human civilization is based on two factories: the factory of food in the countryside, and the factory of new social roles in cities. The latter produce new social roles by creating demographic anomalies, i.e. by packing humans tightly together, in abnormally high density. Being dense together makes us interact more with each other, which, whilst not always pleasant, stimulates our social brains and makes us figure out new interesting s**t, i.e. new social roles.

I made a metric of density in population, which is a coefficient derived from the available data of the World Bank. I took the coefficient of urbanization (World Bank 1[1]), and I multiplied it by the headcount of population (World Bank 4[2]). This is how I got the number of people living in cities. I divided it by the surface of urban land (World Bank 2[3]), and I got the density of population in cities, which I further label as ‘DU’. Further, I gather that the social difference between cities and the countryside, hence the relative impact of cities as breeding ground for new social roles, is determined by the difference in the depth of demographic anomalies created by the urban density of population. Therefore, I took the just-calculated coefficient DU and I divided it by the general density of population, or ‘DG’ (World Bank 5[4]). This is how I ended up the with the coefficient ‘DU/DG’, which, mathematically, denominates the density of urban population in units of general density in population.

I simulate an artificial reality, where we, humans, optimize the coefficient ‘DU/DG’ as our chief collective orientation. We just want to get it right. Enough human density in cities to be creative, and yet enough space for each human being able to practice mindfulness when taking a #2 in the toilet. We optimize ourselves being dense together in cities on the base of 7 input characteristics of ours, namely:   

Population – this is a typical scale variable. The intuition behind it is that size matters, and that’s why in most socio-economic research, when we really mean business in quantitative terms, we add such variables, pertinent to the size of the social entity studied. Urbanization occurring in a small country, like Belgium (with all my due respect for Belgians), is likely to occur differently from urbanization in India or in the U.S. In this specific case, I assume that a big population, like hundreds of millions of people, has to move more resources around to accommodate people in cities, as compared to a population counted in dozens of millions.  
Urban population absolute – same tune, a scale variable, more specifically pertinent to the headcount of urban populations.   
Gross Domestic Product (GDP, constant 2010 US$) – scale variable, once again, but this time it is about the real output of the economy. In my approach, the GDP is not exactly a measure of the wealth produced, but more of an appraisal of total productive activity in the humans living around. This is why I use constant prices. That shaves off the price-and-relative-wealth component, and leaves GDP as a metric pertinent to how much tradable surpluses do humans create in a given place and time.  
Broad money (% of GDP) – this is essentially the opposite to the velocity of money, and it corresponds to another strand in my research. I discovered and I keep studying the fact that in the presence of quick technological change, human societies stuff themselves up with abnormally high amounts of cash (or cash equivalents, for that matter). It holds for entire countries as well as for individual businesses. You can find more on that in my article ‘Technological change as a monetary phenomenon’. I guess that when humans make more new social roles in cities, technologies change faster.            
Energy use (kg of oil equivalent per capita) – this is one of the fundamental variables I frequently work with. I guess I included it in this particular piece of research just in case, in order to be able to connect with my research on the market of energy.  
Agricultural land (km2) – the surface of agricultural land available is a logical correlate of urban population. A given number of people in cities need a given amount of food, which, in turn, can be provided by a given surface of agricultural land.            
Cereal yield (kg per hectare) – logically complementary to the surface of agricultural land. Yield per hectare in France is different from what an average hectare can contribute in Nigeria, and that is likely to be correlated with urbanization.  

You can get the raw data I used UNDER THIS LINK. It covers Australia, Brazil, Canada, China, Colombia, France, Gabon, Germany, Ghana, India, Malaysia, Mexico, Mozambique, Namibia, New Zealand, Nigeria, Norway, Poland, Russian Federation, United Kingdom, and the United States. All that lot observed over the window in time stretching from 1961 all the way to 2015.

I make that data into a neural network, which means that I make h(tj) = x1(tj)*R* E[xi(tj-1)] + x2(tj)*R* E[x2(tj-1)] + … + xn(tj)*R* E[xn(tj-1)], as explained in my update titled ‘Representative for collective intelligence’, with x1, x2,…, x7 input variables described above, grouped in 21 social entities (countries), and spread over 2015 – 1961= 54 years. After the curation of data for empty cells, I have m = 896 experimental rounds in the (alleged) collective intelligence, whose presence I guess behind the numbers. I made that lot learn how to squeeze the partly randomized input, controlled for internal coherence, into the mould of the desired output of the coefficient xo = DU/DG. I ran the procedure of learning with 4 different methods of estimating the error of optimization. Firstly, I computed that error the way we do it in basic statistics, namely e1 = xo – h(tj). The mixed-up input is simply subtracted from expected output. In the background, I assume that the locally output xo is an expected value in statistical terms, i.e. it is the mean value of some hypothetical Gaussian distribution, local and specific to that concrete observation.  With that approach to error, there is no neural activation as such. It is an autistic neural network, which does not discriminate input as for its strength. It just reacts.

As I want my collective intelligence to be smarter than your average leech, I make three more estimations of errors, with the input h(tj) passing through a neural activation function. I start with the ReLU rectifier, AKA max[0, h(tj)], and, correspondingly, with e2 = xo – ReLU[h(tj)]. Then I warm up, and I use neural activation via hyperbolic tangent tanh[h] = (e2h – 1) / (e2h + 1), and I compute e3 = xo – tanh[h(tj)]. The hyperbolic tangent is a transcendental number generated by periodical observation of a hyperbola, and that means that hyperbolic tangent has no functional correlation to its input. Neural activation with hyperbolic tangent creates a projection of input into a separate, non-correlated space of states, like cultural transformation of cognitive input into symbols, ideologies and whatnot. Fourthly and finally, I use the sigmoid function (AKA logistic function) sig(h) = 1 / (1 + e-h) which can be read as smoothed likelihood that something happens, i.e. that input h(tj) has full power. The corresponding error is e4 = xo – sig[h(tj)].

From there, I go my normal way. I create 4 artificial realities out of my source dataset. Each of these realities assumes that humans strive to nail down the right social difference between cities and the countryside, as measured with the DU/DG coefficient. Each of these realities is generated with a different way of appraising how far we are from the desired DU/DG, this with four different ways of computing the error: e1, e2, e3, and e4.  The expected states of both the source empirical dataset, and sets representative for those 4 alternative realities, are given by their respective vectors of mean values, i.e. mean DU/DG, mean population etc. Those vectors of means are provided in Table 1 below. The source dataset shows a mean DU/DG = 41,14, which means that cities in this dataset display, on average across countries, 41 times greater a density of population than the general density of population. Mean empirical population is 149,6 million people, with mean urban population being 67,34 million people. Yes, we have China and India in the lot, and they really pump those scale numbers up.

Table 1 – Vectors of mean values in the source empirical set and in the perceptrons simulating alternative realities, optimizing the coefficient DU/DG

  Perceptrons pegged on DU/DG
VariableSource dataseterror = xo – herror = xo – ReLu(h)error = xo – tanh(h)error = xo – sigmoid(h)
DU/DG41,1436,384,9161,56324,29
Population149 625 587,07125 596 355,00(33 435 417,00)252 800 741,001 580 356 431,00
GDP (constant 2010 US$)1 320 025 624 972,081 025 700 000 000,00(922 220 000 000,00)2 583 780 000 000,0018 844 500 000 000,00
Broad money (% of GDP)57,5054,1331,8071,99258,38
Urban population absolute67 349 480,4254 311 459,20(31 977 590,00)123 331 287,00843 649 729,00
Energy use (kg of oil equivalent per capita)2 918,692 769,761 784,113 558,1611 786,15
Agricultural land km21 227 301,861 135 064,25524 611,511 623 345,716 719 245,69
Cereal yield (kg per hectare)3 153,313 010,542 065,683 766,3111 653,77

One of the first things which jumps to the eye in Table 1 – at least to my eye – is that one of the alternative realities, namely that based on the ReLU activation function, is an impossible reality. There are negative populations in this one, and this is not a livable state of things. I don’t know about you, my readers, but I would feel horrible knowing that I am a minus. People can’t be negative by default. By the way, in this specific dataset, the ReLU looks like almost identical to the basic difference e1 = xo – h(tj). Yet, whilst making an alternative reality with no neural transformation of quasi-randomized input, thus making it with e1 = xo – h(tj), creates something pretty close to the original empirics.

Another alternative reality which looks sort of sketchy is the one based on neural activation via the sigmoid function. This one transforms the initial mean expected values into their several-times-multiples. Looks like the sigmoid is equivalent, in this case, to powering the collective intelligence of societies studied with substantial doses of interesting chemicals. That particular reality is sort of a wild dream, like what it would be like to produce almost 4 times more cereal yield per hectare, having more than 4 times more agricultural land, and over 10 times more people in cities. The surface of available land being finite as it is, 4 times more agricultural land and 10 times more people in cities would mean cities tiny in terms of land surface, probably all in height, both under and above ground, with those cities being 324 times denser with humans than the general landscape. Sounds familiar, a bit like sci fi movies.  

Four different ways of pitching input variables against the expected output of optimal DU/DG coefficient produce four very different alternative realities. Out of these four, one is impossible, one is hilarious, and we stay with two acceptable ones, namely that based on no proper neural activation at all, and the other one using the hyperbolic tangent for assessing the salience of things. Interestingly, errors estimated as e1 = xo – h(tj) are essentially correlated with the input variables, whilst those assessed as e3 = xo – tanh[h(tj)] are essentially uncorrelated. It means that in the former case one can more or less predict how satisfied the neural network will be with the local input, and that prediction can be reliably made a priori. In the latter case, with the hyperbolic tangent, there is no way to know in advance. In this case, neural activation is a true transformation of reality.

Table 2 below provides the formal calculation of standardized Euclidean distance between all the 4 alternative realities and the real world of tears we live in. By standardized Euclidean I mean: E = {[(meanX – meanS)2]0,5} / meanX. The ‘/ meanX’ part means that divide the basic Euclidean distance by the mean value which serves me as benchmark, i.e. the empirical one. That facilitates subsequent averaging of those variable-specific Euclidean distances into one metric of mathematical similarity between entire vectors of values.   

Table 2 – Vectors of standardized Euclidean distances between the source set X and the perceptrons simulating alternative realities, optimizing the coefficient DU/DG

error = xo – herror = xo – ReLu(h)error = xo – tanh(h)error = xo – sigmoid(h)
DU/DG]0,1155978740,880654960,4963466216,882843342
Population0,1605957411,2234605570,689555559,562073386
GDP (constant 2010 US$)0,222969631,6986379530,95737109313,27585923
Broad money (% of GDP)0,0586723240,4469811720,2519234033,493424228
Urban population absolute0,1935875551,4748008420,83121363711,52644748
Energy use (kg of oil equivalent per capita)0,0510261810,3887308450,2190928923,038163202
Agricultural land km20,0751547870,5725489140,3226947364,474810971
Cereal yield (kg per hectare)0,0452750,3449168340,1943988412,695730596
Average0,1153598860,878841510,4953245966,868669054

Interestingly, whilst alternative reality based on neural activation through the ReLU function creates impossibly negative populations, its overall Euclidean similarity to the source dataset is not as big as it could seem. The impossible alternative is specific just to some variables.

Now, what does it all have to do with anything? How is that estimation of error representative for collective intelligence in human societies? Good question. I am doing my best to give some kind of answer to it. Quantitative socio-economic variables represent valuable collective outcomes, and thus are informative about alternative orientations in collective action. The process of learning how to nail those valuable outcomes down consumes said orientation in action. Assuming that figuring out the right proportion of demographic anomaly in cities, as measured with DU/DG, is a valuable collective outcome, four collective orientations thereon have been simulated. One goes a bit haywire (negative populations), and yet it shows a possible state of society which attempts to sort of smooth out the social difference between cities and the countryside, with DU/DG being ten times lower than reality. Another one goes fantasque, with huge numbers and a slightly sci-fi-ish shade. The remaining two look like realistic alternatives, one essentially predictable with e1 = xo – h(tj), and another one essentially unpredictable, with e3 = xo – tanh[h(tj)].

I want my method to serve as a predictive tool for sketching the possible scenarios of technological change, in particular as regards the emergence and absorption of radically new technologies. On the other hand, I want my method to be of help when it comes to identifying the possible Black Swans, i.e. the rather unlikely and yet profoundly disturbing states of nature. As I look at those 4 alternative realities my perceptron has just made up (it’s not me, its him! Well, it…), I can see two Black Swans. The one made with the sigmoid activation function shows a possible direction which, for example, African countries could follow, should they experience rapid demographic growth. This particular Black Swan is a hypothetical situation, when population grows like hell. This automatically puts enormous pressure on agriculture. More people need more food. More agriculture requires more space and there is fewer left for cities. Still, more people around need more social roles, and we need to ramp up the production thereof in very densely packed urban populations, where the sheer density of human interaction makes our social brains just race for novelty. This particular Black Swan could be actually a historical reconstruction. It could be representative for the type of social change which we know as civilisational revival: passage from the nomad life to the sedentary one, like a dozen of thousands of years ago, reconstruction of social tissue after the fall of the Western Roman Empire in Europe, that sort of stuff.

Another Black Swan is made with the ReLU activation function and simulates a society, where cities lose their function as factories of new social roles. It is the society in downsizing. It is actually a historical reconstruction, too. This is what must have happened when the Western Roman Empire was collapsing, and before the European civilization bounced back.

Well, well, well, that s**t makes sense… Amazing.


[1] World Bank 1: https://data.worldbank.org/indicator/SP.URB.TOTL.IN.ZS

[2] World Bank 4: https://data.worldbank.org/indicator/SP.POP.TOTL

[3] World Bank 2: https://data.worldbank.org/indicator/AG.LND.TOTL.UR.K2

[4] World Bank 5: https://data.worldbank.org/indicator/EN.POP.DNST

We haven’t nailed down all our equations yet

As I keep digging into the topic of collective intelligence, and my research thereon with the use of artificial neural networks, I am making a list of key empirical findings that pave my way down this particular rabbit hole. I am reinterpreting them with the new understandings I have from translating my mathematical model of artificial neural network into an algorithm. I am learning to program in Python, which comes sort of handy given I want to use AI. How could I have made and used artificial neural networks without programming, just using Excel? You see, that’s Laplace and his hypothesis that mathematics represent the structure of reality (https://discoversocialsciences.com/wp-content/uploads/2020/10/Laplace-A-Philosophical-Essay-on-Probabilities.pdf ).

An artificial neural network is a sequence of equations which interact, in a loop, with a domain of data. Just as any of us, humans, essentially. We just haven’t nailed down all of our own equations yet. What I can do and have done with Excel was to understand the structure of those equations and their order. This is a logical structure, and as long as I don’t give it any domain of data to feed on, is stays put.

When I feed data into that structure, it starts working. Now, with any set of empirical socio-economic variables I have worked with, so far, there is always 1 – 2 among them which are different from others as output. Generally, my neural network works differently according to the output variable I make it optimize. Yes, it is the output variable, supposedly being the desired outcome to optimize, and not the input variables treated as instrumental in that view, which makes the greatest difference in the results produced by the network.

That seems counterintuitive, and yet this is like the most fundamental common denominator of everything I have found out so far: the way that a simple neural network simulates the collective intelligence of human societies seems to be conditioned most of all by the variables pre-set as the output of the adaptation process, not by the input ones. Is it a sensible conclusion regarding collective intelligence in real life, or is it just a property of the data? In other words, is it social science or data science? This is precisely one of the questions which I want to answer by learning programming.

If it is a pattern of collective human intelligence, that would mean we are driven by the orientations pursued much more than by the actual perception of reality. What we are after would be more important a differentiating factor of your actions than what we perceive and experience as reality. Strangely congruent with the Interface Theory of Perception (Hoffman et al. 2015[1], Fields et al. 2018[2]). 

As it is some kind of habit in me, in the second part of this update I put the account of my learning how to program and to Data Science in Python. This time, I wanted to work with hard cases of CSV import, like trouble files. I want to practice data cleansing. I have downloaded the ‘World Economic Outlook October 2020’ database from the website https://www.imf.org/en/Publications/WEO/weo-database/2020/October/download-entire-database . Already when downloading, I could notice that the announced format is ‘TAB delimited’, not ‘Comma Separated’. It downloads as Excel.

To start with, I used the https://anyconv.com/tab-to-csv-converter/ website to do the conversion. In parallel, I tested two other ways:

  1. opening in Excel, and then saving as CSV
  2. opening with Excel, converting to *.TXT, importing into Wizard for MacOS (statistical package), and then exporting as CSV.

What I can see like right off the bat are different sizes in the same data, technically saved in the same format. The AnyConv-generated CSV is 12,3 MB, the one converted through Excel is 9,6 MB, and the last one, filtered through Excel to TXT, then to Wizard and to CSV makes 10,1 MB. Intriguing.

I open JupyterLab online, and I create a Python 3-based Notebook titled ‘Practice 27_11_2020_part2’.

I prepare the Notebook by importing Numpy, Pandas, Matplotlib and OS. I do:

>> import numpy as np

      import pandas as pd

      import matplotlib.pyplot as plt

      import os

I upload the AnyConv version of the CSV. I make sure to have the name of the file right by doing:

>> os.listdir()


…and I do:

>> WEO1=pd.DataFrame(pd.read_csv(‘AnyConv__WEOOct2020all.csv’))

Result:

/srv/conda/envs/notebook/lib/python3.7/site-packages/IPython/core/interactiveshell.py:3072: DtypeWarning: Columns (83,85,87,89,91,93,95,98,99,102,103,106,107,110,111,114,115,118,119,122,123,126,127,130,131,134,135,138,139,142,143,146,147,150,151,154,155,158) have mixed types. Specify dtype option on import or set low_memory=False.

  interactivity=interactivity, compiler=compiler, result=result)

As I have been told, I add the “low_memory=False” option to the command, and I retype:

>> WEO1=pd.DataFrame(pd.read_csv(‘AnyConv__WEOOct2020all.csv’, low_memory=False))

Result: the file is apparently imported successfully. I investigate the structure.

>> WEO1.describe()

Result: I know I have 8 rows (there should be much more, over 200), and 32 columns. Something is wrong.

I upload the Excel-converted CSV.

>> WEO2=pd.DataFrame(pd.read_csv(‘WEOOct2020all_Excel.csv’))

Result: Parser error

I retry, with parameter sep=‘;’ (usually works with Excel)

>> WEO2=pd.DataFrame(pd.read_csv(‘WEOOct2020all_Excel.csv’,sep=’;’))

Result: import successful. Let’s check the shape of the data

>> WEO2.describe()

Result: Pandas can see just the last column. I make sure.

>> WEO2.columns

Result:

Index([‘WEO Country Code’, ‘ISO’, ‘WEO Subject Code’, ‘Country’,

       ‘Subject Descriptor’, ‘Subject Notes’, ‘Units’, ‘Scale’,

       ‘Country/Series-specific Notes’, ‘1980’, ‘1981’, ‘1982’, ‘1983’, ‘1984’,

       ‘1985’, ‘1986’, ‘1987’, ‘1988’, ‘1989’, ‘1990’, ‘1991’, ‘1992’, ‘1993’,

       ‘1994’, ‘1995’, ‘1996’, ‘1997’, ‘1998’, ‘1999’, ‘2000’, ‘2001’, ‘2002’,

       ‘2003’, ‘2004’, ‘2005’, ‘2006’, ‘2007’, ‘2008’, ‘2009’, ‘2010’, ‘2011’,

       ‘2012’, ‘2013’, ‘2014’, ‘2015’, ‘2016’, ‘2017’, ‘2018’, ‘2019’, ‘2020’,

       ‘2021’, ‘2022’, ‘2023’, ‘2024’, ‘2025’, ‘Estimates Start After’],

      dtype=’object’)

I will try to import the same file with a different ‘sep’ parameter, this time as sep=‘\t’

>> WEO3=pd.DataFrame(pd.read_csv(‘WEOOct2020all_Excel.csv’,sep=’\t’))

Result: import apparently successful. I check the shape of my data.

>> WEO3.describe()

Result: apparently, this time, no column is distinguished.

When I type:

>> WEO3.columns

…I get

Index([‘WEO Country Code;ISO;WEO Subject Code;Country;Subject Descriptor;Subject Notes;Units;Scale;Country/Series-specific Notes;1980;1981;1982;1983;1984;1985;1986;1987;1988;1989;1990;1991;1992;1993;1994;1995;1996;1997;1998;1999;2000;2001;2002;2003;2004;2005;2006;2007;2008;2009;2010;2011;2012;2013;2014;2015;2016;2017;2018;2019;2020;2021;2022;2023;2024;2025;Estimates Start After’], dtype=’object’)

Now, I test with the 3rd file, the one converted through Wizard.

>> WEO4=pd.DataFrame(pd.read_csv(‘WEOOct2020all_Wizard.csv’))

Result: import successful.

I check the shape.

>> WEO4.describe()

Result: still just 8 rows. Something is wrong.

I do another experiment. I take the original*.XLS from imf.org, and I save it as regular Excel *.XLSX, and then I save this one as CSV.

>> WEO5=pd.DataFrame(pd.read_csv(‘WEOOct2020all_XLSX.csv’))

Result: parser error

I will retry with two options as for the separator: sep=‘;’ and sep=‘\t’. Ledzeee…

>> WEO5=pd.DataFrame(pd.read_csv(‘WEOOct2020all_XLSX.csv’,sep=’;’))

Import successful. “WEO5.describe()” yields just one column.

>> WEO6=pd.DataFrame(pd.read_csv(‘WEOOct2020all_XLSX.csv’,sep=’\t’))

yields successful import, yet all the data is just one long row, without separation into columns.

I check WEO5 and WEO6 with “*.index”, and “*.shape”. 

“WEO5.index” yields “RangeIndex(start=0, stop=8777, step=1)”

“WEO6.index” yields “RangeIndex(start=0, stop=8777, step=1)

“WEO5.shape” gives “(8777, 56)”

“WEO6.shape” gives “(8777, 1)”

Depending on the separator given as parameter in the “pd.read_csv” command, I get 56 columns or just 1 column, yet the “*.describe()” command cannot make sense of them.

I try the *.describe” command, thus more specific than the “*.describe()” one.

I can see that structures are clearly different.

I try another trick, namely to assume separator ‘;’ and TAB delimiter.

>> WEO7=pd.DataFrame(pd.read_csv(‘WEOOct2020all_XLSX.csv’,sep=’;’,delimiter=’\t’))

Result: WEO7.shape yields 8777 rows in just one column.

Maybe ‘header=0’? Same thing.

The provisional moral of the fairy tale is that ‘Data cleansing’ means very largely making sense of the exact shape and syntax of CSV files. Depending on the parametrisation of separators and delimiters, different Data Frames are obtained.


[1] Hoffman, D. D., Singh, M., & Prakash, C. (2015). The interface theory of perception. Psychonomic bulletin & review, 22(6), 1480-1506.

[2] Fields, C., Hoffman, D. D., Prakash, C., & Singh, M. (2018). Conscious agent networks: Formal analysis and application to cognition. Cognitive Systems Research, 47, 186-213. https://doi.org/10.1016/j.cogsys.2017.10.003

I re-run my executable script

I am thinking (again) about the phenomenon of collective intelligence, this time in terms of behavioural reinforcement that we give to each other, and the role that cities and intelligent digital clouds can play in delivering such reinforcement. As it is usually the case with science, there is a basic question to ask: ‘What’s the point of all the fuss with that nice theory of yours, Mr Wasniewski? Any good for anything?’.

Good question. My tentative answer is that studying human societies as collectively intelligent structures is a phenomenology, which allows some major methodological developments, which, I think, are missing from other methodologies in social sciences. First of all, it allows a completely clean slate at the starting point of research, as regards ethics and moral orientations, whilst it almost inevitably leads to defining ethical values through empirical research. This was my first big ‘Oh, f**k!’ with that method: I realized that ethical values can be reliably studied as objectively pursued outcomes at the collective level, and that study can be robustly backed with maths and empirics.

I have that thing with my science, and, as a matter of fact, with other people’s science too: I am an empiricist. I like prodding my assumptions and make them lose some fat, so as they become lighter. I like having as much of a clean slate at the starting point of my research as possible. I believe that one single assumption, namely that human social structures are collectively intelligent structures, almost automatically transforms all the other assumptions into hypotheses to investigate. Still, I need to go, very carefully, through that one single Mother Of All Assumptions, i.e. about us, humans as a society, being collectively intelligent a structure, in order to nail down, and possibly kick out any logical shortcut.

Intelligent structures learn by producing many alternative versions of themselves and testing those versions for fitness in coping with a vector of constraints. There are three claims hidden in this single claim: learning, production of different versions, and testing for fitness. Do human social structures learn, like at all? Well, we have that thing called culture, and culture changes. There is observable change in lifestyles, aesthetic tastes, fashions, institutions and technologies. This is learning. Cool. One down, two still standing.

Do human social structures produce many different versions of themselves? Here, we enter the subtleties of distinction between different versions of a structure, on the one hand, and different structures, on the other hand. A structure remains the same, and just makes different versions of itself, as long as it stays structurally coherent. When it loses structural coherence, it turns into a different structure. How can I know that a structure keeps its s**t together, i.e. it stays internally coherent? That’s a tough question, and I know by experience that in the presence of tough questions, it is essential to keep it simple. One of the simplest facts about any structure is that it is made of parts. As long as all the initial parts are still there, I can assume they hold together somehow. In other words, as long as whatever I observe about social reality can be represented as the same complex set, with the same components inside, I can assume this is one and the same structure just making copies of itself. Still, this question remains a tough one, especially that any intelligent structure should be smart enough to morph into another intelligent structure when the time is right.      

The time is right when the old structure is no longer able to cope with the vector of constraints, and so I arrive to the third component question: how can I know there is adaptation to constraints? How can I know there are constraints for assessing fitness? In a very broad sense, I can see constraints when I see error, and correction thereof, in someone’s behaviour. In other words, when I can see someone sort of making two steps forward and one step back, correcting their course etc., this is a sign of adaptation to constraints. Unconstrained change is linear or exponential, whilst constrained change always shows signs of bumping against some kind of wall. Here comes a caveat as regards using artificial neural networks as simulators of collective human intelligence: they are any good only when they have constraints, and, consequently, when they make errors. An artificial neural network is no good at simulating unconstrained change. When I explore the possibility of simulating collective human intelligence with artificial neural networks, it has marks of a pleonasm. I can use AI as simulator only when the simulation involves constrained adaptation.

F**k! I have gone philosophical in those paragraphs. I can feel a part of my mind gently disconnecting from real life, and this is time to do something in order to stay close to said real life. Here is a topic, which I can treat as teaching material for my students, and, in the same time, make those general concepts bounce a bit around, inside my head, just to see what happens. I make the following claim: ‘Markets are manifestations of collective intelligence in human societies’. In science, this is a working hypothesis. It is called ‘working’ because it is not proven yet, and thus it has to earn its own living, so to say. This is why it has to work.

I pass in review the same bullet points: learning, for one, production of many alternative versions in a structure as opposed to creating new structures, for two, and the presence of constraints as the third component. Do markets manifest collective learning? Ledzzeee… Markets display fashions and trends. Markets adapt to lifestyles, and vice versa. Markets are very largely connected to technological change and facilitate the occurrence thereof. Yes, they learn.

How can I say whether a market stays the same structure and just experiments with many alternative versions thereof, or, conversely, whether it turns into another structure? It is time to go back to the fundamental concepts of microeconomics, and assess (once more), what makes a market structure. A market structure is the mechanism of setting transactional prices. When I don’t know s**t about said mechanism, I just observe prices and I can see two alternative pictures. Picture one is that of very similar prices, sort of clustered in the same, narrow interval. This is a market with equilibrium price, which translates into a local market equilibrium. Picture two shows noticeably disparate prices in what I initially perceived as the same category of goods. There is no equilibrium price in that case, and speaking more broadly, there is no local equilibrium in that market.

Markets with local equilibriums are assumed to be perfectly competitive or very close thereto. They are supposed to serve for transacting in goods so similar that customers perceive them as identical, and technologies used for producing those goods don’t differ sufficiently to create any kind of competitive advantage (homogeneity of supply), for one. Markets with local equilibriums require the customers to be so similar to each other in their tastes and purchasing patterns that, on the whole, they can be assumed identical (homogeneity of demand), for two. Customers are supposed to be perfectly informed about all the deals available in the market (perfect information). Oh, yes, the last one: no barriers to entry or exit. A perfectly competitive market is supposed to offer virtually no minimum investment required for suppliers to enter the game, and no sunk costs in the case of exit.  

Here is that thing: many markets present the alignment of prices typical for a state of local equilibrium, and yet their institutional characteristics – such as technologies, the diversity of goods offered, capital requirements and whatnot – do not match the textbook description of a perfectly competitive market. In other words, many markets form local equilibriums, thus they display equilibrium prices, without having the required institutional characteristics for that, at least in theory. In still other words, they manifest the alignment of prices typical for one type of market structure, whilst all the other characteristics are typical for another type of market structure.

Therefore, the completely justified ‘What the hell…?’question arises. What is a market structure, at the end of the day? What is a structure, in general?

I go down another avenue now. Some time ago, I signalled on my blog that I am learning programming in Python, or, as I should rather say, I make one more attempt at nailing it down. Programming teaches me a lot about the basic logic of what I do, including that whole theory of collective intelligence. Anyway, I started to keep a programming log, and here below, I paste the current entry, from November 27th, 2020.

 Tasks to practice:

  1. reading well structured CSV,
  2. plotting
  3. saving and retrieving a Jupyter Notebook in JupyterLab

I am practicing with Penn World Tables 9.1. I take the version without empty cells, and I transform it into CSV.

I create a new notebook on JupyterLab. I name it ‘Practice November 27th 2020’.

  • Path: demo/Practice November 27th 2020.ipynb

I upload the CSV version of Penn Tables 9.1 with no empty cells.

Shareable link: https://hub.gke2.mybinder.org/user/jupyterlab-jupyterlab-demo-zbo0hr9b/lab/tree/demo/PWT%209_1%20no%20empty%20cells.csv

Path: demo/PWT 9_1 no empty cells.csv

Download path: https://hub.gke2.mybinder.org/user/jupyterlab-jupyterlab-demo-zbo0hr9b/files/demo/PWT%209_1%20no%20empty%20cells.csv?_xsrf=2%7C2ce78815%7C547592bc83c83fd951870ab01113e7eb%7C1605464585

I code libraries:

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

import os

I check my directory:

>> os.getcwd()

result: ‘/home/jovyan/demo’

>> os.listdir()

result:

[‘jupyterlab.md’,

 ‘TCGA_Data’,

 ‘Lorenz.ipynb’,

 ‘lorenz.py’,

 ‘notebooks’,

 ‘data’,

 ‘jupyterlab-slides.pdf’,

 ‘markdown_python.md’,

 ‘big.csv’,

 ‘Practice November 27th 2020.ipynb’,

 ‘.ipynb_checkpoints’,

 ‘Untitled.ipynb’,

 ‘PWT 9_1 no empty cells.csv’]

>> PWT9_1=pd.DataFrame(pd.read_csv(‘PWT 9_1 no empty cells.csv’,header=0))

Result:

  File “<ipython-input-5-32375ff59964>”, line 1

    PWT9_1=pd.DataFrame(pd.read_csv(‘PWT 9_1 no empty cells.csv’,header=0))

                                       ^

SyntaxError: invalid character in identifier

>> I rename the file on Jupyter, into ‘PWT 9w1 no empty cells.csv’.

>> os.listdir()

Result:

[‘jupyterlab.md’,

 ‘TCGA_Data’,

 ‘Lorenz.ipynb’,

 ‘lorenz.py’,

 ‘notebooks’,

 ‘data’,

 ‘jupyterlab-slides.pdf’,

 ‘markdown_python.md’,

 ‘big.csv’,

 ‘Practice November 27th 2020.ipynb’,

 ‘.ipynb_checkpoints’,

 ‘Untitled.ipynb’,

 ‘PWT 9w1 no empty cells.csv’]

>> PWT9w1=pd.DataFrame(pd.read_csv(‘PWT 9w1 no empty cells.csv’,header=0))

Result: imported successfully

>> PWT9w1.describe()

Result: descriptive statistics

# I want to list columns (variables) in my file

>> PWT9w1.columns

Result:

Index([‘country’, ‘year’, ‘rgdpe’, ‘rgdpo’, ‘pop’, ’emp’, ’emp / pop’, ‘avh’,

       ‘hc’, ‘ccon’, ‘cda’, ‘cgdpe’, ‘cgdpo’, ‘cn’, ‘ck’, ‘ctfp’, ‘cwtfp’,

       ‘rgdpna’, ‘rconna’, ‘rdana’, ‘rnna’, ‘rkna’, ‘rtfpna’, ‘rwtfpna’,

       ‘labsh’, ‘irr’, ‘delta’, ‘xr’, ‘pl_con’, ‘pl_da’, ‘pl_gdpo’, ‘csh_c’,

       ‘csh_i’, ‘csh_g’, ‘csh_x’, ‘csh_m’, ‘csh_r’, ‘pl_c’, ‘pl_i’, ‘pl_g’,

       ‘pl_x’, ‘pl_m’, ‘pl_n’, ‘pl_k’],

      dtype=’object’)

>> PWT9w1.columns()

Result:

TypeError                                 Traceback (most recent call last)

<ipython-input-11-38dfd3da71de> in <module>

—-> 1 PWT9w1.columns()

TypeError: ‘Index’ object is not callable

# I try plotting

>> plt.plot(df.index, df[‘rnna’])

Result:

I get a long list of rows like: ‘<matplotlib.lines.Line2D at 0x7fc59d899c10>’, and a plot which is visibly not OK (looks like a fan).

# I want to separate one column from PWT9w1 as a separate series, and then plot it. Maybe it is going to work.

>> RNNA=pd.DataFrame(PWT9w1[‘rnna’])

Result: apparently successful.

# I try to plot RNNA

>> RNNA.plot()

Result:

<matplotlib.axes._subplots.AxesSubplot at 0x7fc55e7b9e10> + a basic graph. Good.

# I try to extract a few single series from PWT9w1 and to plot them. Let’s go for AVH, PL_I and CWTFP.

>> AVH=pd.DataFrame(PWT9w1[‘avh’])

>> PL_I=pd.DataFrame(PWT9w1[‘pl_i’])

>> CWTFP=pd.DataFrame(PWT9w1[‘cwtfp’])

>> AVH.plot()

>> PL_I.plot()

>> CWTFP.plot()

Result:

It worked. I have basic plots.

# It is 8:20 a.m. I go to make myself a coffee. I will quit JupyterLab for a moment. I saved my today’s notebook on server, and I will see how I can open it. Just in case, I make a PDF copy, and a Python copy on my disk.

I cannot do saving into PDF. An error occurs. I will have to sort it out. I made an *.ipynb copy on my disk.

demo/Practice November 27th 2020.ipynb

# It is 8:40 a.m. I am logging back into JupyterLab. I am trying to open my today’s notebook from path. Does not seem to work. I am uploading my *.ipynb copy. This worked. I know now: I upload the *.ipynb script from my own location and then just double click on it. I needed to re-upload my CSV file ‘PWT 9w1 no empty cells.csv’.

# I check if my re-uploaded CSV file is fully accessible. I discover that I need to re-create the whole algorithm. In other words: when I upload on JupyterLab a *.ipynb script from my disk, I need to re-run all the operations. My first idea is to re-run each executable cell in the uploaded script. That worked. Question: how to automatise it? Probably by making a Python script all in one piece, uploading my CSV data source first, and then run the whole script.

I like being a mad scientist

I like being a mad scientist. Am I a mad scientist? A tiny bit, yes, ‘cause I do research on things just because I feel like. Mind you, me being that mad scientist I like being happens to be practical. Those rabbit holes I dive into prove to have interesting outcomes in real life.

I feel like writing, and therefore thinking in an articulate way, about two things I do in parallel: science and investment. I have just realized these two realms of activity tend to merge and overlap in me. When I do science, I tend to think like an investor, or a gardener. I invest my personal energy in ideas which I think have potential for growth. On the other hand, I invest in the stock market with a strong dose of curiosity. Those companies, and the investment positions I can open therein, are like animals which I observe, try to figure out how not to get killed by them, or by predators that hunt them, and I try to domesticate those beasts.

The scientific thing I am working on is the application of artificial intelligence to studying collective intelligence in human societies. The thing I am working on sort of at the crest between science and investment is fundraising for scientific projects (my new job at the university).

The project aims at defining theoretical and empirical fundamentals for using intelligent digital clouds, i.e. large datasets combined with artificial neural networks, in the field of remote digital diagnostics and remote digital care, in medical sciences and medical engineering. That general purpose translates into science strictly speaking, and into the prospective development of medical technologies.

There is observable growth in the percentage of population using various forms of digital remote diagnostics and healthcare. Yet, that growth is very uneven across different social groups, which suggests an early, pre-popular stage of development in those technologies (Mahajan et al. 2020[i]). Other research confirms that supposition, as judging by the very disparate results obtained with those technologies, in terms of diagnostic and therapeutic effectiveness (Cheng et al. 2020[ii]; Wong et al. 2020[iii]). There are known solutions where intelligent digital cloud allows transforming the patient’s place of stay (home, apartment) into the local substitute of a hospital bed, which opens interesting possibilities as regards medical care for patients with significantly reduced mobility, e.g. geriatric patients (Ben Hassen et al. 2020[iv]). Already around 2015, creative applications of medical imagery appeared, where the camera of a person’s smartphone served for early detection of skin cancer (Bliznuks et al. 2017[v]). The connection between distance diagnostics with the acquisition and processing of image comes as one of the most interesting and challenging innovations to make in the here-discussed field of technology (Marwan et al. 2018[vi]). The experience of COVID-19 pandemic has already showed the potential of digital intelligent clouds in assisting national healthcare systems, especially in optimising and providing flexibility to the use of resources, both material and human (Alashhab et al. 2020[vii]). Yet, the same pandemic experience has shown the depth of social disparities as regards real actual access to digital technologies supported by intelligent clouds (Whitelaw et al. 2020[viii]). Intelligent digital clouds enter into learning-generative interactions with the professionals of healthcare. There is observable behavioural modification, for example, in students of healthcare who train with such technologies from the very beginning of their education (Brown Wilson et al. 2020[ix]). That phenomenon of behavioural change requires rethinking from scratch, with the development of each individual technology, the ethical and legal issues relative to interactions between users, on the one hand, and system operators, on the other hand (Godding 2019[x]).

Against that general background, the present project focuses on studying the phenomenon of tacit coordination among the users of digital technologies in remote medical diagnostics and remote medical care. Tacit coordination is essential as regards the well-founded application of intelligent digital cloud to support and enhance these technologies. Intelligent digital clouds are intelligent structures, i.e. they learn by producing many alternative versions of themselves and testing those versions for fitness in coping with a vector of external constraints. It is important to explore the extent and way that populations of users behave similarly, i.e. as collectively intelligent structures. The deep theoretical meaning of that exploration is the extent to which the intelligent structure of a digital cloud really maps and represents the collectively intelligent structure of the users’ population.

The scientific method used in the project explores the main working hypothesis that populations of actual and/or prospective patients, in their own health-related behaviour, and in their relations with the healthcare systems, are collectively intelligent structures, with tacit coordination. In practical terms, that hypothesis means that any intelligent digital cloud in the domain of remote medical care should assume collectively intelligent, thus more than just individual, behavioural change on the part of users. Collectively intelligent behavioural change in a population, marked by tacit coordination, is a long-term, evolutionary process of adaptive walk in rugged landscape (Kauffman & Levin 1987[xi]; Nahum et al. 2015[xii]). Therefore, it is something deeper and more durable that fashions and styles. It is the deep, underlying mechanism of social change accompanying the use of digital intelligent clouds in medical engineering.

The scientific method used in this project aims at exploring and checking the above-stated working hypothesis by creating a large and differentiated dataset of health-related data, and processing that dataset in an intelligent digital cloud, in two distinct phases. The first phase consists in processing a first sample of data with a relatively simple, artificial neural network, in order to discover its underlying orientations and its mechanisms of collective learning. The second phase allows an intelligent digital cloud to respond adaptively to users behaviour, i.e to produce intelligent interaction with them. The first phase serves to understand the process of adaptation observable in the second phase. Both phases are explained more in detail below.

The tests of, respectively, orientation and mode of learning, in the first phase of empirical research aim at defining the vector of collectively pursued social outcomes in the population studied. The initially collected empirical dataset is transformed, with the use of an artificial neural network, into as many representations as there are variables in the set, with each representation being oriented on a different variable as its output (with the remaining ones considered as instrumental input). Each such transformation of the initial set can be tested for its mathematical similarity therewith (e.g. for Euclidean distance between the vectors of expected mean values). Transformations displaying relatively the greatest similarity to the source dataset are assumed to be the most representative for the collectively intelligent structure in the population studied, and, consequently, their output variables can be assumed to represent collectively pursued social outcomes in that collective intelligence (see, for example: Wasniewski 2020[xiii]). Modes of learning in that dataset can be discovered by creating a shadow vector of probabilities (representing, for example, a finite set of social roles endorsed with given probabilities by members of the population), and a shadow process that introduces random disturbance, akin to the theory of Black Swans (Taleb 2007[xiv]; Taleb & Blyth 2011[xv]). The so-created shadow structure is subsequently transformed with an artificial neural network in as many alternative versions as there are variables in the source empirical dataset, each version taking a different variable from the set as its pre-set output. Three different modes of learning can be observed, and assigned to particular variables: a) cyclical adjustment without clear end-state b) finite optimisation with defined end-state and c) structural disintegration with growing amplitude of oscillation around central states.

The above-summarised first phase of research involves the use of two basic digital tools, i.e. an online functionality to collect empirical data from and about patients, and an artificial neural network to process it. There comes an important aspect of that first phase in research, i.e. the actual collectability and capacity to process the corresponding data. It can be assumed that comprehensive medical care involves the collection of both strictly health-related data (e.g. blood pressure, blood sugar etc.), and peripheral data of various kinds (environmental, behavioural). The complexity of data collected in that phase can be additionally enhanced by including imagery such as pictures taken with smartphones (e.g. skin, facial symmetry etc.). In that respect, the first phase of research aims at testing the actual possibility and reliability of collection in various types of data. Phenomena such as outliers of fake data can be detected then.

Once the first phase is finished and expressed in the form of theoretical conclusions, the second phase of research is triggered. An intelligent digital cloud is created, with the capacity of intelligent adaptation to users’ behaviour. A very basic example of such adaptation are behavioural reinforcements. The cloud can generate simple messages of praise for health-functional behaviour (positive reinforcements), or, conversely, warning messages in the case of health-dysfunctional behaviour (negative reinforcements). More elaborate form of intelligent adaptation are possible to implement, e.g. a Twitter-like reinforcement to create trending information, or a Tik-Tok-like reinforcement to stay in the loop of communication in the cloud. This phase aims specifically at defining the actually workable scope and strength of possible behavioural reinforcements which a digital functionality in the domain of healthcare could use vis a vis its end users. Legal and ethical implications thereof are studied as one of the theoretical outcomes of that second phase.

I feel like generalizing a bit my last few updates, and to develop on the general hypothesis of collectively intelligent, human social structures. In order to consider any social structure as manifestation of collective intelligence, I need to place intelligence in a specific empirical context. I need an otherwise exogenous environment, which the social structure has to adapt to. Empirical study of collective intelligence, such as I have been doing it, and, as a matter of fact, the only one I know how to do, consists in studying adaptive effort in human social structures. 


[i] Shiwani Mahajan, Yuan Lu, Erica S. Spatz, Khurram Nasir, Harlan M. Krumholz, Trends and Predictors of Use of Digital Health Technology in the United States, The American Journal of Medicine, 2020, ISSN 0002-9343, https://doi.org/10.1016/j.amjmed.2020.06.033 (http://www.sciencedirect.com/science/article/pii/S0002934320306173  )

[ii] Lei Cheng, Mingxia Duan, Xiaorong Mao, Youhong Ge, Yanqing Wang, Haiying Huang, The effect of digital health technologies on managing symptoms across pediatric cancer continuum: A systematic review, International Journal of Nursing Sciences, 2020, ISSN 2352-0132, https://doi.org/10.1016/j.ijnss.2020.10.002 , (http://www.sciencedirect.com/science/article/pii/S2352013220301630 )

[iii] Charlene A. Wong, Farrah Madanay, Elizabeth M. Ozer, Sion K. Harris, Megan Moore, Samuel O. Master, Megan Moreno, Elissa R. Weitzman, Digital Health Technology to Enhance Adolescent and Young Adult Clinical Preventive Services: Affordances and Challenges, Journal of Adolescent Health, Volume 67, Issue 2, Supplement, 2020, Pages S24-S33, ISSN 1054-139X, https://doi.org/10.1016/j.jadohealth.2019.10.018 , (http://www.sciencedirect.com/science/article/pii/S1054139X19308675 )

[iv] Hassen, H. B., Ayari, N., & Hamdi, B. (2020). A home hospitalization system based on the Internet of things, Fog computing and cloud computing. Informatics in Medicine Unlocked, 100368, https://doi.org/10.1016/j.imu.2020.100368

[v] Bliznuks, D., Bolocko, K., Sisojevs, A., & Ayub, K. (2017). Towards the Scalable Cloud Platform for Non-Invasive Skin Cancer Diagnostics. Procedia Computer Science, 104, 468-476

[vi] Marwan, M., Kartit, A., & Ouahmane, H. (2018). Security enhancement in healthcare cloud using machine learning. Procedia Computer Science, 127, 388-397.

[vii] Alashhab, Z. R., Anbar, M., Singh, M. M., Leau, Y. B., Al-Sai, Z. A., & Alhayja’a, S. A. (2020). Impact of Coronavirus Pandemic Crisis on Technologies and Cloud Computing Applications. Journal of Electronic Science and Technology, 100059. https://doi.org/10.1016/j.jnlest.2020.100059

[viii] Whitelaw, S., Mamas, M. A., Topol, E., & Van Spall, H. G. (2020). Applications of digital technology in COVID-19 pandemic planning and response. The Lancet Digital Health. https://doi.org/10.1016/S2589-7500(20)30142-4

[ix] Christine Brown Wilson, Christine Slade, Wai Yee Amy Wong, Ann Peacock, Health care students experience of using digital technology in patient care: A scoping review of the literature, Nurse Education Today, Volume 95, 2020, 104580, ISSN 0260-6917, https://doi.org/10.1016/j.nedt.2020.104580 ,(http://www.sciencedirect.com/science/article/pii/S0260691720314301 )

[x] Piers Gooding, Mapping the rise of digital mental health technologies: Emerging issues for law and society, International Journal of Law and Psychiatry, Volume 67, 2019, 101498, ISSN 0160-2527, https://doi.org/10.1016/j.ijlp.2019.101498 , (http://www.sciencedirect.com/science/article/pii/S0160252719300950 )

[xi] Kauffman, S., & Levin, S. (1987). Towards a general theory of adaptive walks on rugged landscapes. Journal of theoretical Biology, 128(1), 11-45

[xii] Nahum, J. R., Godfrey-Smith, P., Harding, B. N., Marcus, J. H., Carlson-Stevermer, J., & Kerr, B. (2015). A tortoise–hare pattern seen in adapting structured and unstructured populations suggests a rugged fitness landscape in bacteria. Proceedings of the National Academy of Sciences, 112(24), 7530-7535, www.pnas.org/cgi/doi/10.1073/pnas.1410631112 

[xiii] Wasniewski, K. (2020). Energy efficiency as manifestation of collective intelligence in human societies. Energy, 191, 116500. https://doi.org/10.1016/j.energy.2019.116500

[xiv] Taleb, N. N. (2007). The black swan: The impact of the highly improbable (Vol. 2). Random house

[xv] Taleb, N. N., & Blyth, M. (2011). The black swan of Cairo: How suppressing volatility makes the world less predictable and more dangerous. Foreign Affairs, 33-39

Checkpoint for business

I am changing the path of my writing, ‘cause real life knocks at my door, and it goes ‘Hey, scientist, you economist, right? Good, ‘cause there is some good stuff, I mean, ideas for business. That’s economics, right? Just sort of real stuff, OK?’. Sure. I can go with real things, but first, I explain. At my university, I have recently taken on the job of coordinating research projects and finding some financing for them. One of the first things I did, right after November 1st, was to send around a reminder that we had 12 days left to apply, with the Ministry of Science and Higher Education, for relatively small grants, in a call titled ‘Students make innovation’. Honestly, I was expecting to have 1 – 2 applications max, in response. Yet, life can make surprises. There are 7 innovative ideas in terms of feedback, and 5 of them look like good material for business concepts and for serious development. I am taking on giving them a first prod, in terms of business planning. Interestingly, those ideas are all related to medical technologies, thus something I have been both investing a lot in, during 2020, and thinking a lot about, as a possible path of substantial technological change.

I am progressively wrapping my mind up around ideas and projects formulated by those students, and, walking down the same intellectual avenue, I am making sense of making money on and around science. I am fully appreciating the value of real-life experience. I have been doing research and writing about technological change for years. Until recently, I had that strange sort of complex logical oxymoron in my mind, where I had the impression of both understanding technological change, and missing a fundamental aspect of it. Now, I think I start to understand that missing part: it is the microeconomic mechanism of innovation.

I have collected those 5 ideas from ambitious students at Faculty of Medicine, in my university:

>> Idea 1: An AI-based app, with a chatbot, which facilitates early diagnosis of cardio-vascular diseases

>> Idea 2: Similar thing, i.e. a mobile app, but oriented on early diagnosis and monitoring of urinary incontinence in women.

>> Idea 3: Technology for early diagnosis of Parkinson’s disease, through the observation of speech and motor disturbance.

>> Idea 4: Intelligent cloud to store, study and possibly find something smart about two types of data: basic health data (blood-work etc.), and environmental factors (pollution, climate etc.).

>> Idea 5: Something similar to Idea 4, i.e. an intelligent cloud with medical edge, but oriented on storing and studying data from large cohorts of patients infected with Sars-Cov-2. 

As I look at those 5 ideas, surprisingly simple and basic association of ideas comes to my mind: hierarchy of interest and the role of overarching technologies. It is something I have never thought seriously about: when we face many alternative ideas for new technologies, almost intuitively we hierarchize them. Some of them seem more interesting, some others are less. I am trying to dig out of my own mind the criteria I use, and here they are: I hierarchize with the expected lifecycle of technology, and the breadth of the technological platform involved. In other words, I like big, solid, durable stuff. I am intuitively looking for innovations which offer a relatively long lifecycle in the corresponding technology, and the technology involved is sort of two-level, with a broad base and many specific applicational developments built upon that base.  

Why do I take this specific approach? One step further down into my mind, I discover the willingness to have some sort of broad base of business and scientific points of attachment when I start business planning. I want some kind of horizon to choose my exact target on. The common technological base among those 5 ideas is some kind of intelligent digital cloud, with artificial intelligence learns on the data that flows in. The common scientific base is the collection of health-related data, including behavioural aspects (e.g. sleep, diet, exercise, stress management).

The financial context which I am operating in is complex. It is made of public financial grants for strictly speaking scientific research, other public financing for projects more oriented on research and development in consortiums made of universities and business entities, still a different stream of financing for business entities alone, and finally private capital to look for once the technology is ripe enough for being marketed.

I am operating from an academic position. Intuitively, I guess that the more valuable science academic people bring to their common table with businesspeople and government people, the better position those academics will have in any future joint ventures. Hence, we should max out on useful, functional science to back those ideas. I am trying to understand what that science should consist in. An intelligent digital cloud can yield mind-blowing findings. I know that for a fact from my own research. Yet, what I know too is that I need very fundamental science, something at the frontier of logic, philosophy, mathematics, and of the phenomenology pertinent to the scientific research at hand, in order to understand and use meaningfully whatever the intelligent digital cloud spits back out, after being fed with data. I have already gone once through that process of understanding, as I have been working on the application of artificial neural networks to the simulation of collective intelligence in human societies. I had to coin up a theory of intelligent structure, applicable to the problem at hand. I believe that any application of intelligent digital cloud requires assuming that whatever we investigate with that cloud is an intelligent structure, i.e. a structure which learns by producing many alternative versions of itself, and testing them for their fitness to optimize a given desired outcome.  

With those medical ideas, I (we?) need to figure out what the intelligent structure in action is, how can it possibly produce many alternative versions of itself, and how those alternative thingies can be tested for fitness. What we have in a medically edged digital cloud is data about a population of people. The desired outcome we look for is health, quite simply. I said ‘simply’? No, it was a mistake. It is health, in all complexity. Those apps our students want to develop are supposed to pull someone out of the crowd, someone with early symptoms which they do not identify as relevant. In a next step, some kind of dialogue is proposed to such a person, sort of let’s dig a bit more into those symptoms, let’s try something simple to treat them etc. The vector of health in that population is made, roughly speaking, of three sub-vectors: preventive health (e.g. exercise, sleep, stop eating crap food), effectiveness of early medical intervention (e.g. c’mon men, if you are 30 and can’t have erection, you are bound to concoct some cardio-vascular s**t), and finally effectiveness of advanced medicine, applied when the former two haven’t worked.  

I can see at least one salient, scientific hurdle to jump over: that outcome vector of health. In my own research, I found out that artificial neural networks can give empirical evidence as for what outcomes we are really actually after, as collectively intelligent a structure. That’s my first big idea as regards those digital medical solutions: we collect medical and behavioural data in the cloud, we assume that data represents experimental learning of a collectively intelligent social structure, and we make the cloud discover the phenomena (variables) which the structure actually optimizes.

My own experience with that method is that societies which I studied optimize outcomes which look almost too simplistic in the fancy realm of social sciences, such as the average number of hours worked per person per year, the average amount of human capital per person, measured as years of education before entering the job market, or price index in exports, thus the average price which countries sell their exports at. In general, societies which I studied tend to optimize structural proportions, measurables as coefficients in the lines of ‘amount of thingy one divided by the amount of thingy two’.  

Checkpoint for business. Supposing that our research team, at the Andrzej Frycz – Modrzewski Krakow University, comes up with robust empirical results of that type, i.e. when we take a million of random humans and their broadly spoken health, and we assume they are collectively intelligent (I mean, beyond Facebook), then their collectively shared experimental learning of the stuff called ‘life’ makes them optimize health-related behavioural patterns A, B, and C. How can those findings be used in the form of marketable digital technologies? If I know the behavioural patterns someone tries to optimize, I can break those patterns down into small components and figure out a way to utilize the way to influence behaviour. It is a common technique in marketing. If I know someone’s lifestyle, and the values that come with it, I can artfully include into that pattern the technology I am marketing. In this specific case, it could be done ethically and for a good purpose, for a change.  In that context, my mind keeps returning to that barely marked trend of rising mortality in adult males in high-income countries, since 2016 (https://data.worldbank.org/indicator/SP.DYN.AMRT.MA). WTF? We’ll live, we’ll see.

The understanding of how collective human intelligence goes after health could be, therefore, the kind of scientific bacon our university could bring to the table when starting serious consortial projects with business partners, for the development of intelligent digital technologies in healthcare. Let’s move one step forward. As I have been using artificial neural network in my research on what I call, and maybe overstate as collective human intelligence, I have been running those experiments where I take a handful of behavioural patterns, I assign them probabilities of happening (sort of how many folks out of 10 000 will endorse those patterns), and I treat those probabilities as instrumental input in the optimization of pre-defined social outcomes. I was going to forget: I add random disturbance to that form of learning, in the lines of the Black Swan theory (Taleb 2007[1]; Taleb & Blyth 2011[2]).

I nailed down three patterns of collective learning in the presence of randomly happening s**t: recurrent, optimizing, and panic mode. The recurrent pattern of collective learning, which I tentatively expect to be the most powerful, is essentially a cycle with recurrent amplitude of error. We face a challenge, we go astray, we run around like headless chickens for a while, and then we figure s**t out, we progressively settle for solutions, and then the cycle repeats. It is like everlasting learning, without any clear endgame. The optimizing pattern is something I observed when making my collective intelligence optimize something like the headcount of population, or the GDP. There is a clear phase of ‘WTF!’(error in optimization goes haywire), which, passing through a somehow milder ‘WTH?’, ends up in a calm phase of ‘what works?’, with very little residual error.

The panic mode is different from the other two. There is no visible learning in the strict sense of the term, i.e. no visible narrowing down of error in what the network estimates as its desired outcome. On the contrary, that type of network consistently goes into the headless chicken mode, and it is becoming more and more headless with each consecutive hundred of experimental rounds, so to say. It happens when I make my network go after some very specific socio-economic outcomes, like price index in capital goods (i.e. fixed assets) or Total Factor Productivity.

Checkpoint for business, once again. That particular thing, about Black Swans randomly disturbing people in their endorsing of behavioural patterns, what business value does it have in a digital cloud? I suppose there are fields of applied medical sciences, for example epidemiology, or the management of healthcare systems, where it pays to know in advance which aspects of our health-related behaviour are the most prone to deep destabilization in the presence of exogenous stressors (e.g. epidemic, or the president of our country trending on Tik Tok). It could also pay off to know, which collectively pursued outcomes act as stabilizers. If another pandemic breaks out, for example, which social activities and social roles should keep going, at all price, on the one hand, and which ones can be safely shut down, as they will go haywire anyway?      


[1] Taleb, N. N. (2007). The black swan: The impact of the highly improbable (Vol. 2). Random house.

[2] Taleb, N. N., & Blyth, M. (2011). The black swan of Cairo: How suppressing volatility makes the world less predictable and more dangerous. Foreign Affairs, 33-39.

Lost in the topic. It sucks. Exactly what I needed.

I keep working on the collective intelligence of humans – which, inevitably, involves working on my intelligent cooperation with other people – in the context of the COVID-19 pandemic. I am focusing on one particular survival strategy which we, Europeans, developed over centuries (I can’t speak for them differently continental folks): the habit of hanging out in relatively closed social circles of knowingly healthy people.

The social logic is quite simple. If I can observe someone for many weeks and months, in a row, I sort have an eye for them. After some time, I know whom that person hangs out with, I can tell when they look healthy, and, conversely, when they look like s**t, hence suspiciously. If I concentrate my social contacts in a circle made of such people, then, even in the absence of specific testing for pathogens, I increase my own safety, and, as I do so, others increase their safety by hanging out with me. Of course, epidemic risk is still there. Pathogens are sneaky, and Sars-Cov-2 is next level in terms of sneakiness. Still, patient, consistent observation of my social contacts, and just as consistent making of a highly controlled network thereof, is a reasonable way to reduce that risk.

That pattern of closed social circles has abundant historical roots. Back in the day, even as recently as in the first half of the 20th century, European societies were very clearly divided in two distinct social orders: that of closed social circles which required introduction, prior to letting anyone in, on the one hand, and the rest of the society, much less compartmentalised. The incidence of infectious diseases, such as tuberculosis or typhoid, was much lower in the former of those social orders. As far as I know, many developing countries, plagued by high incidence of epidemic outbreaks, display such a social model even today.

As I think of it, the distinction between immediate social environment, and the distant one, common in social sciences, might have its roots in that pattern of living in closed social circles made of people whom we can observe on a regular basis. In textbooks of sociology, one can find that statement that the immediate social environment of a person makes usually 20 ÷ 25 people. That might be a historically induced threshold of mutual observability in a closed social circle.

I remember my impressions during a trip to China, when I was visiting the imperial palace in Beijing, and then several Buddhist temples. Each time, the guide was explaining a lot of architectural solutions in those structures as defences against evil spirits. I perceive Chinese people as normal, in the sense they don’t exactly run around amidst paranoid visions. Those evil spirits must have had a natural counterpart. What kind of evil spirit can you shield against by making people pass, before reaching your room, through many consecutive ante rooms, separated by high doorsteps and multi-layered, silk curtains? I guess it is about the kind of evil spirit we are dealing with now: respiratory infections.

I am focusing on the contemporary application of just those two types of anti-epidemic contrivances, namely that of living in close social circles, and that of staying in buildings structurally adapted to shielding against respiratory infections. Both are strongly related to socio-economic status. Being able to control the structure of your social circle requires social influence, which, in turn, and quite crudely, means having the luxury to wait for people who gladly comply with the rules in force inside the circle. I guess that in terms of frequency, our social relations are mostly work-related. The capacity to wait for the sufficiently safe social interactions, in a work environment, means either a job which I can do remotely, like home office, or a professional position of power, when I can truly choose whom I hang out with. If I want to live in an architectural structure with a lot of anterooms and curtains, to filter people and their pathogens, it means a lot of indoor space used just as a filter, not as habitat in the strict sense. Who pays for that extra space? At the end of the day, sadly enough, I do. The more money I have, the more of that filtering architectural space I can afford.

Generally, epidemic protection is costly, and, when used on a regular basis across society, that protection is likely to exacerbate the secondary outcomes of economic inequalities. By the way, as I think about it, the relative epidemic safety we have been experiencing in Europe, roughly since the 1950ies, could be a major factor of another subjective, collective experience, namely that of economic equality. Recently, in many spots of the social space, voices have been rising and saying that equality is not equal enough. Strangely enough, since 2016, we have a rise in mortality among adult males in high-income countries (https://data.worldbank.org/indicator/SP.DYN.AMRT.MA). Correlated? Maybe.

Anyway, I have an idea. Yes, another one. I have an idea to use science and technology as parents to a whole bunch of technological babies. Science is the father, as it is supposed to give packaged genetic information, and that information is the stream of scientific articles on the topic of epidemic safety. Yes, a scientific article can be equated to a spermatozoid. It is relatively small a parcel of important information. It should travel fast but usually it does not travel fast enough, as there is plenty of external censors who cite moral principles and therefore prevent it from spreading freely. The author thinks it is magnificent, and yet, in reality, it is just a building block of something much bigger: life.

Technology is the mother, and, as it is wisely stated in the Old Testament, you’d better know who your mother is. The specific maternal technology here is Artificial Intelligence. I imagine a motherly AI which absorbs the stream of scientific articles on COVID and related subjects, and, generation after generation, connects those findings to specific technologies for enhanced epidemic safety. It is an artificial neural network which creates and updates semantic maps of innovation. I am trying to give the general idea in the picture below.

An artificial neural network is a sequence of equations, at the end of the day, and that sequence is supposed to optimize a vector of inputs so as to match with an output. The output can be defined a priori, or the network can optimize this one too. All that optimization occurs as the network produces many alternative versions of itself and tests them for fitness. What could be those different versions in this case? I suppose each such version would consist in a logical alignment of the type ‘scientific findings <> assessment of risk <> technology to mitigate risk’.

Example: article describing the way that Sars-Cov-2 dupes the human immune system is associated with the risk generated once a person has been infected, and can be mitigated by proper stimulation of our immune system before the infection (vaccine), or by pharmaceuticals administered after the appearance of symptoms (treatment). Findings reported in the article can: a) advance completely new hypotheses b) corroborate existing hypotheses or c) contradict them. Hypotheses can have a strong or a weak counterpart in existing technologies.

The basic challenge I see for that neural network, hence a major criterion of fitness, is the capacity to process scientific discovery as it keeps streaming. It is a quantitative challenge. I will give you an example, with the scientific repository Science Direct (www.sciencedirect.com ), run by the Elsevier publishing group. I typed the ‘COVID’ keyword, and run a search there. In turns out 28 680 peer-reviewed articles have been published this year, just in the journals that belong to the Elsevier group. It has been 28 680 articles over 313 days since the beginning of the year (I am writing those words on November 10th, 2020), which gives 91,63 articles per day.

On another scientific platform, namely that of the Wiley-Blackwell publishing group (https://onlinelibrary.wiley.com/), 14 677 articles and 47 books have been published on the same topic, i.e. The Virus, which makes 14 677/313 = 46,9 articles per day and a new book every 313/47 = 6,66 days.

Cool. This is only peer-reviewed staff, sort of the House of Lords in science. We have preprints, too. At the bioRχiv platform (https://connect.biorxiv.org/relate/content/181 ), there has been 10 412 preprints of articles on COVID-19, which gives 10 412/313 = 33,3 articles per day.

Science Direct, Wiley-Blackwell, and bioRχiv taken together give 171,8 articles per day. Each article contains an abstract of no more than 150 words. The neural network I am thinking about should have those 150-word abstract as its basic food. Here is the deal. I take like one month of articles, thus 30*171,8*150 = 773 100 words in abstracts. Among those words, there are two groups: common language and medical language. If I connect that set of 773 100 words to a digital dictionary, such as Thesaurus used in Microsoft Word, I can kick out the common words. I stay with medical terminology, and I want to connect it to another database of knowledge, namely that of technologies.

You know what? I need to take on something which I should have been taken on already some time ago, but I was too lazy to do it. I need to learn programming, at least in one language suitable for building neural networks. Python is a good candidate. Back in the day, two years ago, I had a go at Python but, idiot of me, I quit quickly. Well, maybe I wasn’t as much of an idiot as I thought? Maybe having done, over the last two years, the walkabout of logical structures which I want to program has been a necessary prelude to learning how to program them? This is that weird thing about languages, programming or spoken. You never know exactly what you want to phrase out until you learn the lingo to phrase it out.

Now, I know that I need programming skills. However strong I cling to Excel, it is too slow and too clumsy for really serious work with data. Good. Time to go. If I want to learn Python, I need an interpreter, i.e. a piece of software which allows me to write an algorithm, test it for coherence, and run it. In Python, that interpreter is commonly called ‘Shell’, and the mothership of Python, https://www.python.org/ , runs a shell at https://www.python.org/shell/ . There are others, mind you: https://www.programiz.com/python-programming/online-compiler/ , https://repl.it/languages/python3 , or https://www.onlinegdb.com/online_python_interpreter .

I am breaking down my research with neural networks into partial functions, which, as it turns out, sum up my theoretical assumptions as regards the connection between artificial intelligence and the collective intelligence of human societies. First things first, perception. I use two types of neural networks, one with real data taken from external databases and standardized over respective maxima for individual variables, another one with probabilities assigned to arbitrarily defined phenomena. The first lesson I need to take – or rather retake – in Python is about the structures of data this language uses.

The simplest data structure in Python is a list, i.a. a sequence of items, separated with commas, and placed inside square brackets, e.g. my_list = [1, 2, 3]. My intuitive association with lists is that of categorization. In the logical structures I use, a list specifies phenomenological categories: variables, aggregates (e.g. countries), periods of time etc. In this sense, I mostly use fixed, pre-determined lists. Either I make the list of categories by myself, or I take an existing database and I want to extract headers from it, as category labels. Here comes another data structure in Python: a tuple. A tuple is a collection of data which is essentially external to the algorithm at hand, immutable, and it can be unpacked or indexed. As I understand, and I hope I understand it correctly, any kind of external raw data I use is a tuple.

Somewhere between a tuple (collection of whatever) and a list (collection of categories), Python distinguishes sets, i.e. unordered collections with no duplicate elements. When I transform a tuple or a list into a set, Python kicks out redundant components.

Wrapping it partially up, I can build two types of perception in Python. Firstly, I can try and extract data from a pre-existing database, grouping it into categories, and then making the algorithm read observations inside each category. For now, the fastest way I found to create and use databases in Python is the sqlite3 module (https://www.tutorialspoint.com/sqlite/sqlite_python.htm ). I need to work on it.

I can see something like a path of learning. I mean, I feel lost in the topic. I feel it sucks. I love it. Exactly the kind of intellectual challenge I needed.

When a best-friend’s-brother-in-law’s-cousin has a specific technology to market

I am connecting two strands of my work with artificial neural networks as a tool for simulating collective intelligence. One of them consists in studying orientations and values in human societies by testing different socio-economic variables as outcomes of a neural network and checking which of them makes that network the most similar to the original dataset. The second strand consists in taking any variable as the desired output of the network, setting an initially random vector of local probabilities as input, adding a random disturbance factor, and seeing how the network is learning in those conditions.

So far, I have three recurrent observations from my experiments with those two types of neural networks. Firstly, in any collection of real, empirical, socio-economic variables, there are 1 – 2 of them which, when pegged as the desired outcome of the neural network, produce a clone of actual empirical reality and that clone is remarkably closer to said reality than any other version of the same network, with other variables as its output. In other words, social reality represented with aggregate variables, such as average number of hours worked per person per year, or energy consumption per person per year, is an oriented reality. It is more like a crystal than like a snowball.

Secondly, in the presence of a randomly occurring disturbance, neural networks can learn in three essential ways, clearly distinct from each other. They can be nice and dutiful, and narrow down their residual error of estimation, down to a negligible level. Those networks just nail it down. The second pattern is that of cyclical learning. The network narrows down its residual error, and then, when I think all is said and done, whoosh!: the error starts swinging again, with a broadening amplitude, and then it decreases again, and the cycle repeats, over and over again. Finally, a neural network prodded with a random disturbance can go haywire. The chart of its residual error looks like the cardiac rhythm of a person who takes on an increasing effort: its swings in an ever-broadening amplitude. This is growing chaos. The funny thing, and the connection to my first finding (you know, that about orientations) is that the way a network learns depends on the real socio-economic variable I set as its desired outcome. My network nails it down, like a pro, when it is supposed to optimize something related to absolute size of a society: population, GDP, capital stock. Cyclical learning occurs when I make my network optimize something like a structural proportion: average number of hours worked per person per year, density of population per 1 km2 etc. Just a few variables put my network in the panic mode, i.e. the one with increasing amplitude of error. Price index in capital goods is one, Total Factor Productivity is another one. Interestingly, price index in consumer goods doesn’t create much of a panic in my network.

There is a connection between those two big observations. The socio-economic variables with come out as the most likely orientations of human societies are those, which seem to be optimized in that cyclical, sort of circular learning, neither with visible growth in precision, nor with visible panic mode. Our human societies seem to orient themselves on those structural proportions, which they learn and relearn over and over again.  

The third big observation I made is that each kind of learning, i.e. whichever of the three signalled above, makes my neural network loosen its internal coherence. I measure that coherence with the local Euclidean distance between variables: j = (1, 2,…, k)[(xi – xj)2]0,5 / k. That distance tends to swing cyclically, as if the network needed to loosen its internal connections in order to absorb a parcel of chaos, and then it tightens back, when chaos is being transformed into order.

I am connecting those essential outcomes of me meddling with artificial neural networks to the research interests I developed earlier this year: the research on cities and their role in our civilisation. One more time, I am bringing that strange thought which came to my mind as I was cycling through the empty streets of my hometown, Krakow, Poland, in the first days of the epidemic lockdown, in March 2020: ‘This city looks dead without people in the streets. I have never seen it as dead as now, even in the times of communism, back in the 1970s. I just wonder, how many human footsteps a day this city needs in order to be truly alive?’. After I had that thought, I started digging and I found quite interesting facts about cities and urban space. Yet, another strand of thinking was growing in my head, the one about the impact of sudden, catastrophic events, such as epidemic outbreaks, on our civilisation. I kept thinking about Black Swans.   

I have been reading some history, I have been rummaging in empirical data, I have been experimenting with neural networks, and I have progressively outlined an essential hypothesis, to dig even further into: our social structures absorb shocks, and we do it artfully. Collectively, we don’t just receive s**t from Mother Nature: we absorb it, i.e. we learn how to deal with it. As a matter of fact, we have an amazing capacity to absorb shocks and to create the impression, on the long run, that nothing bad really happened, and that we just keep progressing gloriously. If we think about all the most interesting s**t in our culture, it all comes from one place: shock, suffering, and the need to get over it.

In 2014, I visited an exposition of Roman art (in Barcelona, in the local Museum of Catalonia). Please, do not confuse Roman with Ancient Roman. Roman art is the early medieval one, roughly until and through the 12th century (historians might disagree with me as regards this periodization, but c’mon guys, this is a blog, I can say crazy things here). Roman art covers everything that happened between the collapse of the Western Roman Empire and the first big outbreak of plague in Europe, sort of. And so I walk along the aisles, in that exposition of Roman art, and I see replicas of frescoes, originally located in Roman churches across Europe. All of them sport Jesus Christ, and in all of them Jesus looks like an archetypical Scottish sailor: big, bulky, with a plump, smiling face, curly hair, short beard, and happy as f**k. On all those frescoes Jesus in happy. Can you imagine The Last Supper where Jesus dances on the table, visibly having the time of his life? Well, it is there, on the wall of a small church in Germany.        

I will put it in perspective. If you look across the Christian iconography today, Jesus is, recurrently, that emaciated guy, essentially mangled by life, hanging sadly from his cross, and apostles are just the same way (no cross, however), and there is all that memento mori stuff sort of hanging around, in the air. Still, this comes from the times after the first big outbreak of plague in Europe. Earlier on, on the same European continent, for roughly 800 years between the fall of the Western Roman Empire and the first big epidemic hit, Jesus and all his iconography had been in the lines of Popeye The Sailor, completely different from what we intuitively associate Christianism with today. 

It is to keep in mind that epidemic diseases have always been around. Traditions such as shaking hands to express trust and familiarity, or spitting in those hands before shaking them to close a business deal, it all comes from those times when any stranger, i.e. someone coming from further than 50 miles away, was, technically, an epidemic threat. For hundreds of years, we had sort of been accepting those pathogens at face value, as the necessary s**t which takes nothing off our joy of life, and then ‘Bang!’, 1347 comes, and we really see how hard an epidemic can hit when that pathogen really means business, and our culture changes deeply.

That’s the truly fundamental question which I want to dig into and discuss: can I at all, and, if so, how can I mathematically model the way our civilisation learns, as a collectively intelligent structure, through and from the experience of COVID-19 pandemic?

Collectively intelligent structures, such as I see them, learn by producing many alternative versions of themselves – each of those versions being like one-mutation neighbour to others –   and then testing each such version as for its fitness to optimize a vector of desired outcomes. I wonder how it can happen now, in this specific situation we are in, i.e. the pandemic? How can a society produce alternative versions of itself? We test various versions of epidemic restrictions. We test various ways of organizing healthcare. We probably, semi-consciously test various patterns of daily social interactions, on the top of official regulations on social mobility. How many such mutations can we observe? What is our desired outcome?

I start from the end. My experiments with neural networks applied as simulators of collective human intelligence suggest that we optimize, most of all, structural proportions of our socio-economic system. The average number of hours worked per person per year, and the amount of human capital accumulated in an average person, in terms of schooling years, come to the fore, by far. Energy consumption per person per year is another important metric.

Why labour? Because labour, at the end of the day, is social interaction combined with expenditure of energy, which, in turn, we have from our food base. Optimizing the amount of work per person, together with the amount of education we need in order to perform that work, is a complex adaptive mechanism, where social structures arrange themselves so as their members find some kind of balance with the grub they can grab from environment. Stands to reason.

Now, one more thing as for the transformative impact of COVID-19 on our civilization. I am participating in a call for R&D tenders, with the Polish government, more specifically with the National Centre for Research and Development (https://www.ncbr.gov.pl/en/ ). They have announced a special edition of the so-called Fast Track call, titled ‘Fast Track – Coronaviruses’. First of all, please pay attention to the plural form of coronaviruses. Second of all, that specific track of R&D goes as broadly as calling for architectural designs supposed to protect against contagion. Yes, if that call is not a total fake (which happens sometimes, when a best-friend’s-brother-in-law’s-cousin has a specific technology to market, for taxpayers’ money), the Polish government has data indicating that pandemic is going to be the new normal.

Both needed and impossible

I return to and focus on the issue of behavioural change under the impact of an external stressor, and the use of an artificial neural network to simulate it. I somehow connect to what I wrote in ‘Cross breeding my general observations’, and I want to explore the outcomes to expect from the kind of s**t which is happening right now: climate change, pandemic, rapid technological change with the rise of digital technologies, urbanisation, social unrest…name it. I want to observe Black Swans and study the way they make their way into our normal (see Black Swans happen all the time). I intend to dissect situations when exogenous stressors trigger the so-far dormant patterns of behaviour, whilst randomly pushing the incumbent ones out of the system, and, in the process, those initially exogenous stressors become absorbed (AKA endogenized) by the society.

Back to plain human lingo, I assume that we, humans, do stuff. We can do only what we have learnt to do, therefore anything we do is a recurrent pattern of behaviour, which changes constantly in the process of learning. We differ in our individual patterns, and social life can be represented as the projection of a population into a finite set of behavioural patterns, which, further in this development, I will label as ‘social roles’. You probably know what a pool table looks like. Imagine a pretty continuous stream of pool balls, i.e. humans, spilling over an immense pool table with lots of holes in it. Each hole is a social role, and each human ball finally ends up in one of the holes, i.e. endorsing one among a finite number of social roles. Probabilistically, each social role can be described with the probability that the average homo sapiens being around endorses that role.

Thus, I study a human population projecting itself into a finite set SR = {sr1, sr2, …, srm} of m social roles, coupled with the set PSR = {p(sr1), p(sr2), …, p(srm)} of probabilities that each given social role is being endorsed. Those two coupled sets, i.e. SR and PSR, make a collectively intelligent social structure, able to learn by experimenting with many alternative versions of itself. This, in turn, implies two processes, namely production of and selection from among those alternative versions. Structural intelligence manifests as the capacity to produce and select alternative versions whilst staying coherent laterally and longitudinally. Lateral coherence is observable as functional connection between social roles in the set SR, whilst the longitudinal one is continuity in the structure of the set SR. Out of those two coherences, the lateral one is self-explanatory and assumed a priori: no social role can exist in total abstraction from other social roles, i.e. without any functional connection whatsoever. On the other hand, I assume that longitudinal coherence can be broken, in the sense that under some conditions the set SR can turn into a new set SR’, which will contain very different a repertoire of social roles.

I go into maths. Each social role sri, besides being associated with its probability of endorsement p(sri), is associated with a meta-parameter, i.e. its lateral coherence LC(sri) with m – 1 other social roles in the set SR, and that coherence is defined as the average Euclidean distance between p(sri) and the probabilities p(srj) of other social roles, as in Equation (1) below.

Equation (1)

Logically, we have one more component of the collectively intelligent social structure, namely the set LCSR = {LC(sr1), LC(sr2), …, LC(sri)} of lateral coherences between social roles.

The collectively intelligent social structure, manifest as three mutually coupled sets, i.e. SR, PSR, and LCSR, optimizes a vector of social outcomes. In order to keep some sort of methodological purity, I will further designate that vector as a set, namely the set O = {o1, o2, …, ok} of k social outcomes. Still, we keep in mind that in mathematics, the transition from set to vector and back is pretty simple and common-sense-based. A set has the same variables in it as the vector made out of that set, only we can cherry-pick variables from a set, whilst we cannot really do it out of a vector, on the account of them variables being bloody entangled in the vector. When a set turns into a vector, its variables go mousquetaire, Dumas style, and they are one for all and all for one, sort of.

With the above assumptions, a collectively intelligent social structure can be represented as the coupling of four sets: social roles SR, probabilities of endorsement as regards those social roles PSR, lateral coherences between social roles LCSR, and social outcomes O. Further, the compound notation {SR, PSR, LCSR, O} is used to designate such a structure.

Experimental instances happen one by one, and therefore they can be interpreted as consecutive experiments, possible to designate mathematically as units t of time. For the sake of clarity, the current experimental instance of the structure {SR, PSR, LCSR, O} is designated with ‘t’, past instances are referred to as t – l, where ‘l’ stands for temporal lag, and the hypothetical first state of that structure is t0. Any current instance of {SR, PSR, LCSR, O} is notated as {SR(t), PSR(t), LCSR(t),O(t)}.

Consistently with the Interface Theory of Perception (Hoffman et al. 2015[1], Fields et al. 2018[2]), as well as the theory of Black Swans (Taleb 2007[3]; Taleb & Blyth 2011[4]), it is assumed that the structure {SR, PSR, LCSR, O} internalizes exogenous stressors, both positive and negative, transforming them into endogenous constraints, therefore creating an expected vector E(O) of outcomes. Each consecutive instance {SR(t), PSR(t), LCSR(t),O(t)} of the structure {SR, PSR, LCSR, O} learns by pitching its real local outcomes O(t) against their expected local state E[O(t)].

Internalization of exogenous stressors allows studying the whole sequence of l states, i.e. from  instance {SR(t0), PSR(t0), LCSR(t0),O(t0)} to {SR(t), PSR(t), LCSR(t),O(t)} as a Markov chain of states, which transform into each other through a σ-algebra. The current state {SR(t), PSR(t), LCSR(t),O(t)} and its expected outcomes E[O(t)] contain all the information from past learning, and therefore the local error in adaptation, i.e. e(t) = {E[O(t)] – O(t)}*dO(t), where dO(t) stands for the local derivative (local first moment) of O(t) conveys all that information from past learning. That factorisation of error in adaptation into a residual difference and a first moment is based on the intuition that collective intelligence is always on the move, and any current state instance {SR(t0), PSR(t0), LCSR(t0),O(t0)} is just a snapshot of an otherwise constantly changing social structure.

With the assumptions above, {SR(t), PSR(t), LCSR(t),O(t)} = {SR(t-1) + e(t-1), PSR(t-1) + e(t-1), LCSR(t-1) + e(t-1), O(t-1) + e(t-1)} and E[O(t)] = E[O(t-1)] + e(t-1). The logic behind adding the immediately past error to the present state {SR(t), PSR(t), LCSR(t),O(t)} is that collective learning is essentially incremental, and not revolutionary. Each consecutive state {SR(t), PSR(t), LCSR(t),O(t)} is a one-mutation neighbour of the immediately preceding state {SR(t-1), PSR(t-1), LCSR(t-1),O(t-1)} rather than its structural modification. Hence, we are talking about arithmetical addition rather than multiplication or division. Of course, it is to keep in mind that subtraction is a special case of addition, when one component of addition has a negative sign.

Exogenous stressors act upon human behaviour at two levels: recurrent and incidental. Recurrent exogenous stressors make people reconsider, systematically, their decisions to endorse a given social role, in the sense that those decisions, besides taking into account the past state of the structure {SR(t0), PSR(t0), LCSR(t0),O(t0)}, incorporate randomly distributed, current exogenous information X(t). That random exogenous parcel of information affects all the people susceptible to endorse the given social role sri which, in turn, means arithmetical multiplication rather than addition, i.e. PSR(t) = X(t)*[PSR(t-1) + e(t-1)].

Incidental exogenous stress, in this specific development, is very similar to Black Swans (Taleb 2007 op. cit.; Taleb & Blyth 2011 op. cit.)., i.e. it consists of short-term, violently disturbing events, likely to put some social roles extinct or, conversely, trigger into existence new social roles. Extinction of a social role means that its probability becomes null: P(sri) = 0. The birth of a new social role is more complex. Social roles are based on pre-formed skillsets and socially tested strategies of gaining payoffs from those skillsets. A new social role appears in two phases. In the first phase, skills necessary to endorse that role progressively form in the members of a given society, yet those skills have not played out sufficiently, yet, in order to be endorsed as the social identity of an individual. Just to give an example, the recent and present development of cloud computing as a distinct digital business encourage the formation of skills in trading, at the business level, large datasets, such as those collected via the cookie algorithms. Trade in datasets is real, and the skills required are just as real, yet there is no officially labelled profession of data trader yet. Data trader is something like a dormant social role: the skills are there, in the humans involved, and still there is nothing to endorse officially. A more seasoned social role, which followed a similar trajectory, is an electricity broker. As power grids have been evolving towards increasing digitalisation and liquidity in the transmission of power, it became possible to do daily trade in power capacity, at first, and then a distinct profession, that of a power broker, emerged together with institutionalized power exchanges.

That first phase of emergence, in a new social role, creates dormant social roles, i.e. ready-to-use skillsets which need just a small encouragement, in the form of socially recognized economic incentives, to kick into existence. Mathematically, it means that the set SR of social roles entails two subsets: active and dormant. Active social roles display p(sri;t) > 0, and, under the impact of a local, Black-Swan type event, they can turn p(sri;t) = 0. Dormant social roles are at p(sri;t) = 0 for now, and can turn into display p(sri;t) > 0 in the presence of a Black Swan.

In the presence of active recurrent stress upon the structure {SR, PSR, LCSR, O}, thus if we assume X(t) > 0, I can present a succinct mathematical example of Black-Swan-type exogenous disturbance, with just two social roles, sr1 and sr2. Before the disturbance, sr1 is active and sr2 is dormant. In other words, P(sr1; t -1)*X(t-1) > 0 whilst P(sr2; t -1)*X(t-1) = 0 . With the component of learning by incremental error in a Markov chain of states, it means [P(sr1; t – 2) + e(t-2)]*X(t-1) > 0 and [P(sr2; t -1) + e(t-2)]*X(t-1) = 0, which logically equates to P(sr1; t – 2) > – e(t-2) and P(sr2; t -1) = – e(t – 2).

After the disturbance, the situation changes dialectically, namely P(sr1; t -1)*X(t-1) = 0 and P(sr2; t -1)*X(t-1) > 0, implying that P(sr1; t – 2) = – e(t-2) and P(sr2; t -1) > – e(t – 2). As you can probably recall from math classes in high school, there is no way a probability can be negative, and therefore, if I want the expression ‘– e(t-2)’ to make any sense at all in this context, I need e(t – 2) ≤ 0. As e(t) = {E[O(t)] – O(t)}*dO(t), e(t) ≤ 0 occurs when E[O(t)] ≤ O(t) or dO(t) ≤ 0.

Therefore, the whole construct of Black-Swan-type exogenous stressors such as presented above seems to hold logically when:

>> the structure {SR, PSR, LCSR, O} yields local real outcomes O(t) greater than or equal to expected outcomes E[O(t)]; in other words, that structure should yield no error at all (i.e. perfect match between actual outcomes and expected ones), thus should a perfect adaptation, or it should overshoot actual outcomes beyond expectations…

…or…

>> …the first moment of local real outcomes is perfectly still (i.e. equal to zero) or negative

Of course, there is open possibility of such instances, in the structure {SR, PSR, LCSR, O}, which yield negative error, thus E[O(t)] > O(t), with dO(t) > 0. In these instances, according to the above-deployed logic of collective intelligence, the next experimental round t+1 can yield negative probabilities p(sri) of endorsing specific social roles, thus an impossible state.  Can collective intelligence of a human society go into those impossible states? I admit I have no clear answer to that question, and therefore I asked around. I mean, I went to Google Scholar. I found three articles, all of them, interestingly, in the field of physics. In an article by Feynman, R. P. , published in 1987, and titled ‘Negative probability. Quantum implications: essays in honour of David Bohm’ (pages: 235-248, https://cds.cern.ch/record/154856/files/pre-27827.pdf), I read: ‘[…] conditional probabilities and probabilities of imagined intermediary states may be negative in a calculation of probabilities of physical events or states. If a physical theory for calculating      probabilities yields a negative probability for a given situation under certain assumed conditions, we need not conclude the theory is incorrect. Two other possibilities of interpretation exist. One is that the conditions (for example, initial conditions) may not be capable of being realized in the physical world. The other possibility is that the situation for which the probability appears to be negative is not one that can be verified directly. A combination of these two, limitation of verifiability and freedom of initial conditions, may also be a solution to the apparent difficulty’.

This sends me back to my economics and to the concept of economic equilibrium, which assumes that societies can be in a state of economic equilibrium or in a lack thereof. In the former case, they can sort of steady themselves, and in the latter… Well, when you have no balance, man, you need to move so as to gain some.  If a collectively intelligent social structure yields negative probability attached to the occurrence of a given social role, it can indicate truly impossible a state, yet impossibility is understood in the lines of quantum physics. It is a state, from which our society should get the hell out of, ‘cause it is not gonna last, on the account of being impossible. An impossible state is not a state that cannot happen: it is a state which cannot stay in place.

Well, I am having real fun with that thing. I started from an innocent model of collective intelligence, I found myself cornered with negative probabilities, and I guess I found my way out by referring to quantum physics. The provisional moral I draw from this fairy tale is that a collectively intelligent social structure, whose learning and adaptation can be represented as a Markov chain of states, can have two types of states: the possible AKA stable ones, on the one hand, and the impossible AKA transitory ones, on the other hand.

The structure {SR, PSR, LCSR, O} is in stable, and therefore in possible a state, it yields local real outcomes O(t) greater than or equal to expected outcomes E[O(t)]; it is perfectly fit to fight for survival or it overshoots expectations. Another possible state is that of real outcomes O(t) being perfectly still or negative in its first moment. On the other hand, when the structure {SR, PSR, LCSR, O} yield real outcomes O(t) smaller than expected outcomes E[O(t)], in the presence of positive local gradient of change in those real outcomes, it is an impossible, unstable state. That thing from quantum physics surprisingly well fits to a classical economic theory, namely the theory of innovation by Joseph Schumpeter: economic systems transition from one neighbourhood of equilibrium to another one, and they transition through states of disequilibrium, which are both needed for social change, and impossible to hold for a long time.

When the structure {SR, PSR, LCSR, O} hits an impossible state, where some social roles happen with negative probabilities, that state is an engine which powers accelerated social change.     


[1] Hoffman, D. D., Singh, M., & Prakash, C. (2015). The interface theory of perception. Psychonomic bulletin & review, 22(6), 1480-1506.

[2] Fields, C., Hoffman, D. D., Prakash, C., & Singh, M. (2018). Conscious agent networks: Formal analysis and application to cognition. Cognitive Systems Research, 47, 186-213. https://doi.org/10.1016/j.cogsys.2017.10.003

[3] Taleb, N. N. (2007). The black swan: The impact of the highly improbable (Vol. 2). Random house.

[4] Taleb, N. N., & Blyth, M. (2011). The black swan of Cairo: How suppressing volatility makes the world less predictable and more dangerous. Foreign Affairs, 33-39.

Time for a revolution

I am rethinking the economics of technological change, especially in the context of cloud computing and its spectacular rise, as, essentially, a new and distinct segment of digital business. As I am teaching microeconomics, this semester, I am connecting mostly to that level of technological change. I want to dive a bit more into the business level of cloud computing, and thus I pass in review the annual reports of heavyweights in the IT industry: Alphabet, Microsoft and IBM.

First of all, a didactic reminder is due. When I want to study the business, which is publicly listed in a stock market, I am approaching that business from its investor-relations side, and more specifically the investor-relations site. Each company listed in the stock market runs such a site, dedicated to show, with some reluctance to full transparency, mind you, the way the business works. Thus, in my review, I call by, respectively: https://abc.xyz/investor/ for Alphabet (you know, the mothership of Google), https://www.microsoft.com/en-us/investor as regards Microsoft, and https://www.ibm.com/investor as for them Ibemians.

I start with the Mother of All Clouds, i.e. with Google and its mother company, namely Alphabet. Keep in mind: the GDP of Poland, my home country, is roughly $590 billions, and the gross margin which Alphabet generated in 2019 was $89 857 million, thus 15% of the Polish GDP. That’s the size of business we are talking about and I am talking about that business precisely for that reason. There is a school in economic sciences, called new institutionalism. Roughly speaking, those guys study the question why big corporate structures exist at all. The answer is that corporations are a social contrivance which allows internalizing a market inside an organization. You can understand the general drift of that scientific school if you study a foundational paper by O.D. Hart (Hart 1988[1]). Long story short, when a corporate structure grows as big as Alphabet, I can assume its internal structure is somehow representative for the digital industry as a whole. You could say: but them Google people, they don’t make hardware. No, they don’t, and yet they massively invest in hardware, mostly in servers. Their activity translates into a lot of IT hardware.

Anyway, I assume that the business structure of Alphabet is informative about the general structure and the drift of the digital business globally. In the two tables below, I show the structure of their revenues. For the non-economic people: revenue is the value of sales, or, in analytical terms, Price multiplied by Quantity.     


Semi-annual revenue of Alphabet Inc.

The next step is to understand specifically the meaning of categories defined as ‘Segments’, and the general business drift. The latter is strongly rooted in what the Google tribe cherishes as ‘Moonshots’, and which means technological change seen as revolution rather than evolution. Their business develops by technological leaps, smoothed by exogenous economic conditions. Those exogenous conditions translate into the Alphabet’s business mostly as advertising. In the subsection titled ‘How we make money’, you can read it explicitly. By the way, under the mysterious categories of ‘Google other’ and ‘Other Bets revenues’, Alphabet understands, respectively:

>> Google other: Google Play, including sales of apps and in-app purchases, as well as digital content sold in the Google Play store; hardware, including Google Nest home products, Pixelbooks, Pixel phones and other devices; YouTube non-advertising, including YouTube Premium and YouTube TV subscriptions and other services;

>> Other Bet revenues are, in the Google corporate jargon, young and risky businesses, slightly off the main Googly track; right now, they cover the sales of Access internet, TV services, Verily licensing, and R&D services.

Against that background, Google Cloud, which most of us are not really familiar with, as it is a business-to-business functionality, shows interesting growth. Still, it is to keep in mind that Google is cloud: ‘Google was a company built in the cloud. We continue to invest in infrastructure, security, data management, analytics and AI’ (page 7 of the 10K annual report for 2019). You Tube ads, which show a breath-taking ascent in the company’s revenue, base their efficiency and attractiveness on artificial intelligence operating in a huge cloud of data regarding the viewers’ activity on You Tube.

Now, I want to have a look at Alphabet from other financial angles. Their balance sheet, i.e. their capital account, comes next in line. In two tables below, I present that balance sheet one side at a time, and I start with the active side, i.e. with assets. I use the principle that if I know what kind of assets a company invests money in, I can guess a lot about the way their business works. When I look at Alphabet’s assets, the biggest single category is that of ‘Marketable securities’, closely followed by ‘Property and Equipment’. They are like a big factory with a big portfolio of financial securities, and the portfolio is noticeably bigger than the factory. This is a pattern which I recently observe in a lot of tech companies. They hold huge reserves of liquid financial assets, probably in order to max out on their flexibility. You never know when exactly you will face both the opportunity and the necessity to invest in the next technological moonshot. Accounts receivable and goodwill come in the second place, as regards the value in distinct groups of assets. A bit of explanation is due as for that latter category. Goodwill might suggest someone had good intentions. Weeell, sort of. When you are a big company and you buy a smaller company, and you obviously overpay for the control over that company, over the market price of that stock, the surplus you have overpaid you call ‘Goodwill’. It means that this really expensive purchase is, in the same time, very promising, and there is likely to be plenty of future profits. When? In the future, stands to reason.

Now, I call by the passive side of Alphabet’s balance sheet, i.e. by their liabilities and equity, which is shown schematically in the next table below. The biggest single category here, i.e. the biggest distinct stream of financial capital fuelling this specific corporate machine is made of ‘Retained Earnings’, and stock equity comes in the second place. Those two categories taken together made 73% of the Alphabet’s total capital base, by the end of 2019. Still, by the end of 2018, that share was of 77%. Whilst Alphabet retains a lot of its net profit, something like 50%, there is a subtle shift in their financing. They seem to be moving from an equity-based model of financing towards more liability-based one. It happens by baby steps, yet it happens. Some accrued compensations and benefits (i.e. money which Alphabet should pay to their employees, yet they don’t, because…), some accrued revenue share… all those little movements indicate a change in their way of accumulating and using capital.   

The next two tables below give a bird’s eye view of Alphabet in terms of trends in their financials. They have a steady profitability (i.e. capacity to make money out of current business), their capacity to bring return on equity and assets steadily grows, and they shift gently from equity-based finance towards more complex a capital base, with more long-term liabilities. My general conclusion is that Alphabet is up to something, like really. They claim they constantly do revolution, but my gut feeling is that they are poising themselves for a really big revolution, business-wise, coming shortly. Those reserves of liquid financial assets, that accumulation of liabilities… All that stuff is typical in businesses coiling for a big leap.  There is another thing, closely correlated with this one. In their annual report, Alphabet claims that they mostly make money on advertising. In a narrow, operational sense, it might be true. Yet, when I have a look at their cash-flow, it looks different. What they have cash from, first and most of all, are maturities and sales of financial securities, and this one comes as way a dominant, single source of cash, hands down. They make money on financial operations in the stock market, in somehow plainer a human lingo. Then, in the second place, come two operational inflows of cash: amortization of fixed assets, and tax benefits resulting from the payment of stock-based compensations. Alphabet makes real money on financial operations and tax benefits. They might be a cloud in their operations, but in their cash-flows they are a good, old-fashioned financial scheme.  

Now, I compare with Microsoft (https://www.microsoft.com/en-us/Investor/sec-filings.aspx). In a recent update, titled ‘#howcouldtheyhavedoneittome’, I discussed the emerging position of cloud computing in the overall business of Microsoft. Now, I focus on their general financials, with a special focus on their balance sheet and their cash-flow. I show a detailed view of both in the two tables that follow. Capital-wise, Microsoft follows slightly different a pattern as compared to Alphabet, although some common denominators appear. On the active side, i.e. as regards the ways of employing capital, Microsoft seems to be even more oriented on liquid financial than Alphabet. Cash, its equivalents, and short-term investments are, by far, the biggest single category of assets in Microsoft. The capital they have in property and equipment is far lower, and, interestingly, almost equal to goodwill. In other words, when Microsoft acquires productive assets, it seems to be like 50/50 their own ones, on the one hand, and those located in acquired companies, on the other hand. As for the sources of capital, Microsoft is clearly more debt-based, especially long-term debt, than Alphabet, whilst retaining comparatively lower a proportion of their net income. It looks as if Alphabet was only discovering, by now, the charms of a capital structure which Microsoft seems to have discovered quite a while ago. As for cash-flows, both giants are very similar. In Microsoft, as in Alphabet, the main single source of cash is the monetization of financial securities, through maturity or by sales, with operational tax write-offs coming in the second place. Both giants seem to be financially bored, so to say. Operations run their way, people are interested in the company’s stock, from time to time a smaller company gets swallowed, and it goes repeatedly, year by year. Boring. Time for a revolution.      

Edit: as I was ruminating my thoughts after having written this update, I recorded a quick video (https://youtu.be/ra2ztH3k0M0 ) on the economics of technological change, where I connect my observations about Alphabet and Microsoft with a classic, namely with the theory of innovation by Joseph Schumpeter.

[1] Hart, O. D. (1988). Incomplete Contracts and the Theory of the Firm. Journal of Law, Economics, & Organization, 4(1), 119-139.

 


[1] Hart, O. D. (1988). Incomplete Contracts and the Theory of the Firm. Journal of Law, Economics, & Organization, 4(1), 119-139.

Cross breeding my general observations

I return to the project which I started in Spring this year (i.e. 2020), and which I had put aside to some extent: the book I want to write on the role and function of cities in our civilization, including the changes, which we, city slickers, can expect in the foreseeable future. As I think about it now, I guess I had to digest intellectually both my essential method of research for that book, and the core empirical findings which I want to connect to. The method consists in studying human civilization as collective intelligence, thus a collection of intelligent structures, able to learn by experimenting with many alternative versions of themselves. Culture, laws and institutions, technologies: I consider all those anthropological categories as cognitive constructs, which we developed over centuries to study our own collective intelligence and being de facto parts thereof.

Collective intelligence, in that perspective, is an overarching conceptual frame, and as overarching frames frequently do, the concept risks to become a cliché. The remedy I want and intend to use is mathematics. I want to write the book as a collection of conceptual developments and in-depth empirical insights into hypotheses previously formulated with the help of a mathematical model. This is, I think, a major originality of my method. In social sciences, we tend to go the other way around: we formulate hypotheses by sort of freestyling intellectually, and then we check them with mathematical models. I start with just a little bit of intellectual freestyling, then I formulate my assumptions mathematically, and I use the mathematical model which results from those assumptions to formulate hypotheses for further research.

I adopt such a strongly mathematical method because we have a whole class of mathematical models which seem to fit the bill perfectly: artificial neural networks. Yes, I consider artificial neural networks as mathematical models in the first place, and only then as algorithms. The mathematical theory which I associate artificial neural networks the most closely with is that of state space, combined with the otherwise related theory of Markov chains. In other words, whatever happens, I attempt to represent it as a matrix of values, which is being transformed into another matrix of values. The artificial neural network I use for that representation reflects both the structure of the matrix in question, and the mechanism of transformation, which, by the way, is commonly called σ – algebra. By ‘commonly’ I mean commonly in mathematics.

My deep intuition – ‘deep’ means that I understand that intuition just partly – is that artificial neural networks are the best mathematical representation of collective intelligence we can get for now. Therefore I use them as a mathematical model, and here comes a big difference between the way I use them and a typical programmer does. Programmers of artificial intelligence are, as far as I know (my son is a programmer, and, yes, sometimes we speak human lingo to each other), absolutely at home with considering artificial neural networks as black boxes, i.e. as something that does something, yet we don’t really need to understand what exactly that thing is, which neural networks do, and we essentially care about those networks being accurate and quick in whatever they do.

I, in my methodological world, I adopt completely different a stance. I care most of all about understanding very specifically what the is the neural network doing, and I draw my conclusions from the way it does things. I don’t need the neural network I use to be super-fast neither super accurate: I need to understand how it does whatever it does.

I use two types of neural networks in that spirit, both 100% hand made. The first one serves me to identify the direction a social system (collective intelligence) follows in its collective learning. You can see an application in this draft paper of mine, titled ‘Climbing the right hill’. The fundamental logic of that network is to take an empirical dataset and use the neural network to produce as many alternative transformations of that dataset as there are variables in it. Each transformation takes a different variable from the empirical dataset as its desired output (i.e. it optimizes all the other variables as instrumental input). I measure the Euclidean similarity (Euclidean distance) between each individual transformation and the source dataset. I assume that the transformation which falls relatively the closest to source empirical data is the best representation of the collective intelligence represented in that data. Thus, at the end of the day, this specific type of neural network serves me to discover what we are really after, as a society.

The second type of network is built as a matrix of probabilities, modified by a quasi-random factor of disturbance. I am tempted to say that this network attempts to emulate coincidence and quasi-randomness of events. I made it and I keep using it as pure simulation: there is no empirical data which the network learns on. It starts with a first, controlled vector of probabilities, and then it transforms that vector in a finite number of experimental iterations (usually I make that network perform 3000 experimental rounds). In the first application I made of that network, probabilities correspond to social roles, and more specifically to the likelihood that a random person in the society studied endorses the given social role (see ‘The perfectly dumb, smart social structure’). At a deeper, and, in the same time, more general a level, I assume that probability as such is a structural variable of observable reality. A network which simulates changes in a vector of probabilities simulated change in the structure of events.

Long story short, I have two neural networks for making precise hypotheses: one uncovers orientations and pursued values in sets of socio-economic data, whilst the other simulates structural change in compound probabilities attached to specific phenomena. When I put that lot to real computational work, two essential conclusions emerge, sort of across the board, whatever empirical problem I am currently treating. Firstly, all big sets of empirical socio-economic data are after something specific. I mean, when I take the first of those two networks, the one that clones an empirical dataset into as many transformations as there are variables, a few of those transformations, like 1 ÷ 3 of them, are much closer to the original, in Euclidean terms, than all the rest. When I say closer, it is several times closer. Secondly, vectors of probabilities are tenacious and resilient. When I take the second of those networks, the one which prods vectors of probabilities with quasi-random disturbances, those probabilities tend to resist. Even if, in some 100 experimental rounds, some of those probabilities get kicked out of the system, i.e. their values descend to 0, they reappear a few hundred of experimental rounds later, as if by magic. Those probabilities can be progressively driven down if the factor of disturbance, which I include in the network, consists in quasi-randomly dropping new events into the game. The phenomenological structure of reality seems to be something very stable, once set in place, however simple I make that reality a priori. It yields to increasing complexity (new phenomena, with their probabilities coming to the game) rather than to arbitrary reduction of the pre-set phenomena.

I generalize those observations. A collective intelligence, i.e. an intelligent social structure, able to learn by experimenting with many alternative versions of itself, can stay coherent in tat experimentation and seems to stay coherent because it pursues very clear collective outcomes. I am even tempted to reframe it as a condition: a human social structure can evolve as a collectively intelligent structure under the condition of having very clear collectively pursued values. If it doesn’t, it is doomed to disintegrate and to be replaced by another collectively intelligent social structure, which, in turn, is sufficiently oriented to stay internally coherent whilst experimenting with itself. As I descend to the level of human behaviour, observed as the probability of an average individual endorsing specific patterns of behaviour, those behavioural patterns are resilient to exogenous destruction, and, in the same time, quite malleable when new patterns emerge and start to compete with the old ones. When a culture starts from a point A, defined as a set of social roles and behavioural patterns with assorted probabilities of happening, that point A needs a bloody long time, or, in other words, a bloody big lot of collectively intelligent experimentation, to vanish completely.   

Now, I want to narrow down the scope of hypotheses I intend to formulate, by specifying the basic empirical findings which I have made so far, and which make the foundations of my research on cities. The first empirical finding does not come from me, but from the CIESIN centre at the Columbia University, and it is both simple and mind blowing: however the formal boundaries of urban areas are being redefined by local governments, the total surface of urban areas, defined as abnormally dense agglomerations of man-made structures and night-time lights, seems to have been constant over the last 30 years, maybe even more. In other words, whilst we have a commonly shared impression that cities grow, they seem to be growing only at the expense of other cities. You can check those numbers via the stats available with the World Bank (https://data.worldbank.org/indicator/AG.LND.TOTL.UR.K2 ). As you will be surfing with the World Bank, you can also call by another metric, the total surface of agricultural land on the planet (https://data.worldbank.org/indicator/AG.LND.AGRI.K2 ) and you will see that it has been growing, by hiccups, since 1960, i.e. since that stat is being collected. 

To complete the picture, you can check the percentage of urban population in the total human population on the planet (https://data.worldbank.org/indicator/SP.URB.TOTL.IN.ZS ) and you will see that we have been becoming more and more urban, and right now, we are prevalently urban. Long story short, there are more and more urban humans, who apparently live in a constant urban space, and feed themselves out of a growing area of agricultural land. At the end of the day, cities seem to become increasingly different from the countryside, as regards the density of population: urban populations on Earth are becoming systematically more dense than rural ones.

I am cross breeding my general observations from what my two neural networks tend to do, with those main empirical findings about cities, and I am trying to formulate precise hypotheses for further research. Hypothesis #1: cities are purposeful demographic anomalies, with a clear orientation on optimizing specific social outcomes. Hypothesis #2: if and to the extent that the purpose of cities is to create new social roles, through intense social interaction in a limited physical space, the creation of new social roles involves their long coexistence with older social roles, and, therefore, the resulting growth in social complexity is exponential. Hypothesis #3: the COVID-19 pandemic, as an exogenous factor of disturbance, is likely to impact us in three possible ways: a) it can temporarily make disappear some social roles b) on the long run, it is likely to increase social complexity, i.e. to make us create a whole new set of social roles and c) it can change the fundamental orientation (i.e. the pursued collective values) of cities as demographic anomalies.  

In your spare time you can also watch this video I made a few weeks ago: ‘Urban Economics and City Management #1 Lockdowns in pandemic and the role of cities’ : https://youtu.be/fYIz_6JVVZk . It recounts and restates my starting point in this path of research. I browse through the main threads of connection between the pandemic of COVID-19 and the civilisational role of cities. The virus, which just loves densely populated places, makes us question the patterns of urban life, and makes us ask question as for the future of cities.