The right side of the disruption

I am swivelling my intellectual crosshairs around, as there is a lot going on, in the world. Well, there is usually a lot going on, in the world, and I think it is just the focus of my personal attention that changes its scope. Sometimes, I pay attention just to the stuff immediately in front of me, whilst on other times I go wide and broad in my perspective.

My research on collective intelligence, and on the application of artificial neural networks as simulators thereof has brought me recently to studying outlier cases. I am an economist, and I do business in the stock market, and therefore it comes as sort of logical that I am interested in business outliers. I hold some stock of the two so-far winners of the vaccine race: Moderna ( ) and BionTech ( ), the vaccine companies. I am interested in the otherwise classical, Schumpeterian questions: to what extent are their respective business models predictors of their so-far success in the vaccine contest, and, seen from the opposite perspective, to what extent is that whole technological race of vaccines predictive of the business models which its contenders adopt?

I like approaching business models with the attitude of a mean detective. I assume that people usually lie, and it starts with lying to themselves, and that, consequently, those nicely rounded statements in annual reports about ‘efficient strategies’ and ‘ambitious goals’ are always bullshit to some extent. In the same spirit, I assume that I am prone to lying to myself. All in all, I like falling back onto hard numbers, in the first place. When I want to figure out someone’s business model with a minimum of preconceived ideas, I start with their balance sheet, to see their capital base and the way they finance it, just to continue with their cash-flow. The latter helps my understanding on how they make money, at the end of the day, or how they fail to make any.

I take two points in time: the end of 2019, thus the starting blocks of the vaccine race, and then the latest reported period, namely the 3rd quarter of 2020. Landscape #1: end of 2019. BionTech sports $885 388 000 in total assets, whilst Moderna has $1 589 422 000. Here, a pretty amazing detail pops up. I do a routine check of proportion between fixed assets and total assets. It is about to see what percentage of the company’s capital base is immobilized, and thus supposed to bring steady capital returns, as opposed to the current assets, fluid, quick to exchange and made for greasing the current working of the business. When I measure that coefficient ‘fixed assets divided by total assets’, it comes as 29,8% for BionTech, and 29% for Moderna. Coincidence? There is a lot of coincidence in those two companies. When I switch to Landscape #2: end of September 2020, it is pretty much the. You can see it in the two tables below:

As you look at those numbers, they sort of collide with the common image of biotech companies in sci fi movies. In movies, we can see huge labs, like 10 storeys underground, with caged animals inside etc. In real life, biotech is cash, most of all. Biotech companies are like big wallets, camped next to some useful science. Direct investment in biotech means very largely depositing one’s cash on the bank account run by the biotech company.

After studying the active side of those two balance sheets, i.e. in BionTech and in Moderna, I shift my focus to the passive side. I want to know how exactly people put cash in those businesses. I can see that most of it comes in the form of additional paid-in equity, which is an interesting thing for publicly listed companies. In the case of Moderna, the bulk of that addition to equity comes as a mechanism called ‘vesting of restricted common stock’. Although it is not specified in their financial report how exactly that vesting takes place, the generic category corresponds to operations where people close to the company, employees or close collaborators, anyway in a closed private circle, buy stock of the company in a restricted issuance.  With Biontech, it is slightly different. Most of the proceeds from public issuance of common stock is considered as reserve capital, distinct from share capital, and on the top of that they seem to be running, similarly to Moderna, transactions of vesting restricted stock. Another important source of financing in both companies are short-term liabilities, mostly deferred transactional payments. Still, I have an intuitive impression of being surrounded by maybies (you know: ‘maybe I am correct, unless I am wrong), and thus I decided to broaden my view. I take all the 7 biotech companies I currently have in my investment portfolio, which are, besides BionTech and Moderna, five others: Soligenix ( ), Altimmune ( ), Novavax ( ) and VBI Vaccines (  ). In the two tables below, I am trying to summarize my essential observations about those seven business models.

Despite significant differences in the size of their respective capital base, all the seven businesses hold most of their capital in the highly liquid financial form: cash or tradable financial securities. Their main source of financing is definitely the additional paid-in equity. Now, some readers could ask: how the hell is it possible for the additional paid-in equity to make more than the value of assets, like 193%? When a business accumulates a lot of operational losses, they have to be subtracted from the incumbent equity. Additions to equity serve as a compensation of those losses. It seems to be a routine business practice in biotech.

Now, I am going to go slightly conspiracy-theoretical. Not much, just an inch. When I see businesses such as Soligenix, where cumulative losses, and the resulting additions to equity amount to teen times the value of assets, I am suspicious. I believe in the power of science, but I also believe that facing a choice between using my equity to compensate so big a loss, on the one hand, and using it to invest into something less catastrophic financially, I will choose the latter. My point is that cases such as Soligenix smell scam. There must be some non-reported financial interests in that business. Something is going on behind the stage, there.  

In my previous update, titled ‘An odd vector in a comfortably Apple world’, I studied the cases of Tesla and Apple in order to understand better the phenomenon of outlier events in technological change. The short glance I had on those COVID-vaccine-involved biotechs gives me some more insight. Biotech companies are heavily scientific. This is scientific research shaped into a business structure. Most of the biotech business looks like an ever-lasting debut, long before breaking even. In textbooks of microeconomics and management, we can read that being able to run the business at a profit is a basic condition of calling it a business. In biotech, it is different. Biotechs are the true outliers, nascent at the very juncture of cutting-edge science, and business strictly spoken. This is how outliers emerge: there is some cool science. I mean, really cool, the one likely to change the face of the world. Those mRNA biotechnologies are likely to do so. The COVID vaccine is the first big attempt to transform those mRNA therapies from experimental ones into massively distributed and highly standardized medicine. If this stuff works on a big scale, it is a new perspective. It allows fixing people, literally, instead of just curing diseases.

Anyway, there is that cool science, and it somehow attracts large amounts of cash. Here, a little digression from the theory of finance is due. Money and other liquid financial instruments can be seen as risk-absorbing bumpers. People accumulate large monetary balances in times and places when and where they perceive a lot of manageable risk, i.e. where they perceive something likely to disrupt the incumbent business, and they want to be on the right side of the disruption.

An odd vector in a comfortably Apple world

Work pays. Writing about my work helps me learn new things. I am coining up, step by step, the logical structure of my book on collective intelligence. Those last days, I realized the point of using an artificial neural network as simulator of collective behaviour. There is a difference between studying the properties of a social structure, on the one hand, and simulating its collective behaviour, on the other hand. When I study the partial properties of something, I make samples and put them under a microscope. This is what most quantitative methods in social sciences do: they sample and zoom on. This is cool, don’t get me wrong. That method has made the body of science we have today, and, historically, this is the hell of a body of science. Yet, there is a difference between, for example, a study of clusters in a society, and a simulation of the way those clusters form. There is a difference between identifying auto-regressive cycles in the time series of a variable, and understanding how those cycles happen in real life, with respect to collective human behaviour (see ‘An overhead of individuals’). Autoregression translated into human behaviour means that what we do actually accomplish today is somehow derived from and functionally connected to the outcomes of our actions some time ago. Please, notice: not to the outcomes of the actions which immediately preceded the current one, but to the outcomes generated with a lag in the past. Go figure how we, humans, can pick a specific, lagged occurrence in the past and make it a factor in what we do today? Intriguing, isn’t it?

The sample-and-put-under-the-microscope method is essentially based on classical statistics, thus on the assumption that the mean expected value of a variable, or of a vector, is the expected state of the corresponding phenomenon. Here we enter the tricky, and yet interesting a realm of questions such as ‘What do you mean by expected state? Expected by whom?’. First and most of all, we have no idea what other people expect. We can, at best, nail down our own expectations to the point of making them intelligible to ourselves and communicable to other people, and we do our best to understand what other people say they expect. Yet, all hope is not lost. Whilst we can hardly have any clue as for what other people expect, we can learn how they learn. The process of learning is much more objectively observable than expectations.

Here comes the subtle and yet fundamental distinction between expectations and judgments. Both regard the same domain – the reality we live in – but they are different in nature. Judgment is functional. I make myself an opinion about reality because I need it to navigate through said reality. Judgment is an approximation of truth. Emotions play their role in my judgments, certainly, but they are easy to check. When my judgment is correct, i.e. when my emotions give it the functionally right shade, I make the right decisions and I am satisfied with the outcomes. When my judgment is too emotional, or emotional the wrong way, I simply screw it, at the end of the day, and I am among the first people to know it.

On the other hand, when I expect something, it is much more imbibed with emotions. I expect things which are highly satisfactory, or, conversely, which raise my apprehension, ranging from disgust to fear. Expectations are so emotional that we even have a coping mechanism of ex-post rationalization. Something clearly unexpected happens to us and we reduce our cognitive dissonance by persuading ourselves post factum that ‘I expected it to happen, really. I just wasn’t sure’.

I think there is fundamental difference between applying a given quantitative method of to the way that society works, on the one hand, and attributing the logic of this method to the collective behaviour of people in that society, on the other hand. I will try to make my point more explicit by taking on one single business case: Tesla ( ). Why Tesla? For two reasons. I invested significant money of mine in their stock, for one, and when I have my money invested in something, I like updating my understanding as for how the thing works. Tesla seems to me something like a unique phenomenon, an industry in itself. This is a perfect Black Swan, in the lines of Nassim Nicholas Taleb’s ‘The black swan. The impact of the highly improbable’ (2010, Penguin Books, ISBN 9780812973815). Ten years ago, Elon Musk, the founder of Tesla, was considered as just a harmless freak. Two years ago, many people could see him on the verge of tears, publicly explaining to shareholders why Tesla kept losing cash. Still, if I invested in the stock of Tesla 4 years ago, today I would have like seven times the money. Tesla is an outlier which turned into a game changer. Today, they are one of the rare business entities who managed to increase their cash flow over the year 2020. No analyst would predict that. As a matter of fact, even I considered Tesla as an extremely risky investment. Risky means burdened with a lot of uncertainty, which, in turn, hinges on a lot of money engaged.

When a business thrives amidst a general crisis, just as Tesla has been thriving amidst the COVID-19 pandemic, I assume there is a special adaptive mechanism at work, and I want to understand that mechanism. My first, intuitive association of ideas goes to the classic book by Joseph Schumpeter, namely ‘Business Cycles’. Tesla is everything Schumpeter mentioned (almost 100 years ago!) as attributes of breakthrough innovation: new type of plant, new type of entrepreneur, processes rethought starting from first principles, complete outlier as compared to the industrial sector of origin.

What does Tesla have to do with collective intelligence? Saying that Tesla’s success is a pleasant surprise to its shareholders, and that it is essentially sheer luck, would be too easy and simplistic. At the end of September 2020, Tesla had $45,7 billion in total assets. Interestingly, only 0,44% is the so-called ‘Goodwill’, which is the financial chemtrail left after a big company acquires smaller ones and which has nothing to do with good intentions. According to my best knowledge, those assets of $45,7 billion have been accumulated mostly through the good, old-fashioned organic growth, i.e. the growth of the capital base in correlation with the growth of operations. That type of growth requires the concurrence of many factors: the development of a product market, paired with the development of a technological base, and all that associated with a stream of financial capital.    

This is more than luck or accident. The organic growth of Tesla has been concurring with a mounting tide of start-up businesses in the domain of electric vehicles. It coincides closely with significant acceleration in the launching of electric vehicles in the big, established companies of the automotive sector, such as VW Group, PSG or Renault. When a Black Swan deeply modifies an entire industry, it means that an outlier has provoked adaptive change in the social structure. That kept in mind, an interesting question surfaces in my mind: why is there only one Tesla business, as for now? Why isn’t there more business entities like them? Why this specific outlier remains an outlier? I know there are many start-ups in the industry of electric vehicles, but none of them even remotely approaches the kind and the size of business structure that Tesla displays. How does an apparently unique social phenomenon remain unique, whilst having proven to be a successful experiment?

I am intuitively comparing Tesla to its grandfather in uniqueness, namely to Apple Inc. ( ). Apple used to be that outlier which Tesla is today, and, over time, it has become sort of banalized, business-wise. How can I say that? Let’s have a look at the financials of both companies. Since 2015, Tesla has been growing like hell, in terms of assets and revenues. However, they started to make real money just recently, in 2020, amidst the pandemic. Their cash-flow is record-high for the nine months of 2020. Apple is the opposite case. If you look at their financials over the last few years, they seem to be shrinking assets-wise, and sort of floating at the same level in terms of revenues. Tesla makes cash mostly by tax write-offs, through amortization and stock-based compensations for their employees. Apple makes cash in the good, old-fashioned way, by generating net income after tax. At the bottom line of all cash-flows, Tesla accumulates cash in their balance sheet, whilst Apple apparently gives it away and seems preferring the accumulation of marketable financial securities. Tesla seems to be wild and up to something, Apple not really. Apple is respectable.    

Uniqueness manifests in two attributes: distance from the mean expected state of the social system, on the one hand, and probability of happening, on the other hand. Outliers display low probability and noticeable distance from the mean expected state of social stuff. Importance of outlier phenomena can be apprehended similarly to risk factors: probability times magnitude of change or difference. With low probability, outliers are important by their magnitude.

Mathematically, I can express the emergence of any new phenomenon in two ways: as the activation of a dormant phenomenon, or as recombination of the already active ones. I can go p(x; t0) = 0 and p(x; t1) > 0, where p(x; t) is the probability of the phenomenon x. It used to be zero before, it is greater than zero now, although not much greater, we are talking about outliers, noblesse oblige. That’s the activation of something dormant. I have a phenomenon nicely defined, as ‘x’, and I sort of know the contour of it, it just has not been happening recently at all. Suddenly, it starts happening. I remember having read a theory of innovation, many years ago, which stated that each blueprint of a technology sort of hides and conveys in itself a set of potential improvements and modifications, like a cloud of dormant changes attached to it and logically inferable from the blueprint itself.

When, on the other hand, new things emerge as the result of recombination in something already existing, I can express it as p(x) = p(y)a * p(z)b. Phenomena ‘y’ and ‘z’ are the big fat incumbents of the neighbourhood. When they do something specific together (yes, they do what you think they do), phenomenon ‘x’ comes into existence, and its probability of happening p(x) is a combination of powers ascribed to the respective probabilities. I can go fancier, and use one of them neural activation functions, such as the hyperbolic tangent. Many existing phenomena – sort of Z = {p(z1), p(z2),…, p(zn)} – combine in a more or less haphazard way (frequently, it is the best way of all ways), meaning that the vector of Z of their probabilities has a date with a structurally identical vector of random significances W = {s1, s2, …, sn}, 0 < si <1. They date and they produce a weighted sum h = ∑p(zi)*si, and that weighed sum gets sucked into the vortex of reality via tanh = [(e2h – 1)/(e2h + 1)]. Why via tanh? First of all, why not? Second of all, tanh is a structure in itself. It is essentially (e2 -1)/(e2 +1) = 6,389056099 / 8,389056099 = 0,761594156, which has the courtesy of accepting h in, and producing something new.

Progress in the development of artificial neural networks leads to the discovery of those peculiar structures – the activation functions – which have the capacity to suck in a number of individual phenomena with their respective probabilities, and produce something new, like a by-phenomenon. In the article available at I read about a newly discovered activation function called ‘Swish’. With that weighted sum h = ∑p(zi)*si, Swish = h/(1+e-h). We have a pre-existing structure 1/(1 + e) = 0,268941421, which, combined with h in a complex way (as a direct factor of multiplication and as exponent in the denominator), produces something surprisingly meaningful.

Writing those words, I have suddenly noticed a meaning of activation functions which I have been leaving aside, so far. Values produced by activation functions are aggregations of the input which we feed into the neural network. Yet, under a different angle, activation functions produce a single new probability, usually very close to 1, which can be understood as something new happening right now, almost for sure, and deriving its existence from many individual phenomena happening now as well. I need to wrap my mind around it. It is interesting.    

Now, I study the mathematical take on the magnitude of the outlier ‘x’, which makes its impact on everything around, and makes it into a Black Swan. I guess x has some properties. I mean not real estate, just attributes. It has a vector of attributes R = {r1, r2, …, rm}, and, if I want to present ‘x’ as an outlier in mathematical terms, those attributes should be the same for all the comparable phenomena in the same domain. That R = {r1, r2, …, rm} is a manifold, in which every observable phenomenon is mapped into m coordinates. If I take any two phenomena, like z and y, each has its vector of attributes, i.e. R(z) and R(y). Each such pair can estimate their mutual closeness by going Euclidean[R(z), R(y)] = ∑{[ri(z) – ri(y)]2}0,5 / m. We remember that m is the number of attributes in that universe.

Phenomena are used to be in a comfortably predictable Euclidean distance to each other, and, all of a sudden, x pops out, and shows a bloody big Euclidean distance from any other phenomenon. Its vector R(x) is odd. This is how a Tesla turns up, as an odd vector in a comfortably Apple world.

An overhead of individuals

I think I have found out, when writing my last update (‘Cultural classes’) another piece of the puzzle which I need to assemble in order to finish writing my book on collective intelligence. I think I have nailed down the general scientific interest of the book, i.e. the reason why my fellow scientists should even bother to have a look at it. That reason is the possibility to have deep insight into various quantitative models used in social sciences, with a particular emphasis on the predictive power of those models in the presence of exogenous stressors, and, digging further, the representativeness of those models as simulators of social reality.

Let’s have a look at one quantitative model, just one picked at random (well, almost at random): autoregressive conditional heteroscedasticity AKA ARCH ( ). It goes as follows. I have a process, i.e. a time-series of a quantitative variable. I compute the mean expected value in that time series, which, in plain human, means arithmetical average of all the observations in that series. In even plainer human, the one we speak after having watched a lot of You Tube, it means that we sum up the values of all the consecutive observations in that time series and we divide the so-obtained total by the number of observations.

Mean expected values have that timid charm of not existing, i.e. when I compute the mean expected value in my time series, none of the observations will be exactly equal to it. Each observation t will return a residual error εt. The ARCH approach assumes that εtis the product of two factors, namely of the time-dependentstandard deviation σt, and a factor of white noise zt. Long story short, we have εttzt.

The time-dependent standard deviation shares the common characteristics of all the standard deviations, namely it is the square root of time-dependent variance: σt = [(σt)2]1/2. That time-dependent variance is computed as:

Against that general methodological background, many variations arise, especially as regards the mean expected value which everything else is wrapped around. It can be a constant value, i.e. computed for the entire time-series once and for all. We can allow the time series to extend, and then each extension leads to the recalculation of the mean expected value, including the new observation(s). We can make the mean expected value a moving average over a specific window in time.

Before I dig further into the underlying assumptions of ARCH, one reminder begs for being reminded: I am talking about social sciences, and about the application of ARCH to all kinds of crazy stuff that we, humans, do collectively. All the equations and conditions phrased out above apply to collective human behaviour. The next step in understanding of ARCH, in the specific context of social sciences, is that ARCH has any point when the measurable attributes of our collective human behaviour really oscillate and change. When I have, for example, a trend in the price of something, and that trend is essentially smooth, without much of a dentition jumping to the eye, ARCH is pretty much pointless. On the other hand, that analytical approach – where each observation in the real measurable process which I observe is par excellence a deviation from the expected state – gains in cognitive value as the process in question becomes increasingly dented and bumpy.

A brief commentary on the very name of the method might be interesting. The term ‘heteroskedasticity’ means that real observations tend to be grouped on one side of the mean expected value rather than on the other. There is a slant, which, over time, translates into a drift. Let’s simulate the way it happens. Before I even start going down this rabbit hole, another assumption is worth deconstructing. If I deem a phenomenon to be describable as white noise, AKA zt, I assume there is no pattern in the occurrence thereof. Any state of that phenomenon can happen with equal probability. It is the ‘Who knows?’ state of reality in its purest form.

White noise is at the very basis of the way we experience reality. This is pure chaos. We make distinctions in this chaos; we group phenomena, and we assess the probability of each newly observed phenomenon falling into one of the groups. Our essential cognition of reality assumes that in any given pound of chaos, there are a few ounces of order, and a few residual ounces of chaos. Then we have the ‘Wait a minute!’ moment and we further decompose the residual ounces of chaos into some order and even more residual a chaos. From there, we can go ad infinitum, sequestrating streams of regularity and order out of the essentially chaotic flow of reality. I would argue that the book of Genesis in the Old Testament is a poetic, metaphorical account of the way that human mind cuts layers of intelligible order out of the primordial chaos.

Seen from a slightly different angle, it means that white noise zt can be interpreted as an error in itself, because it is essentially a departure from the nicely predictable process εt = σt, i.e. where residual departure from the mean expected value is equal to the mean expected departure from the mean expected value. Being a residual error, zt can be factorized into zt = σ’t*z’t , and, once again, that factorization can go all the way down to the limits of observability as regards the phenomena studied.     

At this point, I am going to put the whole reasoning on its head, as regards white noise. It is because I know and use a lot the same concept, just under a different name, namely that of mean-reverted value. I use mean-reversion a lot in my investment decisions in the stock market, with a very simple logic: when I am deciding to buy or sell a given stock, my purely technical concern is to know how far away the current price from its moving average is. When I do this calculation for many different stocks, priced differently, I need a common denominator, and I use standard deviation in price for that purpose. In other words, I compute as follows: mean-reverted price = (current price – mean expected price)/ standard deviation in price.

If you have a closer look at this coefficient of mean-reverted price, its nominator is error, because it is the deviation from mean expected value. I divide that error by standard deviation, and, logically, what I get is error divided by standard deviation, therefore the white noise component zt of the equation εt = σtzt. This is perfectly fine mathematically, only my experience with that coefficient tells me it is anything but white noise. When I want to grasp very sharply and accurately the way which the price of a given stock reacts to its economic environment, I use precisely the mean-reverted coefficient of price. As soon as I recalculate the time series of a price into its mean-reverted form, patterns emerge, sharp and distinct. In other words, the allegedly white-noise-based factor in the stock price is much more patterned than the original price used for its calculation.

The same procedure which I call ‘mean-reversion’ is, by the way, a valid procedure to standardize empirical data. You take each empirical observation, you subtract from it the mean expected value of the corresponding variable, you divide the residual difference by its standard deviation, and Bob’s your uncle. You have your data standardized.

Summing up that little rant of mine, I understand the spirit of the ARCH method. If I want to extract some kind of autoregression in time-series, I can test the hypothesis that standard deviation is time-dependent. Do I need, for that purpose, to assume the existence of strong white noise in the time series? I would say cautiously: maybe, although I do not see the immediate necessity for it. Is the equation εt = σtzt the right way to grasp the distinction into the stochastic component and the random one, in the time series? Honestly: I don’t think so. Where is the catch? I think it is in the definition and utilization of error, which, further, leads to the definition and utilization of the expected state.

In order to make my point clearer, I am going to quote two short passages from pages xxviii-xxix in Nicolas Nassim Taleb’s book ‘The Black Swan’. Here it goes. ‘There are two possible ways to approach phenomena. The first is to rule out the extraordinary and focus on the “normal.” The examiner leaves aside “outliers” and studies ordinary cases. The second approach is to consider that in order to understand a phenomenon, one needs first to consider the extremes—particularly if, like the Black Swan, they carry an extraordinary cumulative effect. […] Almost everything in social life is produced by rare but consequential shocks and jumps; all the while almost everything studied about social life focuses on the “normal,” particularly with “bell curve” methods of inference that tell you close to nothing’.

When I use mean-reversion to study stock prices, for my investment decisions, I go very much in the spirit of Nicolas Taleb. I am most of all interested in the outlying values of the metric (current price – mean expected price)/ standard deviation in price, which, once again, the proponents of the ARCH method interpret as white noise. When that metric spikes up, it is a good moment to sell, whilst when it is in a deep trough, it might be the right moment to buy. I have one more interesting observation about those mean-reverted prices of stock: when they change their direction from ascending to descending and vice versa, it is always a sharp change, like a spike, never a gentle recurving. Outliers always produce sharp change. Exactly, as Nicolas Taleb claims. In order to understand better what I am talking about, you can have a look at one of the analytical graphs I used for my investment decisions, precisely with mean-reverted prices and transactional volumes, as regards Ethereum: .

In a manuscript that I wrote and which I am still struggling to polish enough for making it publishable ( ), I have identified three different modes of collective learning. In most of the cases I studied empirically, societies learn cyclically, i.e. first they produce big errors in adjustment, then they narrow their error down, which means they figure s**t out, and in a next phase the error increases again, just to decrease once again in the next cycle of learning. This is cyclical adjustment. In some cases, societies (national economies, to be exact) adjust in a pretty continuous process of diminishing error. They make big errors initially, and they reduce their error of adjustment in a visible trend of nailing down workable patterns. Finally, in some cases, national economies can go haywire and increase their error continuously instead of decreasing it or cycling on it.

I am reconnecting to my method of quantitative analysis, based on simulating with a simple neural network. As I did that little excursion into the realm of autoregressive conditional heteroscedasticity, I realized that most of the quantitative methods used today start from studying one single variable, and then increase the scope of analysis by including many variables in the dataset, whilst each variable keeps being the essential monad of observation. For me, the complex local state of the society studied is that monad of observation and empirical study. By default, I group all the variables together, as distinct, and yet fundamentally correlated manifestations of the same existential stuff happening here and now. What I study is a chain of here-and-now states of reality rather than a bundle of different variables.    

I realize that whilst it is almost axiomatic, in typical quantitative analysis, to phrase out the null hypothesis as the absence of correlation between variables, I don’t even think about it. For me, all the empirical variables which we, humans, measure and report in our statistical data, are mutually correlated one way or another, because they all talk about us doing things together. In phenomenological terms, is it reasonable to assume that we do in order to produce real output, i.e. our Gross Domestic Product, is uncorrelated with what we do with the prices of productive assets? Probably not.

There is a fundamental difference between discovering and studying individual properties of a social system, such as heteroskedastic autoregression in a variable, on the one hand, and studying the way this social system changes and learns as a collective. It means two different definitions of expected state. In most quantitative methods, the expected state is the mean value of one single variable. In my approach, it is always a vector of expected values.

I think I start nailing down, at last, the core scientific idea I want to convey in my book about collective intelligence. Studying human societies as instances of collective intelligence, or, if you want, as collectively intelligent structure, means studying chains of complex states. The Markov chain of states, and the concept of state space, are the key mathematical notions here.

I have used that method, so far, to study four distinct fields of empirical research: a) the way we collectively approach energy management in our societies b) the orientation of national economies on the optimization of specific macroeconomic variables c) the way we collectively manage the balance between urban land, urban density of population, and agricultural production, and d) the way we collectively learn in the presence of random disturbances. The main findings I can phrase out start with the general observation that in a chain of complex social states, we collectively tend to lean towards some specific aspects of our social reality. Fault of a better word, I equate those aspects to the quantitative variables I find them represented by, although it is something to dig in. We tend to optimize the way we work, in the first place, and the way we sell our work. Concerns such as return on investment or real output come as secondary. That makes sense. At the large scale, the way we work is important for the way we use energy, and collectively learn. Surprisingly, variables commonly associated with energy management, such as energy efficiency, or the exact composition of energy sources, are secondary.

The second big finding is related to a manuscript t which I am still struggling to polish enough for making it publishable ( ), I have identified three different modes of collective learning. In most of the cases I studied empirically, societies learn cyclically, i.e. first they produce big errors in adjustment, then they narrow their error down, which means they figure s**t out, and in a next phase the error increases again, just to decrease once again in the next cycle of learning. This is cyclical adjustment. In some cases, societies (national economies, to be exact) adjust in a pretty continuous process of diminishing error. They make big errors initially, and they reduce their error of adjustment in a visible trend of nailing down workable patterns. Finally, in some cases, national economies can go haywire and increase their error continuously instead of decreasing it or cycling on it.

The third big finding is about the fundamental logic of social change, or so I perceive it. We seem to be balancing, over decades, the proportions between urban land and agricultural land so as to balance the production of food with the production of new social roles for new humans. The countryside is the factory of food, and cities are factories of new social roles. I think I can make a strong, counterintuitive claim that social unrest, such as what is currently going on in United States, for example, erupts when the capacity to produce food in the countryside grows much faster than the capacity to produce new social roles in the cities. When our food systems can sustain more people than our collective learning can provide social roles for, we have an overhead of individuals whose most essential physical subsistence is provided for, and yet they have nothing sensible to do, in the collective intelligent structure of the society.

Cultural classes

Some of my readers asked me to explain how to get in control of one’s own emotions when starting their adventure as small investors in the stock market. The purely psychological side of self-control is something I leave to people smarter than me in that respect. What I do to have more control is the Wim Hof method ( ) and it works. You are welcome to try. I described my experience in that matter in the update titled ‘Something even more basic’. Still, there is another thing, namely, to start with a strategy of investment clever enough to allow emotional self-control. The strongest emotion I have been experiencing on my otherwise quite successful path of investment is the fear of loss. Yes, there are occasional bubbles of greed, but they are more like childish expectations to get the biggest toy in the neighbourhood. They are bubbles, which burst quickly and inconsequentially. The fear of loss is there to stay, on the other hand.    

This is what I advise to do. I mean this is what I didn’t do at the very beginning, and fault of doing it I made some big mistakes in my decisions. Only after some time (around 2 months), I figured out the mental framework I am going to present. Start by picking up a market. I started with a dual portfolio, like 50% in the Polish stock market, and 50% in the big foreign ones, such as US, Germany, France etc. Define the industries you want to invest in, like biotech, IT, renewable energies. Whatever: pick something. Study the stock prices in those industries. Pay particular attention to the observed losses, i.e., the observed magnitude of depreciation in those stocks. Figure out the average possible loss, and the maximum one. Now, you have an idea of how much you can lose in percentage. Quantitative techniques such as mean-reversion or extrapolation of the past changes can help. You can consult my update titled ‘What is my take on these four: Bitcoin, Ethereum, Steem, and Golem?’ to see the general drift.

The next step is to accept the occurrence of losses. You need to acknowledge very openly the following: you will lose money on some of your investment positions, inevitably. This is why you build a portfolio of many investment positions. All investors lose money on parts of their portfolio. The trick is to balance losses with even greater gains. You will be experimenting, and some of those experiments will be successful, whilst others will be failures. When you learn investment, you fail a lot. The losses you incur when learning, are the cost of your learning.

My price of learning was around €600, and then I bounced back and compensated it with a large surplus. If I take those €600 and compare it to the cost of taking an investment course online, e.g. with Coursera, I think I made a good deal.

Never invest all your money in the stock market. My method is to take some 30% of my monthly income and invest it, month after month, patiently and rhythmically, by instalments. For you, it can be 10% or 50%, which depends on what exactly your personal budget looks like. Invest just the amount you feel you can afford exposing to losses. Nail down this amount honestly. My experience is that big gains in the stock market are always the outcome of many consecutive steps, with experimentation and the cumulative learning derived therefrom.

General remark: you are much calmer when you know what you’re doing. Look at the fundamental trends and factors. Look beyond stock prices. Try to understand what is happening in the real business you are buying and selling the stock of. That gives perspective and allows more rational decisions.  

That would be it, as regards investment. You are welcome to ask questions. Now, I shift my topic radically. I return to the painful and laborious process of writing my book about collective intelligence. I feel like shaking things off a bit. I feel I need a kick in the ass. The pandemic being around and little social contacts being around, I need to be the one who kicks my own ass.

I am running myself through a series of typical questions asked by a publisher. Those questions fall in two broad categories: interest for me, as compared to interest for readers. I start with the external point of view: why should anyone bother to read what I am going to write? I guess that I will have two groups of readers: social scientists on the one hand, and plain folks on the other hand. The latter might very well have a deeper insight than the former, only the former like being addressed with reverence. I know something about it: I am a scientist.

Now comes the harsh truth: I don’t know why other people should bother about my writing. Honestly. I don’t know. I have been sort of carried away and in the stream of my own blogging and research, and that question comes as alien to the line of logic I have been developing for months. I need to look at my own writing and thinking from outside, so as to adopt something like a fake observer’s perspective. I have to ask myself what is really interesting in my writing.

I think it is going to be a case of assembling a coherent whole out of sparse pieces. I guess I can enumerate, once again, the main points of interest I find in my research on collective intelligence and investigate whether at all and under what conditions the same points are likely to be interesting for other people.

Here I go. There are two, sort of primary and foundational points. For one, I started my whole research on collective intelligence when I experienced the neophyte’s fascination with Artificial Intelligence, i.e. when I discovered that some specific sequences of equations can really figure stuff out just by experimenting with themselves. I did both some review of literature, and some empirical testing of my own, and I discovered that artificial neural networks can be and are used as more advanced counterparts to classical quantitative models. In social sciences, quantitative models are about the things that human societies do. If an artificial form of intelligence can be representative for what happens in societies, I can hypothesise that said societies are forms of intelligence, too, just collective forms.

I am trying to remember what triggered in me that ‘Aha!’ moment, when I started seriously hypothesising about collective intelligence. I think it was when I was casually listening to an online lecture on AI, streamed from the Massachusetts Institute of Technology. It was about programming AI in robots, in order to make them able to learn. I remember one ‘Aha!’ sentence: ‘With a given set of empirical data supplied for training, robots become more proficient at completing some specific tasks rather than others’. At the time, I was working on an article for the journal ‘Energy’. I was struggling. I had an empirical dataset on energy efficiency in selected countries (i.e. on the average amount of real output per unit of energy consumption), combined with some other variables. After weeks and weeks of data mining, I had a gut feeling that some important meaning is hidden in that data, only I wasn’t able to put my finger precisely on it.

That MIT-coined sentence on robots triggered that crazy question in me. What if I return to the old and apparently obsolete claim of the utilitarian school in social sciences, and assume that all those societies I have empirical data about are something like one big organism, with different variables being just different measurable manifestations of its activity?

Why was that question crazy? Utilitarianism is always contentious, as it is frequently used to claim that small local injustice can be justified by bringing a greater common good for the whole society. Many scholars have advocated for that claim, and probably even more of them have advocated against. I am essentially against. Injustice is injustice, whatever greater good you bring about to justify it. Besides, being born and raised in a communist country, I am viscerally vigilant to people who wield the argument of ‘greater good’.

Yet, the fundamental assumptions of utilitarianism can be used under a different angle. Social systems are essentially collective, and energy systems in a society are just as collective. There is any point at all in talking about the energy efficiency of a society when we are talking about the entire intricate system of using energy. About 30% of the energy that we use is used in transport, and transport is from one person to another. Stands to reason, doesn’t it?

Studying my dataset as a complex manifestation of activity in a big complex organism begs for the basic question: what do organisms do, like in their daily life? They adapt, I thought. They constantly adjust to their environment. I mean, they do if they want to survive. If I settle for studying my dataset as informative about a complex social organism, what does this organism adapt to? It could be adapting to a gazillion of factors, including some invisible cosmic radiation (the visible one is called ‘sunlight’). Still, keeping in mind that sentence about robots, adaptation can be considered as actual optimization of some specific traits. In my dataset, I have a range of variables. Each variable can be hypothetically considered as informative about a task, which the collective social robot strives to excel at.

From there, it was relatively simple. At the time (some 16 months ago), I was already familiar with the logical structure of a perceptron, i.e. a very basic form of artificial neural network. I didn’t know – and I still don’t – how to program effectively the algorithm of a perceptron, but I knew how to make a perceptron in Excel. In a perceptron, I take one variable from my dataset as output, the remaining ones are instrumental as input, and I make my perceptron minimize the error on estimating the output. With that simple strategy in mind, I can make as many alternative perceptrons out of my dataset as I have variables in the latter, and it was exactly what I did with my data on energy efficiency. Out of sheer curiosity, I wanted to check how similar were the datasets transformed by the perceptron to the source empirical data. I computed Euclidean distances between the vectors of expected mean values, in all the datasets I had. I expected something foggy and pretty random, and once again, life went against my expectations. What I found was a clear pattern. The perceptron pegged on optimizing the coefficient of fixed capital assets per one domestic patent application was much more similar to the source dataset than any other transformation.

In other words, I created an intelligent computation, and I made it optimize different variables in my dataset, and it turned out that, when optimizing that specific variable, i.e. the coefficient of fixed capital assets per one domestic patent application, that computation was the most fidel representation of the real empirical data.   

This is when I started wrapping my mind around the idea that artificial neural networks can be more than just tools for optimizing quantitative models; they can be simulators of social reality. If that intuition of mine is true, societies can be studied as forms of intelligence, and, as they are, precisely, societies, we are talking about collective intelligence.

Much to my surprise, I am discovering similar a perspective in Steven Pinker’s book ‘How The Mind Works’ (W. W. Norton & Company, New York London, Copyright 1997 by Steven Pinker, ISBN 0-393-04535-8). Professor Steven Pinker uses a perceptron as a representation of human mind, and it seems to be a bloody accurate representation.

That makes me come back to the interest that readers could have in my book about collective intelligence, and I cannot help referring to still another book of another author: Nassim Nicholas Taleb’s ‘The black swan. The impact of the highly improbable’ (2010, Penguin Books, ISBN 9780812973815). Speaking from an abundant experience of quantitative assessment of risk, Nassim Taleb criticizes most quantitative models used in finance and economics as pretty much useless in making reliable predictions. Those quantitative models are good solvers, and they are good at capturing correlations, but they suck are predicting things, based on those correlations, he says.

My experience of investment in the stock market tells me that those mid-term waves of stock prices, which I so much like riding, are the product of dissonance rather than correlation. When a specific industry or a specific company suddenly starts behaving in an unexpected way, e.g. in the context of the pandemic, investors really pay attention. Correlations are boring. In the stock market, you make good money when you spot a Black Swan, not another white one. Here comes a nuance. I think that black swans happen unexpectedly from the point of view of quantitative predictions, yet they don’t come out of nowhere. There is always a process that leads to the emergence of a Black Swan. The trick is to spot it in time.

F**k, I need to focus. The interest of my book for the readers. Right. I think I can use the concept of collective intelligence as a pretext to discuss the logic of using quantitative models in social sciences in general. More specifically, I want to study the relation between correlations and orientations. I am going to use an example in order to make my point a bit more explicit, hopefully. In my preceding update, titled ‘Cool discovery’, I did my best, using my neophytic and modest skills in programming, the method of negotiation proposed in Chris Voss’s book ‘Never Split the Difference’ into a Python algorithm. Surprisingly for myself, I found two alternative ways of doing it: as a loop, on the one hand, and as a class, on the other hand. They differ greatly.

Now, I simulate a situation when all social life is a collection of negotiations between people who try to settle, over and over again, contentious issues arising from us being human and together. I assume that we are a collective intelligence of people who learn by negotiated interactions, i.e. by civilized management of conflictual issues. We form social games, and each game involves negotiations. It can be represented as a lot of these >>

… and a lot of those >>

In other words, we collectively negotiate by creating cultural classes – logical structures connecting names to facts – and inside those classes we ritualise looping behaviours.

Cool discovery

Writing about me learning something helps me to control emotions involved into the very process of learning. It is like learning on the top of learning. I want to practice programming, in Python, the learning process of an intelligent structure on the basis of negotiation techniques presented in Chris Voss’s book ‘Never Split the Difference’. It could be hard to translate a book into an algorithm, I know. I like hard stuff, and I am having a go at something even harder: translating two different books into one algorithm. A summary, and an explanation, are due. Chris Voss develops, in the last chapter of his book, a strategy of negotiation based on the concept of Black Swan, as defined by Nassim Nicholas Taleb in his book ‘The black swan. The impact of the highly improbable’ (I am talking about the revised edition from 2010, published with Penguin Books, ISBN 9780812973815).

Generally, Chriss Voss takes a very practical drift in his method of negotiation. By ‘practical’, I mean that he presents techniques which he developed and tested in hostage negotiations with FBI, where he used to be the chief international hostage negotiator. He seems to attach particular importance to all the techniques which allow unearthing the non-obvious in negotiations: hidden emotions, ethical values, and contextual factors with strong impact on the actual negotiation. His method is an unusual mix of rigorous cognitive approach with a very emotion-targeting thread. His reference to Black Swans, thus to what we don’t know we don’t know, is an extreme version of that approach. It consists in using literally all our cognitive tools to uncover events and factors in the game which we even didn’t initially know were in the game.

Translating a book into an algorithm, especially for a newbie of programming such as I am, is hard. Still, in the case of ‘Never Split the Difference’, it is a bit easier because of the very game-theoretic nature of the method presented. Chriss Voss attaches a lot of importance to taking our time in negotiations, and to making our counterpart make a move rather than overwhelming them with our moves. All that is close to my own perspective and makes the method easier to translate into a functional sequence where each consecutive phase depends on the preceding phase.

Anyway, I assume that a negotiation is an intelligent structure, i.e. it is an otherwise coherent and relatively durable structure which learns by experimenting with many alternative versions of itself. That implies a lot. Firstly, it implies that the interaction between negotiating parties is far from being casual and accidental: it is a structure, it has coherence, and it is supposed to last by recurrence. Secondly, negotiations are supposed to be learning much more than bargaining and confrontation. Yes, it is a confrontation of interests and viewpoints, nevertheless the endgame is learning. Thirdly, an intelligent structure experiments with many alternative versions of itself and learns by assessing the fitness of those versions in coping with a vector of external stressors. Therefore, negotiating in an intelligent structure means that, consciously or unconsciously, we, mutual counterparts in negotiation, experiment together with many alternative ways of settling our differences, and we are essentially constructive in that process.

Do those assumptions hold? I guess I can somehow verify them by making first steps into programming a negotiation.  I already know two ways of representing an intelligent structure as an algorithm: in the form of a loop (primitive, tried it, does not fully work, yet has some interesting basic properties), or in the form of a class, i.e. a complex logical structure which connects names to numbers.

When represented as a loop, a negotiation is a range of recurrent steps, where the same action is performed a given number of times. Looping means that a negotiation can be divided into a finite number of essentially identical steps, and the endgame is the cumulative output of those steps. With that in mind, I can see that a loop is not truly intelligent a structure. Intelligent learning requires more than just repetition: we need consistent assessment and dissemination of new knowledge. Mind you, many negotiations can play out as ritualized loops, and this is when they are the least productive. Under the condition of unearthing Black Swans hidden in the contentious context of the negotiation, the whole thing can play out as an intelligent structure. Still, many loop-like negotiations which recurrently happen in a social structure, can together form an intelligent structure. Looks like intelligent structures are fractal: there are intelligent structures inside intelligent structures etc. Intelligent social structures can contain chains of ritualized, looped negotiations, which are intelligent structures in themselves.   

Whatever. I program. When I try to sift out the essential phenomenological categories out of the Chris Voss’s book ‘Never Split the Difference’, I get to the following list of techniques recommended by Chriss Voss:

>> Mirroring – I build emotional rapport by just repeating the last three words of each big claim phrased out by my counterpart.

 >> Labelling – I further build emotional rapport by carefully and impersonally naming emotions and aspirations in my counterpart.

>> Open-ended questions – I clarify claims and disarm emotional bottlenecks by asking calibrated open questions such as ‘How can we do X,Y, Z?’ or ‘What do we mean by…?’ etc.

>> Claims – I state either what I want or what I want my counterpart to think I want

Those four techniques can be used in various shades and combinations to accomplish typical partial outcomes in negotiation, namely: a) opportunities for your counterpart to say openly ‘No’ b) agreement in principle c) guarantee of implementation d) Black Swans, i.e. unexpected attributes of the situation which turn the negotiation in a completely different, favourable direction.

I practice phrasing it out as a class in Python. Here is what I came up with and which my JupyterLab compiler swallows nicely without yielding any errors:

Mind you, I don’t know how exactly it works, algorithmically. I am a complete newbie to programming classes in Python, and my first goal is to have the grammar right, and thus not to have to deal with those annoying, salmon-pink-shaded messages of error.

Before I go further into programming negotiation as a class, I feel like I need to go back to my primitive skills, i.e. to programming loops, in order to understand the mechanics of the class I have just created. Each ‘self’ in the class is a category able to have many experimental versions of itself. I try the following structure:

As you can see, I received an error of non-definition. I have not defined the dataset which I want to use for appending my lists. Such a dataset would contain linguistic strings, essentially. Thus, the type of datasets I am operating with, here, are sets of linguistic strings, thus sets of objects. An intelligent structure representative for negotiation is an algorithm for processing natural language. Cool discovery.

I got it all wrong

I like doing research on and writing about collective intelligence in human societies. I am even learning to program in Python in order to know how to code collective intelligence in the form of an artificial neural network. I decided to take on my own intelligence as an interesting diversion from the main course. I hope I can assume I am an intelligent structure. Whilst staying essentially coherent, i.e. whilst remembering who I am, I can experiment a bit with many different versions of myself. Of course, a substantial part of the existential Me works like a shuttle train, going back and forth on the rails of my habits. Still, I can learn heuristically on my own experience. Heuristic learning means that as I learn something, I gain new knowledge about how much more I can learn about and along the lines of the same something.

I want to put into a Python code the experience of heuristic, existential learning which I exemplified in the update titled ‘Something even more basic’. It starts with experience which happens through intentional action from my part. I define a vector of action, i.e. a vector of behavioural patterns, associated with the percentage of my total time they take. That percentage can be understood, alternatively, as the probability that any given minute in the day is devoted to that specific action. Some of those patterns are active, and some are dormant, with the possibility of being triggered into existence. Anyway, it is something like A = {a1, a2, …, an}. Now, in terms of coding in Python, is that vector of action a NumPy array, or is it a Pandas data frame? In terms of pure algorithmic technique, it is a trade-off between computational speed, with a NumPy array, and programmatic versatility in the case of a Pandas data frame. Here are a few points of view expressed, as regards this specific matter, by people smarter than me:




In terms of algorithmic theory, these are two different, cognitive phenomena. A NumPy array is a structured collection of numbers, whilst a Pandas data frame is a structure combining many types of data, e.g. string objects with numbers. How does it translate into my own experience? I think that, essentially, my action is a data frame. I take purposeful action to learn something when I have a logical frame to put it in, i.e. when I have words to label what I do. That leads me to starting at even more elementary a level, namely that of a dictionary as regards my actions.

Anyway, I create a notebook with JupyterLab, and I start like a hamster, with stuffing my cheeks with libraries:

>> import numpy as np

>> import pandas as pd

>> import os

>> import math     

Then, I make a first dictionary:

>> Types_of_action=[‘Action 1′,’Action 2′,’Action 3′,’Action 4′,’Action 5’]

A part of my brain says, at this point: ‘Wait a minute, bro. Before you put labels on the things that you do, you need to be doing things. Humans label stuff that happens, essentially. Yes, of course, later on, me can make them metaphors and abstract concepts but, fundamentally, descriptive language comes after experience’. Well, dear part of my brain, this is a valid argument. Things unfold into a paradox, just as I like it. I need raw experience, primal to any logical structuring. How to express it in Python? I can go like:

>> Raw_experience=np.random.rand(np.random.randint(1)) #This is a NumPy array made of random decimal values, and the number of those values in the array is random as well.

I check. I type ‘Raw_experience’ and run it. Python answers:

>> array([], dtype=float64) #  I have just made a paradox: a totally empty array of numbers, i.e. with no numbers in it, and yet those inexistent numbers have a type, namely that of ‘float64’.

I try something less raw and more cooked, like:

>> Raw_experience_50=np.random.rand(50) # I assume a priori there are 50 distinct units of raw experience

>> Raw_experience_50 # yields…

>> array([0.73209089, 0.94390333, 0.44267215, 0.92111994, 0.4098961 ,

       0.22435079, 0.61447481, 0.21183481, 0.10223352, 0.04481922,

       0.01418667, 0.65747087, 0.22180559, 0.6158434 , 0.82275393,

       0.22446375, 0.31331992, 0.64459349, 0.90762324, 0.65626915,

       0.41462473, 0.35278516, 0.13978946, 0.79563848, 0.41794509,

       0.12931173, 0.37012284, 0.37117378, 0.30989358, 0.26912215,

       0.7404481 , 0.61690128, 0.41023962, 0.9405769 , 0.86930885,

       0.84279381, 0.91174751, 0.04715724, 0.35663278, 0.75116884,

       0.78188546, 0.30712707, 0.00615981, 0.93404037, 0.82483854,

       0.99342718, 0.74814767, 0.49888401, 0.93164796, 0.87413073])

This is a short lesson of empiricism. When I try to code raw, completely unstructured experience, I obtain an empty expanse. I return to that interesting conversation with a part of my brain. Dear part of my brain, you were right to point out that experience comes before language, and yet, without language, i.e. without any logical structuring of reality, I don’t know s**t about experience, and I cannot intelligibly describe it. I need to go for a compromise. I make that experience as raw as possible by making it happen at random, and, in the same time, I need to give it some frame, like the number of times those random things are supposed to happen to me.

I defined a dictionary with 5 types of action in it. Thus, I define a random path of happening as an array made of 5 categories (columns), and 50 rows of experience: Raw_experience_for_action=np.random.rand(50,5).

I acknowledge the cognitive loop I am in, made of raw experience and requiring some language to put order in all that stuff. I make a data frame:

>> Frame_of_action = pd.DataFrame (Raw_experience_for_action, columns = [Types_of_action]) # One remark is due, just in case. In the Python code, normally, there are no spaces. I put spaces, somehow in phase with interpunction, just to make some commands more readable.

I check with ‘’ and I get:

>> <class 'pandas.core.frame.DataFrame'>
RangeIndex: 50 entries, 0 to 49
Data columns (total 5 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   (Action 1,)  50 non-null     float64
 1   (Action 2,)  50 non-null     float64
 2   (Action 3,)  50 non-null     float64
 3   (Action 4,)  50 non-null     float64
 4   (Action 5,)  50 non-null     float64
dtypes: float64(5)
memory usage: 2.1 KB

Once I have that basic frame of action, what is my next step? I need to learn from that experience. The frame of action is supposed to give me knowledge. What is knowledge coming from action? That type of knowledge is called ‘outcomes’. My action brings an outcome, and I evaluate it. Now, in a baseline algorithm of artificial neural network, evaluation of outcomes happens by pitching them against a predefined benchmark, something like expected outcome. As I am doing my best to be an intelligent structure, there is that aspect too, of course. Yet, there is something else, which I want to deconstruct, understand, and reconstruct as Python code. There is discovery and exploration, thus something that I perceive as entirely new a piece of experience. I don’t have any benchmark I can consciously pitch that experience against.

I can perceive my fresh experiential knowledge in two different ways: as a new piece of data, or as an error, i.e. as deviation from the expected state of reality. Both mathematically, and algorithmically, it is a difference. Mathematically, any number, thus any piece of data, is the result of an operation. If I note down, in the algorithm of my heuristic learning, my new knowledge as literally new, anyway it needs to come from some kind of mathematical operation: addition, subtraction, multiplication, or division.

As I think about myself learning new stuff, there is a phase, in the beginning, when I have some outcomes, and yet I don’t have any precise idea what those outcomes are, exactly. This is something that happens in coincidence (I don’t even know, yet, if this is a functional correlation) with the actions I do.

As I think about all that stuff, I try to make a loop of learning between action and outcomes, and as I am doing it, I realize I got it all wrong. For the last few weeks, I have been assuming that an intelligent structure can and should be coded as a loop (see, for example, ‘Two loops, one inside the other’). Still, as I am trying to code the process of my own heuristic learning, I realize that an algorithmic loop has fundamental flaws in that respect. Essentially, each experimental round – where I pitch the outcomes of my actions against a pre-defined expected outcome – is a separate loop, as I have to feed forward the resulting error. With many experimental rounds, like thousands, making a separate loop for each of them is algorithmically clumsy. I know it even at my neophytic stage of advancement in Python.

When I don’t know what to do, I can ask around. I can ask people smarter than me. And so I ask:    



After rummaging a bit in the content available under those links, I realize that intelligent structures can be represented algorithmically as classes ( ), and it is more functional a way than representing them as loops. From the second of the above-mentioned links, I took an example of algorithm, which I allow myself to reproduce below. Discussing this algorithm will help me wrapping my own mind around it and developing new understandings.


A neural network is a class, i.e. a type of object, which allows creating many different instances of itself. Inside the class, types of instances are defined, using selves: ‘self.input’, ‘self.output’ etc. Selves are composed into distinct functions, introduced with the command ‘def’. Among the three functions defined inside the class ‘NeuralNetwork’, one is particularly interesting, namely the ‘_init_’. As I rummage through online resources, it turns out that ‘_init_’ serves to create objects inside a class, and then to make selves of those objects. 

I am trying to dissect the use of ‘_init_’ in this specific algorithm. It is introduced with three attributes: self, x, and y. I don’t quite get the corresponding logic. I am trying with something simpler: an algorithm I found at :

I think I start to understand. Inside the ‘_init_’ function, I need to signal there are different instances – selves – of the class I create. Then, I add the variables I intend to use. In other words, each specific self of the class ‘Rectangle’ has three dimensions: length, breadth, and unit cost.

I am trying to apply this logic to my initial problem, i.e. my own heuristic learning, with the bold assumption that I am an intelligent structure. I go:

>> class MyLearning:

            def _init_(self, action, outcomes, new_knowledge = 0):

                        self.action = action

                        self.outcomes = outcomes

                        self.new_knowledge = new_knowledge

            def learning(self):

                        return action-outcomes

When I run this code, there is no error message from Python, which is encouraging for a newbie such as I am. Mind you, I have truly vague an idea of what I have just coded. I know it is grammatically correct in Python.

Something even more basic

It had to happen. I have been seeing it coming, essentially, only I didn’t want to face it. The nature of truth. Whatever kind of intellectual adventure we engage, i.e. whatever kind of game we start playing, with coherent understanding of reality in the centre of it, that s**t just has to come our way. We ask ourselves: ‘What is true?’.

My present take on truth is largely based on something even more basic than intellectual pursuit as such. Since December 2016, I have been practicing the Wim Hof method ( ). In short, this is a combination of breathing exercises with purposeful exposition to cold. You can check the original recipe on that website I have just provided, and I want to give an account of my own experience, such as I practice it now. Can I say that I practice an experience? Yes, in this case this is definitely the right way to call it. My practice of the Wim Hof method consists, precisely, in me exposing myself, over and over again, consistently, to a special experience.

Anyway, every evening, I practice breathing exercises, followed by a cold shower. I do the breathing in two phases. In the first phase, I do the classical Wim Hof pattern: 30 – 40 deep, powerful breaths, when I work with my respiratory muscles as energetically as I can, inhale through the nose, exhale through pinched mouth (so as to feel a light resistance in my lips and cheeks, as if my mouth was a bellow with limited flow capacity), and then, after I exhale for the 30th ÷ 40th time, I pass in apnoea, i.e. I stop breathing. For how long? Here comes the first component of experiential truth: I have no idea how long. I don’t measure my phase of apnoea with a timer, I measure it with my proprioception.

In the beginning, that is 3 – 4 years ago, I used to follow religiously the recommendation of Wim Hof himself, namely to stay at the limit of my comfort zone. In other words, beginners should stay in apnoea just long enough to start feeling uncomfortable, and then they should inhale. When I feel something like muscular panic inside, e.g. my throat muscles going into a sort of spasm, I inhale, deeply, and I hold for like 10 seconds. Then, I repeat the whole cycle as many times as I feel like. Now, after 4 years of practice, I know that my comfort zone can stretch. Now, when I pass into apnoea, the spasm of my throat is one of the first proprioceptive feelings I experience, not the last. I stop breathing, my throat muscles contract in something that feels like habitual panic, and then something in me says: ‘Wait a minute. Like really, wait. That feeling of muscular panic in the throat muscles, it is interesting. Wait, explore it, discover. Please’. Thus, I discover. I discover that my throat muscles spasm in a cycle of contraction and relaxation. I discover that once I set for discovering that feeling, I start discovering layers of calm. Each short cycle of spasmatic panic in my throat induces in me something like a deep reach into my body, with emotions such as fear fading away a bit. I experience something like spots of pleasurable tingling and warmth, across my body, mostly in big muscular groups, like the legs, the abs, or the back. There comes a moment when the next spasm in my respiratory muscles drives me to inhaling. This specific moment moves in time as I practice. My brain seems to be doing, at every daily practice of mine, something like accounting work: ‘How much of my subjectively experienced safety from suffocation am I willing to give away, when rewarded with still another piece of that strange experience, when I feel as if I were suffocating, and, as strange as it seems, I sort of like the feeling?’.       

I repeat the cycle ’30 – 40 powerful breaths, apnoea, then inhale and hold for a few seconds’ a few times, usually 3 to 5 times, and this is my phase one. After, I pass into my phase two, which consists in doing as many power breaths as I can until I start feeling fatigue in my respiratory muscles. Usually, it is something like 250 breaths, sometimes I go beyond 300. After the last breath, I pass in apnoea, and, whilst staying in that state, I do 20 push-ups. After the last push-up, I breathe in, and hold for a long moment.  

I repeat two times the whole big cycle of phase one followed by phase two. Why two times? Why those two phases inside each cycle? See? That’s the big question. I don’t know. Something in me calibrates my practice into that specific protocol. That something in me is semi-conscious. I know I am following some sort of discovery into my sensory experience. When I am in phase one, I am playing with my fear of suffocation. Playing with that fear is bloody interesting. In phase two, let’s face it, I get high as f**k on oxygen. Really. Being oxygen-high, my brain starts behaving differently. It starts racing. Thoughts turn up in my consciousness like fireworks. Many of those thoughts are flashback memories. For a short moment, my brain dives head-first into a moment of my past. I suddenly re-experience a situation, inclusive of emotions and sensory feelings. This is a transporting experience. I realize, over and over again, that what I remember is full of ambiguity. There are many possible versions of memories about the same occurrence in the past. I feel great with that. When I started to go through this specific experience, when high on oxygen, I realized that my own memorized past is a mine of information. Among other things, it opens me up onto the realization how subjective is my view of events that happened to me. I realized how many points of very different viewpoints I can hold as regards the same occurrence.

Here comes another interesting thing which I experience when being high on oxygen amidst those hyper-long sequences of power breathing. Just as I rework my memories, I rework my intellectual take on present events. Ideas suddenly recombine into something new and interesting. Heavily emotional judgments about recent or ongoing tense situations suddenly get nuanced, and I joyfully indulge in accepting what a d**k I have been and how many alternative options do I have.

After the full practice of two cycles, each composed of two phases in breathing exercises, I go under a cold shower. Nothing big, something like 30 seconds. Enough to experience another interesting distinction. When I pour cold water on my skin, the first, spontaneous label I can put on that experience is ‘cold’. Once again, something in me says ‘Wait a minute! There is no way in biology you can go into hypothermia in 30 seconds under a cold shower. What you experience is not cold, it is something else. It is the fear of cold. It is a flood on sensory warnings’. There is a follow-up experience which essentially confirms that intuition. Sometimes, when I feel like going really hard on that cold shower, like 2 minutes, I experience a delayed feeling of cold. One or two hours later, when I am warm and comfy, suddenly I start shivering and experiencing once again the fear of cold. Now, there is even no sensory stimulation whatsoever. It is something like a vegetative memory of the cold which I have experienced shortly before. My brain seems to be so overwhelmed with that information that it needs to rework through it. It is something like a vegetative night-dream. Funny.  

All of that happens during each daily practice of the Wim Hof method. Those internal experiences vanish shortly after I stop the session. There is another layer, namely that of change which I observe in me as I practice over a long time. I have realized, and I keep realizing the immense diversity of ways I can experience my life. I keep taking and re-taking a step back from my current emotions and viewpoints. I sort of ride the wave crest of all the emotional and cognitive stuff that goes on inside of me. Hilarious and liberating an experience. Over the time spent practicing the Wim Hof method, I have learnt to empty my consciousness. It is real. For years, I have been dutifully listening to psychologists who claim that no one can really clear their conscious mind of any thought whatsoever. Here comes the deep revelation of existential truth through experience. My dear psychologists, you are wrong. I can and do experience, frequently and wilfully, a frame of mind when my consciousness is really like calm water. Nothing, like really. Not a single conscious thought. On the other hand, I know that I can very productively combine that state of non-thinking with a wild, no-holds-barred ride on thinking, when I just let it go internally and let thoughts flow through my consciousness. I know by experience that when I go in that sort of meditation, alternating the limpid state of consciousness with the crazy rollercoaster of running thoughts, my subconscious gets a kick, and I just get problem-solving done.

The nature of truth that I want to be the provisional bottom line, under my personal account of practicing the Wim Hof method, is existential. I know, people have already said it. Jean-Paul Sartre, Martin Heidegger. Existentialists. They claimed that truth is existential, that it comes essentially and primordially from experience. I know. Still, reading about the depth of existential truth is one thing, and experiencing it by myself is a completely different ball game. Existential truth has limits, and those limits are set, precisely, by the scope of experience we have lived. Here comes a painful, and yet quite enlightening and experience of mine, as regards the limits of existential truth. Yesterday, i.e. on January 1st, 2021, I was listening to one of my favourite podcasts, namely the ‘Full Auto Friday’ by Andy Stumpf ( ). A fan of the podcast asked for discussing his case, namely that of a young guy, age 18, whose father is alcoholic, elaborately destroys himself and people around him, and makes his young son face a deep dilemma: ‘To what extent should I sacrifice myself in order to help my father?’.

I know the story, in a slightly different shade. My relations with my father (dead in 2019), had never been exemplary, for many reasons. His f**k-ups and my f**k-ups summed up and elevated each other to a power, and long story short, my dad fell into alcoholism around the age I am now, i.e. in his fifties, and our common 1990ies. He would drink himself into death and destruction, literally. Destroyed his liver, went through long-lasting portal hypertension ( ), destroyed his professional life, destroyed his relationships, inclusive, very largely, of his relationship with me. At the time, I was very largely watching the show from outside. Me and my father lived in different cities, did not really share much of our existence. Just to prevent your questions: the role and position of my late mum, in all that, is a different story. She was living her life in France.

My position of bystander cam brutally to an end in 2002, when my dad found himself at the bottom of the abyss. He stopped drinking, because in Polish hospitals we have something like semi-legal forced detox, and this is one of those rare times when something just semi-legal comes as deliverance. Yet, him having stopped drinking was about the only bright spot in his existence. The moment he would be discharged from hospital, he was homeless. The state he was in, it meant death within weeks. I was his only hope. Age 34, I had to decide, whether I will take care of a freshly detoxicated alcoholic, who had not really been paying much attention to me and who happened to be my biological father. I decided I will take care of him, and so I did. Long story short, the next 17 years were rough. Both the existential status of my father and his health, inclusive of his personality, were irremediably damaged by alcoholism. Yes, this is the sad truth: even fully rehabbed alcoholics have just a chance to regain normal life, and that chance is truly far from certainty. Those 17 years, until my dad’s death, were like a very long agony, pointless and painful.

And here I was, yesterday, listening to Andy Stumpf’s podcast, and him instructing his young listener that he should ‘draw a line in the sand, and hold that line, and not allow his alcoholic father to destroy another life’. I was listening to that, I fully acknowledged the wisdom of those words, and I realized that I did exactly the opposite. No line in the sand, just a crazy dive, head-first, into what, so far, had been the darkest experience of my life. Did I make the right choice? I don’t know, really. No clue. Even when I run that situation through the meditative technique which I described a few paragraphs earlier, and I did it many times, I have still no clue. These are my personal limits of existential truth. I was facing a loop. To any young man, his father is the main role model for ethical behaviour. My father’s behaviour had been anything but a moral compass. Trying to figure out the right thing to do, in that situation, was like being locked in a box and trying to open it with a key lying outside the box.

I guess any person in a similar situation faces the same limits. This is one of those situations, when I really need cold, scientific, objectivized truth. Some life choices are so complex that existential truth is not enough. Yet, with many other things, existential truth just works. What is my existential truth, though, sort of generally and across the board? I think it is a very functional type of truth, namely that of gradual, step-by-step achievement. In my life, existential truth serves me to calibrate myself to achieve what I want to achieve.

Money being just money for the sake of it

I have been doing that research on the role of cities in our human civilization, and I remember the moment of first inspiration to go down this particular rabbit hole. It was the beginning of March, 2020, when the first epidemic lockdown has been imposed in my home country, Poland. I was cycling through streets of Krakow, my city, from home to the campus of my university. I remember being floored at how dead – yes, literally dead – the city looked. That was the moment when I started perceiving cities as something almost alive. I started wondering how will pandemic affect the mechanics of those quasi-living, urban organisms.

Here is one aspect I want to discuss: restaurants. Most restaurants in Krakow turn into takeouts. In the past, each restaurant had the catering part of the business, but it was mostly for special events, like conferences, weddings and whatnot. Catering was sort of a wholesale segment in the restaurant business, and the retail was, well, the table, the napkin, the waiter, that type of story. That retail part was supposed to be the main one. Catering was an addition to that basic business model, which entailed a few characteristic traits. When your essential business process takes place in a restaurant room with tables and guests sitting at them, the place is just as essential. The location, the size, the look, the relative accessibility: it all played a fundamental role. The rent for the place was among the most important fixed costs of a restaurant. When setting up business, one of the most important questions – and risk factors – was: “Will I be able to attract sufficiently profuse customers to this place, and to ramp up prices sufficiently high to as to pay the rent for the place and still have satisfactory profit?”. It was like a functional loop: a better place (location, look) meant more select a clientele and higher prices, which required to pay a high rent etc.

As I was travelling to other countries, and across my own country, I noticed many times that the attributes of the restaurant as physical place were partly substitute to the quality of food. I know a lot of places where the customers used to pretend that the food is excellent just because said food was so strange that it just didn’t do to say it is crappy in taste. Those people pretended they enjoy the food because the place was awesome. Awesomeness of the place, in turn, was largely based on the fact that many people enjoyed coming there, it was trendy, stylish, it was a good thing to show up there from time to time, just to show I have something to show to others. That was another loop in the business model of restaurants: the peculiar, idiosyncratic, gravitational field between places and customers.

In that business model, quite substantial expenses, i.e.  the rent, and the money spent on decorating and equipping the space for customers were essentially sunk costs. The most important financial outlays you made to make the place competitive did not translate into any capital value in your assets. The only way to do such translation was to buy the place instead of renting it. Advantageous, long-term lease was another option. In some cities, e.g. the big French ones, such as Paris, Lyon or Marseille, the market of places suitable for running restaurants, both legally and physically, used to be a special segment in the market of real estate, with its own special contracts, barriers to entry etc.   

As restaurants turn into takeouts, amidst epidemic restrictions, their business model changes. Food counts in the first place, and the place counts only to the extent of accessibility for takeout. Even if I order food from a very fancy restaurant, I pay for food, not for fanciness. When consumed at home, with the glittering reputation of the restaurant taken away from it, food suddenly tastes differently. I consume it much more with my palate and much less with my ideas of what is trendy. Preparation and delivery of food becomes the essential business process. I think it facilitates new entries into the market of gastronomy. Yes, I know, restaurants are going bankrupt, and my take on it is that places are going bankrupt, but people stay. Chefs and cooks are still there. Human capital, until recently being 50/50 important – together with the real estate aspect of the business – becomes definitely the greatest asset of the restaurants’ sector as they focus on takeout. The broadly spoken cooking skills, including the ability to purchase ingredients of good quality, become primordial. Equipping a business-scale kitchen is not really rocket science, and, what is just as important, there is a market for second-hand equipment of that kind. The equipment of a kitchen, in a takeout-oriented restaurant, is much more of an asset than the decoration of a dining room. The rent you pay, or the market price of the whole place in the real-estate market are much lower, too, as compared to classical restaurants.

What restaurant owners face amidst the pandemic is the necessity to switch quickly, and on a very short notice of 1 – 2 weeks, between their classical business model based on a classy place to receive customers, and the takeout business model, focused on the quality of food and the promptness of delivery. It is a zone of uncertainty more than a durable change, and this zone is

associated with different cash flows and different assets. That, in turn, means measurable risk. Risk in big amounts is an amount, essentially, much more than a likelihood. We talk about risk, in economics and in finance, when we are actually sure that some adverse events will happen, and we even know what is going to be the total amount of adversity to deal with; we just don’t know where exactly that adversity will hit and who exactly will have to deal with it.

There are two basic ways of responding to measurable risk: hedging and insurance. I can face risk by having some aces up my sleeve, i.e. by having some alternative assets, sort of fall-back ones, which assure me slightly softer a landing, should the s**t which I hedge against really happen. When I am at risk in my in-situ restaurant business, I can hedge towards my takeout business. With time, I can discover that I am so good at the logistics of delivery that it pays off to hedge towards a marketing platform for takeouts rather than one takeout business. There is an old saying that you shouldn’t put all your eggs in the same basket, and hedging is the perfect illustration thereof. I hedge in business by putting my resources in many different baskets.

On the other hand, I can face risk by sharing it with other people. I can make a business partnership with a few other folks. When I don’t really care who exactly those folks are, I can make a joint-stock company with tradable shares of participation in equity. I can issue derivative financial instruments pegged on the value of the assets which I perceive as risky. When I lend money to a business perceived as risky, I can demand it to be secured with tradable notes AKA bills of exchange. All that is insurance, i.e. a scheme where I give away part of my cash flow in exchange of the guarantee that other people will share with me the burden of damage, if I come to consume my risks. The type of contract designated expressis verbis as ‘insurance’ is one among many forms of insurance: I pay an insurance premium in exchange o the insurer’s guarantee to cover my damages. Restaurant owners can insure their epidemic-based risk by sharing it with someone else. With whom and against what kind of premium on risk? Good question. I can see like a shade of that. During the pandemic, marketing platforms for gastronomy, such as Uber Eats, swell like balloons. These might be the insurers of the restaurant business. They capitalize on the customer base for takeout. As a matter of fact, they can almost own that customer base.

A group of my students, all from France, as if by accident, had an interesting business concept: a platform for ordering food from specific chefs. A list of well-credentialed chefs is available on the website. Each of them recommends a few flagship recipes of theirs. The customer picks the specific chef and their specific culinary chef d’oeuvre. One more click, and the customer has that chef d’oeuvre delivered on their doorstep. Interesting development. Pas si con que ça, as the French say.     

Businesspeople have been using both hedging and insurance for centuries, to face various risks. When used systematically, those two schemes create two characteristic types of capitalistic institutions: financial markets and pooled funds. Spreading my capitalistic eggs across many baskets means that, over time, we need a way to switch quickly among baskets. Tradable financial instruments serve to that purpose, and money is probably the most liquid and versatile among them. Yet, it is the least profitable one: flexibility and adaptability is the only gain that one can derive from holding large monetary balances. No interest rate, no participation in profits of any kind, no speculative gain on the market value. Just adaptability. Sometimes, just being adaptable is enough to forego other gains. In the presence of significant need for hedging risks, businesses hold abnormally large amounts of cash money.

When people insure a lot – and we keep in mind the general meaning of insurance as described above – they tend to create large pooled funds of liquid financial assets, which stand at the ready to repair any breach in the hull of the market. Once again, we return to money and financial markets. Whilst abundant use of hedging as strategy for facing risk leads to hoarding money at the individual level, systematic application of insurance-type contracts favours pooling funds in joint ventures. Hedging and insurance sort of balance each other.

Those pieces of the puzzle sort of fall together into a pattern. As I have been doing my investment in the stock market, all over 2020, financial markets seems to be puffy with liquid capital, and that capital seems to be avid of some productive application. It is as if money itself was saying: ‘C’mon, guys. I know I’m liquid, and I can compensate risk, but I am more than that. Me being liquid and versatile makes me easily convertible into productive assets, so please, start converting. I’m bored with being just me, I mean with money being just money for the sake of it’.

Boots on the ground

I continue the fundamental cleaning in my head, as the year 2020 touches to its end. What do I want? Firstly,I want to exploit and develop on my hypothesis of collective intelligence in human societies, and I want to develop my programming skills in Python. Secondly, I want to develop my skills and my position as a facilitator and manager of research projects at the frontier of the academic world and that of business.  How will I know I have what I want? If I actually program a workable (and working) intelligent structure, able to uncover and reconstruct the collective intelligence of a social structure out of available empirical data – namely to uncover and reconstruct the chief collective outcomes that structure is after, and its patterns of reaction to random exogenous disturbances – that would be an almost tangible outcome for me, telling me I have made a significant step. When I see that I have repetitive, predictable patterns of facilitating the start of joint research projects in consortiums of scientific entities and business ones, then I know I have nailed down something in terms of project management. If I can start something like an investment fund for innovative technologies, then I definitely know I am on the right track.

As I want to program an intelligent structure, it is essentially an artificial neural network, possibly instrumented with additional functions, such as data collection, data cleansing etc. I know I want to understand very specifically what my neural network does. I want to understand every step in takes. To that purpose, I need to figure out a workable algorithm of my own, where I understand every line of code. It can be sub-optimally slow and limited in its real computational power, yet I need it. On the other hand, Internet is more and more equipped with platforms and libraries in the form of digital clouds, such as IBM Watson, or Tensorflow, which provide optimized processes to build complex pieces of AI. I already know that being truly proficient in Data Science entails skills pertinent to using those cloud-based tools. My bottom line is that if I want to program an intelligent structure communicable and appealing to other people, I need to program it at two levels: as my own prototypic code, and as a procedure of using cloud-based platforms to replicate it.             

At the juncture of those two how-will-I-know pieces of evidence, an idea emerges, a crazy one. What if I can program an intelligent structure which uncovers and reconstructs one or more alternative business models out of the available empirical data? Interesting. The empirical data I work the most with, as regards business models, is the data provided in the annual reports of publicly listed companies. Secondarily, data about financial markets sort of connects. My own experience as small investor supplies me with existential basis to back this external data, and that experience suggests me to define a business model as a portfolio of assets combined with broadly spoken behavioural patterns both in people active inside the business model, thus running it and employed with it, and in people connected to that model from outside, as customers, suppliers, investors etc.

How will other people know I have what I want? The intelligent structure I will have programmed has to work across different individual environments, which is an elegant way of saying it should work on different computers. Logically, I can say I have clearly demonstrated to other people that I achieved what I wanted with that thing of collective intelligence when said other people will be willing to and successful at trying my algorithm. Here comes the point of willingness in other people. I think it is something like an existential thing across the board. When we want other people to try and do something, and they don’t, we are pissed. When other people want us to try and do something, and we don’t, we are pissed, and they are pissed. As regards my hypothesis of collective intelligence, I have already experienced that sort of intellectual barrier, when my articles get reviewed. Reviewers write that my hypothesis is interesting, yet not articulate and not grounded enough. Honestly, I can’t blame them. My feeling is that it is even hard to say that I have that hypothesis of collective intelligence. It is rather as if that hypothesis was having me as its voice and speech. Crazy, I know, only this is how I feel about the thing, and I know by experience that good science (and good things, in general) turn up when I am honest with myself.

My point is that I feel I need to write a book about that concept of collective intelligence, in order to give a full explanation of my hypothesis. My observations about cities and their role in the human civilization make, for the moment, one of the most tangible topics I can attach the theoretical hypothesis to. Writing that book about cities, together with programming an intelligent structure, takes a different shade, now. It becomes a complex account of how we can deconstruct something – our own collective intelligence – which we know is there and yet, as we are inside that thing, we have hard times to describe it.

That book about cities, abundantly referring to my hypothesis of collective intelligence, could be one of the ways to convince other people to at least try what I propose. Thus, once again, I restate what I understand by intelligent structure. It is a structure which learns new patterns by experimenting with many alternative versions of itself, whilst staying internally coherent. I return to my ‘DU_DG’ database about cities (see ‘It is important to re-assume the meaning’) and I am re-assuming the concept of alternative versions, in an intelligent structure.

I have a dataset structured into n variables and m empirical observations. In my DU_DG database, as in many other economic datasets, distinct observations are defined as the state of a country in a given year. As I look at the dataset (metaphorically, it has content and meaning, but it does not have any physical shape save for the one my software supplies it with), and as I look at my thoughts (metaphorically, once again), I realize I have been subconsciously distinguishing two levels of defining an intelligent structure in that dataset, and, correspondingly, two ways of defining alternative versions thereof. At the first level, the entire dataset is supposed to be an intelligent structure and alternative versions thereof consist in alternative dichotomies of the type ‘one variable as output, i.e. as the desired outcome to optimize, and the remaining ones as instrumental input’. At this level of structural intelligence – by which term I understand the way of being in an intelligent structure – alternative versions are alternative orientations, and there are as many of them as there are variables.

Distinction into variables is largely, although not entirely, epistemic, and not ontological. The headcount of urban population is not fundamentally different phenomenon from the surface of agricultural land. Yes, the units of measurement are different, i.e. people vs. square kilometres, but, ontologically, it is largely the same existential stuff, possible to describe as people living somewhere in large numbers and being successful at it. Historically, social scientists and governments alike have come to the conclusion, though, that these two metrics have different a meaning, and thus it comes handy to distinguish them as semantic vessels to collect and convey information. The distinction of alternative orientations in an intelligent structure, supposedly represented in a dataset, is arbitrary and cognitive more than ontological. It depends on the number of variables we have. If I add variables to the dataset, e.g. by computing coefficients between the incumbent variables, I can create new orientations for the intelligent structure, i.e. new alternative versions to experiment with.

The point which comes to my mind is that the complexity of an intelligent structure, at that first level, depends on the complexity of my observational apparatus. The more different variables I can distinguish, and measure as regards a given society, the more complexity I can give to the allegedly existing, collectively intelligent structure of that society.

Whichever combination ‘output variable vs. input variables’ I am experimenting with, there comes the second level of defining intelligent structures, i.e. that of defining them as separate countries. They are sort of local intelligent structures, and, at the same time, they are alternative experimental versions of the overarching intelligent structure to be found in the vector of variables. Each such local intelligent structure, with a flag, a national anthem, and a government, produces many alternative versions of itself in consecutive years covered by the window of observation I have in my dataset.

I can see a subtle distinction here. A country produces alternative versions of itself, in different years of its existence, sort of objectively and without giving a f**k about my epistemological distinctions. It just exists and tries to be good at it. Experimenting comes as natural in the flow of time. This is unguided learning. On the other hand, I produce different orientations of the entire dataset. This is guided learning. Now, I understand the importance of the degree of supervision in artificial neural networks.

I can see an important lesson for me, here. If I want to program intelligent structures ‘able to uncover and reconstruct the collective intelligence of a social structure out of available empirical data – namely to uncover and reconstruct the chief collective outcomes that structure is after, and its patterns of reaction to random exogenous disturbances’, I need to distinguish those two levels of learning in the first place, namely the unguided flow of existential states from the guided structuring into variables and orientations. When I have an empirical dataset and I want to program an intelligent structure able to deconstruct the collective intelligence represented in that dataset, I need to define accurately the basic ontological units, i.e. the fundamentally existing things, then I define alternative states of those things, and finally I define alternative orientations.

Now, I am contrasting. I pass from those abstract thoughts on intelligent structures to a quick review of my so-far learning to program those structures in Python. Below, I present that review as a quick list of separate files I created in JupyterLab, together with a quick characteristic of problems I am trying to solve in each of those files, as well as of the solutions found and not found.

>> Practice Dec 11 2020.iypnb.

In this file, I work with IMF database WEOOct2020 ( ).  I practiced reading complex datasets, with an artificially flattened structure. It is a table, in which index columns are used to add dimensions to an otherwise two-dimensional format. I practiced the ‘read_excel’ and ‘read_csv’ commands. On the whole, it seems that converting an Excel to CSV and then reading CSV in Python is a better method than reading excel. Problems solved: a) cleansing the dataset of not-a-number components and successful conversion of initially ‘object’ columns into the desired ‘float64’ format b) setting descriptive indexes to the data frame c) listing unique labels from a descriptive index d) inserting new columns into the data frame e) adding (compounding) the contents of two existing, descriptive index columns into a third index column. Failures: i) reading data from XML file ii) reading data from SDMX format iii) transposing my data frame so as to put index values of economic variables as column names and years as index values in a column.

>> Practice Dec 8 2020.iypnb.

In this file, I worked with a favourite dataset of mine, the Penn Tables 9.1. ( ). I described my work with it in two earlier updates, namely ‘Two loops, one inside the other’, and ‘Mathematical distance’. I succeeded in creating an intelligent structure from that dataset. I failed at properly formatting the output of that structure and thus at comparing the cognitive value of different orientations I made it simulate.   

>> Practice with Mortality.iypnb.

I created this file as a first practice before working with the above-mentioned WEOOct2020 database. I took one dataset from the website of the World Bank, namely that pertinent to the coefficient of adult male mortality ( ). I practiced reading data from CSV files, and I unsuccessfully tried to stack the dataset, i.e. to transform columns corresponding to different years of observation into rows indexed with labels corresponding to years.   

>> Practice DU_DG.iypnb.

In this file, I am practicing with my own dataset pertinent to the density of urban population and its correlates. The dataset is already structured in Excel. I start practicing the coding of the same intelligent structure I made with Penn Tables, supposed to study orientation of the societies studied. Same problems and same failures as with Penn Tables 9.1.: for the moment, I cannot nail down the way to get output data in structures that allow full comparability. My columns tend to wander across the output data frames. In other words, the vectors of mean expected values produced by the code I made have slightly (just slightly, and sufficiently to be annoying) different a structure from the original dataset. I don’t know why, yet, and I don’t know how to fix it.  

On the other hand, in that same file, I have been messing around a bit with algorithms based on the ‘scikit’ library for Python. Nice graphs, and functions which I still need to understand.

>> Practice SEC Financials.iypnb.

Here, I work with data published by the US Securities and Exchange Commission, regarding the financials of individual companies listed in the US stock market ( ). The challenge here consists in translating data originally supplied in *.TXT files into numerical data frames in Python. The problem with I managed to solve, so far (this is the most recent piece of my programming), is the most elementary translation of TXT data into a Pandas data frame, using the ‘open()’ command, and the ‘f.readlines()’ one. Another small victory here is to read data from a sub-directory inside the working directory of JupyterLab, i.e. inside the root directory of my user profile. I used two methods of reading TXT data. Both worked sort of. First, I used the following sequence:

>> with open(‘2020q3/num.txt’) as f:


>> Numbers=pd.DataFrame(numbers)

… which, when checked with the ‘’ command, yields:

<class ‘pandas.core.frame.DataFrame’>

RangeIndex: 2351641 entries, 0 to 2351640

Data columns (total 1 columns):

 #   Column  Dtype

—  ——  —–

 0   0       object

dtypes: object(1)

memory usage: 17.9+ MB

In other words, that sequence did not split the string of column names into separate columns, and the ‘Numbers’ data frame contains one column, in which every row is a long string structured with the ‘\’ separators. I tried to be smart with it. I did:

>> Numbers.to_csv(‘Num2’) # I converted the Pandas data frame into a CSV file

>> Num3=pd.DataFrame(pd.read_csv(‘Num2′,sep=’;’)) # …and I tried to read back from CSV, experimenting with different separators. None of it worked. With the ‘sep=’ argument in the command, I kept getting an error of parsing, in the lines of ‘ParserError: Error tokenizing data. C error: Expected 1 fields in line 3952, saw 10’. When I didn’t use the ‘sep=’ argument, the command did not yield error, yet it yielded the same long column of structured strings instead of many data columns.  

Thus, I gave up a bit, and I used Excel to open the TXT file, and to save a copy of it in the CSV format. Then, I just created a data frame from the CSV dataset, through the ‘NUM_from_CSV=pd.DataFrame(pd.read_csv(‘SEC_NUM.csv’, sep=’;’))’ command, which, checked with the ‘’ command, yields:

<class ‘pandas.core.frame.DataFrame’>

RangeIndex: 1048575 entries, 0 to 1048574

Data columns (total 9 columns):

 #   Column    Non-Null Count    Dtype 

—  ——    ————–    —– 

 0   adsh      1048575 non-null  object

 1   tag       1048575 non-null  object

 2   version   1048575 non-null  object

 3   coreg     30131 non-null    object

 4   ddate     1048575 non-null  int64 

 5   qtrs      1048575 non-null  int64 

 6   uom       1048575 non-null  object

 7   value     1034174 non-null  float64

 8   footnote  1564 non-null     object

dtypes: float64(1), int64(2), object(6)

memory usage: 72.0+ MB

The ‘tag’ column in this data frame contains the names of financial variables ascribed to companies identified with their ‘adsh’ codes. I experience the same challenge, and, so far, the same failure as with the WEOOct2020 database from IMF, namely translating different values in a descriptive index into a dictionary, and then, in the next step, to flip the database so as to make those different index categories into separate columns (variables).   

As I have passed in review that programming of mine, I have become aware that reading and properly structuring different formats of data is the sensory apparatus of the intelligent structure I want to program.  Operations of data cleansing and data formatting are the fundamental skills I need to develop in programming. Contrarily to what I expected a few weeks ago, when I was taking on programming in Python, elaborate mathematical constructs are simpler to code than I thought they would be. What might be harder, mind you, is to program them so as to optimize computational efficiency with large datasets. Still, the very basic, boots-on-the-ground structuring of data seems to be the name of the game for programming intelligent structures.

It is important to re-assume the meaning

It is Christmas 2020, late in the morning. I am thinking, sort of deeply. It is a dysfunctional tradition to make, by the end of the year, resolutions for the coming year. Resolutions which we obviously don’t hold to long enough to see them bring anything substantial. Yet, it is a good thing to pass in review the whole passing year, distinguish my own f**k-ups from my valuable actions, and use it as learning material for the incoming year.

What I have been doing consistently for the past year is learning new stuff: investment in the stock market, distance teaching amidst epidemic restrictions, doing research on collective intelligence in human societies, managing research projects, programming, and training consistently while fasting. Finally, and sort of overarchingly, I have learnt the power of learning by solving specific problems and writing about myself mixing successes and failures as I am learning.

Yes, it is precisely the kind you can expect in what we tend to label as girls’ readings, sort of ‘My dear journal, here is what happened today…’. I keep my dear journal focused mostly on my broadly speaking professional development. Professional development combines with personal development, for me, though. I discovered that when I want to achieve some kind of professional success, would it be academic, or business, I need to add a few new arrows to my personal quiver.    

Investing in the stock market and training while fasting are, I think, what I have had the most complete cycle of learning with. Strange combination? Indeed, a strange one, with a surprising common denominator: the capacity to control my emotions, to recognize my cognitive limitations, and to acknowledge the payoff from both. Financial decisions should be cold and calculated. Yes, they should, and sometimes they are, but here comes a big discovery of mine: when I start putting my own money into investment positions in the stock market, emotions flare in me so strongly that I experience something like tunnel vision. What looked like perfectly rational inference from numbers, just minutes ago, now suddenly looks like a jungle, with both game and tigers in it. The strongest emotion of all, at least in my case, is the fear of loss, and not the greed for gain. Yes, it goes against a common stereotype, and yet it is true. Moreover, I discovered that properly acknowledged and controlled, the fear of loss is a great emotional driver for good investment decisions, and, as a matter of fact, it is much better an emotional driver than avidity for gain. I know that I am well off when I keep the latter sort of weak and shy, expecting gains rather than longing for them, if you catch my drift.

Here comes the concept of good investment decisions. As this year 2020 comes to an end, my return on cash invested over the course of the year is 30% with a little change. Not bad at all, compared to a bank deposit (+1,5%) or to sovereign bonds (+4,5% max). I am wrapping my mind around the second most fundamental question about my investment decisions this year – after, of course, of the question about return on investment – and that second question is ontological: what my investment decisions actually have been? What has been their substance? The most general answer is tolerable complexity with intuitive hedging and a pinch of greed. Complexity means that I have progressively passed from the otherwise naïve expectation of one perfect hit to a portfolio of investment positions. Thinking intuitively in terms of portfolio has taught me just as intuitive approach to hedging my risks. Now, when I open one investment position, I already think about another possible one, either to reinforce my glide on the wave crest I intend to ride, or to compensate the risks contingent to seeing my ass gliding off and down from said wave crest.

That portfolio thinking of mine happens in layers, sort of. I have a portfolio of industries, and that seems to be the basic structuring layer of my decisions. I think I can call myself a mid-term investor. I have learnt to spot and utilise mid-term trends of interest that investors in the stock market attach to particular industries. I noticed there are cyclical fashion seasons in the stock market, in that respect. There is a cyclically recurrent biotech season, due to the pandemic. There is just as cyclical a fashion for digital tech, and another one for renewable energies (photovoltaic, in particular). Inside the digital tech, there are smaller waves of popularity as regards the gaming business, others connected to FinTech etc.

Cyclicality means that prices of stock in those industries grow for some time, ranging, by my experience, from 2 to 13 weeks. Riding those waves means jumping on and off at the right moment. The right moment for jumping on is as early as possible after the trend starts to ascend, and jump just as early as possible after it shows signs of durable descent.

The ‘durable’ part is tricky, mind you. I saw many episodes, and during some of them I shamefully yielded to short-termist panic, when the trend curbs down just for a few days before rocketing up again. Those episodes show well what it means in practical terms to face ‘technical factors’. The stock market is like an ocean. There are spots of particular fertility, and big predators tend to flock just there. In the stock market, just as in the ocean, you have bloody big sharks swimming around, and you’d better hold on when they start feeding, ‘cause they feed just as real sharks do: they hit quickly, cause abundant bleeding, and then just wait until their pray bleeds out enough to be defenceless.

When I see, for example, a company like the German Biontech ( suddenly losing value in the stock market, whilst the very vaccine they ganged up with Pfizer to make is being distributed across the world, I am like: ‘Wait a minute! Why the stock price of a super-successful, highly innovative business would fall just at the moment when they are starting to consume the economic fruit of their innovation?’. The only explanation is that sharks are hunting. Your typical stock market shark hunts in a disgusting way, by eating, vomiting and then eating their vomit back with a surplus. It bites a big chunk of a given stock, chews it for a moment, spits it out quickly – which pushes the price down a bit – then eats back its own vomit of stock, with a tiny surplus acquired at the previously down-driven price, and then it repeats. Why wouldn’t it repeat, as long as the thing works?

My personal philosophy, which, unfortunately, sometimes I deviate from when my emotions prevail, is just to sit and wait until those big sharks end their feeding cycle. This is another useful thing to know about big predators in the stock market: they hunt similarly to big predators in nature. They have a feeding cycle. When they have killed and consumed a big prey, they rest, as they are both replete with eating and down on energy. They need to rebuild their capital base.      

My reading of the stock market is that those waves of financial interest in particular industries are based on expectations as for real business cycles going on out there. Of course, in the stock market, there is always the phenomenon of subsidiary interest: I invest in companies which I expect other investors to invest to, as well, and, consequently, whose stock price I expect to grow. Still, investors in the stock market are much more oriented on fundamental business cycles than non-financial people think. When I invest in the stock of a company, and I know for a fact that many other investors think the same, I expect that company to do something constructive with my trust. I want to see those CEOs take bold decisions as for real investment in technological assets. When they really do so, I stay with them, i.e. I hold that stock. This is why I keep holding the stock of Tesla even amidst episodes of while swings in its price. I simply know Elon Musk will always come up with something which, for him, are business concepts, and for the common of mortals are science-fiction. If, on the other hand, I see those CEOs just sitting and gleaming benefits from trading their preferential shares, I leave.

Here I connect to another thing I started to learn during 2020: managing research projects. At my university, I have been assigned this specific job, and I discovered something which I did not expect: there is more money than ideas, out there. There is, actually, plenty of capital available from different sources, to finance innovative science. The tricky part is to translate innovative ideas into an intelligible, communicable form, and then into projects able to convince people with money. The ‘translating’ part is surprisingly complex. I can see many sparse, sort of semi-autonomous ideas in different people, and I still struggle with putting those people together, into some sort of team, or, fault of a team, into a network, and make them mix their respective ideas into one, big, articulate concept. I have been reading for years about managing R&D in corporate structures, about how complex and artful it is to manage R&D efficiently, and now, I am experiencing it in real life. An interesting aspect of that is the writing of preliminary contracts, the so-called ‘Non-Disclosure Agreements’ AKA NDAs, the signature of which is sort of a trigger for starting serious networking between different agents of an R&D project.

As I am wrapping my mind around those questions, I meditate over the words written by Joseph Schumpeter, in his Business Cycles: “Whenever a new production function has been set up successfully and the trade beholds the new thing done and its major problems solved, it becomes much easier for other people to do the same thing and even to improve upon it. In fact, they are driven to copying it if they can, and some people will do so forthwith. It should be observed that it becomes easier not only to do the same thing, but also to do similar things in similar lines—either subsidiary or competitive ones—while certain innovations, such as the steam engine, directly affect a wide variety of industries. This seems to offer perfectly simple and realistic interpretations of two outstanding facts of observation : First, that innovations do not remain isolated events, and are not evenly distributed in time, but that on the contrary they tend to cluster, to come about in bunches, simply because first some, and then most, firms follow in the wake of successful innovation ; second, that innovations are not at any time distributed over the whole economic system at random, but tend to concentrate in certain sectors and their surroundings”. (Business Cycles, Chapter III HOW THE ECONOMIC SYSTEM GENERATES EVOLUTION, The Theory of Innovation). In the Spring, when the pandemic was deploying its wings for the first time, I had a strong feeling that medicine and biotechnology will be the name of the game in technological change for at least a few years to come. Now, as strange as it seems, I have a vivid confirmation of that in my work at the university. Conceptual balls which I receive and which I do my best to play out further in the field come almost exclusively from the faculty of medical sciences. Coincidence? Go figure…

I am developing along two other avenues: my research on cities and my learning of programming in Python. I have been doing research on cities as manifestations of collective intelligence, and I have been doing it for a while. See, for example, ‘Demographic anomalies – the puzzle of urban density’ or ‘The knowingly healthy people’. As I have been digging down this rabbit hole, I have created a database, which, for working purposes, I call ‘DU_DG’. DU_DG is a coefficient of relative density in population, which I came by with some day and which keeps puzzling me.  Just to announce the colour, as we say in Poland when playing cards, ‘DU’ stands for the density of urban population, and ‘DG’ is the general density of population. The ‘DU_DG’ coefficient is a ratio of these two, namely it is DU/DG, or, in other words, this is the density of urban population denominated in the units of general density in population. In still other words, if we take the density of population as a fundamental metric of human social structures, the DU_DG coefficient tells how much denser urban population is, as compared to the mean density, rural settlements included.

I want to rework through my DU_DG database in order both to practice my programming skills, and to reassess the main axes of research on the collective intelligence of cities. I open JupyterLab from my Anaconda panel, and I create a new Notebook with Python 3 as its kernel. I prepare my dataset. Just in case, I make two versions: one in Excel, another one in CSV. I replace decimal comas with decimal points; I know by experience that Python has issues with comas. In human lingo, a coma is a short pause for taking like half a breath before we continue uttering the rest of the sentence. From there, we take the coma into maths, as decimal separator. In Python, as in finance, we talk about decimal point as such, i.e. as a point. The coma is a separator.

Anyway, I have that notebook in JupyterLab, and I start by piling up what I think I will need in terms of libraries:

>> import numpy as np

>> import pandas as pd

>> import os

>> import math

I place my database in the root directory of my user profile, which is, by default, the working directory of Anaconda, and I check if my database is visible for Python:

>> os.listdir()

It is there, in both versions, Excel and CSV. I start with reading from Excel:

>> DU_DG_Excel=pd.DataFrame(pd.read_excel(‘Dataset For Perceptron.xlsx’, header=0))

I check with ‘’. I get:

<class ‘pandas.core.frame.DataFrame’>

RangeIndex: 1155 entries, 0 to 1154

Data columns (total 10 columns):

 #   Column                                                                Non-Null Count  Dtype 

—  ——                                                                      ————–  —– 

 0   Country                                                                1155 non-null   object

 1   Year                                                                      1155 non-null   int64 

 2   DU_DG                                                                1155 non-null   float64

 3   Population                                                           1155 non-null   int64 

 4   GDP (constant 2010 US$)                                  1042 non-null   float64

 5   Broad money (% of GDP)                                  1006 non-null   float64

 6   urban population absolute                                 1155 non-null   float64

 7   Energy use (kg of oil equivalent per capita)    985 non-null    float64

 8   agricultural land km2                                        1124 non-null   float64

 9   Cereal yield (kg per hectare)                                         1124 non-null   float64

dtypes: float64(7), int64(2), object(1)

memory usage: 90.4+ KB  

Cool. Exactly what I wanted. Now, if I want to use this database as a simulator of collective intelligence in human societies, I need to assume that each separate ‘country <> year’ observation is a distinct local instance of an overarching intelligent structure. My so-far experience with programming opens up on a range of actions that structure is supposed to perform. It is supposed to differentiate itself into the desired outcomes, on the one hand, and the instrumental epistatic traits manipulated and adjusted in order to achieve those outcomes.

As I pass in review my past research on the topic, a few big manifestations of collective intelligence in cities come to my mind. Creation and development of cities as purposeful demographic anomalies is the first manifestation. This is an otherwise old problem in economics. Basically, people and resources they use should be disposed evenly over the territory those people occupy, and yet they aren’t. Even with a correction taken for physical conditions, such as mountains or deserts, we tend to like forming demographic anomalies on the landmass of Earth. Those anomalies have one obvious outcome, i.e. the delicate balance between urban land and agricultural land, which is a balance between dense agglomerations generating new social roles due to abundant social interactions, on the one hand, and the local food base for people endorsing those roles. The actual difference between cities and the surrounding countryside, in terms of social density, is very idiosyncratic across the globe and seems to be another aspect of intelligent collective adaptation.

Mankind is becoming more and more urbanized, i.e. a consistently growing percentage of people live in cities (World Bank 1[1]). In 2007 – 2008, the coefficient of urbanization topped 50% and keeps progressing since then. As there is more and more of us, humans, on the planet, we concentrate more and more in urban areas. That process defies preconceived ideas about land use. A commonly used narrative is that cities keep growing out into their once-non-urban surroundings, which is frequently confirmed by anecdotal, local evidence of particular cities effectively sprawling into the neighbouring rural land. Still, as data based on satellite imagery is brought up, and as total urban land area on Earth is measured as the total surface of peculiar agglomerations of man-made structures and night-time lights, that total area seems to be stationary, or, at least, to have been stationary for the last 30 years (World Bank 2[2]). The geographical distribution of urban land over the entire land mass of Earth does change, yet the total seems to be pretty constant. In parallel, the total surface of agricultural land on Earth has been growing, although at a pace far from steady and predictable (World Bank 3[3]).

There is a theory implied in the above-cited methodology of measuring urban land based on satellite imagery. Cities can be seen as demographic anomalies with a social purpose, just as Fernand Braudel used to state it (Braudel 1985[4]) : ‘Towns are like electric transformers. They increase tension, accelerate the rhythm of exchange and constantly recharge human life. […]. Towns, cities, are turning-points, watersheds of human history. […]. The town […] is a demographic anomaly’. The basic theoretical thread of this article consists in viewing cities as complex technologies, for one, and in studying their transformations as a case of technological change. Logically, this is a case of technological change occurring by agglomeration and recombination. Cities can be studied as demographic anomalies with the specific purpose to accommodate a growing population with just as expanding a catalogue of new social roles, possible to structure into non-violent hierarchies. That path of thinking is present, for example, in the now classical work by Arnold Toynbee (Toynbee 1946[5]), and in the even more classical take by Adam Smith (Smith 1763[6]). Cities can literally work as factories of new social roles due to intense social interactions. The greater the density of population, the greater the likelihood of both new agglomerations of technologies being built, and new, adjacent social roles emerging. A good example of that special urban function is the interaction inside age groups. Historically, cities have allowed much more abundant interactions among young people (under the age of 25), that rural environments have. That, in turn, favours the emergence of social roles based on the typically adolescent, high appetite for risk and immediate rewards (see for example: Steinberg 2008[7]). Recent developments in neuroscience, on the other hand, allow assuming that abundant social interactions in the urban environment have a deep impact on the neuroplastic change in our brains, and even on the phenotypical expression of human DNA (Ehninger et al. 2008[8]; Bavelier et al. 2010[9]; Day & Sweatt 2011[10]; Sweatt 2013[11])

At the bottom line of all those theoretical perspectives, cities are quantitatively different from the countryside by their abnormal density of population. Throughout this article, the acronymic symbol [DU/DG] is used to designate the density of urban population denominated in the units of (divided by) general density of population, and is computed on the grounds of data published by combining the above cited coefficient of urbanization (World Bank 1) with the headcount of population (World Bank 4[12]), as well as with the surface of urban land (World Bank 2). The general density of population is taken straight from official statistics (World Bank 5[13]). 

The [DU/DG] coefficient stays in the theoretical perspective of cities as demographic anomalies with a purpose, and it can be considered as a measure of social difference between cities and the countryside. It displays intriguing quantitative properties. Whilst growing steadily over time at the globally aggregate level, from 11,9 in 1961 to 19,3 in 2018, it displays significant disparity across space. Such countries as Mauritania or Somalia display a [DU/DG] > 600, whilst United Kingdom or Switzerland are barely above [DU/DG] = 3. In the 13 smallest national entities in the world, such as Tonga, Puerto Rico or Grenada, [DU/DG] falls below 1. In other words, in those ultra-small national structures, the method of assessing urban space by satellite-imagery-based agglomeration of night-time lights fails utterly. These communities display peculiar, categorially idiosyncratic a spatial pattern of settlement. The cross-sectional variability of [DU/DG] (i.e. its standard deviation across space divided by its cross-sectional mean value) reaches 8.62, and yet some 70% of mankind lives in countries ranging across the 12,84 ≤ [DU/DG] ≤ 23,5 interval.

Correlations which the [DU/DG] coefficient displays at the globally aggregate level (i.e. at the scale of the whole planet) are even more puzzling. When benchmarked against the global real output in constant units of value (World Bank 6[14]), the time series of aggregate, global  [DU/DG] displays a Pearson correlation of r = 0,9967. On the other hand, the same type of Pearson correlation with the relative supply of money to the global economy (World Bank 7[15]) yields r = 0,9761. As the [DU/DG] coefficient is supposed to represent the relative social difference between cities and the countryside, a look at the latter is beneficial. The [DU/DG] Pearson-correlates with the global area of agricultural land (World Bank 8[16]) at r = 0,9271, and with the average, global yield of cereals, in kgs per hectare (World Bank 9[17]), at r = 0,9858. That strong correlations of the [DU/DG] coefficient with metrics pertinent to the global food base match its correlation with the energy base. When Pearson-correlated with the global average consumption of energy per capita (World Bank 10[18]), [DU/DG] proves significantly covariant, at r = 0,9585. All that kept in mind, it is probably not that much of a surprise to see the global aggregate [DU/DG] Pearson correlated with the global headcount of population (World Bank 11[19]) at r = 0,9954.    

It is important to re-assume the meaning of the [DU/DG] coefficient. This is essentially a metric of density in population, and density has abundant ramifications, so to say. The more people live per 1 km2, the more social interactions occur on the same square kilometre. Social interactions mean a lot. They mean learning by civilized rivalry. They mean transactions and markets as well. The greater the density of population, the greater the probability of new skills emerging, which possibly translates into new social roles, new types of business and new technologies. When two types of human settlements coexist, displaying very different densities of population, i.e. type A being many times denser than type B, type A is like a factory of patterns (new social roles and new markets), whilst type B is the supplier of raw resources. The progressively growing global average [DU/DG] means that, at the scale of the human civilization, that polarity of social functions accentuates.

The [DU/DG] coefficient bears strong marks of a statistical stunt. It is based on truly risky the assumption, advanced implicitly by through the World Bank’s data, that total surface of urban land on Earth has remained constant, at least over the last 3 decades. Moreover, denominating the density of urban population in units of general density of population was purely intuitive from the author’s part, and, as a matter of fact, other meaningful denominators can easily come to one’s mind. Still, with all that wobbly theoretical foundation, the [DU/DG] coefficient seems to inform about a significant, structural aspect of human societies. The Pearson correlations, which the global aggregate of that coefficient yields with the fundamental metrics of the global economy, are of an almost uncanny strength in social sciences, especially with respect to the strong cross-sectional disparity in the [DU/DG].

The relative social difference between cities and the countryside, measurable with the gauge of the [DU/DG] coefficient, seems to be a strongly idiosyncratic adaptative mechanism in human societies, and this mechanism seems to be correlated with quantitative growth in population, real output, production of food, and the consumption of energy. That could be a manifestation of tacit coordination, where a growing human population triggers an increasing pace of emergence in new social roles by stimulating urban density. As regards energy, the global correlation between the increasing [DU/DG] coefficient and the average consumption of energy per capita interestingly connects with a stream of research which postulates intelligent collective adaptation of human societies to the existing energy base, including intelligent spatial re-allocation of energy production and consumption (Leonard, Robertson 1997[20]; Robson, Wood 2008[21]; Russon 2010[22]; Wasniewski 2017[23], 2020[24]; Andreoni 2017[25]; Heun et al. 2018[26]; Velasco-Fernández et al 2018[27]).

It is interesting to investigate how smart are human societies in shaping their idiosyncratic social difference between cities and the countryside. This specific path of research is being pursued, further in this article, through the verification and exploration of the following working hypothesis: ‘The collective intelligence of human societies optimizes social interactions in the view of maximizing the absorption of energy from the environment’.  

[1] World Bank 1:

[2] World Bank 2:

[3] World Bank 3:

[4] Braudel, F. (1985). Civilisation and Capitalism 15th and 18th Century–Vol. I: The Structures of Everyday Life, Translated by S. Reynolds, Collins, London, pp. 479 – 482

[5] Royal Institute of International Affairs, Somervell, D. C., & Toynbee, A. (1946). A Study of History. By Arnold J. Toynbee… Abridgement of Volumes I-VI (VII-X.) by DC Somervell. Oxford University Press., Section 3: The Growths of Civilizations, Chapter X.

[6] Smith, A. (1763-1896). Lectures on justice, police, revenue and arms. Delivered in the University of Glasgow in 1763, published by Clarendon Press in 1896, pp. 9 – 20

[7] Steinberg, L. (2008). A social neuroscience perspective on adolescent risk-taking. Developmental review, 28(1), 78-106.

[8] Ehninger, D., Li, W., Fox, K., Stryker, M. P., & Silva, A. J. (2008). Reversing neurodevelopmental disorders in adults. Neuron, 60(6), 950-960.

[9] Bavelier, D., Levi, D. M., Li, R. W., Dan, Y., & Hensch, T. K. (2010). Removing brakes on adult brain plasticity: from molecular to behavioral interventions. Journal of Neuroscience, 30(45), 14964-14971.

[10] Day, J. J., & Sweatt, J. D. (2011). Epigenetic mechanisms in cognition. Neuron, 70(5), 813-829.

[11] Sweatt, J. D. (2013). The emerging field of neuroepigenetics. Neuron, 80(3), 624-632.

[12] World Bank 4:

[13] World Bank 5:

[14] World Bank 6:

[15] World Bank 7:

[16] World Bank 8:

[17] World Bank 9:

[18] World Bank 10:

[19] World Bank 11:

[20] Leonard, W.R., and Robertson, M.L. (1997). Comparative primate energetics and hominoid evolution. Am. J. Phys. Anthropol. 102, 265–281.

[21] Robson, S.L., and Wood, B. (2008). Hominin life history: reconstruction and evolution. J. Anat. 212, 394–425

[22] Russon, A. E. (2010). Life history: the energy-efficient orangutan. Current Biology, 20(22), pp. 981- 983.

[23] Waśniewski, K. (2017). Technological change as intelligent, energy-maximizing adaptation. Energy-Maximizing Adaptation (August 30, 2017).

[24] Wasniewski, K. (2020). Energy efficiency as manifestation of collective intelligence in human societies. Energy, 191, 116500.

[25] Andreoni, V. (2017). Energy Metabolism of 28 World Countries: A Multi-scale Integrated Analysis. Ecological Economics, 142, 56-69

[26] Heun, M. K., Owen, A., & Brockway, P. E. (2018). A physical supply-use table framework for energy analysis on the energy conversion chain. Applied Energy, 226, 1134-1162

[27] Velasco-Fernández, R., Giampietro, M., & Bukkens, S. G. (2018). Analyzing the energy performance of manufacturing across levels using the end-use matrix. Energy, 161, 559-572