I think I have found out, when writing my last update (‘Cultural classes’) another piece of the puzzle which I need to assemble in order to finish writing my book on collective intelligence. I think I have nailed down the general scientific interest of the book, i.e. the reason why my fellow scientists should even bother to have a look at it. That reason is the possibility to have deep insight into various quantitative models used in social sciences, with a particular emphasis on the predictive power of those models in the presence of exogenous stressors, and, digging further, the representativeness of those models as simulators of social reality.

Let’s have a look at one quantitative model, just one picked at random (well, almost at random): autoregressive conditional heteroscedasticity AKA ARCH (https://en.wikipedia.org/wiki/Autoregressive_conditional_heteroskedasticity ). It goes as follows. I have a process, i.e. a time-series of a quantitative variable. I compute the mean expected value in that time series, which, in plain human, means arithmetical average of all the observations in that series. In even plainer human, the one we speak after having watched a lot of You Tube, it means that we sum up the values of all the consecutive observations in that time series and we divide the so-obtained total by the number of observations.

Mean expected values have that timid charm of not existing, i.e. when I compute the mean expected value in my time series, none of the observations will be exactly equal to it. Each observation ** t** will return a residual error

**The**

*ε*_{t}_{.}**ARCH**approach assumes that

**is the product of two factors, namely of the time-dependentstandard deviation**

*ε*_{t}**and a factor of white noise**

*σ*_{t}_{, }**Long story short, we have**

*z*._{t}

*ε*_{t}=σ_{t}z_{t}**.**

The time-dependent standard deviation shares the common characteristics of all the standard deviations, namely it is the square root of time-dependent variance: ** σ_{t}** = [(

**)**

*σ*_{t}^{2}]

^{1/2}. That time-dependent variance is computed as:

Against that general methodological background, many variations arise, especially as regards the mean expected value which everything else is wrapped around. It can be a constant value, i.e. computed for the entire time-series once and for all. We can allow the time series to extend, and then each extension leads to the recalculation of the mean expected value, including the new observation(s). We can make the mean expected value a moving average over a specific window in time.

Before I dig further into the underlying assumptions of ARCH, one reminder begs for being reminded: I am talking about social sciences, and about the application of ARCH to all kinds of crazy stuff that we, humans, do collectively. All the equations and conditions phrased out above apply to collective human behaviour. The next step in understanding of ARCH, in the specific context of social sciences, is that ARCH has any point when the measurable attributes of our collective human behaviour really oscillate and change. When I have, for example, a trend in the price of something, and that trend is essentially smooth, without much of a dentition jumping to the eye, ARCH is pretty much pointless. On the other hand, that analytical approach – where each observation in the real measurable process which I observe is par excellence a deviation from the expected state – gains in cognitive value as the process in question becomes increasingly dented and bumpy.

A brief commentary on the very name of the method might be interesting. The term ‘heteroskedasticity’ means that real observations tend to be grouped on one side of the mean expected value rather than on the other. There is a slant, which, over time, translates into a drift. Let’s simulate the way it happens. Before I even start going down this rabbit hole, another assumption is worth deconstructing. If I deem a phenomenon to be describable as white noise, AKA ** z_{t}, **I assume there is no pattern in the occurrence thereof. Any state of that phenomenon can happen with equal probability. It is the ‘Who knows?’ state of reality in its purest form.

White noise is at the very basis of the way we experience reality. This is pure chaos. We make distinctions in this chaos; we group phenomena, and we assess the probability of each newly observed phenomenon falling into one of the groups. Our essential cognition of reality assumes that in any given pound of chaos, there are a few ounces of order, and a few residual ounces of chaos. Then we have the ‘Wait a minute!’ moment and we further decompose the residual ounces of chaos into some order and even more residual a chaos. From there, we can go *ad infinitum*, sequestrating streams of regularity and order out of the essentially chaotic flow of reality. I would argue that the book of Genesis in the Old Testament is a poetic, metaphorical account of the way that human mind cuts layers of intelligible order out of the primordial chaos.

Seen from a slightly different angle, it means that white noise ** z_{t}** can be interpreted as an error in itself, because it is essentially a departure from the nicely predictable process

**ε**, i.e. where residual departure from the mean expected value is equal to the mean expected departure from the mean expected value. Being a residual error,

_{t }= σ_{t}**can be factorized into**

*z*_{t}

*z*_{t}=*, and, once again, that factorization can go all the way down to the limits of observability as regards the phenomena studied.*

**σ’***_{t}**z’**_{t}At this point, I am going to put the whole reasoning on its head, as regards white noise. It is because I know and use a lot the same concept, just under a different name, namely that of **mean-reverted value**. I use mean-reversion a lot in my investment decisions in the stock market, with a very simple logic: when I am deciding to buy or sell a given stock, my purely technical concern is to know how far away the current price from its moving average is. When I do this calculation for many different stocks, priced differently, I need a common denominator, and I use standard deviation in price for that purpose. In other words, I compute as follows: ** mean-reverted price = (current price – mean expected price)/ standard deviation in price**.

If you have a closer look at this coefficient of mean-reverted price, its nominator is error, because it is the deviation from mean expected value. I divide that error by standard deviation, and, logically, what I get is error divided by standard deviation, therefore the white noise component ** z_{t}** of the equation

**. This is perfectly fine mathematically, only my experience with that coefficient tells me it is anything but white noise. When I want to grasp very sharply and accurately the way which the price of a given stock reacts to its economic environment, I use precisely the mean-reverted coefficient of price. As soon as I recalculate the time series of a price into its mean-reverted form, patterns emerge, sharp and distinct. In other words, the allegedly white-noise-based factor in the stock price is much more patterned than the original price used for its calculation.**

*ε*_{t }= σ_{t}z_{t}The same procedure which I call ‘mean-reversion’ is, by the way, a valid procedure to standardize empirical data. You take each empirical observation, you subtract from it the mean expected value of the corresponding variable, you divide the residual difference by its standard deviation, and Bob’s your uncle. You have your data standardized.

Summing up that little rant of mine, I understand the spirit of the ARCH method. If I want to extract some kind of autoregression in time-series, I can test the hypothesis that standard deviation is time-dependent. Do I need, for that purpose, to assume the existence of strong white noise in the time series? I would say cautiously: maybe, although I do not see the immediate necessity for it. Is the equation ** ε_{t }= σ_{t}z_{t}** the right way to grasp the distinction into the stochastic component and the random one, in the time series? Honestly: I don’t think so. Where is the catch? I think it is in the definition and utilization of error, which, further, leads to the definition and utilization of the expected state.

In order to make my point clearer, I am going to quote two short passages from pages xxviii-xxix in Nicolas Nassim Taleb’s book ‘The Black Swan’. Here it goes. ‘*There are two possible ways to approach phenomena. The first is to rule out the extraordinary and focus on the “normal.” The examiner leaves aside “outliers” and studies ordinary cases. The second approach is to consider that in order to understand a phenomenon, one needs first to consider the extremes—particularly if, like the Black Swan, they carry an extraordinary cumulative effect. […] ****Almost everything in social life is produced by rare but consequential shocks and jumps; all the while almost everything studied about social life focuses on the “normal,” particularly with “bell curve” methods of inference that tell you close to nothing*’.

When I use mean-reversion to study stock prices, for my investment decisions, I go very much in the spirit of Nicolas Taleb. I am most of all interested in the outlying values of the metric ** (current price – mean expected price)/ standard deviation in price**, which, once again, the proponents of the ARCH method interpret as white noise. When that metric spikes up, it is a good moment to sell, whilst when it is in a deep trough, it might be the right moment to buy. I have one more interesting observation about those mean-reverted prices of stock: when they change their direction from ascending to descending and vice versa, it is always a sharp change, like a spike, never a gentle recurving. Outliers always produce sharp change. Exactly, as Nicolas Taleb claims. In order to understand better what I am talking about, you can have a look at one of the analytical graphs I used for my investment decisions, precisely with mean-reverted prices and transactional volumes, as regards Ethereum: https://discoversocialsciences.com/wp-content/uploads/2020/04/Slide5-Ethereum-MR.png .

In a manuscript that I wrote and which I am still struggling to polish enough for making it publishable (https://discoversocialsciences.com/wp-content/uploads/2021/01/Black-Swans-article.pdf ), I have identified three different modes of collective learning. In most of the cases I studied empirically, societies learn cyclically, i.e. first they produce big errors in adjustment, then they narrow their error down, which means they figure s**t out, and in a next phase the error increases again, just to decrease once again in the next cycle of learning. This is cyclical adjustment. In some cases, societies (national economies, to be exact) adjust in a pretty continuous process of diminishing error. They make big errors initially, and they reduce their error of adjustment in a visible trend of nailing down workable patterns. Finally, in some cases, national economies can go haywire and increase their error continuously instead of decreasing it or cycling on it.

I am reconnecting to my method of quantitative analysis, based on simulating with a simple neural network. As I did that little excursion into the realm of autoregressive conditional heteroscedasticity, I realized that most of the quantitative methods used today start from studying one single variable, and then increase the scope of analysis by including many variables in the dataset, whilst each variable keeps being the essential monad of observation. For me, the complex local state of the society studied is that monad of observation and empirical study. By default, I group all the variables together, as distinct, and yet fundamentally correlated manifestations of the same existential stuff happening here and now. What I study is a chain of here-and-now states of reality rather than a bundle of different variables.

I realize that whilst it is almost axiomatic, in typical quantitative analysis, to phrase out the null hypothesis as the absence of correlation between variables, I don’t even think about it. For me, all the empirical variables which we, humans, measure and report in our statistical data, are mutually correlated one way or another, because they all talk about us doing things together. In phenomenological terms, is it reasonable to assume that we do in order to produce real output, i.e. our Gross Domestic Product, is uncorrelated with what we do with the prices of productive assets? Probably not.

There is a fundamental difference between discovering and studying individual properties of a social system, such as heteroskedastic autoregression in a variable, on the one hand, and studying the way this social system changes and learns as a collective. It means two different definitions of expected state. In most quantitative methods, the expected state is the mean value of one single variable. In my approach, it is always a vector of expected values.

I think I start nailing down, at last, the core scientific idea I want to convey in my book about collective intelligence. Studying human societies as instances of collective intelligence, or, if you want, as collectively intelligent structure, means studying chains of complex states. The Markov chain of states, and the concept of state space, are the key mathematical notions here.

I have used that method, so far, to study four distinct fields of empirical research: a) the way we collectively approach energy management in our societies b) the orientation of national economies on the optimization of specific macroeconomic variables c) the way we collectively manage the balance between urban land, urban density of population, and agricultural production, and d) the way we collectively learn in the presence of random disturbances. The main findings I can phrase out start with the general observation that in a chain of complex social states, we collectively tend to lean towards some specific aspects of our social reality. Fault of a better word, I equate those aspects to the quantitative variables I find them represented by, although it is something to dig in. We tend to optimize the way we work, in the first place, and the way we sell our work. Concerns such as return on investment or real output come as secondary. That makes sense. At the large scale, the way we work is important for the way we use energy, and collectively learn. Surprisingly, variables commonly associated with energy management, such as energy efficiency, or the exact composition of energy sources, are secondary.

The second big finding is related to a manuscript t which I am still struggling to polish enough for making it publishable (https://discoversocialsciences.com/wp-content/uploads/2021/01/Black-Swans-article.pdf ), I have identified three different modes of collective learning. In most of the cases I studied empirically, societies learn cyclically, i.e. first they produce big errors in adjustment, then they narrow their error down, which means they figure s**t out, and in a next phase the error increases again, just to decrease once again in the next cycle of learning. This is cyclical adjustment. In some cases, societies (national economies, to be exact) adjust in a pretty continuous process of diminishing error. They make big errors initially, and they reduce their error of adjustment in a visible trend of nailing down workable patterns. Finally, in some cases, national economies can go haywire and increase their error continuously instead of decreasing it or cycling on it.

The third big finding is about the fundamental logic of social change, or so I perceive it. We seem to be balancing, over decades, the proportions between urban land and agricultural land so as to balance the production of food with the production of new social roles for new humans. The countryside is the factory of food, and cities are factories of new social roles. I think I can make a strong, counterintuitive claim that social unrest, such as what is currently going on in United States, for example, erupts when the capacity to produce food in the countryside grows much faster than the capacity to produce new social roles in the cities. When our food systems can sustain more people than our collective learning can provide social roles for, we have an overhead of individuals whose most essential physical subsistence is provided for, and yet they have nothing sensible to do, in the collective intelligent structure of the society.

## One thought on “An overhead of individuals”