Inside a vector

My editorial

I am returning to the issue of collective memory, and to collective memory recognizable in numbers, i.e. in the time series of variables pertinent to the state of a society (see ‘Back to blogging, trying to define what I remember’ ). And so I take my general formula xi(t) = f1[xi(t – b)] + f2[xi(t – STOCH)] + Res[xi(t)], which means that any given moment ‘t’, current information xi(t) about the social system consists in some sort of constant-loop remembering xi(t – b), with ‘b’ standing for that fixed temporal window (in an average human it seems to be like 3 weeks), coming along with more irregular, stochastic a pick of past information, like [xi(t – STOCH)], and on the top of all that is the residual Res[xi(t)] of current information, hardly attributable to any remembering of the past, and, fault of a better expression, it can be grasped as the strictly spoken present.

I am reviewing the available mathematical tools for modelling such a process with hypothetical memory. I start with something that I could label ‘perfect remembering and only remembering’, or the Gaussian process. It represents a system, which essentially does not learn much, and is predictable on the grounds of its mean and covariance. When I do linear regression, which you could have seen a lot in my writings on this blog, I more or less consciously follow the logic of a Gaussian process. That logic is simple: if I can draw a straight line that matches the empirical distribution of my real-life variable, and if I prolong this line into the future, it will make a good predictor of the future values in my variable. It doesn’t even have to be one variable. I can deal with a vector made of many variables as well. As a matter of fact, the mathematical notation used in the Gaussian process basically refers to vectors of variables. It might be the right moment for explaining what the hell is a vector in quantitative analysis. Well, I am a vector, and you, my reader, you are a vector, and my cousin is a vector as well, and his dog is a vector. My phone is a vector, and any other phone the same. Anything we encounter in life is complex. There are no simple phenomena, even in the middle of summer holidays, on some remote tropical beach. Anything we can think of has many characteristics. To the extent that those characteristics can be represented as numbers, the state of nature at a given moment is a set of numbers. These numbers can be considered as coordinates in many criss-crossing manifolds. I have an age in the manifold of ages, a height in the manifold of heights, and numerically expressible a hair colour in the manifold of hair colours etc. Many coordinates make a vector, stands to reason.

And so I have that vector X* made of n variables, stretched over m periods of time. Each point in that vector is characterized by its appurtenance to the precise variable i out of those n variables, as well as its observability at a given moment j out of the total duration. It can look more or less like that: X*= {Xt1,1, Xt2,2, …, Xtj,i, Xtm,n} , or, in a more straightforward form of a matrix, it is something like:

                                   Moments in time (or any indexed value you want)           

                                                    t1                     t2                     …                     tj                      tm

| Variables        I          Xt1,I               Xt2,I                   …                   Xtj,I                Xtm,I

X* |                          II        Xt1,II              Xt2,II                    …                   Xtj,II             Xtm,II

|                          …

|                         n         Xt1,n              Xt2,n                    …                   Xtj,n               Xtm,n


Right, I got myself lost a bit in that vector thing, and I kind of stepped aside the path of wisdom regarding the Gaussian process. In order to understand the logic of the Gaussian process, you’d better revise the Gaussian distribution, or, in other words, the normal distribution. If any set of observable data follows the normal distribution, the values you can encounter the most frequently in it are those infinitely close to the arithmetical average of the set. As you probably remember from your maths class at high school, one of the reasons the arithmetical average is so frequently used in all kinds of calculations (even those pretty intuitive ones) is that it doesn’t exist. If you take any set of data and compute its arithmetical average, none of your empirical observations will be exactly equal to that average. Still, and this is really funny, you have things – especially those occurring in large amounts, like foot size in a human population – which take the most frequently those numerical values, which are relatively the closest to their arithmetical average, i.e. the closest to a value that doesn’t exist, and yet is somehow expected. These things follow the Gaussian (normal) distribution and we use to assume that their expected value (i.e. the value we can rightfully expect to meet the most frequently in those things) is their arithmetical average.

Inside the set of all those Gaussian things, there is a smaller subset of things, for which time matters. These phenomena unfold in time. Foot size is a good example. Instead of asking yourself what foot size you are the most likely to satisfy with the shoes you make for the existing population, you can ask about the expected foot size in any human being to be born in the future. What you can do is to measure the average foot size in the population year after year, like over one century. That would be a lot of foot measuring, I agree, but science requires some effort. Anyway, if you measure average foot sizes, year after year during one century, you can discover that those averages follow a normal distribution over time, i.e. the greatest number of annual averages tends to be infinitely close to the general, century-long average. If this is the case, we can say that the average foot size changes over time in a Gaussian process, and this is the first characteristic of this specific process: the mean is always the expected value.

If I apply this elementary assumption to the concept of collective intelligence, it implies a special aspect of intelligence, i.e. generalisation. My eyes transmit to my brain the image of one set of colourful points, and then the image of another set of points, kind of just next to the previous one. My brain connects those dots and labels them ‘woman’, ‘red’, ‘bag’ etc. In a sense, ‘woman’, ‘red’, and ‘bag’ are averages, because they are the incidences I expect to find the most probably in the presence of those precise kinds of colourful points. Thus, collective intelligence endowed with a memory, which works according to a Gaussian process, is the kind of intelligence we use for establishing our basic distinctions. In our collective intelligence, Gaussian processes (if they happen at all), can represent, for example, the formation of cultural constructs such as law, justice, scientific laws, and, by the way, concepts like the Gaussian process itself.

Now, we go one step further, and, in order to do it, we need to go one step back, namely back to the concept of vector. If my process in time is made of vectors, instead of single points, and each vector is like a snapshot of reality at a given moment, I am interested in something called the covariance of variables inside the vector. If one variable deviates from its own mean, and I make it power 2 in order to get rid of the possibly embarrassing minus sign, I have variance. If I have two variables, and I take their respective, local deviations from their means, and I multiply those deviations by each other, I have covariance. As we are talking vectors, we have a whole matrix of covariance, between each pair of variables in the vector. Any process, unfolding in time and involving many variables, has to answer the existential question about its own matrix of covariance. Some processes have the peculiar property of keeping a pretty repetitive matrix of covariance over time. The component, simple variables of those processes change in some sort of constant-pace contredans. If variable X1 changes by one inch, the variable X2 will change by three quarters of a gallon, and so it will reproduce for a long time. This is the second basic characteristic of a Gaussian process: future covariance is predictable on the grounds of the covariance observed so far.

As I am transplanting that concept of very recurrent covariance onto my idea of collective intelligence with memory, Gaussian collective intelligence would be the kind that establishes recurrent functional connections between things of society. We call those things institutions. Language, as a matter of fact, is an institution, as well. As we have institutions in every society, and societies that do not form institutions tend to have pretty short a life expectance, we can assume that collective intelligence certainly follows, at least to some extent, the pattern of a Gaussian process.