Lean, climbing trends

My editorial on You Tube

Our artificial intelligence: the working title of my research, for now. Volume 1: Energy and technological change. I am doing a little bit of rummaging in available data, just to make sure I keep contact with reality. Here comes a metric: access to electricity in the world, measured as the % of total human population[1]. The trend line looks proudly ascending. In 2016, 87,38% of mankind had at least one electric socket in their place. Ten years earlier, by the end of 2006, they were 81,2%. Optimistic. Looks like something growing almost linearly. Another one: « Electric power transmission and distribution losses »[2]. This one looks different: instead of a clear trend, I observe something shaking and oscillating, with the width of variance narrowing gently down, as time passes. By the end of 2014 (last data point in this dataset), we were globally at 8,25% of electricity lost in transmission. The lowest coefficient of loss occurred in 1998: 7,13%.

I move from distribution to production of electricity, and to its percentage supplied from nuclear power plants[3]. Still another shape, that of a steep bell with surprisingly lean edges. Initially, it was around 2% of global electricity supplied by the nuclear. At the peak of fascination, it was 17,6%, and at the end of 2014, we went down to 10,6%. The thing seems to be temporarily stable at this level. As I move to water, and to the percentage of electricity derived from the hydro[4], I see another type of change: a deeply serrated, generally descending trend. In 1971, we had 20,2% of our total global electricity from the hydro, and by the end of 2014, we were at 16,24%. In the meantime, it looked like a rollercoaster. Yet, as I am having a look at other renewables (i.e. other than hydroelectricity) and their share in the total supply of electricity[5], the shape of the corresponding curve looks like a snake, trying to figure something out about a vertical wall. Between 1971 and 1988, the share of those other renewables in the total electricity supplied moved from 0,25% to 0,6%. Starting from 1989, it is an almost perfectly exponential growth, to reach 6,77% in 2015. 

Just to have a complete picture, I shift slightly, from electricity to energy consumption as a whole, and I check the global share of renewables therein[6]. Surprise! This curve does not behave at all as it is expected to behave, after having seen the previously cited share of renewables in electricity. Instead of a snake sniffing a wall, we can see a snake like from above, or something like e meandering river. This seems to be a cycle over some 25 years (could it be Kondratiev’s?), with a peak around 18% of renewables in the total consumption of energy, and a trough somewhere by 16,9%. Right now, we seem to be close to the peak. 

I am having a look at the big, ugly brother of hydro: the oil, gas and coal sources of electricity and their share in the total amount of electricity produced[7]. Here, I observe a different shape of change. Between 1971 and 1986, the fossils dropped their share from 62% to 51,47%. Then, it rockets up back to 62% in 1990. Later, a slowly ascending trend starts, just to reach a peak, and oscillate for a while around some 65 ÷ 67% between 2007 and 2011. Since then, the fossils are dropping again: the short-term trend is descending.  

Finally, one of the basic metrics I have been using frequently in my research on energy: the final consumption thereof, per capita, measured in kilograms of oil equivalent[8]. Here, we are back in the world of relatively clear trends. This one is ascending, with some bumps on the way, though. In 1971, we were at 1336,2 koe per person per year. In 2014, it was 1920,655 koe.

Thus, what are all those curves telling me? I can see three clearly different patterns. The first is the ascending trend, observable in the access to electricity, in the consumption of energy per capita, and, since the late 1980ies, in the share of electricity derived from renewable sources. The second is a cyclical variation: share of renewables in the overall consumption of energy, to some extent the relative importance of hydroelectricity, as well as that of the nuclear. Finally, I can observe a descending trend in the relative importance of the nuclear since 1988, as well as in some episodes from the life of hydroelectricity, coal and oil.

On the top of that, I can distinguish different patterns in, respectively, the production of energy, on the one hand, and its consumption, on the other hand. The former seems to change along relatively predictable, long-term paths. The latter looks like a set of parallel, and partly independent experiments with different sources of energy. We are collectively intelligent: I deeply believe that. I mean, I hope. If bees and ants can be collectively smarter than singlehandedly, there is some potential in us as well.

Thus, I am progressively designing a collective intelligence, which experiments with various sources of energy, just to produce those two, relatively lean, climbing trends: more energy per capita and ever growing a percentage of capitae with access to electricity. Which combinations of variables can produce a rationally desired energy efficiency? How is the supply of money changing as we reach different levels of energy efficiency? Can artificial intelligence make energy policies? Empirical check: take a real energy policy and build a neural network which reflects the logical structure of that policy. Then add a method of learning and see, what it produces as hypothetical outcome.

What is the cognitive value of hypotheses made with a neural network? The answer to this question starts with another question: how do hypotheses made with a neural network differ from any other set of hypotheses? The hypothetical states of nature produced by a neural network reflect the outcomes of logically structured learning. The process of learning should represent real social change and real collective intelligence. There are four most important distinctions I have observed so far, in this respect: a) awareness of internal cohesion b) internal competition c) relative resistance to new information and d) perceptual selection (different ways of standardizing input data).

The awareness of internal cohesion, in a neural network, is a function that feeds into the consecutive experimental rounds of learning the information on relative cohesion (Euclidean distance) between variables. We assume that each variable used in the neural network reflects a sequence of collective decisions in the corresponding social structure. Cohesion between variables represents the functional connection between sequences of collective decisions. Awareness of internal cohesion, as a logical attribute of a neural network, corresponds to situations when societies are aware of how mutually coherent their different collective decisions are. The lack of logical feedback on internal cohesion represents situation when societies do not have that internal awareness.

As I metaphorically look around and ask myself, what awareness do I have about important collective decisions in my local society. I can observe and pattern people’s behaviour, for one. Next thing: I can read (very literally) the formalized, official information regarding legal issues. On the top of that, I can study (read, mostly) quantitatively formalized information on measurable attributes of the society, such as GDP per capita, supply of money, or emissions of CO2. Finally, I can have that semi-formalized information from what we call “media”, whatever prefix they come with: mainstream media, social media, rebel media, the-only-true-media etc.

As I look back upon my own life and the changes which I have observed on those four levels of social awareness, the fourth one, namely the media, has been, and still is the biggest game changer. I remember the cultural earthquake in 1990 and later, when, after decades of state-controlled media in the communist Poland, we suddenly had free press and complete freedom of publishing. Man! It was like one of those moments when you step out of a calm, dark alleyway right into the middle of heavy traffic in the street. Information, it just wheezed past.         

There is something about media, both those called ‘mainstream’, and the modern platforms like Twitter or You Tube: they adapt to their audience, and the pace of that adaptation is accelerating. With Twitter, it is obvious: when I log into my account, I can see the Tweets only from people and organizations whom I specifically subscribed to observe. With You Tube, on my starting page, I can see the subscribed channels, for one, and a ton of videos suggested by artificial intelligence on the grounds of what I watched in the past. Still, the mainstream media go down the same avenue. When I go bbc.com, the types of news presented are very largely what the editorial team hopes will max out on clicks per hour, which, in turn, is based on the types of news that totalled the most clicks in the past. The same was true for printed newspapers, 20 years ago: the stuff that got to headlines was the kind of stuff that made sales.

Thus, when I simulate collective intelligence of a society with a neural network, the function allowing the network to observe its own, internal cohesion seems to be akin the presence of media platforms. Actually, I have already observed, many times, that adding this specific function to a multi-layer perceptron (type of neural network) makes that perceptron less cohesive. Looks like a paradox: observing the relative cohesion between its own decisions makes a piece of AI less cohesive. Still, real life confirms that observation. Social media favour the phenomenon known as « echo chamber »: if I want, I can expose myself only to the information that minimizes my cognitive dissonance and cut myself from anything that pumps my adrenaline up. On a large scale, this behavioural pattern produces a galaxy of relatively small groups encapsulated in highly distilled, mutually incoherent worldviews. Have you ever wondered what it would be to use GPS navigation to find your way, in the company of a hardcore flat-Earther?   

When I run my perceptron over samples of data regarding the energy – efficiency of national economies – including the function of feedback on the so-called fitness function is largely equivalent to simulating a society with abundant mediatic activity. The absence of such feedback is, on the other hand, like a society without much of a media sector.

Internal competition, in a neural network, is the deep underlying principle for structuring a multi-layer perceptron into separate layers, and manipulating the number of neurons in each layer. Let’s suppose I have two neural layers in a perceptron: A, and B, in this exact order. If I put three neurons in the layer A, and one neuron in the layer B, the one in B will be able to choose between the 3 signals sent from the layer A. Seen from the A perspective, each neuron in A has to compete against the two others for the attention of the single neuron in B. Choice on one end of a synapse equals competition on the other end.

When I want to introduce choice in a neural network, I need to introduce internal competition as well. If any neuron is to have a choice between processing input A and its rival, input B, there must be at least two distinct neurons – A and B – in a functionally distinct, preceding neural layer. In a collective intelligence, choice requires competition, and there seems to be no way around it.  In a real brain, neurons form synaptic sequences, which means that the great majority of our neurons fire because other neurons have fired beforehand. We very largely think because we think, not because something really happens out there. Neurons in charge of early-stage collection in sensory data compete for the attention of our brain stem, which, in turn, proposes its pre-selected information to the limbic system, and the emotional exultation of the latter incites he cortical areas to think about the whole thing. From there, further cortical activity happens just because other cortical activity has been happening so far.

I propose you a quick self-check: think about what you are thinking right now, and ask yourself, how much of what you are thinking about is really connected to what is happening around you. Are you thinking a lot about the gradient of temperature close to your skin? No, not really? Really? Are you giving a lot of conscious attention to the chemical composition of the surface you are touching right now with your fingertips? Not really a lot of conscious thinking about this one either? Now, how much conscious attention are you devoting to what [fill in the blank] said about [fill in the blank], yesterday? Quite a lot of attention, isn’t it?

The point is that some ideas die out, in us, quickly and sort of silently, whilst others are tough survivors and keep popping up to the surface of our awareness. Why? How does it happen? What if there is some kind of competition between synaptic paths? Thoughts, or components thereof, that win one stage of the competition pass to the next, where they compete again.           

Internal competition requires complexity. There needs to be something to compete for, a next step in the chain of thinking. A neural network with internal competition reflects a collective intelligence with internal hierarchies that offer rewards. Interestingly, there is research showing that greater complexity gives more optimizing accuracy to a neural network, but just as long as we are talking about really low complexity, like 3 layers of neurons instead of two. As complexity is further developed, accuracy decreases noticeably. Complexity is not the best solution for optimization: see Olawoyin and Chen (2018[9]).

Relative resistance to new information corresponds to the way that an intelligent structure deals with cognitive dissonance. In order to have any cognitive dissonance whatsoever, we need at least two pieces of information: one that we have already appropriated as our knowledge, and the new stuff, which could possibly disturb the placid self-satisfaction of the I-already-know-how-things-work. Cognitive dissonance is a potent factor of stress in human beings as individuals, and in whole societies. Galileo would have a few words to say about it. Question: how to represent in a mathematical form the stress connected to cognitive dissonance? My provisional answer is: by division. Cognitive dissonance means that I consider my acquired knowledge as more valuable than new information. If I want to decrease the importance of B in relation to A, I divide B by a factor greater than 1, whilst leaving A as it is. The denominator of new information is supposed to grow over time: I am more resistant to the really new stuff than I am to the already slightly processed information, which was new yesterday. In a more elaborate form, I can use the exponential progression (see The really textbook-textbook exponential growth).

I noticed an interesting property of the neural network I use for studying energy efficiency. When I introduce choice, internal competition and hierarchy between neurons, the perceptron gets sort of wild: it produces increasing error instead of decreasing error, so it basically learns how to swing more between possible states, rather than how to narrow its own trial and error down to one recurrent state. When I add a pinchful of resistance to new information, i.e. when I purposefully create stress in the presence of cognitive dissonance, the perceptron calms down a bit, and can produce a decreasing error.   

Selection of information can occur already at the level of primary perception. I developed on this one in « Thinking Poisson, or ‘WTF are the other folks doing?’ ». Let’s suppose that new science comes as for how to use particular sources of energy. We can imagine two scenarios of reaction to that new science. On the one hand, the society can react in a perfectly flexible way, i.e. each new piece of scientific research gets evaluated as for its real utility for energy management, and gest smoothly included into the existing body of technologies. On the other hand, the same society (well, not quite the same, an alternative one) can sharply distinguish those new pieces of science into ‘useful stuff’ and ‘crap’, with little nuance in between.

What do we know about collective learning and collective intelligence? Three essential traits come to my mind. Firstly, we make social structures, i.e. recurrent combinations of social relations, and those structures tend to be quite stable. We like having stable social structures. We almost instinctively create rituals, rules of conduct, enforceable contracts etc., thus we make stuff that is supposed to make the existing stuff last. An unstable social structure is prone to wars, coups etc. Our collective intelligence values stability. Still, stability is not the same as perfect conservatism: our societies have imperfect recall. This is the second important trait. Over (long periods of) time we collectively shake off, and replace old rules of social games with new rules, and we do it without disturbing the fundamental social structure. In other words: stable as they are, our social structures have mechanisms of adaptation to new conditions, and yet those mechanisms require to forget something about our past. OK, not just forget something: we collectively forget a shitload of something. Thirdly, there had been many local human civilisations, and each of them had eventually collapsed, i.e. their fundamental social structures had disintegrated. The civilisations we have made so far had a limited capacity to learn. Sooner or later, they would bump against a challenge which they were unable to adapt to. The mechanism of collective forgetting and shaking off, in every known historically documented case, had a limited efficiency.

I intuitively guess that simulating collective intelligence with artificial intelligence is likely to be the most fruitful when we simulate various capacities to learn. I think we can model something like a perfectly adaptable collective intelligence, i.e. the one which has no cognitive dissonance and processes information uniformly over time, whilst having a broad range of choice and internal competition. Such a neural network behaves in the opposite way to what we tend to associate with AI: instead of optimizing and narrowing down the margin of error, it creates new alternative states, possibly in a broadening range. This is a collective intelligence with lots of capacity to learn, but little capacity to steady itself as a social structure. From there, I can muzzle the collective intelligence with various types of stabilizing devices, making it progressively more and more structure-making, and less flexible. Down that avenue, the solver-type of artificial intelligence lies, thus a neural network that just solves a problem, with one, temporarily optimal solution.

I am consistently delivering good, almost new science to my readers, and love doing it, and I am working on crowdfunding this activity of mine. You can communicate with me directly, via the mailbox of this blog: goodscience@discoversocialsciences.com. As we talk business plans, I remind you that you can download, from the library of my blog, the business plan I prepared for my semi-scientific project Befund  (and you can access the French version as well). You can also get a free e-copy of my book ‘Capitalism and Political Power’ You can support my research by donating directly, any amount you consider appropriate, to my PayPal account. You can also consider going to my Patreon page and become my patron. If you decide so, I will be grateful for suggesting me two things that Patreon suggests me to suggest you. Firstly, what kind of reward would you expect in exchange of supporting me? Secondly, what kind of phases would you like to see in the development of my research, and of the corresponding educational tools?

[1] https://data.worldbank.org/indicator/EG.ELC.ACCS.ZS last access May 17th, 2019

[2] https://data.worldbank.org/indicator/EG.ELC.LOSS.ZS?end=2016&start=1990&type=points&view=chart last access May 17th, 2019

[3] https://data.worldbank.org/indicator/EG.ELC.NUCL.ZS?end=2014&start=1960&type=points&view=chart last access May 17th, 2019

[4] https://data.worldbank.org/indicator/EG.ELC.HYRO.ZS?end=2014&start=1960&type=points&view=chart last access May 17th, 2019

[5] https://data.worldbank.org/indicator/EG.ELC.RNWX.ZS?type=points last access May 17th, 2019

[6] https://data.worldbank.org/indicator/EG.FEC.RNEW.ZS?type=points last access May 17th, 2019

[7] https://data.worldbank.org/indicator/EG.ELC.FOSL.ZS?end=2014&start=1960&type=points&view=chart last access May 17th, 2019

[8] https://data.worldbank.org/indicator/EG.USE.PCAP.KG.OE?type=points last access May 17th, 2019

[9] Olawoyin, A., & Chen, Y. (2018). Predicting the Future with Artificial Neural Network. Procedia Computer Science, 140, 383-392.

Existence intelligente et pas tout à fait rationnelle

Mon éditorial sur You Tube

Je continue avec le sujet de l’intelligence artificielle. Je développe sur le contenu de ma dernière mise à jour en anglais : « Thinking Poisson, or ‘WTF are the other folks doing?’ ». Je veux bâtir un raisonnement cohérent en ce qui concerne le bien-fondé et la méthode d’utiliser un réseau neuronal comme outil de prédiction dans les sciences sociales. Je sens que pour le faire j’ai besoin de prendre du recul et d’articuler clairement les sources de ma fascination avec les réseaux neuronaux. Je me souviens la première fois que j’avais utilisé, d’une façon encore très maladroite, un algorithme très simple de réseau neuronal (regardez « Ce petit train-train des petits signaux locaux d’inquiétude »). Ce qui m’avait fasciné, à l’époque, c’était la possibilité de regarder, de l’extérieur, une chose – une chose logique – apprendre. C’était comme si j’observais quelqu’un qui trouve son chemin à tâtons avec les yeux bandés, seulement ce quelqu’un était une séquence de 6 équations.

Il y a deux ans, j’ai présenté, dans une conférence, quelques preuves empiriques que la civilisation humaine a pour trait essentiel de maximiser l’absorption d’énergie de l’environnement. En fait, les changements technologiques de notre civilisation depuis 1960 ont pour effet d’accroître ladite absorption d’énergie. C’est l’un des sentiers intellectuels qui me passionnent. Lorsque je réfléchis sur les différentes manifestations de vie biologique, toute espèce maximise son absorption d’énergie. Nous, les humains, ne faisons pas exception à cette règle. Dans un autre article, j’ai présenté une application créative de la bonne vieille fonction de production – telle que vous pouvez la trouver dans l’article de Charles Cobb et Paul Douglas – au phénomène d’adaptation des sociétés humaines à leur environnements locaux, vu la quantité d’énergie et d’alimentation disponible. La conclusion générale que je tire de la recherche présentée dans ces deux articles est que l’existence des sociétés humaines est une histoire d’apprentissage intelligent, quoi qu’imparfaitement rationnel, à plusieurs niveaux. Pas vraiment original, vous direz. Oui, pas très original, mais ça donne de l’inspiration et ça excite ma curiosité.

Les histoires, ça se déroule. Je suis curieux où est-ce que cette existence intelligente et pas tout à fait rationnelle peut bien nous mener. C’est logique. Je suis chercheur dans les sciences de société et j’essaie de prédire, encore et encore, comme je reçois de l’information nouvelle, quelle forme va prendre la société dans l’avenir. Comment allons-nous adapter aux changements climatiques ? Comment pouvons-nous arrêter ou inverser ces changements ? Comment nous comporterons-nous, en Europe, si une pénurie alimentaire à l’échelle continentale survient ? Quelle va être la loi de demain ? Va-t-elle punir toute offense verbale à la sensibilité de quiconque ? La loi va-t-elle règlementer l’accès à l’eau potable ? Comment voterons-nous dans les élections parlementaires, dans 100 ans ? Y-aura-t-il des élections parlementaires ?

Autant des questions qui provoquent deux types d’attitude. « Qui sait ? Il y a tellement de variables en jeu qu’il est impossible de dire quoi que ce soit de ne serait-ce que moyennement raisonnable » est la première. « Qui sait ? Essayons de formuler des hypothèses, pour commencer. Les hypothèses, ça donne un point de départ. Ensuite, nous pouvons évaluer l’information nouvelle, que nous gagnerons dans l’avenir, en vue de ces hypothèses et comprendre un peu plus de ce qui se passe ». Ça, c’est la deuxième approche possible et moi, j’y souscris. Je suis chercheur, la science est ma passion, je suis curieux et je préfère savoir plutôt qu’ignorer.

Ça fait pratiquement un an que je m’efforce de mettre au point un concept d’entreprise financière que j’ai baptisé EneFin. En général, il s’agit de stimuler le développement des nouvelles sources d’énergie – surtout des petites installations locales basés sur les renouvelables – à travers un mécanisme financier qui combine une structure coopérative avec des solutions typiquement capitalistes, un peu comme dans le financement participatif type « crowdfunding ». Il y a quelque chose d’étrange dans cette idée, ou plutôt dans mes tentatives de la développer. À première vue, ça semble attrayant dans sa simplicité. Lorsque je m’y prends à décrire et développer cette idée, soit comme un business plan soit comme un article scientifique, je bute contre… Voilà, je ne sais pas exactement contre quoi. Il y a comme un blocage dans mon cerveau. Comme j’essaie de comprendre la nature de ce blocage, ça semble être quelque chose comme de la complexité résiduelle. C’est comme si une partie de mon intellect me disait, encore et encore : « Ce truc est plus complexe que tu crois. Tu n’as pas découvert toutes les cartes de ce jeu. Il est trop tôt pour présenter ça comme idée toute faite. Il faut que tu continues à chercher et découvrir, avant de présenter ».

EneFin est un concept essentiellement financier. La finance, ça tend à marcher en boucle de rétroaction : les phénomènes qui, juste un instant avant, étaient la cause et la force motrice de quelque chose, deviennent l’effet du même quelque chose. C’est l’une des raisons pourquoi les méthodes stochastiques classiques, comme la régression linéaire, donnent des résultats très insatisfaisants en ce qui concerne la prédiction des marchés financiers. La méthode stochastique a pour but de trouver une fonction mathématique qui donne une représentation mathématiquement cohérente des données empiriques – une fonction – avec aussi petite erreur type que possible. La prédiction strictement dite consiste à projeter cette fonction dans un futur possible et incertain. La qualité de prédiction se juge, en fait, après coup, donc lorsque le futur de jadis est devenu le passé, ne serait-ce qu’immédiat, du présent. Il y a une assomption profondément cachée dans cette méthode : c’est l’assomption que nous savons tout ce qu’il y a à savoir.

La méthode stochastique requiert de dire ouvertement que l’échantillon des données empiriques que j’utilise pour tracer une fonction est un échantillon représentatif. Suivant la logique de de Moivre – Laplace, mon échantillon a de la valeur stochastique seulement lorsque sa moyenne arithmétique est identique à celle de la moyenne à observer dans la réalité en général ou bien elle est suffisamment proche de cette moyenne réelle pour que la différence soit insignifiante. Dire que mon observation de la réalité est représentative de cette réalité, ça crée une perspective cognitive spéciale, ou je prétends de savoir tout ce qu’il est nécessaire de savoir sur le monde qui m’entoure.

Si vous travaillez sur un projet et quelqu’un vous dit « Va dans la direction A, je sais parfaitement que j’ai raison », vous répondrez, probablement, « Avec tout mon respect, non, tu ne peux pas savoir à coup sûr si tu as raison. La réalité, ça change et ça surprend ». Voilà le talon d’Achille de la méthode stochastique. Bien qu’officiellement différente du bon vieux déterminisme, elle en garde certaines caractéristiques. Avec tous ses avantages indéniables, elle est très exposée à l’erreur d’observation incomplète.

Il y a cette blague à propos des sciences économiques, qu’elles sont l’art de formuler des pronostics qui ne tiennent pas. Cruelle et exagérée, la blague, néanmoins fréquemment vraie. C’est probablement pour ça qu’un créneau légèrement différent s’est développé dans les sciences sociales, celui qui puise des sciences physiques et qui utilise des modèles théoriques comme le mouvement Brownien ou bien le mouvement d’Itô . Dans cette approche, la fonction des données empiriques inclue explicitement une composante de changement aléatoire.

Un réseau neuronal va dans une direction encore un peu différente. Au lieu d’assembler toutes les observations empiriques et en tirer une fonction commune, un réseau neuronal expérimente avec des petits sous-ensembles de l’échantillon complet. Après chaque expérience, le réseau teste sa capacité d’obtenir le résultat égal à une valeur de référence. Le résultat de ce test est ensuite utilisé comme information additionnelle dans des expériences ultérieures. L’intelligence artificielle connaît le succès qu’elle connaît parce que savons que certaines séquences des fonctions mathématiques ont la capacité d’optimiser des fonctions réelles, par exemple le fonctionnement d’un robot de nettoyage des planchers.

Si une séquence d’actions possède la capacité de s’optimiser elle-même, elle se comporte comme l’intelligence d’un organisme vivant : elle apprend. Voilà la méthode dont j’ai besoin pour travailler à fond mon idée de solution financière pour les énergies renouvelables. Le financier, ça contient des multiples boucles de rétroaction entre les variables en jeu, qui sont un gros problème pour les modèles stochastiques. Pour un réseau neuronal, les boucles de rétroaction, c’est précisément ce que l’intelligence artificielle du réseau est faite pour.

Par ailleurs, voilà que j’ai trouvé un article intéressant sur la méthodologie d’utilisation des réseaux neuronaux comme outils de prédiction alternatifs ou complémentaires vis-à-vis les modèles stochastiques. Olawoyin et Chen (2018[1]) discutent la valeur prédictive des plusieurs architectures possibles d’un perceptron à couches multiples. La valeur prédictive est évaluée en appliquant les perceptrons, d’une part, et un modèle ARIMA d’autre part à la prédiction des mêmes variables dans le même échantillon des données empiriques. Le perceptron à couches multiples se débrouille mieux que le modèle stochastique, quelles que soient les conditions exactes de l’expérience. Olawoyin et Chen trouvent deux trucs intéressants à propos de l’architecture du réseau neuronal. Premièrement, le perceptron basé sur la tangente hyperbolique comme fonction d’activation neuronale est généralement plus précis dans sa prédiction que celui basé sur la fonction sigmoïde. Deuxièmement, la multiplication des couches de neurones dans le perceptron ne se traduit pas directement en sa valeur prédictive. Chez Olawoyin et Chen, le réseau à 3 couches semble se débrouiller généralement mieux que celui à 4 couches.

Il est peut-être bon que j’explique cette histoire des couches. Dans un réseau neuronal artificiel, un neurone est une fonction mathématique avec une tâche précise à effectuer. Attribuer des coefficients aléatoires de pondération aux variables d’entrée est une fonction distincte du calcul de la variable de résultat à travers une fonction d’activation neuronale. J’ai donc deux neurones distincts : un qui attribue les coefficients aléatoires et un autre qui calcule la fonction d’activation. Logiquement, ce dernier a besoin des valeurs crées par le premier, donc l’attribution des coefficients aléatoires est la couche neuronale précédente par rapport au calcul de la fonction d’activation, qui est donc situé dans la couche suivante. De manière générale, si l’équation A requiert le résultat de l’équation B, l’équation B sera dans la couche précédente et l’équation A trouvera son expression dans la couche suivante. C’est comme dans un cerveau : pour contempler la beauté d’un tableau de Cézanne j’ai besoin de le voir, donc les neurones engagés directement dans la vision sont dans une couche supérieure et les neurones responsables des gloussements d’admiration font la couche suivante.

Pourquoi parler des couches plutôt que des neurones singuliers ? C’est une découverte que même moi, un néophyte à peine initié aux fondements des réseaux neuronaux, je comprends déjà : lorsque je place des neurones multiples dans la même couche fonctionnelle du réseau, je peux les mettre en compétition, c’est-à-dire les neurones de la couche suivante peuvent choisir entre les résultats différents apportés par les neurones distincts de la couche précédente. J’ai commencé à tester ce truc dans « Surpopulation sauvage ou compétition aux États-Unis ». Par ailleurs, j’avais alors découvert à peu près la même chose qu’Olawoyin et Chen (2018) présentent dans leur article : plus de complexité dans l’architecture d’un réseau neuronal crée plutôt plus de possibilités que plus de précision prédictive. Quand il s’agit de prédiction strictement dite, plus simple le réseau plus de précision il donne. En revanche, lorsqu’il est question de formuler des hypothèses alternatives précises, plus de complexité élargit le répertoire des comportements possibles du perceptron et donne plus d’envergure dans la description des états alternatifs de la même situation.  

Je continue à vous fournir de la bonne science, presque neuve, juste un peu cabossée dans le processus de conception. Je vous rappelle que vous pouvez télécharger le business plan du projet BeFund (aussi accessible en version anglaise). Vous pouvez aussi télécharger mon livre intitulé “Capitalism and Political Power”. Je veux utiliser le financement participatif pour me donner une assise financière dans cet effort. Vous pouvez soutenir financièrement ma recherche, selon votre meilleur jugement, à travers mon compte PayPal. Vous pouvez aussi vous enregistrer comme mon patron sur mon compte Patreon . Si vous en faites ainsi, je vous serai reconnaissant pour m’indiquer deux trucs importants : quel genre de récompense attendez-vous en échange du patronage et quelles étapes souhaitiez-vous voir dans mon travail ? Vous pouvez me contacter à travers la boîte électronique de ce blog : goodscience@discoversocialsciences.com .

[1] Olawoyin, A., & Chen, Y. (2018). Predicting the Future with Artificial Neural Network. Procedia Computer Science, 140, 383-392.

Thinking Poisson, or ‘WTF are the other folks doing?’

My editorial on You Tube

I think I have just put a nice label on all those ideas I have been rummaging in for the last 2 years. The last 4 months, when I have been progressively initiating myself at artificial intelligence, have helped me to put it all in a nice frame. Here is the idea for a book, or rather for THE book, which I have been drafting for some time. « Our artificial intelligence »: this is the general title. The first big chapter, which might very well turn into the first book out of a whole series, will be devoted to energy and technological change. After that, I want to have a go at two other big topics: food and agriculture, then laws and institutions.

I explain. What does it mean « Our artificial intelligence »? As I have been working with an initially simple algorithm of a neural network, and I have been progressively developing it, I understood a few things about the link between what we call, fault of a better word, artificial intelligence, and the way my own brain works. No, not my brain. That would be an overstatement to say that I understand fully my own brain. My mind, this is the right expression. What I call « mind » is an idealized, i.e. linguistic description of what happens in my nervous system. As I have been working with a neural network, I have discovered that artificial intelligence that I make, and use, is a mathematical expression of my mind. I project my way of thinking into a set of mathematical expressions, made into an algorithmic sequence. When I run the sequence, I have the impression of dealing with something clever, yet slightly alien: an artificial intelligence. Still, when I stop staring at the thing, and start thinking about it scientifically (you know: initial observation, assumptions, hypotheses, empirical check, new assumptions and new hypotheses etc.), I become aware that the alien thing in front of me is just a projection of my own way of thinking.

This is important about artificial intelligence: this is our own, human intelligence, just seen from outside and projected into electronics. This particular point is an important piece of theory I want to develop in my book. I want to compile research in neurophysiology, especially in the neurophysiology of meaning, language, and social interactions, in order to give scientific clothes to that idea. When we sometimes ask ourselves whether artificial intelligence can eliminate humans, it boils down to asking: ‘Can human intelligence eliminate humans?’. Well, where I come from, i.e. Central Europe, the answer is certainly ‘yes, it can’. As a matter of fact, when I raise my head and look around, the same answer is true for any part of the world. Human intelligence can eliminate humans, and it can do so because it is human, not because it is ‘artificial’.

When I think about the meaning of the word ‘artificial’, it comes from the Latin ‘artificium’, which, in turn, designates something made with skill and demonstrable craft. Artificium means seasoned skills made into something durable so as to express those skills. Artificial intelligence is a crafty piece of work made with one of the big human inventions: mathematics. Artificial intelligence is mathematics at work. Really at work, i.e. not just as another idealization of reality, but as an actual tool. When I study the working of algorithms in neural networks, I have a vision of an architect in Ancient Greece, where the first mathematics we know seem to be coming from. I have a wall and a roof, and I want them both to hold in balance, so what is the proportion between their respective lengths? I need to learn it by trial and error, as I haven’t any architectural knowledge yet. Although devoid of science, I have common sense, and I make small models of the building I want (have?) to erect, and I test various proportions. Some of those maquettes are more successful than others. I observe, I make my synthesis about the proportions which give the least error, and so I come up with something like the Pythagorean z2 = x2 + y2, something like π = 3,14 etc., or something like the discovery that, for a given angle, the tangent proportion y/x makes always the same number, whatever the empirical lengths of y and x.

This is exactly what artificial intelligence does. It makes small models of itself, tests the error resulting from comparison between those models and something real, and generalizes the observation of those errors. Really: this is what a face recognition piece of software does at an airport, or what Google Ads does. This is human intelligence, just unloaded into a mathematical vessel. This is the first discovery that I have made about AI. Artificial intelligence is actually our own intelligence. Studying the way AI behaves allows seeing, like under a microscope, the workings of human intelligence.

The second discovery is that when I put a neural network to work with empirical data of social sciences, it produces strange, intriguing patterns, something like neighbourhoods of the actual reality. In my root field of research – namely economics – there is a basic concept that we, economists, use a lot and still wonder what it actually means: equilibrium. It is an old observation that networks of exchange in human societies tend to find balance in some precise proportions, for example proportions between demand, supply, price and quantity, or those between labour and capital.

Half of economic sciences is about explaining the equilibriums we can empirically observe. The other half employs itself at discarding what that first half comes up with. Economic equilibriums are something we know that exists, and constantly try to understand its mechanics, but those states of society remain obscure to a large extent. What we know is that networks of exchange are like machines: some designs just work, some others just don’t. One of the most important arguments in economic sciences is whether a given society can find many alternative equilibriums, i.e. whether it can use optimally its resources at many alternative proportions between economic variables, or, conversely, is there just one point of balance in a given place and time. From there on, it is a rabbit hole. What does it mean ‘using our resources optimally’? Is it when we have the lowest unemployment, or when we have just some healthy amount of unemployment? Theories are welcome.

When trying to make predictions about the future, using the apparatus of what can now be called classical statistics, social sciences always face the same dilemma: rigor vs cognitive depth. The most interesting correlations are usually somehow wobbly, and mathematical functions we derive from regression always leave a lot of residual errors.    

This is when AI can step in. Neural networks can be used as tools for optimization in digital systems. Still, they have another useful property: observing a neural network at work allows having an insight into how intelligent structures optimize. If I want to understand how economic equilibriums take shape, I can observe a piece of AI producing many alternative combinations of the relevant variables. Here comes my third fundamental discovery about neural networks: with a few, otherwise quite simple assumptions built into the algorithm, AI can produce very different mechanisms of learning, and, consequently, a broad range of those weird, yet intellectually appealing, alternative states of reality. Here is an example: when I make a neural network observe its own numerical properties, such as its own kernel or its own fitness function, its way of learning changes dramatically. Sounds familiar? When you make a human being performing tasks, and you allow them to see the MRI of their own brain when performing those tasks, the actual performance changes.

When I want to talk about applying artificial intelligence, it is a good thing to return to the sources of my own experience with AI, and explain it works. Some sequences of mathematical equations, when run recurrently many times, behave like intelligent entities: they experiment, they make errors, and after many repeated attempts they come up with a logical structure that minimizes the error. I am looking for a good, simple example from real life; a situation which I experienced personally, and which forced me to learn something new. Recently, I went to Marrakech, Morocco, and I had the kind of experience that most European first-timers have there: the Jemaa El Fna market place, its surrounding souks, and its merchants. The experience consists in finding your way out of the maze-like structure of the alleys adjacent to the Jemaa El Fna. You walk down an alley, you turn into another one, then into still another one, and what you notice only after quite a few such turns is that the whole architectural structure doesn’t follow AT ALL the European concept of urban geometry.  

Thus, you face the length of an alley. You notice five lateral openings and you see a range of lateral passages. In a European town, most of those lateral passages would lead somewhere. A dead end is an exception, and passages between buildings are passages in the strict sense of the term: from one open space to another open space. At Jemaa El Fna, its different: most of the lateral ways lead into deep, dead-end niches, with more shops and stalls inside, yet some other open up into other alleys, possibly leading to the main square, or at least to a main street.

You pin down a goal: get back to the main square in less than… what? One full day? Just kidding. Let’s peg that goal down at 15 minutes. Fault of having a good-quality drone, equipped with thermovision, flying over the whole structure of the souk, and guiding you, you need to experiment. You need to test various routes out of the maze and to trace those, which allow the x ≤ 15 minutes time. If all the possible routes allowed you to get out to the main square in exactly 15 minutes, experimenting would be useless. There is any point in experimenting only if some from among the possible routes yield a suboptimal outcome. You are facing a paradox: in order not to make (too much) errors in your future strolls across Jemaa El Fna, you need to make some errors when you learn how to stroll through.

Now, imagine a fancy app in your smartphone, simulating the possible errors you can make when trying to find your way through the souk. You could watch an imaginary you, on the screen, wandering through the maze of alleys and dead-ends, learning by trial and error to drive the time of passage down to no more than 15 minutes. That would be interesting, wouldn’t it? You could see your possible errors from outside, and you could study the way you can possibly learn from them. Of course, you could always say: ‘it is not the real me, it is just a digital representation of what I could possibly do’. True. Still, I can guarantee you: whatever you say, whatever strong the grip you would try to keep on the actual, here-and-now you, you just couldn’t help being fascinated.

Is there anything more, beyond fascination, in observing ourselves making many possible future mistakes? Let’s think for a moment. I can see, somehow from outside, how a copy of me deals with the things of life. Question: how does the fact of seeing a copy of me trying to find a way through the souk differ from just watching a digital map of said souk, with GPS, such as Google Maps? I tried the latter, and I have two observations. Firstly, in some structures, such as that of maze-like alleys adjacent to Jemaa El Fna, seeing my own position on Google Maps is of very little help. I cannot put my finger on the exact reason, but my impression is that when the environment becomes just too bizarre for my cognitive capacities, having a bird’s eye view of it is virtually no good. Secondly, when I use Google Maps with GPS, I learn very little about my route. I just follow directions on the screen, and ultimately, I get out into the main square, but I know that I couldn’t reproduce that route without the device. Apparently, there is no way around learning stuff by myself: if I really want to learn how to move through the souk, I need to mess around with different possible routes. A device that allows me to see how exactly I can mess around looks like having some potential.

Question: how do I know that what I see, in that imaginary app, is a functional copy of me, and how can I assess the accuracy of that copy? This is, very largely, the rabbit hole I have been diving into for the last 5 months or so. The first path to follow is to look at the variables used. Artificial intelligence works with numerical data, i.e. with local instances of abstract variables. Similarity between the real me, and the me reproduced as artificial intelligence is to find in the variables used. In real life, variables are the kinds of things, which: a) are correlated with my actions, both as outcomes and as determinants b) I care about, and yet I am not bound to be conscious of caring about.

Here comes another discovery I made on my journey through the realm of artificial intelligence: even if, in the simplest possible case, I just make the equations of my neural network so as they represent what I think is the way I think, and I drop some completely random values of the relevant variables into the first round of experimentation, the neural network produces something disquietingly logical and coherent. In other words, if I am even moderately honest in describing, in the form of equations, my way of apprehending reality, the AI I thus created really processes information in the way I would.  

Another way of assessing the similarity between a piece of AI and myself is to compare the empirical data we use: I can make a neural network think more or less like me if I feed it with an accurate description of my so-far experience. In this respect, I discovered something that looks like a keystone in my intellectual structure: as I feed my neural network with more and more empirical data, the scope of the possible ways to learning something meaningful narrows down. When I minimise the amount of empirical data fed into the network, the latter can produce interesting, meaningful results via many alternative sequences of equations. As the volume of real-life information swells, some sequences of equations just naturally drop off the game: they drive the neural network into a state of structural error, when it stops performing calculations.

At this point, I can see some similarity between AI and quantum physics. Quantum mechanics have grown as a methodology, as they proved to be exceptionally accurate in predicting the outcomes of experiments in physics. That accuracy was based on the capacity to formulate very precise hypotheses regarding empirical reality, and the capacity to increase the precision of those hypotheses through the addition of empirical data from past experiments.  

Those fundamental observations I made about the workings of artificial intelligence have progressively brought me to use AI in social sciences. An analytical tool has become a topic of research for me. Happens all the time in science, mind you. Geometry, way back in the day, was a thoroughly practical set of tools, which served to make good boats, ships and buildings. With time, geometry has become a branch of science on its own rights. In my case, it is artificial intelligence. It is a tool, essentially, invented back in the 1960ies and 1970ies, and developed over the last 20 years, and it serves practical purposes: facial identification, financial investment etc. Still, as I have been working with a very simple neural network for the last 4 months, and as I have been developing the logical structure of that network, I am discovering a completely new opening in my research in social sciences.

I am mildly obsessed with the topic of collective human intelligence. I have that deeply rooted intuition that collective human behaviour is always functional regarding some purpose. I perceive social structures such as financial markets or political institutions as something akin to endocrine systems in a body: complex set of signals with a random component in their distribution, and yet a very coherent outcome. I follow up on that intuition by assuming that we, humans, are most fundamentally, collectively intelligent regarding our food and energy base. We shape our social structures according to the quantity and quality of available food and non-edible energy. For quite a while, I was struggling with the methodological issue of precise hypothesis-making. What states of human society can be posited as coherent hypotheses, possible to check or, fault of checking, to speculate about in an informed way?

The neural network I am experimenting with does precisely this: it produces strange, puzzling, complex states, defined by the quantitative variables I use. As I am working with that network, I have come to redefining the concept of artificial intelligence. A movie-based approach to AI is that it is fundamentally non-human. As I think about it sort of step by step, AI is human, as it has been developed on the grounds of human logic. It is human meaning, and therefore an expression of human neural wiring. It is just selective in its scope. Natural human intelligence has no other way of comprehending but comprehending IT ALL, i.e. the whole of perceived existence. Artificial intelligence is limited in scope: it works just with the data we assign it to work with. AI can really afford not to give a f**k about something otherwise important. AI is focused in the strict sense of the term.

During that recent stay in Marrakech, Morocco, I had been observing people around me and their ways of doing things. As it is my habit, I am patterning human behaviour. I am connecting the dots about the ways of using energy (for the moment I haven’t seen any making of energy, yet) and food. I am patterning the urban structure around me and the way people live in it.

Superbly kept gardens and buildings marked by a sense of instability. Human generosity combined with somehow erratic behaviour in the same humans. Of course, women are fully dressed, from head to toes, but surprisingly enough, men too. With close to 30 degrees Celsius outside, most local dudes are dressed like a Polish guy would dress by 10 degrees Celsius. They dress for the heat as I would dress for noticeable cold. Exquisitely fresh and firm fruit and vegetables are a surprise. After having visited Croatia, on the Southern coast of Europe, I would rather expect those tomatoes to be soft and somehow past due. Still, they are excellent. Loads of sugar in very nearly everything. Meat is scarce and tough. All that has been already described and explained by many a researcher, wannabe researchers included. I think about those things around me as about local instances of a complex logical structure: a collective intelligence able to experiment with itself. I wonder what other, hypothetical forms could this collective intelligence take, close to the actually observable reality, as well as some distance from it.

The idea I can see burgeoning in my mind is that I can understand better the actual reality around me if I use some analytical tool to represent slight hypothetical variations in said reality. Human behaviour first. What exactly makes me perceive Moroccans as erratic in their behaviour, and how can I represent it in the form of artificial intelligence? Subjectively perceived erraticism is a perceived dissonance between sequences. I expect a certain sequence to happen in other people’s behaviour. The sequence that really happens is different, and possibly more differentiated than what I expect to happen. When I perceive the behaviour of Moroccans as erratic, does it connect functionally with their ways of making and using food and energy?  

A behavioural sequence is marked by a certain order of actions, and a timing. In a given situation, humans can pick their behaviour from a total basket of Z = {a1, a2, …, az} possible actions. These, in turn, can combine into zPk = z!/(z – k)! = (1*2*…*z) / [1*2*…*(z – k)] possible permutations of k component actions. Each such permutation happens with a certain frequency. The way a human society works can be described as a set of frequencies in the happening of those zPk permutations. Well, that’s exactly what a neural network such as mine can do. It operates with values standardized between 0 and 1, and these can be very easily interpreted as frequencies of happening. I have a variable named ‘energy consumption per capita’. When I use it in the neural network, I routinely standardize each empirical value over the maximum of this variable in the entire empirical dataset. Still, standardization can convey a bit more of a mathematical twist and can be seen as the density of probability under the curve of a statistical distribution.

When I feel like giving such a twist, I can make my neural network stroll down different avenues of intelligence. I can assume that all kinds of things happen, and all those things are sort of densely packed one next to the other, and some of those things are sort of more expected than others, and thus I can standardize my variables under the curve of the normal distribution. Alternatively, I can see each empirical instance of each variable in my database as a rare event in an interval of time, and then I standardize under the curve of the Poisson distribution. A quick check with the database I am using right now brings an important observation: the same empirical data standardized with a Poisson distribution becomes much more disparate as compared to the same data standardized with the normal distribution. When I use Poisson, I lead my empirical network to divide sharply empirical data into important stuff on the one hand, and all the rest, not even worth to bother about, on the other hand.

I am giving an example. Here comes energy consumption per capita in Ecuador (1992) = 629,221 kg of oil equivalent (koe), Slovak Republic (2000) = 3 292,609 koe, and Portugal (2003) = 2 400,766 koe. These are three different states of human society, characterized by a certain level of energy consumption per person per year. They are different. I can choose between three different ways of making sense out of their disparity. I can see them quite simply as ordinals on a scale of magnitude, i.e. I can standardize them as fractions of the greatest energy consumption in the whole sample. When I do so, they become: Ecuador (1992) =  0,066733839, Slovak Republic (2000) =  0,349207223, and Portugal (2003) =  0,254620211.

In an alternative worldview, I can perceive those three different situations as neighbourhoods of an expected average energy consumption, in the presence of an average, standard deviation from that expected value. In other words, I assume that it is normal that countries differ in their energy consumption per capita, as well as it is normal that years of observation differ in that respect. I am thinking normal distribution, and then my three situations come as: Ecuador (1992) = 0,118803134, Slovak Republic (2000) = 0,556341893, and Portugal (2003) = 0,381628627.

I can adopt an even more convoluted approach. I can assume that energy consumption in each given country is the outcome of a unique, hardly reproducible process of local adjustment. Each country, with its energy consumption per capita, is a rare event. Seen from this angle, my three empirical states of energy consumed per capita could occur with the probability of the Poisson distribution, estimated with the whole sample of data. With this specific take on the thing, my three empirical values become: Ecuador (1992) = 0, Slovak Republic (2000) = 0,999999851, and Portugal (2003) = 9,4384E-31.

I come back to Morocco. I perceive some behaviours in Moroccans as erratic. I think I tend to think Poisson distribution. I expect some very tightly defined, rare event of behaviour, and when I see none around, I discard everything else as completely not fitting the bill. As I think about it, I guess most of our human intelligence is Poisson-based. We think ‘good vs bad’, ‘edible vs not food’, ‘friend vs foe’ etc.

I am consistently delivering good, almost new science to my readers, and love doing it, and I am working on crowdfunding this activity of mine. As we talk business plans, I remind you that you can download, from the library of my blog, the business plan I prepared for my semi-scientific project Befund  (and you can access the French version as well). You can also get a free e-copy of my book ‘Capitalism and Political Power’ You can support my research by donating directly, any amount you consider appropriate, to my PayPal account. You can also consider going to my Patreon page and become my patron. If you decide so, I will be grateful for suggesting me two things that Patreon suggests me to suggest you. Firstly, what kind of reward would you expect in exchange of supporting me? Secondly, what kind of phases would you like to see in the development of my research, and of the corresponding educational tools?