Sketching quickly alternative states of nature

My editorial on You Tube

I am thinking about a few things, as usually, and, as usually, it is a laborious process. The first one is a big one: what the hell am I doing what I am doing for? I mean, what’s the purpose and the point of applying artificial intelligence to simulating collective intelligence? There is one particular issue that I am entertaining in this regard: the experimental check. A neural network can help me in formulating very precise hypotheses as for how a given social structure can behave. Yet, these are hypotheses. How can I have them checked?

Here is an example. Together with a friend, we are doing some research about the socio-economic development of big cities in Poland, in the perspective of seeing them turning into so-called ‘smart cities’. We came to an interesting set of hypotheses generated by a neural network, but we have a tiny little problem: we propose, in the article, a financial scheme for cities but we don’t quite understand why we propose this exact scheme. I know it sounds idiotic, but well: it is what it is. We have an idea, and we don’t know exactly where that idea came from.

I have already discussed the idea in itself on my blog, in « Locally smart. Case study in finance.» : a local investment fund, created by the local government, to finance local startup businesses. Business means investment, especially at the aggregate scale and in the long run. This is how business works: I invest, and I have (hopefully) a return on my investment. If there is more and more private business popping up in those big Polish cities, and, in the same time, local governments are backing off from investment in fixed assets, let’s make those business people channel capital towards the same type of investment that local governments are withdrawing from. What we need is an institutional scheme where local governments financially fuel local startup businesses, and those businesses implement investment projects.

I am going to try and deconstruct the concept, sort of backwards. I am sketching the landscape, i.e. the piece of empirical research that brought us to formulating the whole idea of investment fund paired with crowdfunding.  Big Polish cities show an interesting pattern of change: local populations, whilst largely stagnating demographically, are becoming more and more entrepreneurial, which is observable as an increasing number of startup businesses per 10 000 inhabitants. On the other hand, local governments (city councils) are spending a consistently decreasing share of their budgets on infrastructural investment. There is more and more business going on per capita, and, in the same time, local councils seem to be slowly backing off from investment in infrastructure. The cities we studied as for this phenomenon are: Wroclaw, Lodz, Krakow, Gdansk, Kielce, Poznan, Warsaw.

More specifically, the concept tested through the neural network consists in selecting, each year, 5% of the most promising local startups, and funds each of them with €80 000. The logic behind this concept is that when a phenomenon becomes more and more frequent – and this is the case of startups in big Polish cities – an interesting strategy is to fish out, consistently, the ‘crème de la crème’ from among those frequent occurrences. It is as if we were soccer promotors in a country, where more and more young people start playing at a competitive level. A viable strategy consists, in such a case, in selecting, over and over again, the most promising players from the top of the heap and promote them further.

Thus, in that hypothetical scheme, the local investment fund selects and supports the most promising from amongst the local startups. Mind you, that 5% rate of selection is just an idea. It could be 7% or 3% just as well. A number had to be picked, in order to simulate the whole thing with a neural network, which I present further. The 5% rate can be seen as an intuitive transference from the s-Student significance test in statistics. When you test a correlation for its significance, with the t-Student test, you commonly assume that at least 95% of all the observations under scrutiny is covered by that correlation, and you can tolerate a 5% outlier of fringe cases. I suppose this is why we picked, intuitively, that 5% rate of selection among the local startups: 5% sounds just about right to delineate the subset of most original ideas.

Anyway, the basic idea consists in creating a local investment fund controlled by the local government, and this fund would provide a standard capital injection of €80 000 to 5% of most promising local startups. The absolute number STF (i.e. financed startups) those 5% translate into can be calculated as: STF = 5% * (N/10 000) * ST10 000, where N is the population of the given city, and ST10 000 is the coefficient of startup businesses per 10 000 inhabitants. Just to give you an idea what it looks like empirically, I am presenting data for Krakow (KR, my hometown) and Warsaw (WA, Polish capital), in 2008 and 2017, which I designate, respectively, as STF(city_acronym; 2008) and STF(city_acronym; 2017). It goes like:

STF(KR; 2008) = 5% * (754 624/ 10 000) * 200 = 755

STF(KR; 2017) = 5* * (767 348/ 10 000) * 257 = 986

STF(WA; 2008) = 5% * (1709781/ 10 000) * 200 = 1 710

STF(WA; 2017) = 5% * (1764615/ 10 000) * 345 = 3 044   

That glimpse of empirics allows guessing why we applied a neural network to that whole thing: the two core variables, namely population and the coefficient of startups per 10 000 people, can change with a lot of autonomy vis a vis each other. In the whole sample that we used for basic stochastic analysis, thus 7 cities from 2008 through 2017 equals 70 observations, those two variables are Pearson-correlated at r = 0,6267. There is some significant correlation, and yet some 38% of observable variance in each of those variables doesn’t give a f**k about the variance of the other variable. The covariance of these two seems to be dominated by the variability in population rather than by uncertainty as for the average number of startups per 10 000 people.

What we have is quite predictable a trend of growing propensity to entrepreneurship, combined with a bit of randomness in demographics. Those two can come in various duos, and their duos tend to be actually trios, ‘cause we have that other thing, which I already mentioned: investment outlays of local governments and the share of those outlays in the overall local budgets. Our (my friend’s and mine) intuitive take on that picture was that it is really interesting to know the different ways those Polish cities can go in the future, rather that setting one central model. I mean, the central stochastic model is interesting too. It says, for example, that the natural logarithm of the number of startups per 10 000 inhabitants, whilst being negatively correlated with the share of investment outlays in the local government’s budget, it is positively correlated with the absolute amount of those outlays. The more a local government spends on fixed assets, the more startups it can expect per 10 000 inhabitants. That latter variable is subject to some kind of scale effects from the part of the former. Interesting. I like scale effects. They are intriguing. They show phenomena, which change in a way akin to what happens when I heat up a pot full of water: the more heat have I supplied to water, the more different kinds of stuff can happen. We call it increase in the number of degrees of freedom.

The stochastically approached degrees of freedom in the coefficient of startups per 10 000 inhabitants, you can see them in Table 1, below. The ‘Ln’ prefix means, of course, natural logarithms. Further below, I return to the topic of collective intelligence in this specific context, and to using artificial intelligence to simulate the thing.

Table 1

Explained variable: Ln(number of startups per 10 000 inhabitants) R2 = 0,608 N = 70
Explanatory variable Coefficient of regression Standard error Significance level
Ln(investment outlays of the local government) -0,093 0,048 p = 0,054
Ln(total budget of the local government) 0,565 0,083 p < 0,001
Ln(population) -0,328 0,09 p < 0,001
Constant    -0,741 0,631 p = 0,245

I take the correlations from Table 1, thus the coefficients of regression from the first numerical column, and I check their credentials with the significance level from the last numerical column. As I want to understand them as real, actual things that happen in the cities studied, I recreate the real values. We are talking about coefficients of startups per 10 000 people, comprised somewhere the observable minimum ST10 000 = 140, and the maximum equal to ST10 000 = 345, with a mean at ST10 000 = 223. It terms of natural logarithms, that world folds into something between ln(140) = 4,941642423 and ln(345) = 5,843544417, with the expected mean at ln(223) = 5,407171771. Standard deviation Ω from that mean can be reconstructed from the standard error, which is calculated as s = Ω/√N, and, consequently, Ω = s*√N. In this case, with N = 70, standard deviation Ω = 0,631*√70 = 5,279324767.  

That regression is interesting to the extent that it leads to an absurd prediction. If the population of a city shrinks asymptotically down to zero, and if, in the same time, the budget of the local government swells up to infinity, the occurrence of entrepreneurial behaviour (number of startups per 10 000 inhabitants) will tend towards infinity as well. There is that nagging question, how the hell can the budget of a local government expand when its tax base – the population – is collapsing. I am an economist and I am supposed to answer questions like that.

Before being an economist, I am a scientist. I ask embarrassing questions and then I have to invent a way to give an answer. Those stochastic results I have just presented make me think of somehow haphazard a set of correlations. Such correlations can be called dynamic, and this, in turn, makes me think about the swarm theory and collective intelligence (see Yang et al. 2013[1] or What are the practical outcomes of those hypotheses being true or false?). A social structure, for example that of a city, can be seen as a community of agents reactive to some systemic factors, similarly to ants or bees being reactive to pheromones they produce and dump into their social space. Ants and bees are amazingly intelligent collectively, whilst, let’s face it, they are bloody stupid singlehandedly. Ever seen a bee trying to figure things out in the presence of a window? Well, not only can a swarm of bees get that s**t down easily, but also, they can invent a way of nesting in and exploiting the whereabouts of the window. The thing is that a bee has its nervous system programmed to behave smartly mostly in social interactions with other bees.

I have already developed on the topic of money and capital being a systemic factor akin to a pheromone (see Technological change as monetary a phenomenon). Now, I am walking down this avenue again. What if city dwellers react, through entrepreneurial behaviour – or the lack thereof – to a certain concentration of budgetary spending from the local government? What if the budgetary money has two chemical hooks on it – one hook observable as ‘current spending’ and the other signalling ‘investment’ – and what if the reaction of inhabitants depends on the kind of hook switched on, in the given million of euros (or rather Polish zlotys, or PLN, as we are talking about Polish cities)?

I am returning, for a moment, to the negative correlation between the headcount of population, on the one hand, and the occurrence of new businesses per 10 000 inhabitants. Cities – at least those 7 Polish cities that me and my friend did our research on – are finite spaces. Less people in the city means less people per 1 km2 and vice versa. Hence, the occurrence of entrepreneurial behaviour is negatively correlated with the density of population. A behavioural pattern emerges. The residents of big cities in Poland develop entrepreneurial behaviour in response to greater a concentration of current budgetary spending by local governments, and to lower a density of population. On the other hand, greater a density of population or less money spent as current payments from the local budget act as inhibitors of entrepreneurship. Mind you, greater a density of population means greater a need for infrastructure – yes, those humans tend to crap and charge their smartphones all over the place – whence greater a pressure on the local governments to spend money in the form of investment in fixed assets, whence the secondary in its force, negative correlation between entrepreneurial behaviour and investment outlays from local budgets.

This is a general, behavioural hypothesis. Now, the cognitive challenge consists in translating the general idea into as precise empirical hypotheses as possible. What precise states of nature can happen in those cities? This is when artificial intelligence – a neural network – can serve, and this is when I finally understand where that idea of investment fund had come from. A neural network is good at producing plausible combinations of values in a pre-defined set of variables, and this is what we need if we want to formulate precise hypotheses. Still, a neural network is made for learning. If I want the thing to make those hypotheses for me, I need to give it a purpose, i.e. a variable to optimize, and learn as it is optimizing.

In social sciences, entrepreneurial behaviour is assumed to be a good thing. When people recurrently start new businesses, they are in a generally go-getting frame of mind, and this carries over into social activism, into the formation of institutions etc. In an initial outburst of neophyte enthusiasm, I might program my neural network so as to optimize the coefficient of startups per 10 000 inhabitants. There is a catch, though. When I tell a neural network to optimize a variable, it takes the most likely value of that variable, thus, stochastically, its arithmetical average, and it keeps recombining all the other variables so as to have this one nailed down, as close to that most likely value as possible. Therefore, if I want a neural network to imagine relatively high occurrences of entrepreneurial behaviour, I shouldn’t set said behaviour as the outcome variable. I should mix it with others, as an input variable. It is very human, by the way. You brace for achieving a goal, you struggle the s**t out of yourself, and you discover, with negative amazement, that instead of moving forward, you are actually repeating the same existential pattern over and over again. You can set your personal compass, though, on just doing a good job and having fun with it, and then, something strange happens. Things get done sort of you haven’t even noticed when and how. Goals get nailed down even without being phrased explicitly as goals. And you are having fun with the whole thing, i.e. with life.

Same for artificial intelligence, as it is, as a matter of fact, an artful expression of our own, human intelligence: it produces the most interesting combinations of variables as a by-product of optimizing something boring. Thus, I want my neural network to optimize on something not-necessarily-fascinating and see what it can do in terms of people and their behaviour. Here comes the idea of an investment fund. As I have been racking my brains in the search of place where that idea had come from, I finally understood: an investment fund is both an institutional scheme, and a metaphor. As a metaphor, it allows decomposing an aggregate stream of investment into a set of more or less autonomous projects, and decisions attached thereto. An investment fund is a set of decisions coordinated in a dynamically correlated manner: yes, there are ways and patterns to those decisions, but there is a lot of autonomous figuring-out-the-thing in each individual case.

Thus, if I want to put functionally together those two social phenomena – investment channelled by local governments and entrepreneurial behaviour in local population – an investment fund is a good institutional vessel to that purpose. Local government invests in some assets, and local homo sapiens do the same in the form of startups. What if we mix them together? What if the institutional scheme known as public-private partnership becomes something practiced serially, as a local market for ideas and projects?

When we were designing that financial scheme for local governments, me and my friend had the idea of dropping a bit of crowdfunding into the cooking pot, and, as strange as it could seem, we are bit confused as for where this idea came from. Why did we think about crowdfunding? If I want to understand how a piece of artificial intelligence simulates collective intelligence in a social structure, I need to understand what kind of logical connections had I projected into the neural network. Crowdfunding is sort of spontaneous. When I am having a look at the typical conditions proposed by businesses crowdfunded at Kickstarter or at StartEngine, these are shitty contracts, with all the due respect. Having a Master’s in law, when I look at the contracts offered to investors in those schemes, I wouldn’t sign such a contract if I had any room for negotiation. I wouldn’t even sign a contract the way I am supposed to sign it via a crowdfunding platform.

There is quite a strong piece of legal and business science to claim that crowdfunding contracts are a serious disruption to the established contractual patterns (Savelyev 2017[2]). Crowdfunding largely rests on the so-called smart contracts, i.e. agreements written and signed as software on Blockchain-based platforms. Those contracts are unusually flexible, as each amendment, would it be general or specific, can be hash-coded into the history of the individual contractual relation. That puts a large part of legal science on its head. The basic intuition of any trained lawyer is that we negotiate the s**t of ourselves before the signature of the contract, thus before the formulation of general principles, and anything that happens later is just secondary. With smart contracts, we are pretty relaxed when it comes to setting the basic skeleton of the contract. We just put the big bones in, and expect we gonna make up the more sophisticated stuff as we go along.

With the abundant usage of smart contracts, crowdfunding platforms have peculiar legal flexibility. Today you sign up for having a discount of 10% on one Flower Turbine, in exchange of £400 in capital crowdfunded via a smart contract. Next week, you learn that you can turn your 10% discount on one turbine into 7% on two turbines if you drop just £100 more into that pig coin. Already the first step (£400 against the discount of 10%) would be a bit hard to squeeze into classical contractual arrangements as for investing into the equity of a business, let alone the subsequent amendment (Armour, Enriques 2018[3]).

Yet, with a smart contract on a crowdfunding platform, anything is just a few clicks away, and, as astonishing as it could seem, the whole thing works. The click-based smart contracts are actually enforced and respected. People do sign those contracts, and moreover, when I mentally step out of my academic lawyer’s shoes, I admit being tempted to sign such a contract too. There is a specific behavioural pattern attached to crowdfunding, something like the Russian ‘Davaj, riebiata!’ (‘Давай, ребята!’ in the original spelling). ‘Let’s do it together! Now!’, that sort of thing. It is almost as I were giving someone the power of attorney to be entrepreneurial on my behalf. If people in big Polish cities found more and more startups, per 10 000 residents, it is a more and more recurrent manifestation of entrepreneurial behaviour, and crowdfunding touches the very heart of entrepreneurial behaviour (Agrawal et al. 2014[4]). It is entrepreneurship broken into small, tradable units. The whole concept we invented is generally placed in the European context, and in Europe crowdfunding is way below the popularity it has reached in North America (Rupeika-Aboga, Danovi 2015[5]). As a matter of fact, European entrepreneurs seem to consider crowdfunding as really a secondary source of financing.

Time to sum up a bit all those loose thoughts. Using a neural network to simulate collective behaviour of human societies involves a few deep principles, and a few tricks. When I study a social structure with classical stochastic tools and I encounter strange, apparently paradoxical correlations between phenomena, artificial intelligence may serve. My intuitive guess is that a neural network can help in clarifying what is sometimes called ‘background correlations’ or ‘transitive correlations’: variable A is correlated with variable C through the intermediary of variable B, i.e. A is significantly correlated with B, and B is significantly correlated with C, but the correlation between A and C remains insignificant.

When I started to use a neural network in my research, I realized how important it is to formulate very precise and complex hypotheses rather than definitive answers. Artificial intelligence allows to sketch quickly alternative states of nature, by gazillions. For a moment, I am leaving the topic of those financial solutions for cities, and I return to my research on energy, more specifically on energy efficiency. In a draft article I wrote last autumn, I started to study the relative impact of the velocity of money, as well as that of the speed of technological change, upon the energy efficiency of national economies. Initially, I approached the thing in the nicely and classically stochastic a way. I came up with conclusions of the type: ‘variance in the supply of money makes 7% of the observable variance in energy efficiency, and the correlation is robust’. Good, this is a step forward. Still, in practical terms, what does it give? Does it mean that we need to add money to the system in order to have greater an energy efficiency? Might well be the case, only you don’t add money to the system just like that, ‘cause most of said money is account money on current bank accounts, and the current balances of those accounts reflect the settlement of obligations resulting from complex private contracts. There is no government that could possibly add more complex contracts to the system.

Thus, stochastic results, whilst looking and sounding serious and scientific, have remote connexion to practical applications. On the other hand, if I take the same empirical data and feed it into a neural network, I get alternative states of nature, and those states are bloody interesting. Artificial intelligence can show me, for example, what happens to energy efficiency if a social system is more or less conservative in its experimenting with itself. In short, artificial intelligence allows super-fast simulation of social experiments, and that simulation is theoretically robust.

I am consistently delivering good, almost new science to my readers, and love doing it, and I am working on crowdfunding this activity of mine. You can communicate with me directly, via the mailbox of this blog: goodscience@discoversocialsciences.com. As we talk business plans, I remind you that you can download, from the library of my blog, the business plan I prepared for my semi-scientific project Befund  (and you can access the French version as well). You can also get a free e-copy of my book ‘Capitalism and Political Power’ You can support my research by donating directly, any amount you consider appropriate, to my PayPal account. You can also consider going to my Patreon page and become my patron. If you decide so, I will be grateful for suggesting me two things that Patreon suggests me to suggest you. Firstly, what kind of reward would you expect in exchange of supporting me? Secondly, what kind of phases would you like to see in the development of my research, and of the corresponding educational tools?


[1] Yang, X. S., Cui, Z., Xiao, R., Gandomi, A. H., & Karamanoglu, M. (2013). Swarm intelligence and bio-inspired computation: theory and applications.

[2] Savelyev, A. (2017). Contract law 2.0:‘Smart’contracts as the beginning of the end of classic contract law. Information & Communications Technology Law, 26(2), 116-134.

[3] Armour, J., & Enriques, L. (2018). The promise and perils of crowdfunding: Between corporate finance and consumer contracts. The Modern Law Review, 81(1), 51-84.

[4] Agrawal, A., Catalini, C., & Goldfarb, A. (2014). Some simple economics of crowdfunding. Innovation Policy and the Economy, 14(1), 63-97

[5] Rupeika-Apoga, R., & Danovi, A. (2015). Availability of alternative financial resources for SMEs as a critical part of the entrepreneurial eco-system: Latvia and Italy. Procedia Economics and Finance, 33, 200-210.

Lean, climbing trends

My editorial on You Tube

Our artificial intelligence: the working title of my research, for now. Volume 1: Energy and technological change. I am doing a little bit of rummaging in available data, just to make sure I keep contact with reality. Here comes a metric: access to electricity in the world, measured as the % of total human population[1]. The trend line looks proudly ascending. In 2016, 87,38% of mankind had at least one electric socket in their place. Ten years earlier, by the end of 2006, they were 81,2%. Optimistic. Looks like something growing almost linearly. Another one: « Electric power transmission and distribution losses »[2]. This one looks different: instead of a clear trend, I observe something shaking and oscillating, with the width of variance narrowing gently down, as time passes. By the end of 2014 (last data point in this dataset), we were globally at 8,25% of electricity lost in transmission. The lowest coefficient of loss occurred in 1998: 7,13%.

I move from distribution to production of electricity, and to its percentage supplied from nuclear power plants[3]. Still another shape, that of a steep bell with surprisingly lean edges. Initially, it was around 2% of global electricity supplied by the nuclear. At the peak of fascination, it was 17,6%, and at the end of 2014, we went down to 10,6%. The thing seems to be temporarily stable at this level. As I move to water, and to the percentage of electricity derived from the hydro[4], I see another type of change: a deeply serrated, generally descending trend. In 1971, we had 20,2% of our total global electricity from the hydro, and by the end of 2014, we were at 16,24%. In the meantime, it looked like a rollercoaster. Yet, as I am having a look at other renewables (i.e. other than hydroelectricity) and their share in the total supply of electricity[5], the shape of the corresponding curve looks like a snake, trying to figure something out about a vertical wall. Between 1971 and 1988, the share of those other renewables in the total electricity supplied moved from 0,25% to 0,6%. Starting from 1989, it is an almost perfectly exponential growth, to reach 6,77% in 2015. 

Just to have a complete picture, I shift slightly, from electricity to energy consumption as a whole, and I check the global share of renewables therein[6]. Surprise! This curve does not behave at all as it is expected to behave, after having seen the previously cited share of renewables in electricity. Instead of a snake sniffing a wall, we can see a snake like from above, or something like e meandering river. This seems to be a cycle over some 25 years (could it be Kondratiev’s?), with a peak around 18% of renewables in the total consumption of energy, and a trough somewhere by 16,9%. Right now, we seem to be close to the peak. 

I am having a look at the big, ugly brother of hydro: the oil, gas and coal sources of electricity and their share in the total amount of electricity produced[7]. Here, I observe a different shape of change. Between 1971 and 1986, the fossils dropped their share from 62% to 51,47%. Then, it rockets up back to 62% in 1990. Later, a slowly ascending trend starts, just to reach a peak, and oscillate for a while around some 65 ÷ 67% between 2007 and 2011. Since then, the fossils are dropping again: the short-term trend is descending.  

Finally, one of the basic metrics I have been using frequently in my research on energy: the final consumption thereof, per capita, measured in kilograms of oil equivalent[8]. Here, we are back in the world of relatively clear trends. This one is ascending, with some bumps on the way, though. In 1971, we were at 1336,2 koe per person per year. In 2014, it was 1920,655 koe.

Thus, what are all those curves telling me? I can see three clearly different patterns. The first is the ascending trend, observable in the access to electricity, in the consumption of energy per capita, and, since the late 1980ies, in the share of electricity derived from renewable sources. The second is a cyclical variation: share of renewables in the overall consumption of energy, to some extent the relative importance of hydroelectricity, as well as that of the nuclear. Finally, I can observe a descending trend in the relative importance of the nuclear since 1988, as well as in some episodes from the life of hydroelectricity, coal and oil.

On the top of that, I can distinguish different patterns in, respectively, the production of energy, on the one hand, and its consumption, on the other hand. The former seems to change along relatively predictable, long-term paths. The latter looks like a set of parallel, and partly independent experiments with different sources of energy. We are collectively intelligent: I deeply believe that. I mean, I hope. If bees and ants can be collectively smarter than singlehandedly, there is some potential in us as well.

Thus, I am progressively designing a collective intelligence, which experiments with various sources of energy, just to produce those two, relatively lean, climbing trends: more energy per capita and ever growing a percentage of capitae with access to electricity. Which combinations of variables can produce a rationally desired energy efficiency? How is the supply of money changing as we reach different levels of energy efficiency? Can artificial intelligence make energy policies? Empirical check: take a real energy policy and build a neural network which reflects the logical structure of that policy. Then add a method of learning and see, what it produces as hypothetical outcome.

What is the cognitive value of hypotheses made with a neural network? The answer to this question starts with another question: how do hypotheses made with a neural network differ from any other set of hypotheses? The hypothetical states of nature produced by a neural network reflect the outcomes of logically structured learning. The process of learning should represent real social change and real collective intelligence. There are four most important distinctions I have observed so far, in this respect: a) awareness of internal cohesion b) internal competition c) relative resistance to new information and d) perceptual selection (different ways of standardizing input data).

The awareness of internal cohesion, in a neural network, is a function that feeds into the consecutive experimental rounds of learning the information on relative cohesion (Euclidean distance) between variables. We assume that each variable used in the neural network reflects a sequence of collective decisions in the corresponding social structure. Cohesion between variables represents the functional connection between sequences of collective decisions. Awareness of internal cohesion, as a logical attribute of a neural network, corresponds to situations when societies are aware of how mutually coherent their different collective decisions are. The lack of logical feedback on internal cohesion represents situation when societies do not have that internal awareness.

As I metaphorically look around and ask myself, what awareness do I have about important collective decisions in my local society. I can observe and pattern people’s behaviour, for one. Next thing: I can read (very literally) the formalized, official information regarding legal issues. On the top of that, I can study (read, mostly) quantitatively formalized information on measurable attributes of the society, such as GDP per capita, supply of money, or emissions of CO2. Finally, I can have that semi-formalized information from what we call “media”, whatever prefix they come with: mainstream media, social media, rebel media, the-only-true-media etc.

As I look back upon my own life and the changes which I have observed on those four levels of social awareness, the fourth one, namely the media, has been, and still is the biggest game changer. I remember the cultural earthquake in 1990 and later, when, after decades of state-controlled media in the communist Poland, we suddenly had free press and complete freedom of publishing. Man! It was like one of those moments when you step out of a calm, dark alleyway right into the middle of heavy traffic in the street. Information, it just wheezed past.         

There is something about media, both those called ‘mainstream’, and the modern platforms like Twitter or You Tube: they adapt to their audience, and the pace of that adaptation is accelerating. With Twitter, it is obvious: when I log into my account, I can see the Tweets only from people and organizations whom I specifically subscribed to observe. With You Tube, on my starting page, I can see the subscribed channels, for one, and a ton of videos suggested by artificial intelligence on the grounds of what I watched in the past. Still, the mainstream media go down the same avenue. When I go bbc.com, the types of news presented are very largely what the editorial team hopes will max out on clicks per hour, which, in turn, is based on the types of news that totalled the most clicks in the past. The same was true for printed newspapers, 20 years ago: the stuff that got to headlines was the kind of stuff that made sales.

Thus, when I simulate collective intelligence of a society with a neural network, the function allowing the network to observe its own, internal cohesion seems to be akin the presence of media platforms. Actually, I have already observed, many times, that adding this specific function to a multi-layer perceptron (type of neural network) makes that perceptron less cohesive. Looks like a paradox: observing the relative cohesion between its own decisions makes a piece of AI less cohesive. Still, real life confirms that observation. Social media favour the phenomenon known as « echo chamber »: if I want, I can expose myself only to the information that minimizes my cognitive dissonance and cut myself from anything that pumps my adrenaline up. On a large scale, this behavioural pattern produces a galaxy of relatively small groups encapsulated in highly distilled, mutually incoherent worldviews. Have you ever wondered what it would be to use GPS navigation to find your way, in the company of a hardcore flat-Earther?   

When I run my perceptron over samples of data regarding the energy – efficiency of national economies – including the function of feedback on the so-called fitness function is largely equivalent to simulating a society with abundant mediatic activity. The absence of such feedback is, on the other hand, like a society without much of a media sector.

Internal competition, in a neural network, is the deep underlying principle for structuring a multi-layer perceptron into separate layers, and manipulating the number of neurons in each layer. Let’s suppose I have two neural layers in a perceptron: A, and B, in this exact order. If I put three neurons in the layer A, and one neuron in the layer B, the one in B will be able to choose between the 3 signals sent from the layer A. Seen from the A perspective, each neuron in A has to compete against the two others for the attention of the single neuron in B. Choice on one end of a synapse equals competition on the other end.

When I want to introduce choice in a neural network, I need to introduce internal competition as well. If any neuron is to have a choice between processing input A and its rival, input B, there must be at least two distinct neurons – A and B – in a functionally distinct, preceding neural layer. In a collective intelligence, choice requires competition, and there seems to be no way around it.  In a real brain, neurons form synaptic sequences, which means that the great majority of our neurons fire because other neurons have fired beforehand. We very largely think because we think, not because something really happens out there. Neurons in charge of early-stage collection in sensory data compete for the attention of our brain stem, which, in turn, proposes its pre-selected information to the limbic system, and the emotional exultation of the latter incites he cortical areas to think about the whole thing. From there, further cortical activity happens just because other cortical activity has been happening so far.

I propose you a quick self-check: think about what you are thinking right now, and ask yourself, how much of what you are thinking about is really connected to what is happening around you. Are you thinking a lot about the gradient of temperature close to your skin? No, not really? Really? Are you giving a lot of conscious attention to the chemical composition of the surface you are touching right now with your fingertips? Not really a lot of conscious thinking about this one either? Now, how much conscious attention are you devoting to what [fill in the blank] said about [fill in the blank], yesterday? Quite a lot of attention, isn’t it?

The point is that some ideas die out, in us, quickly and sort of silently, whilst others are tough survivors and keep popping up to the surface of our awareness. Why? How does it happen? What if there is some kind of competition between synaptic paths? Thoughts, or components thereof, that win one stage of the competition pass to the next, where they compete again.           

Internal competition requires complexity. There needs to be something to compete for, a next step in the chain of thinking. A neural network with internal competition reflects a collective intelligence with internal hierarchies that offer rewards. Interestingly, there is research showing that greater complexity gives more optimizing accuracy to a neural network, but just as long as we are talking about really low complexity, like 3 layers of neurons instead of two. As complexity is further developed, accuracy decreases noticeably. Complexity is not the best solution for optimization: see Olawoyin and Chen (2018[9]).

Relative resistance to new information corresponds to the way that an intelligent structure deals with cognitive dissonance. In order to have any cognitive dissonance whatsoever, we need at least two pieces of information: one that we have already appropriated as our knowledge, and the new stuff, which could possibly disturb the placid self-satisfaction of the I-already-know-how-things-work. Cognitive dissonance is a potent factor of stress in human beings as individuals, and in whole societies. Galileo would have a few words to say about it. Question: how to represent in a mathematical form the stress connected to cognitive dissonance? My provisional answer is: by division. Cognitive dissonance means that I consider my acquired knowledge as more valuable than new information. If I want to decrease the importance of B in relation to A, I divide B by a factor greater than 1, whilst leaving A as it is. The denominator of new information is supposed to grow over time: I am more resistant to the really new stuff than I am to the already slightly processed information, which was new yesterday. In a more elaborate form, I can use the exponential progression (see The really textbook-textbook exponential growth).

I noticed an interesting property of the neural network I use for studying energy efficiency. When I introduce choice, internal competition and hierarchy between neurons, the perceptron gets sort of wild: it produces increasing error instead of decreasing error, so it basically learns how to swing more between possible states, rather than how to narrow its own trial and error down to one recurrent state. When I add a pinchful of resistance to new information, i.e. when I purposefully create stress in the presence of cognitive dissonance, the perceptron calms down a bit, and can produce a decreasing error.   

Selection of information can occur already at the level of primary perception. I developed on this one in « Thinking Poisson, or ‘WTF are the other folks doing?’ ». Let’s suppose that new science comes as for how to use particular sources of energy. We can imagine two scenarios of reaction to that new science. On the one hand, the society can react in a perfectly flexible way, i.e. each new piece of scientific research gets evaluated as for its real utility for energy management, and gest smoothly included into the existing body of technologies. On the other hand, the same society (well, not quite the same, an alternative one) can sharply distinguish those new pieces of science into ‘useful stuff’ and ‘crap’, with little nuance in between.

What do we know about collective learning and collective intelligence? Three essential traits come to my mind. Firstly, we make social structures, i.e. recurrent combinations of social relations, and those structures tend to be quite stable. We like having stable social structures. We almost instinctively create rituals, rules of conduct, enforceable contracts etc., thus we make stuff that is supposed to make the existing stuff last. An unstable social structure is prone to wars, coups etc. Our collective intelligence values stability. Still, stability is not the same as perfect conservatism: our societies have imperfect recall. This is the second important trait. Over (long periods of) time we collectively shake off, and replace old rules of social games with new rules, and we do it without disturbing the fundamental social structure. In other words: stable as they are, our social structures have mechanisms of adaptation to new conditions, and yet those mechanisms require to forget something about our past. OK, not just forget something: we collectively forget a shitload of something. Thirdly, there had been many local human civilisations, and each of them had eventually collapsed, i.e. their fundamental social structures had disintegrated. The civilisations we have made so far had a limited capacity to learn. Sooner or later, they would bump against a challenge which they were unable to adapt to. The mechanism of collective forgetting and shaking off, in every known historically documented case, had a limited efficiency.

I intuitively guess that simulating collective intelligence with artificial intelligence is likely to be the most fruitful when we simulate various capacities to learn. I think we can model something like a perfectly adaptable collective intelligence, i.e. the one which has no cognitive dissonance and processes information uniformly over time, whilst having a broad range of choice and internal competition. Such a neural network behaves in the opposite way to what we tend to associate with AI: instead of optimizing and narrowing down the margin of error, it creates new alternative states, possibly in a broadening range. This is a collective intelligence with lots of capacity to learn, but little capacity to steady itself as a social structure. From there, I can muzzle the collective intelligence with various types of stabilizing devices, making it progressively more and more structure-making, and less flexible. Down that avenue, the solver-type of artificial intelligence lies, thus a neural network that just solves a problem, with one, temporarily optimal solution.

I am consistently delivering good, almost new science to my readers, and love doing it, and I am working on crowdfunding this activity of mine. You can communicate with me directly, via the mailbox of this blog: goodscience@discoversocialsciences.com. As we talk business plans, I remind you that you can download, from the library of my blog, the business plan I prepared for my semi-scientific project Befund  (and you can access the French version as well). You can also get a free e-copy of my book ‘Capitalism and Political Power’ You can support my research by donating directly, any amount you consider appropriate, to my PayPal account. You can also consider going to my Patreon page and become my patron. If you decide so, I will be grateful for suggesting me two things that Patreon suggests me to suggest you. Firstly, what kind of reward would you expect in exchange of supporting me? Secondly, what kind of phases would you like to see in the development of my research, and of the corresponding educational tools?


[1] https://data.worldbank.org/indicator/EG.ELC.ACCS.ZS last access May 17th, 2019

[2] https://data.worldbank.org/indicator/EG.ELC.LOSS.ZS?end=2016&start=1990&type=points&view=chart last access May 17th, 2019

[3] https://data.worldbank.org/indicator/EG.ELC.NUCL.ZS?end=2014&start=1960&type=points&view=chart last access May 17th, 2019

[4] https://data.worldbank.org/indicator/EG.ELC.HYRO.ZS?end=2014&start=1960&type=points&view=chart last access May 17th, 2019

[5] https://data.worldbank.org/indicator/EG.ELC.RNWX.ZS?type=points last access May 17th, 2019

[6] https://data.worldbank.org/indicator/EG.FEC.RNEW.ZS?type=points last access May 17th, 2019

[7] https://data.worldbank.org/indicator/EG.ELC.FOSL.ZS?end=2014&start=1960&type=points&view=chart last access May 17th, 2019

[8] https://data.worldbank.org/indicator/EG.USE.PCAP.KG.OE?type=points last access May 17th, 2019

[9] Olawoyin, A., & Chen, Y. (2018). Predicting the Future with Artificial Neural Network. Procedia Computer Science, 140, 383-392.

Thinking Poisson, or ‘WTF are the other folks doing?’

My editorial on You Tube

I think I have just put a nice label on all those ideas I have been rummaging in for the last 2 years. The last 4 months, when I have been progressively initiating myself at artificial intelligence, have helped me to put it all in a nice frame. Here is the idea for a book, or rather for THE book, which I have been drafting for some time. « Our artificial intelligence »: this is the general title. The first big chapter, which might very well turn into the first book out of a whole series, will be devoted to energy and technological change. After that, I want to have a go at two other big topics: food and agriculture, then laws and institutions.

I explain. What does it mean « Our artificial intelligence »? As I have been working with an initially simple algorithm of a neural network, and I have been progressively developing it, I understood a few things about the link between what we call, fault of a better word, artificial intelligence, and the way my own brain works. No, not my brain. That would be an overstatement to say that I understand fully my own brain. My mind, this is the right expression. What I call « mind » is an idealized, i.e. linguistic description of what happens in my nervous system. As I have been working with a neural network, I have discovered that artificial intelligence that I make, and use, is a mathematical expression of my mind. I project my way of thinking into a set of mathematical expressions, made into an algorithmic sequence. When I run the sequence, I have the impression of dealing with something clever, yet slightly alien: an artificial intelligence. Still, when I stop staring at the thing, and start thinking about it scientifically (you know: initial observation, assumptions, hypotheses, empirical check, new assumptions and new hypotheses etc.), I become aware that the alien thing in front of me is just a projection of my own way of thinking.

This is important about artificial intelligence: this is our own, human intelligence, just seen from outside and projected into electronics. This particular point is an important piece of theory I want to develop in my book. I want to compile research in neurophysiology, especially in the neurophysiology of meaning, language, and social interactions, in order to give scientific clothes to that idea. When we sometimes ask ourselves whether artificial intelligence can eliminate humans, it boils down to asking: ‘Can human intelligence eliminate humans?’. Well, where I come from, i.e. Central Europe, the answer is certainly ‘yes, it can’. As a matter of fact, when I raise my head and look around, the same answer is true for any part of the world. Human intelligence can eliminate humans, and it can do so because it is human, not because it is ‘artificial’.

When I think about the meaning of the word ‘artificial’, it comes from the Latin ‘artificium’, which, in turn, designates something made with skill and demonstrable craft. Artificium means seasoned skills made into something durable so as to express those skills. Artificial intelligence is a crafty piece of work made with one of the big human inventions: mathematics. Artificial intelligence is mathematics at work. Really at work, i.e. not just as another idealization of reality, but as an actual tool. When I study the working of algorithms in neural networks, I have a vision of an architect in Ancient Greece, where the first mathematics we know seem to be coming from. I have a wall and a roof, and I want them both to hold in balance, so what is the proportion between their respective lengths? I need to learn it by trial and error, as I haven’t any architectural knowledge yet. Although devoid of science, I have common sense, and I make small models of the building I want (have?) to erect, and I test various proportions. Some of those maquettes are more successful than others. I observe, I make my synthesis about the proportions which give the least error, and so I come up with something like the Pythagorean z2 = x2 + y2, something like π = 3,14 etc., or something like the discovery that, for a given angle, the tangent proportion y/x makes always the same number, whatever the empirical lengths of y and x.

This is exactly what artificial intelligence does. It makes small models of itself, tests the error resulting from comparison between those models and something real, and generalizes the observation of those errors. Really: this is what a face recognition piece of software does at an airport, or what Google Ads does. This is human intelligence, just unloaded into a mathematical vessel. This is the first discovery that I have made about AI. Artificial intelligence is actually our own intelligence. Studying the way AI behaves allows seeing, like under a microscope, the workings of human intelligence.

The second discovery is that when I put a neural network to work with empirical data of social sciences, it produces strange, intriguing patterns, something like neighbourhoods of the actual reality. In my root field of research – namely economics – there is a basic concept that we, economists, use a lot and still wonder what it actually means: equilibrium. It is an old observation that networks of exchange in human societies tend to find balance in some precise proportions, for example proportions between demand, supply, price and quantity, or those between labour and capital.

Half of economic sciences is about explaining the equilibriums we can empirically observe. The other half employs itself at discarding what that first half comes up with. Economic equilibriums are something we know that exists, and constantly try to understand its mechanics, but those states of society remain obscure to a large extent. What we know is that networks of exchange are like machines: some designs just work, some others just don’t. One of the most important arguments in economic sciences is whether a given society can find many alternative equilibriums, i.e. whether it can use optimally its resources at many alternative proportions between economic variables, or, conversely, is there just one point of balance in a given place and time. From there on, it is a rabbit hole. What does it mean ‘using our resources optimally’? Is it when we have the lowest unemployment, or when we have just some healthy amount of unemployment? Theories are welcome.

When trying to make predictions about the future, using the apparatus of what can now be called classical statistics, social sciences always face the same dilemma: rigor vs cognitive depth. The most interesting correlations are usually somehow wobbly, and mathematical functions we derive from regression always leave a lot of residual errors.    

This is when AI can step in. Neural networks can be used as tools for optimization in digital systems. Still, they have another useful property: observing a neural network at work allows having an insight into how intelligent structures optimize. If I want to understand how economic equilibriums take shape, I can observe a piece of AI producing many alternative combinations of the relevant variables. Here comes my third fundamental discovery about neural networks: with a few, otherwise quite simple assumptions built into the algorithm, AI can produce very different mechanisms of learning, and, consequently, a broad range of those weird, yet intellectually appealing, alternative states of reality. Here is an example: when I make a neural network observe its own numerical properties, such as its own kernel or its own fitness function, its way of learning changes dramatically. Sounds familiar? When you make a human being performing tasks, and you allow them to see the MRI of their own brain when performing those tasks, the actual performance changes.

When I want to talk about applying artificial intelligence, it is a good thing to return to the sources of my own experience with AI, and explain it works. Some sequences of mathematical equations, when run recurrently many times, behave like intelligent entities: they experiment, they make errors, and after many repeated attempts they come up with a logical structure that minimizes the error. I am looking for a good, simple example from real life; a situation which I experienced personally, and which forced me to learn something new. Recently, I went to Marrakech, Morocco, and I had the kind of experience that most European first-timers have there: the Jemaa El Fna market place, its surrounding souks, and its merchants. The experience consists in finding your way out of the maze-like structure of the alleys adjacent to the Jemaa El Fna. You walk down an alley, you turn into another one, then into still another one, and what you notice only after quite a few such turns is that the whole architectural structure doesn’t follow AT ALL the European concept of urban geometry.  

Thus, you face the length of an alley. You notice five lateral openings and you see a range of lateral passages. In a European town, most of those lateral passages would lead somewhere. A dead end is an exception, and passages between buildings are passages in the strict sense of the term: from one open space to another open space. At Jemaa El Fna, its different: most of the lateral ways lead into deep, dead-end niches, with more shops and stalls inside, yet some other open up into other alleys, possibly leading to the main square, or at least to a main street.

You pin down a goal: get back to the main square in less than… what? One full day? Just kidding. Let’s peg that goal down at 15 minutes. Fault of having a good-quality drone, equipped with thermovision, flying over the whole structure of the souk, and guiding you, you need to experiment. You need to test various routes out of the maze and to trace those, which allow the x ≤ 15 minutes time. If all the possible routes allowed you to get out to the main square in exactly 15 minutes, experimenting would be useless. There is any point in experimenting only if some from among the possible routes yield a suboptimal outcome. You are facing a paradox: in order not to make (too much) errors in your future strolls across Jemaa El Fna, you need to make some errors when you learn how to stroll through.

Now, imagine a fancy app in your smartphone, simulating the possible errors you can make when trying to find your way through the souk. You could watch an imaginary you, on the screen, wandering through the maze of alleys and dead-ends, learning by trial and error to drive the time of passage down to no more than 15 minutes. That would be interesting, wouldn’t it? You could see your possible errors from outside, and you could study the way you can possibly learn from them. Of course, you could always say: ‘it is not the real me, it is just a digital representation of what I could possibly do’. True. Still, I can guarantee you: whatever you say, whatever strong the grip you would try to keep on the actual, here-and-now you, you just couldn’t help being fascinated.

Is there anything more, beyond fascination, in observing ourselves making many possible future mistakes? Let’s think for a moment. I can see, somehow from outside, how a copy of me deals with the things of life. Question: how does the fact of seeing a copy of me trying to find a way through the souk differ from just watching a digital map of said souk, with GPS, such as Google Maps? I tried the latter, and I have two observations. Firstly, in some structures, such as that of maze-like alleys adjacent to Jemaa El Fna, seeing my own position on Google Maps is of very little help. I cannot put my finger on the exact reason, but my impression is that when the environment becomes just too bizarre for my cognitive capacities, having a bird’s eye view of it is virtually no good. Secondly, when I use Google Maps with GPS, I learn very little about my route. I just follow directions on the screen, and ultimately, I get out into the main square, but I know that I couldn’t reproduce that route without the device. Apparently, there is no way around learning stuff by myself: if I really want to learn how to move through the souk, I need to mess around with different possible routes. A device that allows me to see how exactly I can mess around looks like having some potential.

Question: how do I know that what I see, in that imaginary app, is a functional copy of me, and how can I assess the accuracy of that copy? This is, very largely, the rabbit hole I have been diving into for the last 5 months or so. The first path to follow is to look at the variables used. Artificial intelligence works with numerical data, i.e. with local instances of abstract variables. Similarity between the real me, and the me reproduced as artificial intelligence is to find in the variables used. In real life, variables are the kinds of things, which: a) are correlated with my actions, both as outcomes and as determinants b) I care about, and yet I am not bound to be conscious of caring about.

Here comes another discovery I made on my journey through the realm of artificial intelligence: even if, in the simplest possible case, I just make the equations of my neural network so as they represent what I think is the way I think, and I drop some completely random values of the relevant variables into the first round of experimentation, the neural network produces something disquietingly logical and coherent. In other words, if I am even moderately honest in describing, in the form of equations, my way of apprehending reality, the AI I thus created really processes information in the way I would.  

Another way of assessing the similarity between a piece of AI and myself is to compare the empirical data we use: I can make a neural network think more or less like me if I feed it with an accurate description of my so-far experience. In this respect, I discovered something that looks like a keystone in my intellectual structure: as I feed my neural network with more and more empirical data, the scope of the possible ways to learning something meaningful narrows down. When I minimise the amount of empirical data fed into the network, the latter can produce interesting, meaningful results via many alternative sequences of equations. As the volume of real-life information swells, some sequences of equations just naturally drop off the game: they drive the neural network into a state of structural error, when it stops performing calculations.

At this point, I can see some similarity between AI and quantum physics. Quantum mechanics have grown as a methodology, as they proved to be exceptionally accurate in predicting the outcomes of experiments in physics. That accuracy was based on the capacity to formulate very precise hypotheses regarding empirical reality, and the capacity to increase the precision of those hypotheses through the addition of empirical data from past experiments.  

Those fundamental observations I made about the workings of artificial intelligence have progressively brought me to use AI in social sciences. An analytical tool has become a topic of research for me. Happens all the time in science, mind you. Geometry, way back in the day, was a thoroughly practical set of tools, which served to make good boats, ships and buildings. With time, geometry has become a branch of science on its own rights. In my case, it is artificial intelligence. It is a tool, essentially, invented back in the 1960ies and 1970ies, and developed over the last 20 years, and it serves practical purposes: facial identification, financial investment etc. Still, as I have been working with a very simple neural network for the last 4 months, and as I have been developing the logical structure of that network, I am discovering a completely new opening in my research in social sciences.

I am mildly obsessed with the topic of collective human intelligence. I have that deeply rooted intuition that collective human behaviour is always functional regarding some purpose. I perceive social structures such as financial markets or political institutions as something akin to endocrine systems in a body: complex set of signals with a random component in their distribution, and yet a very coherent outcome. I follow up on that intuition by assuming that we, humans, are most fundamentally, collectively intelligent regarding our food and energy base. We shape our social structures according to the quantity and quality of available food and non-edible energy. For quite a while, I was struggling with the methodological issue of precise hypothesis-making. What states of human society can be posited as coherent hypotheses, possible to check or, fault of checking, to speculate about in an informed way?

The neural network I am experimenting with does precisely this: it produces strange, puzzling, complex states, defined by the quantitative variables I use. As I am working with that network, I have come to redefining the concept of artificial intelligence. A movie-based approach to AI is that it is fundamentally non-human. As I think about it sort of step by step, AI is human, as it has been developed on the grounds of human logic. It is human meaning, and therefore an expression of human neural wiring. It is just selective in its scope. Natural human intelligence has no other way of comprehending but comprehending IT ALL, i.e. the whole of perceived existence. Artificial intelligence is limited in scope: it works just with the data we assign it to work with. AI can really afford not to give a f**k about something otherwise important. AI is focused in the strict sense of the term.

During that recent stay in Marrakech, Morocco, I had been observing people around me and their ways of doing things. As it is my habit, I am patterning human behaviour. I am connecting the dots about the ways of using energy (for the moment I haven’t seen any making of energy, yet) and food. I am patterning the urban structure around me and the way people live in it.

Superbly kept gardens and buildings marked by a sense of instability. Human generosity combined with somehow erratic behaviour in the same humans. Of course, women are fully dressed, from head to toes, but surprisingly enough, men too. With close to 30 degrees Celsius outside, most local dudes are dressed like a Polish guy would dress by 10 degrees Celsius. They dress for the heat as I would dress for noticeable cold. Exquisitely fresh and firm fruit and vegetables are a surprise. After having visited Croatia, on the Southern coast of Europe, I would rather expect those tomatoes to be soft and somehow past due. Still, they are excellent. Loads of sugar in very nearly everything. Meat is scarce and tough. All that has been already described and explained by many a researcher, wannabe researchers included. I think about those things around me as about local instances of a complex logical structure: a collective intelligence able to experiment with itself. I wonder what other, hypothetical forms could this collective intelligence take, close to the actually observable reality, as well as some distance from it.

The idea I can see burgeoning in my mind is that I can understand better the actual reality around me if I use some analytical tool to represent slight hypothetical variations in said reality. Human behaviour first. What exactly makes me perceive Moroccans as erratic in their behaviour, and how can I represent it in the form of artificial intelligence? Subjectively perceived erraticism is a perceived dissonance between sequences. I expect a certain sequence to happen in other people’s behaviour. The sequence that really happens is different, and possibly more differentiated than what I expect to happen. When I perceive the behaviour of Moroccans as erratic, does it connect functionally with their ways of making and using food and energy?  

A behavioural sequence is marked by a certain order of actions, and a timing. In a given situation, humans can pick their behaviour from a total basket of Z = {a1, a2, …, az} possible actions. These, in turn, can combine into zPk = z!/(z – k)! = (1*2*…*z) / [1*2*…*(z – k)] possible permutations of k component actions. Each such permutation happens with a certain frequency. The way a human society works can be described as a set of frequencies in the happening of those zPk permutations. Well, that’s exactly what a neural network such as mine can do. It operates with values standardized between 0 and 1, and these can be very easily interpreted as frequencies of happening. I have a variable named ‘energy consumption per capita’. When I use it in the neural network, I routinely standardize each empirical value over the maximum of this variable in the entire empirical dataset. Still, standardization can convey a bit more of a mathematical twist and can be seen as the density of probability under the curve of a statistical distribution.

When I feel like giving such a twist, I can make my neural network stroll down different avenues of intelligence. I can assume that all kinds of things happen, and all those things are sort of densely packed one next to the other, and some of those things are sort of more expected than others, and thus I can standardize my variables under the curve of the normal distribution. Alternatively, I can see each empirical instance of each variable in my database as a rare event in an interval of time, and then I standardize under the curve of the Poisson distribution. A quick check with the database I am using right now brings an important observation: the same empirical data standardized with a Poisson distribution becomes much more disparate as compared to the same data standardized with the normal distribution. When I use Poisson, I lead my empirical network to divide sharply empirical data into important stuff on the one hand, and all the rest, not even worth to bother about, on the other hand.

I am giving an example. Here comes energy consumption per capita in Ecuador (1992) = 629,221 kg of oil equivalent (koe), Slovak Republic (2000) = 3 292,609 koe, and Portugal (2003) = 2 400,766 koe. These are three different states of human society, characterized by a certain level of energy consumption per person per year. They are different. I can choose between three different ways of making sense out of their disparity. I can see them quite simply as ordinals on a scale of magnitude, i.e. I can standardize them as fractions of the greatest energy consumption in the whole sample. When I do so, they become: Ecuador (1992) =  0,066733839, Slovak Republic (2000) =  0,349207223, and Portugal (2003) =  0,254620211.

In an alternative worldview, I can perceive those three different situations as neighbourhoods of an expected average energy consumption, in the presence of an average, standard deviation from that expected value. In other words, I assume that it is normal that countries differ in their energy consumption per capita, as well as it is normal that years of observation differ in that respect. I am thinking normal distribution, and then my three situations come as: Ecuador (1992) = 0,118803134, Slovak Republic (2000) = 0,556341893, and Portugal (2003) = 0,381628627.

I can adopt an even more convoluted approach. I can assume that energy consumption in each given country is the outcome of a unique, hardly reproducible process of local adjustment. Each country, with its energy consumption per capita, is a rare event. Seen from this angle, my three empirical states of energy consumed per capita could occur with the probability of the Poisson distribution, estimated with the whole sample of data. With this specific take on the thing, my three empirical values become: Ecuador (1992) = 0, Slovak Republic (2000) = 0,999999851, and Portugal (2003) = 9,4384E-31.

I come back to Morocco. I perceive some behaviours in Moroccans as erratic. I think I tend to think Poisson distribution. I expect some very tightly defined, rare event of behaviour, and when I see none around, I discard everything else as completely not fitting the bill. As I think about it, I guess most of our human intelligence is Poisson-based. We think ‘good vs bad’, ‘edible vs not food’, ‘friend vs foe’ etc.

I am consistently delivering good, almost new science to my readers, and love doing it, and I am working on crowdfunding this activity of mine. As we talk business plans, I remind you that you can download, from the library of my blog, the business plan I prepared for my semi-scientific project Befund  (and you can access the French version as well). You can also get a free e-copy of my book ‘Capitalism and Political Power’ You can support my research by donating directly, any amount you consider appropriate, to my PayPal account. You can also consider going to my Patreon page and become my patron. If you decide so, I will be grateful for suggesting me two things that Patreon suggests me to suggest you. Firstly, what kind of reward would you expect in exchange of supporting me? Secondly, what kind of phases would you like to see in the development of my research, and of the corresponding educational tools?

Locally smart. Case study in finance.

 

My editorial on You Tube

 

Here I go, at the frontier between research and education. This is how I earn my living, basically, combining research and education. I am presenting and idea I am currently working on, in a team, regarding a financial scheme for local governments. I am going to develop it here as a piece of educational material for my course « Fundamentals of Finance ». I am combining educational explanation with specific techniques of scientific research.

 

Here is the deal: creating a financial scheme, combining pooled funds, crowdfunding, securities, and cryptocurriences, for facilitating smart urban development through the creation of local start-up businesses. A lot of ideas in one concept, but this is science, for one, and thus anything is possible, and this is education, for two, hence we need to go through as many basic concepts as possible. It goes more or less as follows: a local government creates two financial instruments, a local investment fund, and a local crowdfunding platform. Both serve to facilitate the creation and growth of local start-ups, which, in turn, facilitate smart urban development.

 

We need a universe in order to do anything sensible. Good. Let’s make a universe, out of local governments, local start-up businesses, and local projects in smart urban development. Projects are groups of people with a purpose and a commitment to achieve it together. Yes, wars are projects, just as musical concerts and public fundraising campaigns for saving the grey wolf. Projects in smart urban development are groups of people with a purpose and a commitment to do something interesting about implementing new technologies into the urban infrastructures and this improving the quality, and the sustainability of urban life.

 

A project is like a demon. It needs a physical body, a vessel to carry out the mission at hand. Projects need a physical doorstep to put a clear sign over it. It is called ‘headquarters’, it has an official address, and we usually need it if we want to do something collective and social. This is where letters from the bank should be addressed to. I have the idea to embody local projects of smart urban development in physical bodies of local start-up businesses. This, in turn, implies turning those projects into profitable ventures. What is the point? A business has assets and it has equity. Assets can back equity, and liabilities. Both equity and liabilities can be represented with financial instruments, namely tradable securities. With that, we can do finance.

 

Why securities? The capital I need, and which I don’t have, is the capital somebody is supposed to entrust with me. Thus, by acquiring capital to finance my project, I give other people claims on the assets I am operating with. Those people will be much more willing to entrust me with their capital if those claims are tradable, i.e. when they can back off out of the business really quickly. That’s the idea of financial instruments: making those claims flow and float around, a bit like water.

 

Question: couldn’t we just make securities for projects, without embodying them in businesses? Problematic. Any financial instrument needs some assets to back it up, on the active side of the balance sheet. Projects, as long as they have no such back up in assets, are not really in a position to issue any securities. Another question: can we embody those projects in institutional forms other than businesses, e.g. foundations, trusts, cooperatives, associations? Yes, we can. Each institutional form has its pluses and its minuses. Business structures have one peculiar trait, however: they have at their disposal probably the broadest range of clearly defined financial instruments, as compared to other institutional forms.

 

Still, we can think out of the box. We can take some financial instruments peculiar to business, and try to transplant them onto another institutional body, like that of an association. Let’s just try and see what happens. I am a project in smart urban development. I go to a notary, and I write the following note: “Whoever hands this note on December 31st of any calendar year from now until 2030, will be entitled to receive 20% of net profits after tax from the business identified as LHKLSHFKSDHF”. Signature, date of signature, stamp by the notary. Looks like a security? Mmmwweeelll, maybe. Let’s try and put it in circulation. Who wants my note? What? What do I want in exchange? Let’s zeeee… The modest sum of $2 000 000? You good with that offer?

 

Some of you will say: you, project, you stop right there and you explain a few things. First of all, what if you really have those profits, and 20% of them really make it worth to hand you $2 000 000 now? How exactly can anyone claim those 20%? How will they know the exact sum they are entitled to? Right, say I (project), we need to write some kind of contract with those rules inside. It can be called corporate bylaw, and we need to write it all down. For example, what if somebody has this note on December 31st, 2025, and then they sell it to someone else on January 2nd, 2026, and the profits for 2025 will be really accounted for like in February 2026 at best, and then, who is entitled to have those 20% of profits: the person who had the note on December 31st, 2025, or the one presenting it in 2026, when all is said and done about profits? Sort of tricky, isn’t it? The note says: ‘Whoever hands this note on December 31st… etc.’, only the act of handing is now separated from the actual disclosure of profits. We keep that in mind: the whole point of making a claim into a security is to make it apt for circulation. If the circulation in itself becomes too troublesome, the security loses a lot of its appeal.

 

See? This note contains a conditional claim. Someone needs to hand the note at the right moment and in the right place, there need to be any profit to share etc. That’s the thing about conditional claims: you need to know exactly how to apprehend those conditions, which the claim is enforceable upon.

 

As I think about the exact contents of that contract, it looks like me and anyone holds that note are partners in business. We are supposed to share profits. Profits come from the exploitation of some assets, and they become real only after all the current liabilities have been paid. Hence, we actually share equity in those assets. The note is an equity-based security, a bit primitive, yes, certainly, still an equity-based security.

 

Another question from the audience: “Project, with all the due respect, I don’t really want to be partners in business with you. Do you have an alternative solution to propose?”. Maybe I have… What do you say about a slightly different note, like “Whoever hands this note on December 31st of any calendar year from now until 2030, will be entitled to receive $500 000 from the bank POIUYTR not later than until January 15th of the next calendar year”. Looks good? You remember what is that type of note? This is a draft, or routed note, a debt-based security. It embodies an unconditional claim, routed on that bank with an interesting name, a bit hard to spell aloud. No conditions attached, thus less paperwork with a contract. Worth how much? Maybe $2 000 000, again?

 

No conditions, yet a suggestion. If, on the one hand, I grant you a claim on 20% of my net profit after tax, and, on the other hand, I am ready to give an unconditional claim on $500 000, you could search some mathematical connection between the 20% and the $500 000. Oh, yes, and there are those $2 000 000. You are connecting the dots. Same window in time, i.e. from 2019 through 2030, which makes 11 occasions to hand the note and claim the money. I multiply occasions by unconditional claims, and I go 11*$500 000 = $5 500 000. An unconditional claim on $5 000 000 spread over 11 annual periods is being sold for $2 000 000. Looks like a ton of good business to do, still let’s do the maths properly. You could invest your $2 000 000 in some comfy sovereign bonds, for example the federal German ones. Rock solid, those ones, and they can yield like 2% a year. I simulate: $2 000 000*(1+0,02)11 =  $2 486 748,62. You pay me $2 000 000, you forego the opportunity to earn $486 748,62, and, in exchange, you receive an unconditional claim on $5 500 000. Looks good, at least at the first sight. Gives you a positive discount rate of ($5 500 000 – $2 486 748,62)/ $2 486 748,62 = 121,2% on the whole 11 years of the deal, thus 121,2%/11 = 11% a year. Not bad.

 

When you have done the maths from the preceding paragraph, you can assume that I expect, in that project of smart urban development, a future stream of net profit after tax, over the 11 fiscal periods to come, somewhere around those $5 500 000. Somewhere around could be somewhere above or somewhere below.  Now, we enter the world of behavioural finance. I have laid my cards on the table, with those two notes. Now, you try to figure out my future behaviour, as well as the behaviour to expect in third parties. When you hold a claim, on whatever and whomever you want, this claim has two financial characteristics: enforceability and risk on the one hand, and liquidity on the other hand. You ask yourself, what exactly can the current holder of the note enforce in terms of payback from my part, and what kind of business you can do by selling those notes to someone else.

 

In a sense, we are playing a game. You face a choice between different moves. Move #1: buy the equity-based paper and hold. Move #2: buy the equity-based one and sell it to third parties. Move #3: buy the debt-based routed note and hold. Move #4: buy the routed note and sell it shortly after. You can go just for one of those moves, or make a basket thereof, if you have enough money to invest more than one lump injection of $2 000 000 into my project of smart urban development.

 

You make your move, and you might wonder what kind of move will I make, and what will other people do. Down that avenue of thinking, madness lies. Finance means, very largely, domesticated madness, and thus, when you are a financial player, instead of wondering what other people will do, you look for reliable benchmarks in the existing markets. This is an important principle of finance: quantities and prices are informative about the human behaviour to expect. When you face the choice between moves #1 ÷ #4, you will look, in the first place, for and upon the existing markets. If I grant you 20% of my profits in exchange of $2 000 000, which, in fact, seem corresponding to at least $500 000 of future annual cash flow. If 20% of something is $500 000, the whole something makes $500 000/ 20% = $2 500 000. How much equity does it correspond to? Here it comes to benchmarking. Aswath Damodaran, from NYU Stern Undergraduate College, publishes average ROE (return on equity) in different industries. Let’s suppose that my project of smart urban development is focused on Environmental & Waste Services. It is urban, it claims being smart, hence it could be about waste management. That makes 17,95% of average ROE, i.e. net profit/equity = 17,95%. Logically, equity = net profit/17,95%, thus I go $2 500 000/17,95% = $13 927 576,60 and this is the equity you can reasonably expect I expect to accumulate in that project of smart urban development.

 

Among the numerous datasets published by Aswath Damodaran, there is one containing the so-called ROIC, or return on invested capital, thus on the total equity and debt invested in the business. In the same industry, i.e. Environmental & Waste Services, it is 13,58%. It spells analogously to ROE, thus it is net profit divided by the total capital invested, and, logically, total capital invested = net profit / ROIC = $2 500 000 / 13,58% = $18 409 425,63. Equity alone makes $13 927 576,60, equity plus debt makes $18 409 425,63, therefore debt = $18 409 425,63 – $13 927 576,60 =  $4 481 849,02.

 

With those rates of return on, respectively, equity and capital invested, those 11% of annual discount, benchmarked against German sovereign bonds, look acceptable. If I take a look at the financial instruments listed in the AIM market of London Stock Exchange, and I dig a bit, I can find corporate bonds, i.e. debt-based securities issued by incorporated business structures. Here come, for example, the bonds issued by 3i Group, an investment fund. They are identified with ISIN (International Securities Identification Number) XS0104440986, they were issued in 1999, and their maturity date is December 3rd, 2032. They are endowed with an interest rate of 5,75% a year, payable in two semi-annual instalments every year. Once again, the 11% discount offered on those imaginary routed notes of my project look interesting in comparison.

 

Before I go further, I am once again going to play at anticipating your questions. What is the connection between the interest rate and the discount rate, in this case? I am explaining numerically. Imagine you buy corporate bonds, like those 3i Group bonds, with an interest rate 5,75% a year. You spend $2 000 000 on them. You hold them for 5 years, and then you sell them to third persons. Just for the sake of simplifying, I suppose you sell them for the same face value you bought them, i.e. for $2 000 000. What happened arithmetically, from your point of view, can be represented as follows: – $2 000 000 + 5*5,75%*$2 000 000 + $2 000 000 = $575 000. Now, imagine that instead of those bonds, you bought, for an amount of $2 000 000,  debt-based routed notes of my project, phrased as follows: “Whoever hands this note on December 31st of any calendar year from now until Year +5, will be entitled to receive $515 000 from the bank POIUYTR not later than until January 15th of the next calendar year”. With such a draft (remember: another name for a routed note), you will total – $2 000 000 + 5*$515 000 = $575 000.

 

Same result at the end of the day, just phrased differently. With those routed notes of mine, I earn a a discount of $575 000, and with the 3i bonds, you earn an interest of $575 000. You understand? Whatever you do with financial instruments, it sums up to a cash flow. You spend your capital on buying those instruments in the first place, and you write that initial expenditure with a ‘-’ sign in your cash flow. Then you receive some ‘+’ cash flows, under various forms, and variously described. At the end of the day, you sum up the initial outflow (minus) of cash with the subsequent inflows (pluses).

 

Now, I look back, I mean back to the beginning of this update on my blog, and I realize how far have I ventured myself from the initial strand of ideas. I was about to discuss a financial scheme, combining pooled funds, crowdfunding, securities, and cryptocurriences, for facilitating smart urban development through the creation of local start-up businesses. Good. I go back to it. My fundamental concept is that of public-private partnership, just peppered with a bit of finance. Local governments do services connected to waste and environmental care. The basic way they finance it is through budgetary spending, and sometimes they create or take interest in local companies specialized in doing it. My idea is to go one step further, and make local governments create and run investment funds specialized in taking interest in such businesses.

 

One of the basic ideas when running an investment fund is to make a portfolio of participations in various businesses, with various temporal horizons attached. We combine the long term with the short one. In some companies we invest for like 10 years, and in some others just for 2 years, and then we sell those shares, bonds, or whatever. When I was working on the business plan for the BeFund project, I had a look at the shape those investment portfolios take. You can sort of follow back that research of mine in « Sort of a classical move » from March 15th, 2018. I had quite a bit of an exploration into the concept of smart cities. See « My individual square of land, 9 meters on 9 », from January 11, 2018, or « Smart cities, or rummaging in the waste heap of culture » from January 31, 2018, as for this topic. What comes out of my research is that the combination of digital technologies with the objectively growing importance of urban structures in our civilisation brings new investment opportunities. Thus, I have this idea of local governments, like city councils, becoming active investors in local businesses, and that local investment would combine the big, steady ventures – like local waste management companies – with a lot of small startup companies.

 

This basic structure in the portfolio of a local investment fund reflects my intuitive take on the way a city works. There is the fundamental, big, heavy stuff that just needs to work – waste management, again, but also water supply, energy supply etc. – and there is the highly experimental part, where the city attempts to implement radically new solutions on the grounds of radically new technologies. The usual policy that I can observe in local governments, now, is to create big local companies for the former category, and to let private businesses take over entirely the second one. Now, imagine that when you pay taxes to the local government, part of your tax money goes into an investment fund, which takes participations in local startups, active in the domain on those experimental solutions and new technologies. Your tax money goes into a portfolio of investments.

 

Imagine even more. There is local crowdfunding platform, similar to Kickstarter or StartEngine, where you can put your money directly into those local ventures, without passing by the local investment fund as a middleman. On that crowdfunding platform, the same local investment fund can compete for funding with other ventures. A cryptocurrency, internal to that crowdfunding platform, could be used to make clearer financial rules in the investment game.

 

When I filed that idea for review, in the form of an article, with a Polish scientific journal, I received back an interestingly critical review. There were two main lines of criticism. Firstly, where is the advantage of my proposed solution over the presently applied institutional schemes? How could my solution improve smart urban development, as compared to what local governments currently do? Secondly, doesn’t it go too far from the mission of local governments? Doesn’t my scheme push public goods too far into private hands and doesn’t it make local governments too capitalistic?

 

I need to address those questions, both for revising my article, and for giving a nice closure to this particular, educational story in the fundamentals of finance. Functionality first, thus: what is the point? What can be possibly improved with that financial scheme I propose? Finance has two essential functions: it meets the need for liquidity, and, through the mechanism of financial markets. Liquidity is the capacity to enter in transactions. For any given situation there is a total set T of transactions that an entity, finding themselves in this situation, could be willing to enter into. Usually, we can’t enter it all, I mean we, entities. Individuals, businesses, governments: we are limited in our capacity to enter transactions. For the given total set T of transactions, there is just a subset Ti that i-th entity can participate in. The fraction « Ti/T » is a measure of liquidity this entity has.

 

Question: if, instead of doing something administratively, or granting a simple subsidy to a private agent, local governments act as investment funds in local projects, how does it change their liquidity, and the liquidity of local communities they are the governments of? I went to the website of the Polish Central Statistical Office, there I took slightly North-East and landed in their Local Data Bank. I asked around for data regarding the financial stance of big cities in Poland, and I found out some about: Wroclaw, Lodz, Krakow, Gdansk, Kielce, and Poznan. I focused on the investment outlays of local governments, the number of new business entities registered every year, per 10 000 residents, and on population. Here below, you can find three summary tables regarding these metrics. You will see by yourself, but in a bird’s eye view, we have more or less stationary populations, and local governments spending a shrinking part of their total budgets on fixed local assets. Local governments back off from financing those assets. In the same time, there is growing stir in business. There are more and more new business entities registered every year, in relation to population. Those local governments look as if they were out of ideas as for how to work with that local business. Can my idea change the situation? I develop on this one further below those two tables.

 

 

The share of investment outlays in the total expenditures of the city council, in major Polish cities
  City
Year Wroclaw Lodz Krakow Gdansk Kielce Poznan Warsaw
2008 31,8% 21,0% 19,7% 22,6% 15,3% 27,9% 19,8%
2009 34,6% 23,5% 20,4% 20,6% 18,6% 28,4% 17,8%
2010 24,2% 15,2% 16,7% 24,5% 21,2% 29,6% 21,4%
2011 20,3% 12,5% 14,5% 33,9% 26,9% 30,1% 17,1%
2012 21,5% 15,3% 12,6% 38,2% 21,9% 20,8% 16,8%
2013 15,0% 19,3% 11,0% 28,4% 18,5% 18,1% 15,0%
2014 15,6% 24,4% 16,4% 27,0% 18,6% 11,8% 17,5%
2015 18,4% 26,8% 13,7% 21,3% 23,8% 24,1% 10,2%
2016 13,3% 14,3% 11,5% 15,2% 10,7% 17,5% 9,0%
2017 11,7% 10,2% 11,5% 12,2% 14,1% 12,3% 12,0%
               
Delta 2017 – 2008 -20,1% -10,8% -8,2% -10,4% -1,2% -15,6% -7,8%

 

 

Population of major cities
  City
Year Wroclaw Lodz Krakow Gdansk Kielce Poznan Warsaw
2008 632 162 747 152 754 624 455 581 205 094 557 264 1 709 781
2009 632 146 742 387 755 000 456 591 204 835 554 221 1 714 446
2010 630 691 730 633 757 740 460 509 202 450 555 614 1 700 112
2011 631 235 725 055 759 137 460 517 201 815 553 564 1 708 491
2012 631 188 718 960 758 334 460 427 200 938 550 742 1 715 517
2013 632 067 711 332 758 992 461 531 199 870 548 028 1 724 404
2014 634 487 706 004 761 873 461 489 198 857 545 680 1 735 442
2015 635 759 700 982 761 069 462 249 198 046 542 348 1 744 351
2016 637 683 696 503 765 320 463 754 197 704 540 372 1 753 977
2017 638 586 690 422 767 348 464 254 196 804 538 633 1 764 615
               
Delta 2017 – 2008 6 424 (56 730) 12 724 8 673 (8 290) (18 631) 54 834

 

Number of newly registered business entities per 10 000 residents, in major Polish cities
  City
Year Wroclaw Lodz Krakow Gdansk Kielce Poznan Warsaw
2008 190 160 200 190 140 210 200
2009 195 167 205 196 149 216 207
2010 219 193 241 213 182 238 274
2011 221 169 204 195 168 244 249
2012 228 187 230 201 168 255 274
2013 237 187 224 211 175 262 307
2014 236 189 216 217 157 267 303
2015 252 183 248 236 185 283 348
2016 265 186 251 238 176 270 364
2017 272 189 257 255 175 267 345
               
Delta 2017 – 2008 82,00 29,00 57,00 65,00 35,00 57,00 145,00

 

Let’s take two cases from the table: my hometown Krakow, and my capital Warsaw. In the former case, the negative gap in the investment outlays of the local government is – 44 mlns of zlotys – some €10 mln – and in the latter case it is minus 248,46 millions of zlotys, thus about €56,5 mln. If we want to really get after new technologies in cities, we need to top up those gaps, possibly with a surplus. How can my idea help to save the day?

 

When I try to spend €10 mln euro more on the urban fixed assets, I need to have all those €10 mln. I need to own them directly, in my balance sheet, before spending them. On the other hand, when I want to create an investment fund, which would take part in local startups, and by their intermediary would make those €10 mln worth of assets to happen in real life, I need much less. I start with the balance sheet directly attached to those assets: €10 mln in fixed assets = equity of the startup(s) + liabilities of the startup(s). Now, equity of the startup(s) = shares of our investment fund + shares of other partners. At the end of the day, the local government could finance assets of €10 mln with 1 or 2 millions of euro of own equity, maybe even less.

 

From there on, it went sort of out of hand. I have that mental fixation on things connected to artificial intelligence and neural networks. You can find the latest account in English in the update entitled « What are the practical outcomes of those hypotheses being true or false? ». If you speak French, there is a bit more, and more recent, in « Surpopulation sauvage ou compétition aux États-Unis ». Anyway, I did it. I made a neural network in order to simulate the behaviour of my financial concept. Below, I am presenting a graphical idea of that network. It combines a strictly spoken multilayer perceptron with components of deep learning: observation of the fitness function, and the feeding back of it, as well as selection and preference regarding different neural outputs of the network. I am using that neural network as a simulator of collective intelligence.

 

So, as I am assuming that we are collectively intelligent in our local communities, I make the following logical structure. Step 1: I take four input variables, as listed below. They are taken from real statistics about those 7 big Polish cities, named above – Wroclaw, Lodz, Krakow, Gdansk, Kielce, Poznan, Warsaw – over the period from 2008 through 2017.

 

Input variable 1: Investment outlays of the local government [mln]

Input variable 2: Overall expenses of the local government [mln]

Input variable 3: Population [headcount]

Input variable 4: Number of new business entities registered annually [coefficient]

 

In step 2, I attach to those real input variables an Output variable – Hypothetical variable: capital engaged in the local governments investment fund, initially calculated as if 5% of new business entities were financed with €100 000 each. I calculate the average value of that variable across the whole sample of 7 cities, and it makes €87 mln as expected value. This is the amount of money the average city among those seven could put in that local investment fund to support local startups and their projects of smart urban development.

 

In step 3, I run my neural network through the empirical data, and then I make it do additional 5000 experimental rounds, just to make it look for a match between the input variables – which can change as they want – and the output variable, which I have almost pegged at €87 mln. I say ‘almost’, as in practice the network will generate a bit of wobbling around those €87 mln. I want to see what possible configurations of the input variables can arise, through different patterns of collective learning, around that virtually pegged value of the output variable.

 

I hypothesise 5 different ways of learning, or 5 different selections in that Neuron 4 you can see in the picture above. Learning pattern #1 consists in systematically preferring the neural output of the sigmoid neural function. It is a type of function, which systematically calms down any shocks and sudden swings in input phenomena. It is like a collective pretention that whatever kind of s**t is really going on, everything is just fine. Learning pattern #2 prefers the output of the hyperbolic tangent function. This one tends to be honest, and when there is a shock, it yields a shock, without any f**kery about it. It is like a market with clear rules of competition. Learning pattern #3 takes the least error of the two functions. It is a most classical approach in neural networks. The closer I get to the expected value, the better I am learning, that sort of things. Learning pattern #4 makes an average of those two functions. The greatest value among those being averaged has the greatest impact on the resulting average. Thus, the average of two functions is like hierarchy of importance, expressed in one number. Finally, learning pattern #5 takes that average, just as #3, but it adds the component of growing resistance to new information. At each experimental round, it divides the value of the error fed back into the network by the consecutive number of the round. Error generated in round 2 gets divided by 2, and that generated in round 4000 is being divided by 4000 etc. This is like a person who, as they process new information, develops a growing sentiment of being fully schooled on the topic, and is more and more resistant to new input.

 

In the table below, I present the results of those simulations. Learning patterns #2 and #4 develop structures somehow more modest than the actual reality, expressed as empirical averages in the first numerical line of the table. These are urban communities, where that investment fund I am thinking about slightly grows in importance, in relation to the whole municipal budget. Learning patterns #1 and #3 develop crazy magnitudes in those input variables. Populations grow 9 or 10 times bigger than the present ones, the probability of having new businesses in each 10 000 people grows 6 or 7 times, and municipal budgets swell by 14 ÷ 15 times. The urban investment fund becomes close to insignificant. Learning pattern #5 goes sort of in the middle between those extremes.

 

 

  Input variable 1 Input variable 2 Input variable 3 Input variable 4 Output variable
Initial averages of empirical values  €177 mln  €996 mln                     721 083                               223  €87 mln
Type of selection in neural output Sample results of simulation with the neural network
Sigmoid preferred €2 440 mln €14 377 mln 7 093 526,21 1 328,83 €87 mln
Hyperbolic Tangent preferred €145 mln €908 mln 501 150,03 237,78 €87 mln
Least error preferred €2 213 mln €13 128 mln 6 573 058,50 1 490,28 €87 mln
Average of the two errors €122 mln €770 mln 432 702,57 223,66 €87 mln
Average of the two errors, with growing resistance to learning €845 mln €5 043 mln 2 555 800,36 661,61 €87 mln

 

What is the moral of the fairy tale? As I see it now, it means that for any given initial situation as for that financial scheme I have in mind for cities and their local governments, future development can go two opposite ways. The city can get sort of slightly smaller and smarter, with more or less the same occurrence of new businesses emerging every year. It happens when the local community learns, as a collective intelligence, with little shielding from external shocks. This is like a market-oriented city. In terms of quantitative dynamics, it makes me think about cities like Vienna (Austria), Lyon (France), or my home city, Krakow (Poland). On the other hand, the city can shield itself somehow against socio-economic shocks, for example with heavy subsidies, and then it gets out of control. It grows big like hell, and business starts just to pop around.

 

At the first sight, it seems counterintuitive. We associate market-based, open-to-shocks solutions with uncontrolled growth, and interventionist, counter-cyclical policies with sort of a tame status quo. Still, cities are strange beasts. They are like crocodiles. When you make them compete for food and territory, they grow just to a certain size, ‘cause when they grow bigger than that, they die. Yet, when you allow a crocodile to live in a place without much competition, and plenty of food around, it grows to enormous proportions.

 

My temporary conclusion is that my idea of a local investment fund to boost smart change in cities is workable, i.e. has the chances to thrive as a financial mechanism, when the whole city is open to market-based solutions and receives little shielding from economic shocks.

 

I am consistently delivering good, almost new science to my readers, and love doing it, and I am working on crowdfunding this activity of mine. As we talk business plans, I remind you that you can download, from the library of my blog, the business plan I prepared for my semi-scientific project Befund  (and you can access the French version as well). You can also get a free e-copy of my book ‘Capitalism and Political Power’ You can support my research by donating directly, any amount you consider appropriate, to my PayPal account. You can also consider going to my Patreon page and become my patron. If you decide so, I will be grateful for suggesting me two things that Patreon suggests me to suggest you. Firstly, what kind of reward would you expect in exchange of supporting me? Secondly, what kind of phases would you like to see in the development of my research, and of the corresponding educational tools?

What are the practical outcomes of those hypotheses being true or false?

 

My editorial on You Tube

 

This is one of those moments when I need to reassess what the hell I am doing. Scientifically, I mean. Of course, it is good to reassess things existentially, too, every now and then, but for the moment I am limiting myself to science. Simpler and safer than life in general. Anyway, I have a financial scheme in mind, where local crowdfunding platforms serve to support the development of local suppliers in renewable energies. The scheme is based on the observable difference between prices of electricity for small users (higher), and those reserved to industrial scale users (lower). I wonder if small consumers would be ready to pay the normal, relatively higher price in exchange of a package made of: a) electricity and b) shares in the equity of its suppliers.

I have a general, methodological hypothesis in mind, which I have been trying to develop over the last 2 years or so: collective intelligence. I hypothesise that collective behaviour observable in markets can be studied as a manifestation of collective intelligence. The purpose is to go beyond optimization and to define, with scientific rigour, what are the alternative, essentially equiprobable paths of change that a complex market can take. I think such an approach is useful when I am dealing with an economic model with a lot of internal correlation between variables, and that correlation can be so strong that it turns into those variables basically looping on each other. In such a situation, distinguishing independent variables from the dependent ones becomes bloody hard, and methodologically doubtful.

On the grounds of literature, and my own experimentation, I have defined three essential traits of such collective intelligence: a) distinction between structure and instance b) capacity to accumulate experience, and c) capacity to pass between different levels of freedom in social cohesion. I am using an artificial neural network, a multi-layer perceptron, in order to simulate such collectively intelligent behaviour.

The distinction between structure and instance means that we can devise something, make different instances of that something, each different by some small details, and experiment with those different instances in order to devise an even better something. When I make a mechanical clock, I am a clockmaker. When I am able to have a critical look at this clock, make many different versions of it – all based on the same structural connections between mechanical parts, but differing from each other by subtle details – and experiment with those multiple versions, I become a meta-clock-maker, i.e. someone who can advise clockmakers on how to make clocks. The capacity to distinguish between structures and their instances is one of the basic skills we need in life. Autistic people have a big problem in that department, as they are mostly on the instance side. To a severely autistic person, me in a blue jacket, and me in a brown jacket are two completely different people. Schizophrenic people are on the opposite end of the spectrum. To them, everything is one and the same structure, and they cannot cope with instances. Me in a blue jacket and me in a brown jacket are the same as my neighbour in a yellow jumper, and we all are instances of the same alien monster. I know you think I might be overstating, but my grandmother on the father’s side used to suffer from schizophrenia, and it was precisely that: to her, all strong smells were the manifestation of one and the same volatile poison sprayed in the air by THEM, and every person outside a circle of about 19 people closest to her was a member of THEM. Poor Jadwiga.

In economics, the distinction between structure and instance corresponds to the tension between markets and their underpinning institutions. Markets are fluid and changeable, they are like constant experimenting. Institutions give some gravitas and predictability to that experimenting. Institutions are structures, and markets are ritualized manners of multiplying and testing many alternative instances of those structures.

The capacity to accumulate experience means that as we experiment with different instances of different structures, we can store information we collect in the process, and use this information in some meaningful way. My great compatriot, Alfred Korzybski, in his general semantics, used to designate it as ‘the capacity to bind time’. The thing is not as obvious as one could think. A Nobel-prized mathematician, Reinhard Selten, coined up the concept of social games with imperfect recall (Harsanyi, Selten 1988[1]). He argued that as we, collective humans, accumulate and generalize experience about what the hell is going on, from time to time we shake off that big folder, and pick the pages endowed with the most meaning. All the remaining stuff, judged less useful on the moment, is somehow archived in culture, so as it basically stays there, but becomes much harder to access and utilise. The capacity to accumulate experience means largely the way of accumulating experience, and doing that from-time-to-time archiving. We can observe this basic distinction in everyday life. There are things that we learn sort of incrementally. When I learn to play piano – which I wish I was learning right now, cool stuff – I practice, I practice, I practice and… I accumulate learning from all those practices, and one day I give a concert, in a pub. Still, other things, I learn them sort of haphazardly. Relationships are a good example. I am with someone, one day I am mad at her, the other day I see her as the love of my life, then, again, she really gets on my nerves, and then I think I couldn’t live without her etc. Bit of a bumpy road, isn’t it? Yes, there is some incremental learning, but you become aware of it after like 25 years of conjoint life. Earlier on, you just need to suck ass and keep going.

There is an interesting theory in economics, labelled as « semi – martingale » (see for example: Malkiel, Fama 1970[2]). When we observe changes in stock prices, in a capital market, we tend to say they are random, but they are not. You can test it. If the price is really random, it should fan out according to the pattern of normal distribution. This is what we call a full martingale. Any real price you observe actually swings less broadly than normal distribution: this is a semi-martingale. Still, anyone with any experience in investment knows that prediction inside the semi-martingale is always burdened with a s**tload of error. When you observe stock prices over a long time, like 2 or 3 years, you can see a sequence of distinct semi-martingales. From September through December it swings inside one semi-martingale, then the Ghost of Past Christmases shakes it badly, people panic, and later it settles into another semi-martingale, slightly shifted from the preceding one, and here it goes, semi-martingaling for another dozen of weeks etc.

The central theoretical question in this economic theory, and a couple of others, spells: do we learn something durable through local shocks? Does a sequence of economic shocks, of whatever type, make a learning path similar to the incremental learning of piano playing? There are strong arguments in favour of both possible answers. If you get your face punched, over and over again, you must be a really dumb asshole not to learn anything from that. Still, there is that phenomenon called systemic homeostasis: many systems, social structures included, tend to fight for stability when shaken, and they are frequently successful. The memory of shocks and revolutions is frequently erased, and they are assumed to have never existed.

The issue of different levels in social cohesion refers to the so-called swarm theory (Stradner et al 2013[3]). This theory studies collective intelligence by reference to animals, which we know are intelligent just collectively. Bees, ants, hornets: all those beasts, when acting individually, as dumb as f**k. Still, when they gang up, they develop amazingly complex patterns of action. That’s not all. Those complex patterns of theirs fall into three categories, applicable to human behaviour as well: static coupling, dynamic correlated coupling, and dynamic random coupling.

When we coordinate by static coupling, we always do things together in the same way. These are recurrent rituals, without much room for change. Many legal rules, and institutions they form the basis of, are examples of static coupling. You want to put some equity-based securities in circulation? Good, you do this, and this, and this. You haven’t done the third this? Sorry, man, but you cannot call it a day yet. When we need to change the structure of what we do, we should somehow loosen that static coupling and try something new. We should dissolve the existing business, which is static coupling, and look for creating something new. When we do so, we can sort of stay in touch with our customary business partners, and after some circling and asking around we form a new business structure, involving people we clearly coordinate with. This is dynamic correlated coupling. Finally, we can decide to sail completely uncharted waters, and take our business concept to China, or to New Zealand, and try to work with completely different people. What we do, in such a case, is emitting some sort of business signal into the environment, and waiting for any response from whoever is interested. This is dynamic random coupling. Attracting random followers to a new You Tube channel is very much an example of the same.

At the level of social cohesion, we can be intelligent in two distinct ways. On the one hand, we can keep the given pattern of collective associations behaviour at the same level, i.e. one of the three I have just mentioned. We keep it ritualized and static, or somehow loose and dynamically correlated, or, finally, we take care of not ritualizing too much and keep it deliberately at the level of random associations. On the other hand, we can shift between different levels of cohesion. We take some institutions, we start experimenting with making them more flexible, at some point we possibly make it as free as possible, and we gain experience, which, in turn, allows us to create new institutions.

When applying the issue of social cohesion in collective intelligence to economic phenomena, we can use a little trick, to be found, for example, in de Vincenzo et al (2018[4]): we assume that quantitative economic variables, which we normally perceive as just numbers, are manifestations of distinct collective decisions. When I have the price of energy, let’s say, €0,17 per kilowatt hour, I consider it as the outcome of collective decision-making. At this point, it is useful to remember the fundamentals of intelligence. We perceive our own, individual decisions as outcomes of our independent thinking. We associate them with the fact of wanting something, and being apprehensive regarding something else etc. Still, neurologically, those decisions are outcomes of some neurons firing in a certain sequence. Same for economic variables, i.e. mostly prices and quantities: they are fruit of interactions between the members of a community. When I buy apples in the local marketplace, I just buy them for a certain price, and, if they look bad, I just don’t buy. This is not any form of purposeful influence upon the market. Still, when 10 000 people like me do the same, sort of ‘buy when price good, don’t when the apple is bruised’, a patterned process emerges. The resulting price of apples is the outcome of that process.

Social cohesion can be viewed as association between collective decisions, not just between individual actions. The resulting methodology is made, roughly speaking, of three steps. Step one: I put all the economic variables in my model over a common denominator (common scale of measurement). Step two: I calculate the relative cohesion between them with the general concept of a fitness function, which I can express, for example, as the Euclidean distance between local values of variables in question. Step three: I calculate the average of those Euclidean distances, and I calculate its reciprocal, like « 1/x ». This reciprocal is the direct measure of cohesion between decisions, i.e. the higher the value of this precise « 1/x », the more cohesion between different processes of economic decision-making.

Now, those of you with a sharp scientific edge could say now: “Wait a minute, doc. How do you know we are talking about different processes of decision making? Who do you know that variable X1 comes from a different process than variable X2?”. This is precisely my point. The swarm theory tells me that if I can observe changing a cohesion between those variables, I can reasonably hypothesise that their underlying decision-making processes are distinct. If, on the other hand, their mutual Euclidean distance stays the same, I hypothesise that they come from the same process.

Summing up, here is the general drift: I take an economic model and I formulate three hypotheses as for the occurrence of collective intelligence in that model. Hypothesis #1: different variables of the model come from different processes of collective decision-making.

Hypothesis #2: the economic system underlying the model has the capacity to learn as a collective intelligence, i.e. to durably increase or decrease the mutual cohesion between those processes. Hypothesis #3: collective learning in the presence of economic shocks is different from the instance of learning in the absence of such shocks.

They look nice, those hypotheses. Now, why the hell should anyone bother? I mean what are the practical outcomes of those hypotheses being true or false? In my experimental perceptron, I express the presence of economic shocks by using hyperbolic tangent as neural function of activation, whilst the absence of shocks (or the presence of countercyclical policies) is expressed with a sigmoid function. Those two yield very different processes of learning. Long story short, the sigmoid learns more, i.e. it accumulates more local errors (this more experimental material for learning), and it generates a steady trend towards lower a cohesion between variables (decisions). The hyperbolic tangent accumulates less experiential material (it learns less), and it is quite random in arriving to any tangible change in cohesion. The collective intelligence I mimicked with that perceptron looks like the kind of intelligence, which, when going through shocks, learns only the skill of returning to the initial position after shock: it does not create any lasting type of change. The latter happens only when my perceptron has a device to absorb and alleviate shocks, i.e. the sigmoid neural function.

When I have my perceptron explicitly feeding back that cohesion between variables (i.e. feeding back the fitness function considered as a local error), it learns less and changes less, but not necessarily goes through less shocks. When the perceptron does not care about feeding back the observable distance between variables, there is more learning and more change, but not more shocks. The overall fitness function of my perceptron changes over time The ‘over time’ depends on the kind of neural activation function I use. In the case of hyperbolic tangent, it is brutal change over a short time, eventually coming back to virtually the same point that it started from. In the hyperbolic tangent, the passage between various levels of association, according to the swarm theory, is super quick, but not really productive. In the sigmoid, it is definitely a steady trend of decreasing cohesion.

I want to know what the hell I am doing. I feel I have made a few steps towards that understanding, but getting to know what I am doing proves really hard.

I am consistently delivering good, almost new science to my readers, and love doing it, and I am working on crowdfunding this activity of mine. As we talk business plans, I remind you that you can download, from the library of my blog, the business plan I prepared for my semi-scientific project Befund  (and you can access the French version as well). You can also get a free e-copy of my book ‘Capitalism and Political Power’ You can support my research by donating directly, any amount you consider appropriate, to my PayPal account. You can also consider going to my Patreon page and become my patron. If you decide so, I will be grateful for suggesting me two things that Patreon suggests me to suggest you. Firstly, what kind of reward would you expect in exchange of supporting me? Secondly, what kind of phases would you like to see in the development of my research, and of the corresponding educational tools?

[1] Harsanyi, J. C., & Selten, R. (1988). A general theory of equilibrium selection in games. MIT Press Books, 1.

[2] Malkiel, B. G., & Fama, E. F. (1970). Efficient capital markets: A review of theory and empirical work. The journal of Finance, 25(2), 383-417.

[3] Stradner, J., Thenius, R., Zahadat, P., Hamann, H., Crailsheim, K., & Schmickl, T. (2013). Algorithmic requirements for swarm intelligence in differently coupled collective systems. Chaos, Solitons & Fractals, 50, 100-114.

[4] De Vincenzo, I., Massari, G. F., Giannoccaro, I., Carbone, G., & Grigolini, P. (2018). Mimicking the collective intelligence of human groups as an optimization tool for complex problems. Chaos, Solitons & Fractals, 110, 259-266.

How can I possibly learn on that thing I have just become aware I do?

 

My editorial on You Tube

 

I keep working on the application of neural networks to simulate the workings of collective intelligence in humans. I am currently macheting my way through the model proposed by de Vincenzo et al in their article entitled ‘Mimicking the collective intelligence of human groups as an optimization tool for complex problems’ (2018[1]). In the spirit of my own research, I am trying to use optimization tools for a slightly different purpose, that is for simulating the way things are done. It usually means that I relax some assumptions which come along with said optimization tools, and I just watch what happens.

Vincenzo et al propose a model of artificial intelligence, which combines a classical perceptron, such as the one I have already discussed on this blog (see « More vigilant than sigmoid », for example) with a component of deep learning based on the observable divergences in decisions. In that model, social agents strive to minimize their divergences and to achieve relative consensus. Mathematically, it means that each decision is characterized by a fitness function, i.e. a function of mathematical distance from other decisions made in the same population.

I take the tensors I have already been working with, namely the input tensor TI = {LCOER, LCOENR, KR, KNR, IR, INR, PA;R, PA;NR, PB;R, PB;NR} and the output tensor is TO = {QR/N; QNR/N}. Once again, consult « More vigilant than sigmoid » as for the meaning of those variables. In the spirit of the model presented by Vincenzo et al, I assume that each variable in my tensors is a decision. Thus, for example, PA;R, i.e. the basic price of energy from renewable sources, which small consumers are charged with, is the tangible outcome of a collective decision. Same for the levelized cost of electricity from renewable sources, the LCOER, etc. For each i-th variable xi in TI and TO, I calculate its relative fitness to the overall universe of decisions, as the average of itself, and of its Euclidean distances to other decisions. It looks like:

 

V(xi) = (1/N)*{xi + [(xi – xi;1)2]0,5 + [(xi – xi;2)2]0,5 + … + [(xi – xi;K)2]0,5}

 

…where N is the total number of variables in my tensors, and K = N – 1.

 

In a next step, I can calculate the average of averages, thus to sum up all the individual V(xi)’s and divide that total by N. That average V*(x) = (1/N) * [V(x1) + V(x2) + … + V(xN)] is the measure of aggregate divergence between individual variables considered as decisions.

Now, I imagine two populations: one who actively learns from the observed divergence of decisions, and another one who doesn’t really. The former is represented with a perceptron that feeds back the observable V(xi)’s into consecutive experimental rounds. Still, it is just feeding that V(xi) back into the loop, without any a priori ideas about it. The latter is more or less what it already is: it just yields those V(xi)’s but does not do much about them.

I needed a bit of thinking as for how exactly should that feeding back of fitness function look like. In the algorithm I finally came up with, it looks differently for the input variables on the one hand, and for the output ones. You might remember, from the reading of « More vigilant than sigmoid », that my perceptron, in its basic version, learns by estimating local errors observed in the last round of experimentation, and then adding those local errors to the values of input variables, just to make them roll once again through the neural activation function (sigmoid or hyperbolic tangent), and see what happens.

As I upgrade my perceptron with the estimation of fitness function V(xi), I ask: who estimates the fitness function? What kind of question is that? Well, a basic one. I have that neural network, right? It is supposed to be intelligent, right? I add a function of intelligence, namely that of estimating the fitness function. Who is doing the estimation: my supposedly intelligent network or some other intelligent entity? If it is an external intelligence, mine, for a start, it just estimates V(xi), sits on its couch, and watches the perceptron struggling through the meanders of attempts to be intelligent. In such a case, the fitness function is like sweat generated by a body. The body sweats but does not have any way of using the sweat produced.

Now, if the V(xi) is to be used for learning, the perceptron is precisely the incumbent intelligent structure supposed to use it. I see two basic ways for the perceptron to do that. First of all, the input neuron of my perceptron can capture the local fitness functions on input variables and add them, as additional information, to the previously used values of input variables. Second of all, the second hidden neuron can add the local fitness functions, observed on output variables, to the exponent of the neural activation function.

I explain. I am a perceptron. I start my adventure with two tensors: input TI = {LCOER, LCOENR, KR, KNR, IR, INR, PA;R, PA;NR, PB;R, PB;NR} and output TO = {QR/N; QNR/N}. The initial values I start with are slightly modified in comparison to what was being processed in « More vigilant than sigmoid ». I assume that the initial market of renewable energies – thus most variables of quantity with ‘R’ in subscript – is quasi inexistent. More specifically, QR/N = 0,01 and  QNR/N = 0,99 in output variables, whilst in the input tensor I have capital invested in capacity IR = 0,46 (thus a readiness to go and generate from renewables), and yet the crowdfunding flow K is KR = 0,01 for renewables and KNR = 0,09 for non-renewables. If you want, it is a sector of renewable energies which is sort of ready to fire off but hasn’t done anything yet in that department. All in all, I start with: LCOER = 0,26; LCOENR = 0,48; KR = 0,01; KNR = 0,09; IR = 0,46; INR = 0,99; PA;R = 0,71; PA;NR = 0,46; PB;R = 0,20; PB;NR = 0,37; QR/N = 0,01; and QNR/N = 0,99.

Being a pure perceptron, I am dumb as f**k. I can learn by pure experimentation. I have ambitions, though, to be smarter, thus to add some deep learning to my repertoire. I estimate the relative mutual fitness of my variables according to the V(xi) formula given earlier, as arithmetical average of each variable separately and its Euclidean distance to others. With the initial values as given, I observe: V(LCOER; t0) = 0,302691788; V(LCOENR; t0) = 0,310267104; V(KR; t0) = 0,410347388; V(KNR; t0) = 0,363680721; V(IR ; t0) = 0,300647174; V(INR ; t0) = 0,652537097; V(PA;R ; t0) = 0,441356844 ; V(PA;NR ; t0) = 0,300683099 ; V(PB;R ; t0) = 0,316248176 ; V(PB;NR ; t0) = 0,293252713 ; V(QR/N ; t0) = 0,410347388 ; and V(QNR/N ; t0) = 0,570485945. All that stuff put together into an overall fitness estimation is like average V*(x; t0) = 0,389378787.

I ask myself: what happens to that fitness function when as I process information with my two alternative neural functions, the sigmoid or the hyperbolic tangent. I jump to experimental round 1500, thus to t1500, and I watch. With the sigmoid, I have V(LCOER; t1500) =  0,359529289 ; V(LCOENR; t1500) =  0,367104605; V(KR; t1500) =  0,467184889; V(KNR; t1500) = 0,420518222; V(IR ; t1500) =  0,357484675; V(INR ; t1500) =  0,709374598; V(PA;R ; t1500) =  0,498194345; V(PA;NR ; t1500) =  0,3575206; V(PB;R ; t1500) =  0,373085677; V(PB;NR ; t1500) =  0,350090214; V(QR/N ; t1500) =  0,467184889; and V(QNR/N ; t1500) = 0,570485945, with average V*(x; t1500) =  0,441479829.

Hmm, interesting. Working my way through intelligent cognition with a sigmoid, after 1500 rounds of experimentation, I have somehow decreased the mutual fitness of decisions I make through individual variables. Those V(xi)’s have changed. Now, let’s see what it gives when I do the same with the hyperbolic tangent: V(LCOER; t1500) =   0,347752478; V(LCOENR; t1500) =  0,317803169; V(KR; t1500) =   0,496752021; V(KNR; t1500) = 0,436752021; V(IR ; t1500) =  0,312040791; V(INR ; t1500) =  0,575690006; V(PA;R ; t1500) =  0,411438698; V(PA;NR ; t1500) =  0,312052766; V(PB;R ; t1500) = 0,370346458; V(PB;NR ; t1500) = 0,319435252; V(QR/N ; t1500) =  0,496752021; and V(QNR/N ; t1500) = 0,570485945, with average V*(x; t1500) =0,413941802.

Well, it is becoming more and more interesting. Being a dumb perceptron, I can, nevertheless, create two different states of mutual fitness between my decisions, depending on the kind of neural function I use. I want to have a bird’s eye view on the whole thing. How can a perceptron have a bird’s eye view of anything? Simple: it rents a drone. How can a perceptron rent a drone? Well, how smart do you have to be to rent a drone? Anyway, it gives something like the graph below:

 

Wow! So this is what I do, as a perceptron, and what I haven’t been aware so far? Amazing. When I think in sigmoid, I sort of consistently increase the relative distance between my decisions, i.e. I decrease their mutual fitness. The sigmoid, that function which sorts of calms down any local disturbance, leads to making a decision-making process like less coherent, more prone to embracing a little chaos. The hyperbolic tangent thinking is different. It occasionally sort of stretches across a broader spectrum of fitness in decisions, but as soon as it does so, it seems being afraid of its own actions, and returns to the initial level of V*(x). Please, note that as a perceptron, I am almost alive, and I produce slightly different outcomes in each instance of myself. The point is that in the line corresponding to hyperbolic tangent, the comb-like pattern of small oscillations can stretch and move from instance to instance. Still, it keeps the general form of a comb.

OK, so this is what I do, and now I ask myself: how can I possibly learn on that thing I have just become aware I do? As a perceptron, endowed with this precise logical structure, I can do one thing with information: I can arithmetically add it to my input. Still, having some ambitions for evolving, I attempt to change my logical structure, and I risk myself into incorporating somehow the observable V(xi) into my neural activation function. Thus, the first thing I do with that new learning is to top the values of input variables with local fitness functions observed in the previous round of experimenting. I am doing it already with local errors observed in outcome variables, so why not doubling the dose of learning? Anyway, it goes like: xi(t0) = xi(t-1) + e(xi; t-1) + V(xi; t-1). It looks interesting, but I am still using just a fraction of information about myself, i.e. just that about input variables. Here is where I start being really ambitious. In the equation of the sigmoid function, I change s = 1 / [1 + exp(∑xi*Wi)] into s = 1 / [1 + exp(∑xi*Wi + V(To)], where V(To) stands for local fitness functions observed in output  variables. I do the same by analogy in my version based on hyperbolic tangent. The th = [exp(2*∑xi*wi)-1] / [exp(2*∑xi*wi) + 1] turns into th = {exp[2*∑xi*wi + V(To)] -1} / {exp[2*∑xi*wi + V(To)] + 1}. I do what I know how to do, i.e. adding information from fresh observation, and I apply it to change the structure of my neural function.

All those ambitious changes in myself, put together, change my pattern of learing as shown in the graph below:

When I think sigmoid, the fact of feeding back my own fitness function does not change much. It makes the learning curve a bit steeper in the early experimental rounds, and makes it asymptotic to a little lower threshold in the last rounds, as compared to learning without feedback on V(xi). Yet, it is the same old sigmoid, with just its sleeves ironed. On the other hand, the hyperbolic tangent thinking changes significantly. What used to look like a comb, without feedback, now looks much more aggressive, like a plough on steroids. There is something like a complex cycle of learning on the internal cohesion of decisions made. Generally, feeding back the observable V(xi) increases the finally achieved cohesion in decisions, and, in the same time, it reduces the cumulative error gathered by the perceptron. With that type of feedback, the cumulative error of the sigmoid, which normally hits around 2,2 in this case, falls to like 0,8. With hyperbolic tangent, cumulative errors which used to be 0,6 ÷ 0,8 without feedback, fall to 0,1 ÷ 0,4 with feedback on V(xi).

 

The (provisional) piece of wisdom I can have as my takeaway is twofold. Firstly, whatever I do, a large chunk of perceptual learning leads to a bit less cohesion in my decisions. As I learn by experience, I allow myself more divergence in decisions. Secondly, looping on that divergence, and including it explicitly in my pattern of learning leads to relatively more cohesion at the end of the day. Still, more cohesion has a price – less learning.

 

I am consistently delivering good, almost new science to my readers, and love doing it, and I am working on crowdfunding this activity of mine. As we talk business plans, I remind you that you can download, from the library of my blog, the business plan I prepared for my semi-scientific project Befund  (and you can access the French version as well). You can also get a free e-copy of my book ‘Capitalism and Political Power’ You can support my research by donating directly, any amount you consider appropriate, to my PayPal account. You can also consider going to my Patreon page and become my patron. If you decide so, I will be grateful for suggesting me two things that Patreon suggests me to suggest you. Firstly, what kind of reward would you expect in exchange of supporting me? Secondly, what kind of phases would you like to see in the development of my research, and of the corresponding educational tools?

[1] De Vincenzo, I., Massari, G. F., Giannoccaro, I., Carbone, G., & Grigolini, P. (2018). Mimicking the collective intelligence of human groups as an optimization tool for complex problems. Chaos, Solitons & Fractals, 110, 259-266.

More vigilant than sigmoid

My editorial on You Tube

 

I keep working on the application of neural networks as simulators of collective intelligence. The particular field of research I am diving into is the sector of energy, its shift towards renewable energies, and the financial scheme I invented some time ago, which I called EneFin. As for that last one, you can consult « The essential business concept seems to hold », in order to grasp the outline.

I continue developing the line of research I described in my last update in French: « De la misère, quoi ». There are observable differences in the prices of energy according to the size of the buyer. In many countries – practically in all the countries of Europe – there are two, distinct price brackets. One, which I further designated as PB, is reserved to contracts with big consumers of energy (factories, office buildings etc.) and it is clearly lower. Another one, further called PA, is applied to small buyers, mainly households and really small businesses.

As an economist, I have that intuitive thought in the presence of price forks: that differential in prices is some kind of value. If it is value, why not giving it some financial spin? I came up with the idea of the EneFin contract. People buy energy from a local supplier, in the amount Q, who sources it from renewables (water, wind etc.), and they pay the price PA, thus generating a financial flow equal to Q*PA. That flow buys two things: energy priced at PB, and participatory titles in the capital of their supplier, for the differential Q*(PA – PB). I imagine some kind of crowdfunding platform, which could channel the amount of capital K = Q*(PA – PB).

That K remains in some sort of fluid relationship to I, or capital invested in the productive capacity of energy suppliers. Fluid relationship means that each of those capital balances can date other capital balances, no hard feelings held. As we talk (OK, I talk) about prices of energy and capital invested in capacity, it is worth referring to LCOE, or Levelized Cost Of Electricity. The LCOE is essentially the marginal cost of energy, and a no-go-below limit for energy prices.

I want to simulate the possible process of introducing that general financial concept, namely K = Q*(PA – PB), into the market of energy, in order to promote the development of diversified networks, made of local suppliers in renewable energy.

Here comes my slightly obsessive methodological idea: use artificial intelligence in order to simulate the process. In classical economic method, I make a model, I take empirical data, I regress some of it on another some of it, and I come up with coefficients of regression, and they tell me how the thing should work if we were living in a perfect world. Artificial intelligence opens a different perspective. I can assume that my model is a logical structure, which keeps experimenting with itself and we don’t the hell know where exactly that experimentation leads. I want to use neural networks in order to represent the exact way that social structures can possibly experiment with that K = Q*(PA – PB) thing. Instead of optimizing, I want to see that way that possible optimization can occur.

I have that simple neural network, which I already referred to in « The point of doing manually what the loop is supposed to do » and which is basically quite dumb, as it does not do any abstraction. Still, it nicely experiments with logical structures. I am sketching its logical structure in the picture below. I distinguish four layers of neurons: input, hidden 1, hidden 2, and output. When I say ‘layers’, it is a bit of grand language. For the moment, I am working with one single neuron in each layer. It is more of a synaptic chain.

Anyway, the input neuron feeds data into the chain. In the first round of experimentation, it feeds the source data in. In consecutive rounds of learning through experimentation, that first neuron assesses and feeds back local errors, measured as discrepancies between the output of the output neuron, and the expected values of output variables. The input neuron is like the first step in a chain of perception, in a nervous system: it receives and notices the raw external information.

The hidden layers – or the hidden neurons in the chain – modify the input data. The first hidden neuron generates quasi-random weights, which the second hidden neuron attributes to the input variables. Just as in a nervous system, the input stimuli are assessed as for their relative importance. In the original algorithm of perceptron, which I used to design this network, those two functions, i.e. generating the random weights and attributing them to input variables, were fused in one equation. Still, my fundamental intent is to use neural networks to simulate collective intelligence, and intuitively guess those two functions are somehow distinct. Pondering the importance of things is one action and using that ponderation for practical purposes is another. It is like scientist debating about the way to run a policy, and the government having the actual thing done. These are two separate paths of action.

Whatever. What the second hidden neuron produces is a compound piece of information: the summation of input variables multiplied by random weights. The output neuron transforms this compound data through a neural function. I prepared two versions of this network, with two distinct neural functions: the sigmoid, and the hyperbolic tangent. As I found out, the way they work is very different, just as the results they produce. Once the output neuron generates the transformed data – the neural output – the input neuron measures the discrepancy between the original, expected values of output variables, and the values generated by the output neuron. The exact way of computing that discrepancy is made of two operations: calculating the local derivative of the neural function, and multiplying that derivative by the residual difference ‘original expected output value minus output value generated by the output neuron’. The so calculated discrepancy is considered as a local error, and is being fed back into the input neuron as an addition to the value of each input variable.

Before I go into describing the application I made of that perceptron, as regards my idea for financial scheme, I want to delve into the mechanism of learning triggered through repeated looping of that logical structure. The input neuron measures the arithmetical difference between the output of the network in the preceding round of experimentation, and that difference is being multiplied by the local derivative of said output. Derivative functions, in their deepest, Newtonian sense, are magnitudes of change in something else, i.e. in their base function. In the Newtonian perspective, everything that happens can be seen either as change (derivative) in something else, or as an integral (an aggregate that changes its shape) of still something else. When I multiply the local deviation from expected values by the local derivative of the estimated value, I assume this deviation is as important as the local magnitude of change in its estimation. The faster things happen, the more important they are, so do say. My perceptron learns by assessing the magnitude of local changes it induces in its own estimations of reality.

I took that general logical structure of the perceptron, and I applied it to my core problem, i.e. the possible adoption of the new financial scheme to the market of energy. Here comes sort of an originality in my approach. The basic way of using neural networks is to give them a substantial set of real data as learning material, make them learn on that data, and then make them optimize a hypothetical set of data. Here you have those 20 old cars, take them into pieces and try to put them back together, observe all the anomalies you have thus created, and then make me a new car on the grounds of that learning. I adopted a different approach. My focus is to study the process of learning in itself. I took just one set of actual input values, exogenous to my perceptron, something like an initial situation. I ran 5000 rounds of learning in the perceptron, on the basis of that initial set of values, and I observed how is learning taking place.

My initial set of data is made of two tensors: input TI and output TO.

The thing I am the most focused on is the relative abundance of energy supplied from renewable sources. I express the ‘abundance’ part mathematically as the coefficient of energy consumed per capita, or Q/N. The relative bend towards renewables, or towards the non-renewables is apprehended as the distinction between renewable energy QR/N consumed per capita, and the non-renewable one, the QNR/N, possibly consumed by some other capita. Hence, my output tensor is TO = {QR/N; QNR/N}.

I hypothesise that TO is being generated by input made of prices, costs, and capital outlays. I split my price fork PA – PB (price for the big ones minus price for the small ones) into renewables and non-renewables, namely into: PA;R, PA;NR, PB;R, and PB;NR. I mirror the distinction in prices with that in the cost of energy, and so I call LCOER and LCOENR. I want to create a financial scheme that generates a crowdfunded stream of capital K, to finance new productive capacities, and I want it to finance renewable energies, and I call KR. Still, some other people, like my compatriots in Poland, might be so attached to fossils they might be willing to crowdfund new installations based on non-renewables. Thus, I need to take into account a KNR in the game. When I say capital, and I say LCOE, I sort of feel compelled to say aggregate investment in productive capacity, in renewables, and in non-renewables, and I call it, respectively, IR and INR. All in all, my input tensor spells TI = {LCOER, LCOENR, KR, KNR, IR, INR, PA;R, PA;NR, PB;R, PB;NR}.

The next step is scale and measurement. The neural functions I use in my perceptron like having their input standardized. Their tastes in standardization differ a little. The sigmoid likes it nicely spread between 0 and 1, whilst the hyperbolic tangent, the more reckless of the two, tolerates (-1) ≥ x ≥ 1. I chose to standardize the input data between 0 and 1, so as to make it fit into both. My initial thought was to aim for an energy market with great abundance of renewable energy, and a relatively declining supply of non-renewables. I generally trust my intuition, only I like to leverage it with a bit of chaos, every now and then, and so I ran some pseudo-random strings of values and I chose an output tensor made of TO = {QR/N = 0,95; QNR/N = 0,48}.

That state of output is supposed to be somehow logically connected to the state of input. I imagined a market, where the relative abundance in the consumption of, respectively, renewable energies and non-renewable ones is mostly driven by growing demand for the former, and a declining demand for the latter. Thus, I imagined relatively high a small-user price for renewable energy and a large fork between that PA;R and the PB;R. As for non-renewables, the fork in prices is more restrained (than in the market of renewables), and its top value is relatively lower. The non-renewable power installations are almost fed up with investment INR, whilst the renewables could still do with more capital IR in productive assets. The LCOENR of non-renewables is relatively high, although not very: yes, you need to pay for the fuel itself, but you have economies of scale. As for the LCOER for renewables, it is pretty low, which actually reflects the present situation in the market.

The last part of my input tensor regards the crowdfunded capital K. I assumed two different, initial situations. Firstly, it is virtually no crowdfunding, thus a very low K. Secondly, some crowdfunding is already alive and kicking, and it is sort of slightly above the half of what people expect in the industry.

Once again, I applied those qualitative assumptions to a set of pseudo-random values between 0 and 1. Here comes the result, in the table below.

 

Table 1 – The initial values for learning in the perceptron

Tensor Variable The Market with virtually no crowdfunding   The Market with significant crowdfunding
Input TI LCOER         0,26           0,26
LCOENR         0,48           0,48
KR         0,01   <= !! =>         0,56    
KNR         0,01            0,52    
IR         0,46           0,46
INR         0,99           0,99
PA;R         0,71           0,71
PA;NR         0,46           0,46
PB;R         0,20           0,20
PB;NR         0,37           0,37
Output TO QR/N         0,95           0,95
QNR/N         0,48           0,48

 

The way the perceptron works means that it generates and feeds back local errors in each round of experimentation. Logically, over the 5000 rounds of experimentation, each input variable gathers those local errors, like a snowball rolling downhill. I take the values of input variables from the last, i.e. the 5000th round: they have the initial values, from the table above, and, on the top of them, there is cumulative error from the 5000 experiments. How to standardize them, so as to make them comparable with the initial ones? I observe: all those final output values have the same cumulative error in them, across all the TI input tensor. I choose a simple method for standardization. As the initial values were standardized over the interval between 0 and 1, I standardize the outcoming values over the interval 0 ≥ x ≥ (1 + cumulative error).

I observe the unfolding of cumulative error along the path of learning, made of 5000 steps. There is a peculiarity in each of the neural functions used: the sigmoid, and the hyperbolic tangent. The sigmoid learns in a slightly Hitchcockian way. Initially, local errors just rocket up. It is as if that sigmoid was initially yelling: ‘F******k! What a ride!’. Then, the value of errors drops very sharply, down to something akin to a vanishing tremor, and starts hovering lazily over some implicit asymptote. Hyperbolic tangent learns differently. It seems to do all it can to minimize local errors whenever it is possible. Obviously, it is not always possible. Every now and then, that hyperbolic tangent produces an explosively high value of local error, like a sudden earthquake, just to go back into forced calm right after. You can observe those two radically different ways of learning in the two graphs below.

Two ways of learning – the sigmoidal one and the hyper-tangential one – bring interestingly different results, just as differentiated are the results of learning depending on the initial assumptions as for crowdfunded capital K. Tables 2 – 5, further below, list the results I got. A bit of additional explanation will not hurt. For every version of learning, i.e. sigmoid vs hyperbolic tangent, and K = 0,01 vs K ≈ 0,5, I ran 5 instances of 5000 rounds of learning in my perceptron. This is the meaning of the word ‘Instance’ in those tables. One instance is like a tensor of learning: one happening of 5000 consecutive experiments. The values of output variables remain constant all the time: TO = {QR/N = 0,95; QNR/N = 0,48}. The perceptron sweats in order to come up with some interesting combination of input variables, given this precise tensor of output.

 

Table 2 – Outcomes of learning with the sigmoid, no initial crowdfunding

 

The learnt values of input variables after 5000 rounds of learning
Learning with the sigmoid, no initial crowdfunding
Instance 1 Instance 2 Instance 3 Instance 4 Instance 5
cumulative error 2,11 2,11 2,09 2,12 2,16
LCOER 0,7617 0,7614 0,7678 0,7599 0,7515
LCOENR 0,8340 0,8337 0,8406 0,8321 0,8228
KR 0,6820 0,6817 0,6875 0,6804 0,6729
KNR 0,6820 0,6817 0,6875 0,6804 0,6729
IR 0,8266 0,8262 0,8332 0,8246 0,8155
INR 0,9966 0,9962 1,0045 0,9943 0,9832
PA;R 0,9062 0,9058 0,9134 0,9041 0,8940
PA;NR 0,8266 0,8263 0,8332 0,8247 0,8155
PB;R 0,7443 0,7440 0,7502 0,7425 0,7343
PB;NR 0,7981 0,7977 0,8044 0,7962 0,7873

 

 

Table 3 – Outcomes of learning with the sigmoid, with substantial initial crowdfunding

 

The learnt values of input variables after 5000 rounds of learning
Learning with the sigmoid, substantial initial crowdfunding
Instance 1 Instance 2 Instance 3 Instance 4 Instance 5
cumulative error 1,98 2,01 2,07 2,03 1,96
LCOER 0,7511 0,7536 0,7579 0,7554 0,7494
LCOENR 0,8267 0,8284 0,8314 0,8296 0,8255
KR 0,8514 0,8529 0,8555 0,8540 0,8504
KNR 0,8380 0,8396 0,8424 0,8407 0,8369
IR 0,8189 0,8207 0,8238 0,8220 0,8177
INR 0,9965 0,9965 0,9966 0,9965 0,9965
PA;R 0,9020 0,9030 0,9047 0,9037 0,9014
PA;NR 0,8189 0,8208 0,8239 0,8220 0,8177
PB;R 0,7329 0,7356 0,7402 0,7375 0,7311
PB;NR 0,7891 0,7913 0,7949 0,7927 0,7877

 

 

 

 

 

Table 4 – Outcomes of learning with the hyperbolic tangent, no initial crowdfunding

 

The learnt values of input variables after 5000 rounds of learning
Learning with the hyperbolic tangent, no initial crowdfunding
Instance 1 Instance 2 Instance 3 Instance 4 Instance 5
cumulative error 1,1 1,27 0,69 0,77 0,88
LCOER 0,6470 0,6735 0,5599 0,5805 0,6062
LCOENR 0,7541 0,7726 0,6934 0,7078 0,7257
KR 0,5290 0,5644 0,4127 0,4403 0,4746
KNR 0,5290 0,5644 0,4127 0,4403 0,4746
IR 0,7431 0,7624 0,6797 0,6947 0,7134
INR 0,9950 0,9954 0,9938 0,9941 0,9944
PA;R 0,8611 0,8715 0,8267 0,8349 0,8450
PA;NR 0,7432 0,7625 0,6798 0,6948 0,7135
PB;R 0,6212 0,6497 0,5277 0,5499 0,5774
PB;NR 0,7009 0,7234 0,6271 0,6446 0,6663

 

 

Table 5 – Outcomes of learning with the hyperbolic tangent, substantial initial crowdfunding

 

The learnt values of input variables after 5000 rounds of learning
Learning with the hyperbolic tangent, substantial initial crowdfunding
Instance 1 Instance 2 Instance 3 Instance 4 Instance 5
cumulative error -0,33 0,2 -0,06 0,98 -0,25
LCOER (0,1089) 0,3800 0,2100 0,6245 0,0110
LCOENR 0,2276 0,5681 0,4497 0,7384 0,3111
KR 0,3381 0,6299 0,5284 0,7758 0,4096
KNR 0,2780 0,5963 0,4856 0,7555 0,3560
IR 0,1930 0,5488 0,4251 0,7267 0,2802
INR 0,9843 0,9912 0,9888 0,9947 0,9860
PA;R 0,5635 0,7559 0,6890 0,8522 0,6107
PA;NR 0,1933 0,5489 0,4252 0,7268 0,2804
PB;R (0,1899) 0,3347 0,1522 0,5971 (0,0613)
PB;NR 0,0604 0,4747 0,3306 0,6818 0,1620

 

The cumulative error, the first numerical line in each table, is something like memory. It is a numerical expression of how much experience has the perceptron accumulated in the given instance of learning. Generally, the sigmoid neural function accumulates more memory, as compared to the hyper-tangential one. Interesting. The way of processing information affects the amount of experiential data stored in the process. If you use the links I gave earlier, you will see different logical structures in those two functions. The sigmoid generally smoothes out anything it receives as input. It puts the incoming, compound data in the negative exponent of the Euler’s constant e = 2,72, and then it puts the resulting value as part of the denominator of 1. The sigmoid is like a bumper: it absorbs shocks. The hyperbolic tangent is different. It sort of exposes small discrepancies in input. In human terms, the hyper-tangential function is more vigilant than the sigmoid. As it can be observed in this precise case, absorbing shocks leads to more accumulated experience than vigilantly reacting to observable change.

The difference in cumulative error, observable in the sigmoid-based perceptron vs that based on hyperbolic tangent is particularly sharp in the case of a market with substantial initial crowdfunding K. In 3 instances on 5, in that scenario, the hyper-tangential perceptron yields a negative cumulative error. It can be interpreted as the removal of some memory, implicitly contained in the initial values of input variables. When the initial K is assumed to be 0,01, the difference in accumulated memory, observable between the two neural functions, significantly shrinks. It looks as if K ≥ 0,5 was some kind of disturbance that the vigilant hyperbolic tangent attempts to eliminate. That impression of disturbance created by K ≥ 0,5 is even reinforced as I synthetically compare all the four sets of outcomes, i.e. tables 2 – 5. The case of learning with the hyperbolic tangent, and with substantial initial crowdfunding looks radically different from everything else. The discrepancy between alternative instances seems to be the greatest in this case, and the incidentally negative values in the input tensor suggest some kind of deep shakeoff. Negative prices and/or negative costs mean that someone external is paying for the ride, probably the taxpayers, in the form of some fiscal stimulation.

I am consistently delivering good, almost new science to my readers, and love doing it, and I am working on crowdfunding this activity of mine. As we talk business plans, I remind you that you can download, from the library of my blog, the business plan I prepared for my semi-scientific project Befund  (and you can access the French version as well). You can also get a free e-copy of my book ‘Capitalism and Political Power’ You can support my research by donating directly, any amount you consider appropriate, to my PayPal account. You can also consider going to my Patreon page and become my patron. If you decide so, I will be grateful for suggesting me two things that Patreon suggests me to suggest you. Firstly, what kind of reward would you expect in exchange of supporting me? Secondly, what kind of phases would you like to see in the development of my research, and of the corresponding educational tools?