The kind of puzzle that Karl Friedrich was after

My editorial on You Tube

Over the last few updates, I have been indulging in the mathematical logic of Gaussian process, eating it with the spoon of mean-reversion. My so-far experience with using the logic of Gaussian process is that of my personal strategy as regards investment in the stock market, and especially as regards those short, periodical episodes of reshuffling in my investment portfolio, when I am exposed to, and I frequently yield to the gambling-like temptation of short trade (see Acceptably dumb proof. The method of mean-reversion , Fast + slower = compound rhythm, the rhythm of life, and We really don’t see small change ). Gambling-like is the key concept here. I engage into quick trade, and I feel that special flow, peculiar to gambling behaviour, and yet I want that flow to weave around a rational strategy, very much in the spirit of Abraham de Moivre’s ‘The doctrine of chances: or, A method of calculating the probabilities of events in play’, published in 1756. A bit of gambling, yes, but informed gambling.  

I am trying to understand why a neural network based on mean-reversed prices as input consistently underestimates the real price, and why the whole method of mean-reversion fails with super-stable prices, such as those of cobalt or uranium (see We really don’t see small change).

I like understanding things. I like understanding the deep logic of the things I do and the methods I use. Here comes the object of my deep intellectual dive, the normal distribution. In the two pictures below, you can see the initial outline of the problem.

How does a function, namely that of normal distribution, assist my process of decision making? Of course, the first-order answer is simple: ‘it gives you numbers, bro’, and when you see those numbers you essentially know what to do’. Good, great, but I want to understand HOW EXACTLY those numbers, thus the function I use, match with my thinking and my action.

Good. I have a function, i.e. that of normal distribution, and for some reason that function works. It works geometrically. The whole mathematical expression serves to create a fraction. If you look carefully at the equation, you will understand that with given mean value μ and standard deviation σ, there is no way this function can go above 1. It is always a fraction. A fraction can be seen from different angles. Firstly, it is a portion of something, like a / b, where a < b. There is a bigger something, the denominator of the fraction, σ[(2π)0,5] = σ* 2,506628275. (elevation to power 0,5 replaces the sign of square root, which I cannot reproduce exactly from the keyboard, as a font).  Secondly, as we talk about denominators, a fraction is a change in units of measurement. Instead of measuring reality in units of 1 – the smallest prime number – we measure reality in units of whatever we put in the denominator of the fraction. Thirdly, a fraction is a proportion between two sides of a rectangle, namely the proportion between the shorter side and the longer side.

Good, so what this function of normal distribution represents is a portion cut of a bigger something equal to σ[(2π)0,5], and that something is my unit of measurement, and, in the same time, it is the longer side of a rectangle. The expression σ[(2π)0,5] is something like one dimension of my world, whilst the whole equation of normal distribution, i.e. the value of that function, makes the other dimension. Is the Gaussian world a rectangular world? I need to know. I start talking to dead people. Usually helps. This time, my interlocutor is Karl Friedrich Gauss, in his General Investigations of Curved Surfaces, presented to the Royal Society, October 8th, 1827.

What many people ignore today is that what we call a Gaussian curve is the outcome of a mathematical problem, which, initially, had virtually nothing to do with probability. What Karl Friedrich Gauss (almost) solved was the problem of geodetic measurements, i.e. the distinction between the bird’s flight distance, and the actual length of the same distance on the rugged and uneven surface of the Earth. I know, when we go through mountains, it is sometimes uphill, sometimes downhill, and, on average, it is flat. Still, when you have to build a railroad through the same mountains, the actual length (spell: cost) of rails to put on the ground is much greater than what would be needed for building the same railroad in the plain. That’s the type of puzzle that Karl Friedrich was after.

Someone could say there is no puzzle. You want to know how long a rail do you need to go over a mountain, you send surveyors and they measure it. Splendid. Yet, civil engineering involves some kind of interference with the landscape. I can come up with the idea of putting my railroad alongside like the half-height of the mountain (instead of going right over its top), or maybe we could sort of shave off the top, couldn’t we, civilised people whom we are? Yes, those ideas are all valid, and I can have a lot of them. Sending surveyors each time I come up with a new concept can become terribly time- and money-consuming. What I could do with is a method of approximating each of those alternative distances on a curved surface, a method which finds good compromise between exactitude and simplicity.

Gauss assumed that when we convert the observation of anything curved – rugged land, or the orbit of a planet – into linear equations, we lose information. The challenge is to lose as little an amount thereof as possible. And here the story starts. Below, you will find a short quote from Gauss: the first paragraph of the introduction.   

1.

Investigations, in which the directions of various straight lines in space are to be considered, attain a high degree of clearness and simplicity if we employ, as an auxiliary, a sphere of unit radius described about an arbitrary centre, and suppose the different points of the sphere to represent the directions of straight lines parallel to the radii ending at these points. As the position of every point in space is determined by three coordinates, that is to say, the distances of the point from three mutually perpendicular fixed planes, it is necessary to consider, first of all, the directions of the axes perpendicular to these planes. The points on the sphere, which represent these directions, we shall denote by (1), (2), (3). The distance of any one of these points from either of the other two will be a quadrant; and we shall suppose that the directions of the axes are those in which the corresponding coordinates increase.’

Before I go further, a disclaimer is due. What follows is my own development on Karl Friedrich Gauss’s ideas, not an exact summary on his thoughts. If you want to go to the source, go to the source, i.e. to Gauss’s original writings.

In this introductory paragraph, reality is a sphere. Question: what geometrical shape does my perception of reality have? Do I perceive reality as a flat surface, as a sphere (as it is the case with Karl Friedrich Gauss), or maybe is it a cone, or a cube? How can I know what is the geometrical shape of my perception? Good. I feel my synapses firing a bit faster. There is nothing like an apparently absurd, mindf**king question to kick my brain into higher gear. If I want to know what shape of reality I am perceiving, it is essentially about distance.

I approach the thing scientifically, and I start by positing hypotheses. My perceived reality is just a point, i.e. everything could be happening together, without any perceived dimension to it. Sort of a super small and stationary life. I could stretch into a segment, and thus giving my existence at least one dimension to move along, and yet within some limits. If I allow the unknown and the unpredictable into my reality, I can perceive it in the form of a continuous, endless, straight line. Sometimes, my existence can be like a bundle of separate paths, each endowed with its own indefiniteness and its own expanse: this is reality made of a few straight lines in front of me, crossing or parallel to each other. Of course, I can stop messing around with discontinuities and I can generalise those few straight lines into a continuous plane. This could make me ambitious, and I could I come to the conclusion that flat is boring. Then I bend the plane into a sphere, and, finally things get really interesting and I assume that what I initially thought is a sphere is actually a space, i.e. a Russian doll made of a lot of spheres with different radiuses, packed one into the other.

I am pretty sure that anything else can be made out of those seven cases. If, for example, my perceived reality is a tetrahedron (i.e. any of the Egyptian pyramids after having taken flight, as any spaceship should, from time to time; just kidding), it is a reality made of semi-planes delimited by segments, thus the offspring of a really tumultuous relationship between a segment and a plane etc.

Let’s take any two points in my universe. Why two and not just one? ‘Cause it’s more fun, in the first place, and then, because of an old, almost forgotten technique called triangulation. I did it in the boy scout times, long before Internet and commercial use of Global Positioning System. You are in the middle of nowhere, and you have just a very faint idea of where exactly that nowhere is, and yet you have a map of it. On the map of nowhere, you find points which you are sort of spotting in the vicinity. That mountain on your 11:00 o’clock looks almost exactly like the mountain (i.e. the dense congregation of concentric contour lines) on the map. That radio tower on your 01:00 o’clock looks like the one marked on the map etc. Having just two points, i.e. the mountain and the radio tower, you can already find your position. You need a flat surface to put your map on, a compass (or elementary orientation by the position of the sun), a pencil and a ruler (or anything with a straight, smooth, hard edge). You position your map conformingly to the geographical directions, i.e. the top edge of the map should be perpendicular to the East-West axis (or, in other words, the top edge of the map should be facing North). You position the ruler on the map so as it marks an imaginary line from the mountain in the real landscape to the mountain on the map. You draw that straight line with the pencil. I do the same for the radio tower, i.e. I draw, on the map, a line connecting the real radio tower I can see to the radio tower on the map. Those lines cross on the map, and the crossing point is my most likely position.

Most likely is different from exact. By my own experience of having applied triangulation in real outdoors (back in the day, before Google Maps, and almost right after Gutenberg printed his first Bible), I know that triangulating with two points is sort of tricky. If my map is really precise (low scale, like military grade), and if it is my lucky day, two points yield a reliable positioning. Still, what used to happen more frequently, were doubtful situations. Is the mountain I can see on the horizon the mountain I think it is on the map? Sometimes it is, sometimes not quite. The more points I triangulate my position on, the closer I come to my exact location. If I have like 5 points or more, triangulating on them can even compensate slight inexactitude in the North-positioning of my map.   

The partial moral of the fairy tale is that representing my reality as a sphere around me comes with some advantages: I can find my place in that reality (the landscape) by using just an imperfect representation thereof (the map), and some thinking (the pencil, the ruler, and the compass).  I perceive my reality as a sphere, and I assume, following the intuitions of William James, expressed in his ‘Essays in Radical Empiricism’ that “there is only one primal stuff or material in the world, a stuff of which everything is composed, and if we call that stuff ‘pure experience,’ then knowing can easily be explained as a particular sort of relation towards one another into which portions of pure experience may enter. The relation itself is a part of pure experience; one of its ‘terms’ becomes the subject or bearer of the knowledge, the knower,[…] the other becomes the object known.” (Excerpt From: William James. “Essays in Radical Empiricism”. Apple Books).

Good. I’m lost. I can have two alternative shapes of my perceptual world: it can be a flat rectangle, or a sphere, and I keep in mind that both shapes are essentially my representations, i.e. my relations with the primal stuff of what’s really going on. The rectangle serves me to measure the likelihood of something happening, and the unit of likelihood is σ[(2π)0,5]. The sphere, on the other hand, has an interesting property: being in the centre of the sphere is radically different from being anywhere else. When I am in the centre, all points on the sphere are equidistant from me. Whatever happens is always at the same distance from my position: everything is equiprobable. On the other hand, when my current position is somewhere else than the centre of the sphere, points on the sphere are at different distances from me.

Now, things become a bit complicated geometrically, yet they remain logical. Imagine that your world is essentially spherical, and that you have two complementary, perceptual representations thereof, thus two types of maps, and they are both spherical as well. One of those maps locates you in its centre: it is a map of all the phenomena which you perceive as equidistant from you, thus equiprobable as for their possible occurrence. C’mon, you know, we all have that thing: anything can happen, and we don’t even bother which exact thing happens in the first place. This is a state of mind which can be a bit disquieting – it is essentially chaos acknowledged – yet, once you get the hang of it, it becomes interesting. The second spherical map locates you away from its centre, and automatically makes real phenomena different in their distance from you, i.e. in their likelihood of happening. That second map is more structured than the first one. Whilst the first is chaos, the second is order.

The next step is to assume that I can have many imperfectly overlapping chaoses in an otherwise ordered reality. I can squeeze, into an overarching, ordered representation of reality, many local, chaotic representations thereof. Then, I can just slice through the big and ordered representation of reality, following one of its secant planes. I can obtain something that I try to represent graphically in the picture below. Each point under the curve of normal distribution can correspond to the centre of a local sphere, with points on that sphere being equidistant from the centre. This is a local chaos. I can fit indefinitely many local chaoses of different size under the curve of normal distribution. The sphere in the middle, the one that touches the very belly of the Gaussian curve, roughly corresponds to what is called ‘standard normal distribution’, with mean μ = 0, and standard deviation σ =1. This is my central chaos, if you want, and it can have indefinitely many siblings, i.e. other local chaoses, located further towards the tails of the Gaussian curve.

An interesting proportion emerges between the sphere in the middle (my central chaos), and all the other spheres I can squeeze under the curve of normal distribution. That central chaos groups all the phenomena, which are one standard deviation away from me; remember: σ =1. All the points on the curve correspond to indefinitely many intersections between indefinitely many smaller spheres (smaller local chaoses), and the likelihood of each of those intersections happening is always a fraction of σ[(2π)0,5] = σ* 2,506628275. The normal curve, with its inherent proportions, represents the combination of all the possible local chaoses in my complex representation of reality.    

Good, so when I use the logic of mean-reversion to study stock prices and elaborating a strategy of investment, thus when I denominate the differences between those prices and their moving averages in units of standard deviation, it is as if I assumed that standard deviation makes σ =1. In other words, I am in the sphere of central chaos, and I discriminate stock prices into three categories, depending on the mean-reversed price. Those in the interval -1 ≤ mean-reversed price ≤ 1 are in my central chaos, which is essentially the ‘hold stock’ chaos. Those, which bear a mean-reversed price < -1, are in the peripheral chaos of the ‘buy’ strategy. Conversely, those with mean-reversed price > 1 are in another peripheral chaos, that of ‘sell’ strategy.

Now, I am trying to understand why a neural network based on mean-reversed prices as input consistently underestimates the real price, and why the whole method of mean-reversion fails with super-stable prices, such as those of cobalt or uranium (see We really don’t see small change). When prices are super-stable, thus when the moving standard deviation is σ = 0, mean-reversion, with its denomination in standard deviations, yields the ‘Division by zero!’ error, which is the mathematical equivalent of ‘WTF?’. When σ = 0, my central chaos (the central sphere under the curve) shrinks a point, devoid of any radius. Interesting. Things that change below the level of my perception deprive me of my central sphere of chaos. I am left just with the possible outliers (peripheral chaoses) without a ruler to measure them.

As regards the estimated output of my neural network (I mean, not the one in my head, the one I programmed) being consistently below real prices, I understand it as a proclivity of said network to overestimate the relative importance of peripheral chaoses in the [x < -1] [buy] zone, and, on the other hand, to underestimate peripheral chaoses existing in the [x > 1] [sell] zone. My neural network is sort of myopic to peripheral chaoses located far above (or to the right of, if you prefer) the center of my central chaos. If, as I deeply believe, the logic of mean-reversion represents an important cognitive structure in my mind, said mind tends to sort of leave one gate unguarded. In the case of price estimation, it is the gate of ‘sell’ opportunities, which, in turn, leads me to buy and hold whatever I invest in, rather than exchanging it back into money (which is the exact economic content of what we call ‘selling’).         

Interesting. When I use the normal distribution to study stock prices, one tail of the distribution – the one with abnormally high values – is sort of neglected to the benefit of the other tail, that with low values. It looks like the normal distribution is not really normal, but biased.

The expected amount of what can happen

 

I am working on that customer forecast thing for my EneFinproject. I want to hit as accurate a forecast, regarding the volume and value of sales, as well as the number of customers, as possible. In my last three updates – Le modèle d’un marché relativement conformiste, Safely narrow down the apparent chaos, and La valeur espérée– I sort of kept turning around that forecast, testing and discussing various angles of approach. So far, I have been sticking to one, central, and somehow an implicit assumption, namely that the EneFinproject – that transactional platform for trading complex contracts, combining futures on energy with participatory deeds – will tap into patterns observable in the market of energy. Still, EneFinis essentially a FinTech concept, which just explores those large disparities between the retail prices of electricity in Europe. Essentially, the concept is applicable in any market with noticeable variance in prices, at the same tier of the value chain. Thus, I could look for good patterns and assumptions in the market of financial services, even very straightforwardly in the FinTech sector.

Good, time to work up to the desired synaptic tension. I open up calmly, with the financial results of Square Inc., a big, US – based FinTech company. I am rummaging in their SEC filings, and more specifically in their 10-K annual report for 2017. I am spotting that nice history of revenues, which I present here below, first as a table with values given in millions of dollars, then as two consecutive graphs, just to give you an idea of proportions.

 

Table 1

  Revenue of Square Inc., USD mln
Year Transaction-based revenue Subscription-based revenue Hardware revenue Total net revenue
2013  433,74  –  4,24  552,43
2014  707,80  12,05  7,32  850,19
2015  1 050,45  58,01  16,38  1 267,12
2016  1 456,16  129,35  44,31  1 708,72
2017  1 920,17  252,66  41,42  2 214,25

 

Graph 1

Square Inc Revenue 2017

 

Graph 2

Square Inc Revenue 2013 2017

The revenues of Square Inc., in terms of sheer size, are a bit out of reach for any startup in the FinTech industry. What I am interested in are mostly proportions. Here, in this update, I am going to apply one particular path of thinking to studying those sizes and proportions. Mind you: this is basic science in action. ‘Basic’ means that I take the very basic analytical tools of logic and mathematics, and I am sort of counting my way through that data. In educational terms it is good example of how you can use the most fundamental logical structures you have in your personal toolbox and invent a method of discovering reality.

And so I discover. I start with the category ‘Subscription-based revenue’, as it looks very much like a startup inside an established business, i.e. it starts from scratch. Intrapreneurship, it is called, I believe. My goal is to find benchmarks for my EneFinproject, and, more specifically, to form some understanding about the way a FinTech project can build its customer base. The specific history of subscription-based revenue with Square Inc. is a process I want to squeeze as much information out as possible. So I start squeezing. A process is an account of happening. It is like a space made of phenomena, carved out of a larger space where, where, technically, anyone can do anything. I take each year of that time series, from 2013 through 2017, as a space, and in a space, distance matters. So I measure distances, the Euclidean ones. In a unidimensional space, as it is the case here, the Euclidean distancebetween two points is very much akin local deviation. I subtract the value at point B from the value at point A, and, just to be sure of getting rid of that impertinent minus that could possibly poke its head out of the computation, I take the so-obtained difference to its square power, so I do (A – B)2, just to take a square root of that square power immediately afterwards: [(A – B)2]1/2.

The logic of the Euclidean distance is basically made for planes, i.e. for two-dimensional spaces. In that natural environment of its own, the Euclidean distance looks very much the I-hope-really-familiar-to-you Pythagorean theorem. C’mon, you know that: a2+ b2= c2, in a right triangle. Now, if you place your right triangle in a manifold with numerical coordinates, your line segments a,b, and cbecome like a = x2– x1, b = y2– y1, and c = [(x2– x1)2+ (y2– y1)2]1/2. If you have more than two dimensions, i.e. when your space truly becomes a space, you need to reduce them down to two dimensions, precisely by taking those multiple dimensions two by two and converting the complex coordinates of a point into Euclidean distances. Complicated? I hope so, honestly. If it wasn’t, I couldn’t play the smart guy here.

Right, my Square Inc. case study. I am coming back to it. I take that history of growing revenues in the ‘Subscription-based’ category and I consider it as a specific, local unfolding of events in a space. I calculate distances, in millions of dollars, in between each pair of years.  I take the value of revenues in a given year and I subtract it from the value of revenues in any given other year. I treat the so-obtained difference with that anti-minus, square-root-of-square-power therapy. The picture below summarizes that part of the analytical process, and Table 2, further below the picture, gives the numerical results, i.e. the Euclidean distances in between each given pair of years, in millions of dollars in revenue, and corrected for the temporal distance in that given pair of years.

Square Euclidean Subscription Revenue

 

Table 2

Euclidean distance in subscription-based revenues, USD mln over time between years
2013 2014 2015 2016 2017
2013 12,09 58,05 129,39 252,70
2014 12,09 45,98 117,32 240,64
2015 58,05 45,98 71,35 194,66
2016 129,39 117,32 71,35 123,32
2017 252,70 240,64 194,66 123,32

 

Now, as we have those results, what’s the next step? The next step consists in a bit of intellectual gymnastics. Those Euclidean distances in Table 2, they are happenings. They reflect the amount of sales that happened in between those pairs of years. Each year is a checkpoint: those revenues are measured at the end – or, more exactly, after the closure – of the fiscal year. Between 2014 and 2015, there are 365 days of temporal distance etc.

We have a set of happenings. What is the kind of happening that we can expect the most to happen? Answer: the average. Yes, the average. Why the average? Because the average is the expected value in a set of numerical observations. You can go back to Safely narrow down the apparent chaos if you need to refresh your background. This is the theorem of de Moivre – Laplace: the expected value in a set is the average. I am just reverting the order of ideas. I claim that the average is the expected value.

The average from Table 2 is $124,5 mln. This is the expected amount of what can happen, in one year, to the revenues of Square Inc. from subscription-based sales. It serves me to denominate the actual revenues as reported in Table 1. By denominating, I mean taking the actual, subscription-based revenue from each year, and dividing it by that average Euclidean distance. You can see the result in the picture below. Some kind of cycle seems to emerge: this particular branch of business at Square Inc. needed like 4 years to exceed the expected amount of what can happen in one year, namely the average Euclidean distance.

Square Euclidean Subscription Revenue_2

A good scientist checks his facts. Firstly, it is in order to make sure they are his facts. Sometimes, quite embarrassingly, they turn out to be somebody else’s facts, and that creates awkward situations when it comes to sharing the merit, and the cash, coming with a Nobel award. Secondly, checking facts broadens one’s intellectual horizons, although it might hurt a bit. So I am checking my facts. Good scientist, check!

I repeat the same computational procedure with the two remaining categories of revenues at Square Inc: the transaction-based ones, and those coming from the sales of hardware. Still, what I do is almost the same computational procedure. The ‘almost’ part regards the fact that those two other fields of business had non-null revenues in 2013, when the publicly disclosed financial reporting starts. Subscription-based revenues started from the literal scratch, and those two other had already something in their respective belts in 2013. In order to make my calculations mutually comparable, I need to transform the time series of transaction-based, and hardware-based revenues so as they look as starting from nearly nothing.

This is simple. You want to make people look as if they were starting from scratch? Just take their money from them. Usually works, this one. This is what I do. I take $433,73 mln from each year of transaction-based sales, and $4,23 mln with respect to each year of hardware-based revenues. Instantaneously, both look younger, and, as soon as they do, I make them do the same gymnastics. Bet Eucliean, one! Compute the expected Euclidean, two! Divide reality by the expected Euclidean, three!

Seems to work. In those two other categories of revenues, I can observe slightly shorter a cycle of achieving the expected amount of happening, like 3+ years. Useful for that business plan of mine, for the EneFinproject.

You can see the general drift of those calculations in the pictures and tables that follow below. Now, one thing is to keep in mind. What I am doing here is having fun with science, just as we can have fun with painting, photography, sport or travel: you take some simple tools, and you just see what happens when you use them the way you think could be interesting. This is probably the strongest message I want to deliver in that entire scientific blog of mine: it is fun to have fun with science.

Square Inc transformation of revenue

 

Table 3

Euclidean distance in transaction-based revenue, USD mln over time between years
2013 2014 2015 2016 2017
2013 274,06 616,71 1 022,43 1 486,44
2014 274,06 342,65 748,36 1 212,38
2015 616,71 342,65 405,72 869,73
2016 1 022,43 748,36 405,72 464,02
2017 1 486,44 1 212,38 869,73 464,02

 

 

Table 4

Euclidean distance in hardware-based revenues, USD mln over time between years
2013 2014 2015 2016 2017
2013 12,09 58,05 129,39 252,70
2014 3,24 45,98 117,32 240,64
2015 12,30 9,11 71,35 194,66
2016 40,18 37,04 27,95 123,32
2017 37,39 34,22 25,12 3,06

Square Euclidean Transaction Revenue

Square Euclidean Hardware Revenue

 

I am consistently delivering good, almost new science to my readers, and love doing it, and I am working on crowdfunding this activity of mine. As we talk business plans, I remind you that you can download, from the library of my blog, the business plan I prepared for my semi-scientific project Befund  (and you can access the French versionas well). You can also get a free e-copy of my book ‘Capitalism and Political Power’ You can support my research by donating directly, any amount you consider appropriate, to my PayPal account. You can also consider going to my Patreon pageand become my patron. If you decide so, I will be grateful for suggesting me two things that Patreon suggests me to suggest you. Firstly, what kind of reward would you expect in exchange of supporting me? Secondly, what kind of phases would you like to see in the development of my research, and of the corresponding educational tools?

Support this blog

€10.00

 

 

Safely narrow down the apparent chaos

There is that thing about me: I like understanding. I represent my internal process of understanding as the interplay of three imaginary entities: the curious ape, the happy bulldog, and the austere monk. The curious ape is the part of me who instinctively reaches for anything new and interesting. The curious ape does basic gauging of that new thing: ‘can kill or hopefully not always?’, ‘edible or unfortunately not without risk?’ etc. When it does not always kill and can be eaten, the happy bulldog is released from its leash. It takes pleasure in rummaging around things, sniffing and digging in the search of adjacent phenomena. Believe me, when my internal happy bulldog starts sniffing around and digging things out, they just pile up. Whenever I study a new topic, the folder I have assigned to it swells like a balloon, with articles, books, reports, websites etc. A moment comes when those piles of adjacent phenomena start needing some order and this is when my internal austere monk steps into the game. His basic tool is the Ockham’s razor, which cuts the obvious from the dubious, and thus, eventually, cuts bullshit off.

In my last update in French, namely in Le modèle d’un marché relativement conformiste, I returned to that business plan for the project EneFin, and the first thing my internal curious ape is gauging right now is the so-called absorption by the market. EneFin is supposed to be an innovative concept, and, as any innovation, it will need to kind of get into the market. It can do so as people in the market will opt for shifting from being just potential users to being the actual ones. In other words, the success of any business depends on a sequence of decisions taken by people who are supposed to be customers.

People are supposed to make decisions regarding my new products or technologies. Decisions have their patterns. I wrote more about this particular issue in an update on this blog, entitled ‘And so I ventured myself into the realm of what people think they can do’, for example. Now, I am interested in the more marketing-oriented, aggregate outcome of those decisions. The commonly used theoretical tool here is the normal distribution(see for example Robertson): we assume that, as customers switch to purchasing that new thing, the population of users grows as a cumulative normal fraction (i.e. fraction based on the normal distribution) of the general population.

As I said, I like understanding. What I want is to really understandthe logic behind simulating aggregate outcomes of customers’ decisions with the help of normal distribution. Right, then let’s do some understanding. Below, I am introducing two graphical presentations of the normal distribution: the first is the ‘official’ one, the second, further below, is my own, uncombed and freshly woken up interpretation.

The normal distribution

 

Normal distribution interpreted

 

So, the logic behind the equation starts biblically: in the beginning, there is chaos. Everyone can do anything. Said chaos occurs in a space, based on the constant e = 2,71828, known as the base of the natural logarithm and reputed to be really handy for studying dynamic processes. This space is ex. Any customer can take any decision in a space made by ‘e’ elevated to the power ‘x’, or the power of the moment. Yes, ‘x’ is a moment, i.e. the moment when we observe the distribution of customers’ decisions.

Chaos gets narrowed down by referring to µ, or the arithmetical average of all the moments studied. This is the expression (x – µ)2or the local variance, observable in the moment x. In order to have an arithmetical average, and have it the same in all the moments ‘x’, we need to close the frame, i.e. to define the set of x’s. Essentially, we are saying to that initial chaos: ‘Look, chaos, it is time to pull yourself together a bit, and so we peg down the set of moments you contain, we draw an average of all those moments, and that average is sort of the point where 50% of you, chaos, is being taken and recognized, and we position every moment xregarding its distance from the average moment µ’.

Thus, the initial chaos ‘e power x’ gets dressed a little, into ‘e power (x – µ)2‘. Still, a dressed chaos is still chaos. Now, there is that old intuition, progressively unfolded by Isaac Newton, Gottfried Wilhelm Leibnizand Abraham de Moivreat the verge of the 17thand 18thcenturies, then grounded by Carl Friedrich Gauss, and Thomas Bayes: chaos is a metaphysical concept born out of insufficient understanding, ‘cause your average reality, babe, has patterns and structures in it.

The way that things structure themselves is most frequently sort of a mainstream fashion, that most events stick to, accompanied by fringe phenomena who want to be remembered as the rebels of their time (right, space-time). The mainstream fashion is observable as an expected value. The big thing about maths is being able to discover by yourself that when you add up all the moments in the apparent chaos, and then you divide the so-obtained sum by the number of moments added, you get a value, which we call arithmetical average, and which actually doesn’t exist in that set of moments, but it sets the mainstream fashion for all the moments in that apparent chaos. Moments tend to stick around the average, whose habitual nickname is ‘µ’.

Once you have the expected value, you can slice your apparent chaos in two, sort of respectively on the right, and on the left of the expected value that doesn’t actually exist. In each of the two slices you can repeat the same operation: add up everything, then divide by the number of items in that everything, and get something expected that doesn’t exist. That second average can have two, alternative properties as for structuring. On the one hand, it can set another mainstream, sort of next door to that first mainstream: moments on one side of the first average tend to cluster and pile up around that second average. Then it means that we have another expected value, and we should split our initial, apparent chaos into two separate chaoses, each with its expected value inside, and study each of them separately. On the other hand, that second average can be sort of insignificant in its power of clustering moments: it is just the average (expected) distance from the first average, and we call it standard deviation, habitually represented with the Greek sigma.

We have the expected distance (i.e. standard deviation) from the expected value in our apparent chaos, and it allows us to call our chaos for further tidying up. We go and slice off some parts of that chaos, which seem not to be really relevant regarding our mainstream. Firstly, we do it by dividing our initial logarithm, being the local variance (x – µ)2, by twice the general variance, or two times sigma power two. We can be even meaner and add a minus sign in front of that divided local variance, and it means that instead of expanding our constant e = 2,71828, into a larger space, we are actually folding it into a smaller space. Thus, we get a space much smaller than the initial ‘e power (x – µ)2‘.

Now, we progressively chip some bits out of that smaller, folded space. We divide it by the standard deviation. I know, technically we multiply it by one divided by standard deviation, but if you are like older than twelve, you can easily understand the equivalence here. Next, we multiply the so-obtained quotient by that funny constant: one divided by the square root of two times π. This constant is 0,39894228 and if my memory is correct is was a big discovery from the part of Carl Friedrich Gauss: in any apparent chaos, you can safely narrow down the number of the realistically possible occurrences to like four tenths of that initial chaos.

After all that chipping we did to our initial, charmingly chaotic ‘e power x‘ space, we get the normal space, or that contained under the curve of normal distribution. This is what the whole theory of probability, and its rich pragmatic cousin, statistics, are about: narrowing down the range of uncertain, future occurrences to a space smaller than ‘anything can happen’. You can do it in many ways, i.e. we have many different statistical distributions. The normal one is like the top dog in that yard, but you can easily experiment with the steps described above and see by yourself what happens. You can kick that Gaussian constant 0,39894228 out of the equation, or you can make it stronger by taking away the square root and just keep two times π in its denominator; you can divide the local variance (x – µ)2just by one time its cousin general variance instead of twice etc. I am persuaded that this is what Carl Friedrich Gaussdid: he kept experimenting with equations until he came up with something practical.

And so am I, I mean I keep experimenting with equations so as to come up with something practical. I am applying all that elaborate philosophy of harnessed chaos to my EneFinthing and to predicting the number of my customers. As I am using normal distribution as my basic, quantitative screwdriver, I start with assuming that however many customers I got, that however many is always a fraction (percentage) of a total population. This is what statistical distributions are meant to yield: a probability, thus a fraction of reality, elegantly expressed as a percentage.

I take a planning horizon of three years, just as I do in the Business Planning Calculator, that analytical tool you can download from a subpage of https://discoversocialsciences.com. In order to make my curves smoother, I represent those three years as 36 months. This is my set of moments ‘x’, ranging from 1 to 36. The expected, average value that does not exist in that range of moments is the average time that a typical potential customer, out there, in the total population, needs to try and buy energy via EneFin. I have no clue, although I have an intuition. In the research on innovative activity in the realm of renewable energies, I have discovered something like a cycle. It is the time needed for the annual number of patent applications to double, with respect to a given technology (wind, photovoltaic etc.). See Time to come to the ad rem, for example, for more details. That cycle seems to be 7 years in Europe and in the United States, whilst it drops down to 3 years in China.

I stick to 7 years, as I am mostly interested, for the moment, in the European market. Seven years equals 7*12 = 84 months. I provisionally choose those 84 months as my average µfor using normal distribution in my forecast. Now, the standard deviation. Once again, no clue, and an intuition. The intuition’s name is ‘coefficient of variability’, which I baptise ßfor the moment. Variability is the coefficient that you get when you divide standard deviation by the mean average value. Another proportion. The greater the ß, the more dispersed is my set of customers into different subsets: lifestyles, cities, neighbourhoods etc. Conversely, the smaller the ß, the more conformist is that population, with relatively more people sailing in the mainstream. I casually assume my variability to be found somewhere in 0,1 ≤ ß ≤ 2, with a step of 0,1. With µ = 84, that makes my Ω (another symbol for sigma, or standard deviation) fall into 0,1*84 ≤ Ω ≤ 2*84 <=> 8,4 ≤ Ω ≤ 168. At ß = 0,1 => Ω = 8,4my customers are boringly similar to each other, whilst at ß = 2 => Ω = 168they are like separate tribes.

In order to make my presentation simpler, I take three checkpoints in time, namely the end of each consecutive year out of the three. Denominated in months, it gives: the 12thmonth, the 24thmonth, and the 36thmonth. I Table 1, below, you can find the results: the percentage of the market I expect to absorb into EneFin, with the average time of behavioural change in my customers pegged at µ = 84, and at various degrees of disparity between individual behavioural changes.

Table 1 Simulation of absorption in the market, with the average time of behavioural change equal to µ = 84 months

Percentage of the market absorbed
Variability of the population Standard deviation with µ = 84 12th month 24 month 36 month
0,1 8,4 8,1944E-18 6,82798E-13 7,65322E-09
0,2 16,8 1,00458E-05 0,02% 0,23%
0,3 25,2 0,18% 0,86% 2,93%
0,4 33,6 1,02% 3,18% 7,22%
0,5 42 2,09% 5,49% 10,56%
0,6 50,4 2,92% 7,01% 12,42%
0,7 58,8 3,42% 7,80% 13,18%
0,8 67,2 3,67% 8,10% 13,28%
0,9 75,6 3,74% 8,09% 13,02%
1 84 3,72% 7,93% 12,58%
1,1 92,4 3,64% 7,67% 12,05%
1,2 100,8 3,53% 7,38% 11,50%
1,3 109,2 3,41% 7,07% 10,95%
1,4 117,6 3,28% 6,76% 10,43%
1,5 126 3,14% 6,46% 9,93%
1,6 134,4 3,02% 6,18% 9,47%
1,7 142,8 2,89% 5,91% 9,03%
1,8 151,2 2,78% 5,66% 8,63%
1,9 159,6 2,67% 5,42% 8,26%
2 168 2,56% 5,20% 7,91%

I think it is enough science for today. That sunlight will not enjoy itself. It needs me to enjoy it. I am consistently delivering good, almost new science to my readers, and love doing it, and I am working on crowdfunding this activity of mine. As we talk business plans, I remind you that you can download, from the library of my blog, the business plan I prepared for my semi-scientific project Befund  (and you can access the French versionas well). You can also get a free e-copy of my book ‘Capitalism and Political Power’ You can support my research by donating directly, any amount you consider appropriate, to my PayPal account. You can also consider going to my Patreon pageand become my patron. If you decide so, I will be grateful for suggesting me two things that Patreon suggests me to suggest you. Firstly, what kind of reward would you expect in exchange of supporting me? Secondly, what kind of phases would you like to see in the development of my research, and of the corresponding educational tools?

Support this blog

€10.00