Unconditional claim, remember? Educational about debt-based securities

My editorial on You Tube

 

Here comes another piece of educational content, regarding the fundamentals of finance, namely a short presentation of debt-based securities. As I will be discussing that topic,  here below, I will compare those financial instruments to equity-based securities, which I already discussed in « Finding the right spot in that flow: educational about equity-based securities ».

In short, debt-based securities are financial instruments which transform a big chunk of debt, thus a big obligatory contract, into a set of small, tradable pieces, and give to that debt more liquidity.

In order to understand how debt-based securities work in finance, it is a good thing to put a few clichés on their head and make them hold that stance. First of all, we normally associate debt with a relation of power: the CREDITOR, or the person who lends to somebody else, has a dominant position over the DEBTOR, who borrows. Whilst being sometimes true, it is true just sometimes, and it is just one point of view. Debt can be considered as a way of transferring capital from entity A to entity B. Entity A has more cash than they currently need, whilst B has less. Entity A can transfer the excess of cash to B, only they need a contractual base to do it in a civilized way. In my last educational, regarding equity-based securities, I presented a way of transferring capital in exchange of a conditional claim on B’s assets, and of a corresponding decisional power: that would be investing in B’s equity. Another way is to acquire an unconditional claim on B’s future cash flows, and this is debt. Historically, both ways have been used and developed into specific financial instruments.

Anyway, the essential concept of debt-based securities is to transform one big, obligatory claim of one entity onto another entity into many small pieces, each expressed as a tradable deed (document). How the hell is it possible to transform a debt – thus future money that is not there yet – into securities? Here come two important, general concepts of finance: liquidity, and security. Liquidity, in financial terms, is something that we spontaneously associate with being able to pay whatever we need to pay in the immediate. The boss of a company can say they have financial liquidity when they have enough cash in their balance sheet to pay the bills currently on the desk. If some of those bills cannot be paid (not enough cash), the boss can say ‘Sorry, not enough liquidity’.

You can generalize from there: liquidity is the capacity to enter into new economic transactions, and to fulfil obligations resulting from such transactions. In markets that we, humans, put in place, there is a peculiar phenomenon to notice: we swing between various levels of required liquidity. In some periods, people in that market will be like immerged in routine. They will repeat the same transactions over and over again, in recurrent amounts. It is like an average Kowalski (the Polish equivalent of the English average Smith, or the French average Dupont) paying their electricity bills. Your electricity bill comes in the form of a six-month plan of instalments. Each month you will have to pay the same, fixed amount, which results from the last reading of your electricity counter. That amount is most likely to be similar to the amounts from previous six-month periods, unless you have just decided to grow some marijuana and you need extra electricity for those greenhouse lamps. If you manage to keep your head above the water, in day-to-day financial terms, you have probably incorporated those payments for electricity into your monthly budget, more or less consciously. You don’t need extra liquidity to meet those obligations. This is the state of a market, when it runs on routine transactions.

Still, there are times when a lot of new business is to be done. New technologies are elbowing their way into our industry, or a new trade agreement has been signed with another country, or the government had the excellent idea of forcing every entity in the market to equip themselves with that absolutely-necessary-thingy-which-absolutely-incidentally-is-being-marketed-by-the-minister’s-cousin. When we need to enter into new transactions, or when we just need to be ready for entering them, we need a reserve of liquidity, i.e. we need additional capacity to transact. Our market has entered into a period of heightened need for liquidity.

When I lend to someone a substantial amount of money in a period of low need for liquidity, I can just sit and wait until they pay me back. No hurry. On the other hand, when I lend during a period of increased need for liquidity, my approach is different: I want to recoup my capital as soon as possible. My debtor, i.e. the person which I have lent to, cannot pay me back immediately. If they could, they would not need to borrow from me. Stands to reason. What I can do is to express that lending-borrowing transaction as an exchange of securities against money.

You can find an accurate description of that link between actual business, its required liquidity, and all the lending business in: Adam Smith – “An Inquiry Into The Nature And Causes Of The Wealth of Nations”, Book II: Of The Nature, Accumulation, and Employment of Stock, Chapter IV: Of Stock Lent At Interest: “Almost all loans at interest are made in money, either of paper, or of gold and silver; but what the borrower really wants, and what the lender readily supplies him with, is not the money, but the money’s worth, or the goods which it can purchase. If he wants it as a stock for immediate consumption, it is those goods only which he can place in that stock. If he wants it as a capital for employing industry, it is from those goods only that the industrious can be furnished with the tools, materials, and maintenance necessary for carrying on their work. By means of the loan, the lender, as it were, assigns to the borrower his right to a certain portion of the annual produce of the land and labour of the country, to be employed as the borrower pleases.”

Here, we come to the concept of financial security. Anything in the future is subject to uncertainty and risk. We don’t know how exactly things are going to happen. This generates risk. Future events can meet my expectations, or they can do me harm. If I can sort of divide both my expectations, and the possible harm, into small pieces, and make each such small piece sort of independent from other pieces, I create a state of dispersed expectations, and dispersed harm. This is the fundamental idea of a security. How can I create mutual autonomy between small pieces of my future luck or lack thereof? By allowing people to trade those pieces independently from each other.

It is time to explain how the hell can we give more liquidity to debt by transforming it into securities. First things first, let’s see the typical ways of doing it: a note, and a bond. A note, AKA promissory note, or bill of exchange, in its most basic appearance is a written, unconditional promise to pay a certain amount of money to whoever presents the note on a given date. You can see it in the graphic below.

Now, those of you, who, hopefully, paid attention in the course of microeconomics, might ask: “Whaaait a minute, doc! Where is the interest on that loan? You told us: there ain’t free money…”. Indeed, there ain’t. Notes were invented long ago. The oldest ones we have in European museums date back to the 12th century A.D. Still, given what we know about the ways of doing business in the past, they had been used even further back. As you might know, it was frequently forbidden by the law to lend money at interest. It was called usury, it was considered at least as a misdemeanour, if not a crime, and you could even be hanged for that. In the world of Islamic Finance, lending at interest is forbidden even today.

One of the ways to bypass the ban on interest-based lending is to calculate who much money will that precise interest make on that precise loan. I lend €9000 at 12%, for one year, and it makes €9000 *12% = €1 080. I lend €9000, for one year, and I make my debtor liable for €10 080. Interest? Who’s talking about interest? It is ordinary discount!

Discount is the difference between the nominal value of a financial instrument (AKA face value), and its actual price in exchange, thus the amount of money you can have in exchange of that instrument.

A few years ago, I found that same pattern in an innocently-looking contract, which was underpinning a loan that me and my wife were taking for 50% of a new car. The person who negotiated the deal at the car dealer’s announced joyfully: ‘This is a zero-interest loan. No interest!’. Great news, isn’t it? Still, as I was going through the contract, I found that we have to pay, at the signature, a ‘contractual fee’. The fee was strangely precise, I mean there were grosze (Polish equivalent of cents) after the decimal point. I did my maths: that ‘contractual fee’ was exactly and rigorously equal to the interest we would have to pay on that loan, should it be officially interest-bearing at ordinary, market rates.

The usage of discount instead of interest points at an important correlate of notes, and debt-based securities in general: risk. That scheme with pre-calculated interest included into the face value of the note is any good when I can reliably predict when exactly will the debtor pay back (buy the note back). Moreover, as the discount is supposed to reflect pre-calculated interest, it also reflects that part of the interest rate, which accounts for credit risk.

There are 1000 borrowers, who borrow from a nondescript number of lenders. Each loan bears a principal (i.e. nominal amount) of €3000, which makes a total market of €3 000 000 lent and borrowed. Out of those 1000, a certain number is bound to default on paying back. Let it be 4%. It makes 4% * 1000 * €3000 = €120 000, which, spread over the whole population of borrowers makes €120 000/ 1000 = €120, or €120/€3000 = 4%. Looks like a logical loop, and for a good reason: you cannot escape it. In a large set of people, some will default on their obligations. This is a fact. Their collective default is an aggregate financial risk – credit risk – which has to be absorbed by the market, somehow. The simplest way to absorb it is to make each borrower pay a small part of it. When I take a loan, in a bank, the interest rate I pay always reflects the credit risk in the whole population of borrowers. When I issue a note, the discount I have to give to my lender will always include the financial risk that recurrently happens in the given market.

The discount rate is a price of debt, just as the interest rate. Both can be used, and the prevalence of one or the other depends on the market. Whenever debt gets massively securitized, i.e. transformed into tradable securities, discount becomes somehow handier and smoother to use. Another quote from invaluable Adam Smith sheds some light on this issue (

Adam Smith – “An Inquiry Into The Nature And Causes Of The Wealth of Nations”, Book II: Of The Nature, Accumulation, and Employment of Stock, Chapter IV: Of Stock Lent At Interest): “As the quantity of stock to be lent at interest increases, the interest, or the price which must be paid for the use of that stock, necessarily diminishes, not only from those general causes which make the market price of things commonly diminish as their quantity increases, but from other causes which are peculiar to this particular case. As capitals increase in any country, the profits which can be made by employing them necessarily diminish. It becomes gradually more and more difficult to find within the country a profitable method of employing any new capital. There arises, in consequence, a competition between different capitals, the owner of one endeavouring to get possession of that employment which is occupied by another; but, upon most occasions, he can hope to justle that other out of this employment by no other means but by dealing upon more reasonable terms.”

The presence of financial risk, and the necessity to account for it whilst maintaining proper liquidity in the market, brought two financial inventions: endorsement, and routed notes. Notes used to be (and still are) issued for a relatively short time, usually not longer than 1 year. If the lender needs to have their money back before the due date of the note, they can do something called endorsement: they can present that note as their own to a third party, who will advance them money in exchange. Presenting a note as my own means making myself liable for up to 100% of the original, i.e signing the note, with a date. You can find an example in the graphic below.

Endorsement used to be a normal way of assuring liquidity in the market financed with notes. Endorsers’ signatures made a chain of liability, ordered by dates. The same scheme is used today in cryptocurrencies, as the chain of hash-tagged digital signatures. Another solution was to put in the system someone super-reliable, like a banker. Such a trusted payer, who, on their part, had tons of reserve money to provide liquidity, made the whole game calmer and less risky, and thus the price of credit (the discount rate) was lower. The way of putting a banker in the game was to write them in the note as the entity liable for payment. Such a note was designated as a routed one, or as a draft. Below, I am presenting an example.

As banks entered the game of securitized debt, it opened the gates of hell, i.e. the way to paper money. Adam Smith was very apprehensive about it (Adam Smith – “Wealth of Nations”, Book II: Of The Nature, Accumulation, and Employment of Stock, Chapter II: Of Money, Considered As A Particular Branch Of The General Stock Of The Society, Or Of The Expense Of Maintaining The National Capital”): “The trader A in Edinburgh, we shall suppose, draws a bill upon B in London, payable two months after date. In reality B in Lon- don owes nothing to A in Edinburgh; but he agrees to accept of A ‘s bill, upon condition, that before the term of payment he shall redraw upon A in Edinburgh for the same sum, together with the interest and a commission, another bill, payable likewise two months after date. B accordingly, before the expiration of the first two months, redraws this bill upon A in Edinburgh; who, again before the expiration of the second two months, draws a second bill upon B in London, payable likewise two months after date; and before the expiration of the third two months, B in London redraws upon A in Edinburgh another bill payable also two months after date. This practice has sometimes gone on, not only for several months, but for several years together, the bill always returning upon A in Edinburgh with the accumulated interest and com- mission of all the former bills. The interest was five per cent. in the year, and the commission was never less than one half per cent. on each draught. This commission being repeated more than six times in the year, whatever money A might raise by this expedient might necessarily have cost him something more than eight per cent. in the year and sometimes a great deal more, when either the price of the commission happened to rise, or when he was obliged to pay compound interest upon the interest and commission of former bills. This practice was called raising money by circulation”

Notes were quick to issue, but a bit clumsy when it came to financing really big ventures, like governments. When you are a king, and you need cash for waging war on another king, issuing a few notes can be tricky. Same in the corporate sector. When we are talking about really big money, making the debt tradable is just one part, and another part is to make it nicely spread over the landscape. This is how bonds came into being, as financial instruments. The idea of bonds was to make the market of debt a bit steadier across space and over time. Notes worked well for short-term borrowing, but long-term projects, which required financing for 5 or 6 years, encountered a problem of price, i.e. discount rate. If I issue a note to back a loan for 5 years, the receiver of the note, i.e. the lender, knows they will have to wait really long to see their money back. Below, in the graphic, you have the idea explained sort of in capital letters.

The first thing is the face value. The note presented earlier proudly displayed €10 000 of face value. The bond is just €100. You divide €10 000 into 100 separate bonds, each tradable independently, at you have something like a moving, living mass of things, flowing, coming and going. Yep, babe. Liquidity, liquidity, and once again liquidity. A lot of small debts flows much more smoothly than one big.

The next thing is the interest. You can see it here designated as “5%, annuity”, with the word ‘coupon’ added. If we have the interest rate written explicitly, it means the whole thing was invented when lending at interest became a normal thing, probably in the late 1700ies. The term ‘annuity’ means that every year, those 5% are being paid to the holder of the bond, like a fixed annual income. This is where the ‘word’ coupon comes from. Back in the day, when bonds were paper documents (they are not anymore), they had detachable strips, as in a cinema ticket, one strip per year. When the issuer of the bond paid annuities to the holders, those strips were being cut off.

The maturity date of the bond is the moment, when the issuer is supposed to buy it back. It is a general convention that bonds are issued for many years. This is when the manner of counting and compound the interest plays a role, and this is when we need to remind one fundamental thing – bonds are made for big borrowers. Anyone can make a note, and many different anyones can make it circulate, by endorsement or else. Only big entities can issue bonds, and because they are big, bonds are usually considered as safe placements, endowed with low risk. Low risk means low price of debt. When I can convince many small lenders that I, the big borrower, am rock solid in my future solvency, I can play on that interest rate. When I guarantee an annuity, it can be lower than the interest paid only at the very end of maturity, i.e. in 2022 as regards this case. When all around us all of them loans are given at 10% or 12%, an annuity backed with the authority of a big institution can be just 5%, and no one bothers.

Over time, bonds have dominated the market of debt. They are more flexible, and thus assure more liquidity. They offer interesting possibilities as for risk management and discount. When big entities issue bonds, it is the possibility for other big entities to invest large amounts of capital at fixed, guaranteed rate of return, i.e. the interest rates. Think about it: you have an investment the size of a big, incorporated business, and yet you have a risk-free return. Unconditional claim, remember? Hence, over time, what professional investors started doing was building a portfolio of investment with equity-based securities for high yield and high risk, plain lending contracts for moderate yield (high interest rate) and moderate risk, and, finally, bonds for low yield and low risk. Creating a highly liquid market of debt, by putting a lot of bonds into circulation, was like creating a safe harbour for investors. Whatever crazy s**t they were after, they could compensate the resulting risk through the inclusion of bonds in their portfolios.

I am consistently delivering good, almost new science to my readers, and love doing it, and I am working on crowdfunding this activity of mine. As we talk business plans, I remind you that you can download, from the library of my blog, the business plan I prepared for my semi-scientific project Befund  (and you can access the French version as well). You can also get a free e-copy of my book ‘Capitalism and Political Power’ You can support my research by donating directly, any amount you consider appropriate, to my PayPal account. You can also consider going to my Patreon page and become my patron. If you decide so, I will be grateful for suggesting me two things that Patreon suggests me to suggest you. Firstly, what kind of reward would you expect in exchange of supporting me? Secondly, what kind of phases would you like to see in the development of my research, and of the corresponding educational tools?

What are the practical outcomes of those hypotheses being true or false?

 

My editorial on You Tube

 

This is one of those moments when I need to reassess what the hell I am doing. Scientifically, I mean. Of course, it is good to reassess things existentially, too, every now and then, but for the moment I am limiting myself to science. Simpler and safer than life in general. Anyway, I have a financial scheme in mind, where local crowdfunding platforms serve to support the development of local suppliers in renewable energies. The scheme is based on the observable difference between prices of electricity for small users (higher), and those reserved to industrial scale users (lower). I wonder if small consumers would be ready to pay the normal, relatively higher price in exchange of a package made of: a) electricity and b) shares in the equity of its suppliers.

I have a general, methodological hypothesis in mind, which I have been trying to develop over the last 2 years or so: collective intelligence. I hypothesise that collective behaviour observable in markets can be studied as a manifestation of collective intelligence. The purpose is to go beyond optimization and to define, with scientific rigour, what are the alternative, essentially equiprobable paths of change that a complex market can take. I think such an approach is useful when I am dealing with an economic model with a lot of internal correlation between variables, and that correlation can be so strong that it turns into those variables basically looping on each other. In such a situation, distinguishing independent variables from the dependent ones becomes bloody hard, and methodologically doubtful.

On the grounds of literature, and my own experimentation, I have defined three essential traits of such collective intelligence: a) distinction between structure and instance b) capacity to accumulate experience, and c) capacity to pass between different levels of freedom in social cohesion. I am using an artificial neural network, a multi-layer perceptron, in order to simulate such collectively intelligent behaviour.

The distinction between structure and instance means that we can devise something, make different instances of that something, each different by some small details, and experiment with those different instances in order to devise an even better something. When I make a mechanical clock, I am a clockmaker. When I am able to have a critical look at this clock, make many different versions of it – all based on the same structural connections between mechanical parts, but differing from each other by subtle details – and experiment with those multiple versions, I become a meta-clock-maker, i.e. someone who can advise clockmakers on how to make clocks. The capacity to distinguish between structures and their instances is one of the basic skills we need in life. Autistic people have a big problem in that department, as they are mostly on the instance side. To a severely autistic person, me in a blue jacket, and me in a brown jacket are two completely different people. Schizophrenic people are on the opposite end of the spectrum. To them, everything is one and the same structure, and they cannot cope with instances. Me in a blue jacket and me in a brown jacket are the same as my neighbour in a yellow jumper, and we all are instances of the same alien monster. I know you think I might be overstating, but my grandmother on the father’s side used to suffer from schizophrenia, and it was precisely that: to her, all strong smells were the manifestation of one and the same volatile poison sprayed in the air by THEM, and every person outside a circle of about 19 people closest to her was a member of THEM. Poor Jadwiga.

In economics, the distinction between structure and instance corresponds to the tension between markets and their underpinning institutions. Markets are fluid and changeable, they are like constant experimenting. Institutions give some gravitas and predictability to that experimenting. Institutions are structures, and markets are ritualized manners of multiplying and testing many alternative instances of those structures.

The capacity to accumulate experience means that as we experiment with different instances of different structures, we can store information we collect in the process, and use this information in some meaningful way. My great compatriot, Alfred Korzybski, in his general semantics, used to designate it as ‘the capacity to bind time’. The thing is not as obvious as one could think. A Nobel-prized mathematician, Reinhard Selten, coined up the concept of social games with imperfect recall (Harsanyi, Selten 1988[1]). He argued that as we, collective humans, accumulate and generalize experience about what the hell is going on, from time to time we shake off that big folder, and pick the pages endowed with the most meaning. All the remaining stuff, judged less useful on the moment, is somehow archived in culture, so as it basically stays there, but becomes much harder to access and utilise. The capacity to accumulate experience means largely the way of accumulating experience, and doing that from-time-to-time archiving. We can observe this basic distinction in everyday life. There are things that we learn sort of incrementally. When I learn to play piano – which I wish I was learning right now, cool stuff – I practice, I practice, I practice and… I accumulate learning from all those practices, and one day I give a concert, in a pub. Still, other things, I learn them sort of haphazardly. Relationships are a good example. I am with someone, one day I am mad at her, the other day I see her as the love of my life, then, again, she really gets on my nerves, and then I think I couldn’t live without her etc. Bit of a bumpy road, isn’t it? Yes, there is some incremental learning, but you become aware of it after like 25 years of conjoint life. Earlier on, you just need to suck ass and keep going.

There is an interesting theory in economics, labelled as « semi – martingale » (see for example: Malkiel, Fama 1970[2]). When we observe changes in stock prices, in a capital market, we tend to say they are random, but they are not. You can test it. If the price is really random, it should fan out according to the pattern of normal distribution. This is what we call a full martingale. Any real price you observe actually swings less broadly than normal distribution: this is a semi-martingale. Still, anyone with any experience in investment knows that prediction inside the semi-martingale is always burdened with a s**tload of error. When you observe stock prices over a long time, like 2 or 3 years, you can see a sequence of distinct semi-martingales. From September through December it swings inside one semi-martingale, then the Ghost of Past Christmases shakes it badly, people panic, and later it settles into another semi-martingale, slightly shifted from the preceding one, and here it goes, semi-martingaling for another dozen of weeks etc.

The central theoretical question in this economic theory, and a couple of others, spells: do we learn something durable through local shocks? Does a sequence of economic shocks, of whatever type, make a learning path similar to the incremental learning of piano playing? There are strong arguments in favour of both possible answers. If you get your face punched, over and over again, you must be a really dumb asshole not to learn anything from that. Still, there is that phenomenon called systemic homeostasis: many systems, social structures included, tend to fight for stability when shaken, and they are frequently successful. The memory of shocks and revolutions is frequently erased, and they are assumed to have never existed.

The issue of different levels in social cohesion refers to the so-called swarm theory (Stradner et al 2013[3]). This theory studies collective intelligence by reference to animals, which we know are intelligent just collectively. Bees, ants, hornets: all those beasts, when acting individually, as dumb as f**k. Still, when they gang up, they develop amazingly complex patterns of action. That’s not all. Those complex patterns of theirs fall into three categories, applicable to human behaviour as well: static coupling, dynamic correlated coupling, and dynamic random coupling.

When we coordinate by static coupling, we always do things together in the same way. These are recurrent rituals, without much room for change. Many legal rules, and institutions they form the basis of, are examples of static coupling. You want to put some equity-based securities in circulation? Good, you do this, and this, and this. You haven’t done the third this? Sorry, man, but you cannot call it a day yet. When we need to change the structure of what we do, we should somehow loosen that static coupling and try something new. We should dissolve the existing business, which is static coupling, and look for creating something new. When we do so, we can sort of stay in touch with our customary business partners, and after some circling and asking around we form a new business structure, involving people we clearly coordinate with. This is dynamic correlated coupling. Finally, we can decide to sail completely uncharted waters, and take our business concept to China, or to New Zealand, and try to work with completely different people. What we do, in such a case, is emitting some sort of business signal into the environment, and waiting for any response from whoever is interested. This is dynamic random coupling. Attracting random followers to a new You Tube channel is very much an example of the same.

At the level of social cohesion, we can be intelligent in two distinct ways. On the one hand, we can keep the given pattern of collective associations behaviour at the same level, i.e. one of the three I have just mentioned. We keep it ritualized and static, or somehow loose and dynamically correlated, or, finally, we take care of not ritualizing too much and keep it deliberately at the level of random associations. On the other hand, we can shift between different levels of cohesion. We take some institutions, we start experimenting with making them more flexible, at some point we possibly make it as free as possible, and we gain experience, which, in turn, allows us to create new institutions.

When applying the issue of social cohesion in collective intelligence to economic phenomena, we can use a little trick, to be found, for example, in de Vincenzo et al (2018[4]): we assume that quantitative economic variables, which we normally perceive as just numbers, are manifestations of distinct collective decisions. When I have the price of energy, let’s say, €0,17 per kilowatt hour, I consider it as the outcome of collective decision-making. At this point, it is useful to remember the fundamentals of intelligence. We perceive our own, individual decisions as outcomes of our independent thinking. We associate them with the fact of wanting something, and being apprehensive regarding something else etc. Still, neurologically, those decisions are outcomes of some neurons firing in a certain sequence. Same for economic variables, i.e. mostly prices and quantities: they are fruit of interactions between the members of a community. When I buy apples in the local marketplace, I just buy them for a certain price, and, if they look bad, I just don’t buy. This is not any form of purposeful influence upon the market. Still, when 10 000 people like me do the same, sort of ‘buy when price good, don’t when the apple is bruised’, a patterned process emerges. The resulting price of apples is the outcome of that process.

Social cohesion can be viewed as association between collective decisions, not just between individual actions. The resulting methodology is made, roughly speaking, of three steps. Step one: I put all the economic variables in my model over a common denominator (common scale of measurement). Step two: I calculate the relative cohesion between them with the general concept of a fitness function, which I can express, for example, as the Euclidean distance between local values of variables in question. Step three: I calculate the average of those Euclidean distances, and I calculate its reciprocal, like « 1/x ». This reciprocal is the direct measure of cohesion between decisions, i.e. the higher the value of this precise « 1/x », the more cohesion between different processes of economic decision-making.

Now, those of you with a sharp scientific edge could say now: “Wait a minute, doc. How do you know we are talking about different processes of decision making? Who do you know that variable X1 comes from a different process than variable X2?”. This is precisely my point. The swarm theory tells me that if I can observe changing a cohesion between those variables, I can reasonably hypothesise that their underlying decision-making processes are distinct. If, on the other hand, their mutual Euclidean distance stays the same, I hypothesise that they come from the same process.

Summing up, here is the general drift: I take an economic model and I formulate three hypotheses as for the occurrence of collective intelligence in that model. Hypothesis #1: different variables of the model come from different processes of collective decision-making.

Hypothesis #2: the economic system underlying the model has the capacity to learn as a collective intelligence, i.e. to durably increase or decrease the mutual cohesion between those processes. Hypothesis #3: collective learning in the presence of economic shocks is different from the instance of learning in the absence of such shocks.

They look nice, those hypotheses. Now, why the hell should anyone bother? I mean what are the practical outcomes of those hypotheses being true or false? In my experimental perceptron, I express the presence of economic shocks by using hyperbolic tangent as neural function of activation, whilst the absence of shocks (or the presence of countercyclical policies) is expressed with a sigmoid function. Those two yield very different processes of learning. Long story short, the sigmoid learns more, i.e. it accumulates more local errors (this more experimental material for learning), and it generates a steady trend towards lower a cohesion between variables (decisions). The hyperbolic tangent accumulates less experiential material (it learns less), and it is quite random in arriving to any tangible change in cohesion. The collective intelligence I mimicked with that perceptron looks like the kind of intelligence, which, when going through shocks, learns only the skill of returning to the initial position after shock: it does not create any lasting type of change. The latter happens only when my perceptron has a device to absorb and alleviate shocks, i.e. the sigmoid neural function.

When I have my perceptron explicitly feeding back that cohesion between variables (i.e. feeding back the fitness function considered as a local error), it learns less and changes less, but not necessarily goes through less shocks. When the perceptron does not care about feeding back the observable distance between variables, there is more learning and more change, but not more shocks. The overall fitness function of my perceptron changes over time The ‘over time’ depends on the kind of neural activation function I use. In the case of hyperbolic tangent, it is brutal change over a short time, eventually coming back to virtually the same point that it started from. In the hyperbolic tangent, the passage between various levels of association, according to the swarm theory, is super quick, but not really productive. In the sigmoid, it is definitely a steady trend of decreasing cohesion.

I want to know what the hell I am doing. I feel I have made a few steps towards that understanding, but getting to know what I am doing proves really hard.

I am consistently delivering good, almost new science to my readers, and love doing it, and I am working on crowdfunding this activity of mine. As we talk business plans, I remind you that you can download, from the library of my blog, the business plan I prepared for my semi-scientific project Befund  (and you can access the French version as well). You can also get a free e-copy of my book ‘Capitalism and Political Power’ You can support my research by donating directly, any amount you consider appropriate, to my PayPal account. You can also consider going to my Patreon page and become my patron. If you decide so, I will be grateful for suggesting me two things that Patreon suggests me to suggest you. Firstly, what kind of reward would you expect in exchange of supporting me? Secondly, what kind of phases would you like to see in the development of my research, and of the corresponding educational tools?

[1] Harsanyi, J. C., & Selten, R. (1988). A general theory of equilibrium selection in games. MIT Press Books, 1.

[2] Malkiel, B. G., & Fama, E. F. (1970). Efficient capital markets: A review of theory and empirical work. The journal of Finance, 25(2), 383-417.

[3] Stradner, J., Thenius, R., Zahadat, P., Hamann, H., Crailsheim, K., & Schmickl, T. (2013). Algorithmic requirements for swarm intelligence in differently coupled collective systems. Chaos, Solitons & Fractals, 50, 100-114.

[4] De Vincenzo, I., Massari, G. F., Giannoccaro, I., Carbone, G., & Grigolini, P. (2018). Mimicking the collective intelligence of human groups as an optimization tool for complex problems. Chaos, Solitons & Fractals, 110, 259-266.

Finding the right spot in that flow: educational about equity-based securities

 

My editorial on You Tube

 

I am returning to educational content, and more specifically to finance. Incidentally, it is quite connected to my current research – crowdfunding in the market of renewable energies – and I feel like returning to the roots of financial theory. In this update, I am taking on a classical topic in finance: equity-based securities.

First things first, a short revision of what is equity. We have things, and we can have them in two ways. We can sort of have them, or have them actually. When I have something, like a house worth $1 mln, and, in the same time, I owe to somebody $1,2 mln, what is really mine, at the end of the day, is a debt of $1 mln – $1,2 mln = – $0,2 mln. As a matter of fact, I have no equity in this house. I just sort of have it. In the opposite case, when the house is worth $1,2 mln and my debt is just $1 mln, I really have $1,2 – $1 mln = $0,2 mln in equity.

There is a pattern in doing business: when we do a lot of it, we most frequently do it in a relatively closed circle of recurrent business partners. Developing durable business relations is even taught in business studies as one of the fundamental skills. When we recurrently do business with the same people, we have claims on each other. Some people owe me something, I owe something to others. The capital account, which we call « balance sheet », expresses the balance between those two types of claims: those of other people on me, against my claims on other people. The art of doing business consists very largely in having more claims on others than others have on us. That “more” is precisely our equity.

When we do business, people expect us to have and maintain positive equity in it. A business person is expected to have that basic skill of keeping a positive balance between claims they have on other people, and the claims that other people have on them.

There are two types of business people, and, correspondingly, two types of strategies regarding equity in business. Type A is mono-business. We do one business, and have one equity. Type B is multi-business. Type B is a bit ADHDish: those are people who would like to participate in oil drilling, manufacturing of solar modules, space travel to Mars, launching a new smartphone, and growing some marijuana, all in the same or nearly the same time. This is a fact of life that the wealthiest people in any social group are to be found in the second category. There is a recurrent pattern of climbing the ladder of social hierarchy: being restless, or at least open in the pursuit of different business opportunities rather than being consistent in pursuing just one. If you think about it, it is something more general: being open to many opportunities in life offers a special path of personal development. Yes, consistency and perseverance matter, but they matter even more when we can be open to novelty, and consistent in the same time.

We tend to do things together. This is how we survived, over millennia, all kinds of s**t: famine, epidemies, them sabretooth tigers and whatnot. Same for business: over time, we have developed institutions for doing business together.

When we do something again and again, we figure out a way of optimizing the doing of that something. In business law, we (i.e. homo sapiens) have therefore invented institutions for both type A, and type B. You look for doing the same business for a long time, and doing it together with other people, type A just like you? You will look for something like a limited liability partnership. If, on the other hand, you are rather the restless B type, you will need something like a joint stock company, and you will need equity-based securities.

The essential idea of an equity-based security is… well, there is more than one idea inside. This is a good example of what finance is: we invent something akin to a social screwdriver, i.e. a tool which unfolds its many utilities as it is being used. Hence, I start with the initial idea rather than with the essential one, and the initial one is to do business with, or between, those B-type people: restless, open-minded, constantly rearranging their horizon of new ventures. Such people need a predictable way to swing between different businesses and/or to build a complex portfolio thereof.

Thus, we have the basic deal presented graphically above: we set a company, we endow it with an equity of €3 000 000, we divide that equity into 10 000 shares of €300 each, and we distribute those shares among some initial group of shareholders. Question: why anyone should bother to be our shareholder, i.e. to pay those €300 for one share? What do they have in exchange? Well, each shareholder who pays €300, receives in exchange one share, nominally worth €300, a bundle of intangible rights, and the opportunity to trade that share in the so-called « stock market », i.e. the market of shares. Let’s discuss these one by one.

Apparently the most unequivocal thing, i.e. the share in itself, nominally worth €300, is, in itself, the least valuable part. It is important to know: the fact of holding shares in an incorporated company does not give to the shareholder any pre-defined, unconditional claim on the company. This is the big difference between a share, and a corporate bond. The fact of holding one €300 share does not entitle to payback of €300 from the company. You have decided to invest in our equity, bro? That’s great, but investment means risk. There is no refund possible. Well, almost no refund. There are contracts called « buyback schemes », which I discuss further.

The intangible rights attached to an equity-based security (share) fall into two categories: voting power on the one hand, and conditional claims on assets on the other hand.

Joint stock companies have official, decision-making bodies: the General Assembly, the Board of Directors, the Executive Management, and they can have additional committees, defined by the statute of the company. As a shareholder, I can directly execute my voting power at the General Assembly of Shareholders. Normally, one share means one vote. There are privileged shares, with more than one vote attached to them. These are usually reserved to the founders of a company. There can also be shares with a reduced voting power, when the company wants to reward someone, with its own shares, but does not want to give them influence on the course of the business.

The General Assembly is the corporate equivalent of Parliament. It is the source of all decisional power in the company. General Assembly appoints the Board of Directors, and, depending on the exact phrasing of the company’s statute, has various competences in appointing the Executive Management. The Board of Directors directs, i.e. it makes the strategic, long-term decisions, whilst the Executive Management is for current things. Now, long story short: the voting power attached to equity-based securities, in a company, is any good only if it is decisive in the appointment of Directors. This is what much of corporate law sums up to. If my shares give me direct leverage upon who will be in the Board of Directors, then I really have voting power.

Sometimes, when holding a small parcel of shares in a company, you can be approached by nice people, who will offer you money (not much, really) in exchange of granting them the power of attorney in the General Assembly, i.e. to vote there in your name. In corporate language it is called power of proxy, and those people, after having collected a lot of such small, individual powers of attorney, can run the so-called proxy votes. Believe me or not, but proxy powers are sort of tradable, too. If you have accumulated enough proxy power in the General Assembly of a company, you, in turn, might be approached by even nicer people, who will propose you (even more) money in exchange of having that conglomerate, proxy voting power of yours on their side when appointing a good friend of theirs to the Board of Directors.

Here you have a glimpse of what equity-based securities are in essence: they are tradable, abstract building blocks of an incorporated business structure. Knowing that, let’s have a look at the conditional claims on assets that come with a corporate share. The company makes some net profit at the end of the year, and happens even to have free cash corresponding to that profit, and the General Assembly decides to have 50% of net profit paid to shareholders, as dividend. Still, voting in a company is based on majority, and, as I already said, majority is there when it can back someone to be member of the Board of Directors. In practical terms it means that decisions about dividend are taken by a majority in the Board of Directors, who, in turn, represent a majority in the General Assembly.

The claim on dividend that you can have, as a shareholder, is conditional on: a) the fact of the company having any profit after tax, b) the company having any free cash in the balance sheet, corresponding to that profit after tax, and c) the majority of voting power in the General Assembly backing the idea of paying a dividend to shareholders. Summing up, the dividend is your conditional claim on the liquid assets of the company. Why do I say it is a conditional claim on assets, and not on net profit? Well, profit is a result. It is an abstract value. What is really there, to distribute, is some cash. That cash can come from many sources. It is just its arithmetical value that must correspond to a voted percentage of net profit after tax. Your dividend might be actually paid with cash that comes from the selling of some used equipment, previously owned by the company.

Another typical case of conditional claim on assets is that of liquidation and dissolvence. When business goes really bad, the company might be forced to sell out its fixed assets in order to pay its debts. When really a lot of debt is there to pay, the shareholders of the company might decide to sell out everything, and to dissolve the incorporation. In such case, should any assets be left at the moment of dissolvence, free of other claims, the proceeds from their sales can be distributed among the incumbent shareholders.

Right, but voting, giving or receiving proxy power, claiming the dividend or proceeds from dissolvence, it is all about staying in a company, and we were talking about the utility of equity-based securities for those B-type capitalists, who would rather trade their shares than hold them. These people can use the stock market.

It is a historical fact that whenever and wherever it became a common practice to incorporate business in the form of companies, and to issue equity-based securities corresponding to shares, a market for those securities arose. Military legions in Ancient Rome were incorporated businesses, which would issue (something akin to) equity-based securities, and there were special places, called ‘counters’, where those securities would be traded. This is a peculiar pattern in human civilisation: when we practice some kind of repetitive deals, whose structure can be standardized, we tend to single out some claims out of those contracts, and turn those claims into tradable financial instruments. We call them ‘financial instruments’, because they are traded as goods, whilst not having any intrinsic utility, besides the fact of representing some claims.

Probably the first modern stock exchange in Europe was founded in Angers, France, somehow in the 15th century. At the time, there were (virtually) no incorporated companies. Still, there was another type of equity. Goods used to be transported slowly. A cargo of wheat could take weeks to sail from port A to port B, and then to be transported inland by barges or carts pulled by oxen. If you were the restless type of capitalist, you could eat your fingernails out of restlessness when waiting for your money, invested in that wheat, to come back to you. Thus, merchants invented securities, which represented abstract arithmetical fraction of the market value ascribed to such a stock of wheat. They were called different names, and usually fell under the general category of warrants, i.e. securities that give the right to pick up something from somewhere. Those warrants were massively traded in that stock exchange in Angers, and in other similar places, like Cadiz, in Spain. Thus, I bought a stock of wheat in Poland (excellent quality and good price), and I had it shipped (horribly slowly) to Italy, and as soon as I had that stock, I made a series of warrants on it, like one warrant per 100 pounds of wheat, and I started trading those warrants.

By the way, this is where the name ‘stock market’ comes from. The word ‘stock’ initially meant, and still means, a large quantity of some tradable goods. Places, such as Angers o Cadiz, where warrants on such goods were being traded, were commonly called ‘stock markets’. When you think of it, those warrants on corn, cotton, wool, wine etc. were equity-based securities. As long as the issuer of warrants had any equity in that stock, i.e. as long as their debt was not exceeding the value of that stock, said value was equity and warrants on those goods were securities backed with equity.

That little historical sketch gives an idea of what finance is. This is a set of institutionalized, behavioural patterns and rituals, which allow faster reaction to changing conditions, by creating something like a social hormone: symbols subject to exchange, and markets of those symbols.

Here comes an important behavioural pattern, observable in the capital market. There are companies, which are recommended by analysts and brokers as ‘dividend companies’ or ‘dividend stock’. It is recommended to hold their stock for a long time, as a long-term investment. The fact of recommending them comes from another fact: in these companies, a substantial percentage of shares stays, for years, in the hands of the same people. This is how they can have their dividend. We can observe relatively low liquidity in their stock. Here is a typical loop, peculiar for financial markets. Some people like holding the stock of some companies for a long time. That creates little liquidity in that stock, and, indirectly, little variation in the market price of that stock. Little variation in price means that whatever you can expect to gain on that stock, you will not really make those gains overnight. Thus, you hold. As you hold, and as other people do the same, there is little liquidity on that stock, and little variation in its price, and analysts recommend it as ‘dividend stock’. And so the loop spins.

I generalize. You have some equity-based securities, whose market value comes mostly from the fact that we have a market for them. People do something specific about those securities, and their behavioural pattern creates a pattern in prices and quantities of trade in that stock. Other people watch those prices and quantities, and conclude that the best thing to do regarding those securities is to clone the behavioural pattern, which made those prices and quantities. The financial market works as a market for strategies. Prices and quantities become signals as for what strategy is recommended.

On the other hand, there are shares just made for being traded. Holding them for more than two weeks seems like preventing a race horse from having a run on the track. People buy and sell them quickly, there is a lot of turnover and liquidity, we are having fun with trade, and the price swings madly. Other people are having a look at the market, and they conclude that with those swings in price, they should buy and sell that stock really quickly. Another loop spins. The stock market gives two types of signals, for two distinct strategies. And thus, two types of capitalists are in the game: the calm and consistent A type, and the restless B type. The financial market and the behavioural patterns observable in business people mutually reinforce and sharpen each other.

Sort of in the shade of those ‘big’ strategies, there is another one. We have ambitions, but we have no capital. We convince other people to finance the equity of a company, where we become Directors or Executive Management. With time, we attribute ourselves so-called ‘management packages’, i.e. parcels of the company’s stock, paid to us as additional compensation. We reasonably assume that the value of those management packages is defined by the price we can sell this stock in. The best price is the price we make: this is one of the basic lessons in the course of macroeconomics. Hence, we make a price for our stock. As Board of Directors, we officially decide to buy some stock from shareholders, at a price which accidentally hits the market maximums or even higher. The company buys some stock from its own shareholders. That stock is usually specified. Just some stock is being bought back, in what we call a buyback scheme. Accidentally, that ‘just some stock’ is the stock contained in the management packages we hold as Directors. Pure coincidence. In some legal orders, an incorporated company cannot hold its own stock, and the shares purchased back must be nullified and terminated. Thus, the company makes some shares, issues them, gives them to selected people, who later vote to sell them back to the company, with a juicy surplus, and ultimately those shares disappear. In other countries, the shares acquired back by the company pass into the category of ‘treasury shares’, i.e. they become assets, without voting power or claim on dividend. This is the Dark Side of the stock market. When there is a lot of hormones flowing, you can have a position of power just by finding the right spot in that flow. Brains know it better than anyone else.

Now, some macroeconomics, thus the bird’s eye view. The bird is lazy, and it prefers having a look at the website of the World Bank, and there it picks two metrics: a) Gross Capital Formation as % of GDP and b) Stock traded as % of GDP. The former measures the value of new fixed assets that pop up in the economic system, the latter estimates the value of all corporate stock traded in capital markets. Both are denominated in units of real output, i.e. as % of GDP, and both have a line labelled ‘World’, i.e. the value estimated for the whole planet taken as an economic system. Here comes a table, and a graph. The latter calculates the liquidity of capital formation, measured as the value of stock traded divided by the gross value of fixed capital formed. Some sort of ascending cycle emerges, just as if we, humans, were experimenting with more and more financial liquidity in new fixed assets, and as if, from time to time, we had to back off a bit on that liquidity.

 

Year Gross capital formation (% of GDP), World Stocks traded, total value (% of GDP), World Year Gross capital formation (% of GDP), World Stocks traded, total value (% of GDP), World
1984 25,4% 17,7% 2001 24,0% 104,8%
1985 25,4% 23,7% 2002 23,4% 82,8%
1986 25,1% 32,4% 2003 23,9% 76,0%
1987 25,4% 46,8% 2004 24,7% 83,8%
1988 26,2% 38,1% 2005 25,0% 99,8%
1989 26,6% 44,5% 2006 25,4% 118,5%
1990 26,0% 31,9% 2007 25,8% 161,9%
1991 25,4% 24,1% 2008 25,6% 140,3%
1992 25,2% 22,5% 2009 23,4% 117,3%
1993 25,0% 30,7% 2010 24,2% 112,5%
1994 25,0% 34,0% 2011 24,5% 104,8%
1995 24,8% 34,1% 2012 24,3% 82,4%
1996 24,7% 41,2% 2013 24,2% 87,7%
1997 24,7% 58,9% 2014 24,4% 101,2%
1998 24,5% 73,1% 2015 24,2% 163,4%
1999 24,1% 103,5% 2016 23,8% 124,5%
2000 24,5% 145,7%

I am consistently delivering good, almost new science to my readers, and love doing it, and I am working on crowdfunding this activity of mine. As we talk business plans, I remind you that you can download, from the library of my blog, the business plan I prepared for my semi-scientific project Befund  (and you can access the French version as well). You can also get a free e-copy of my book ‘Capitalism and Political Power’ You can support my research by donating directly, any amount you consider appropriate, to my PayPal account. You can also consider going to my Patreon page and become my patron. If you decide so, I will be grateful for suggesting me two things that Patreon suggests me to suggest you. Firstly, what kind of reward would you expect in exchange of supporting me? Secondly, what kind of phases would you like to see in the development of my research, and of the corresponding educational tools?

Joseph et le perceptron

 

Mon éditorial sur You Tube

 

Voilà que mon idée d’appliquer l’intelligence artificielle pour simuler des décisions collectives – et plus particulièrement l’implémentation possible d’un schéma financier participatif à l’échelle locale – prend des couleurs. Je viens de travailler un peu sur l’idée de cohésion mutuelle entre les décisions collectives, bien dans l’esprit de ces articles que j’avais cité dans « Si je permets plus d’extravagance ». J’ai inclus la composante de cohésion dans le perceptron que je décris sur ce blog et applique dans ma recherche depuis à peu près 2 mois. Ça commence à donner des résultats intéressants : un perceptron qui prend en compte la cohésion relative de ses propres variables acquiert une certaine noblesse d’apprentissage profond. Vous avez pu lire les premiers résultats de cette approche dans « How can I possibly learn on that thing I have just become aware I do? ».

Depuis cette dernière mise à jour je me suis avancé un peu dans l’application de cette idée de cohésion et de la théorie d’essaims dans ma propre recherche. J’ai remarqué une différence nette entre la cohésion générée par le perceptron suivant la façon d’observer cette cohésion. Je peux adopter deux stratégies de simulation en ce qui concerne le rôle de la cohésion mutuelle entre variables. Stratégie no. 1 : je calcule la cohésion mutuelle entre les décisions représentées par les variables et je m’en arrête à l’observation. Le perceptron n’utilise donc pas la fonction d’adaptation comme paramètre dans le processus d’apprentissage : le réseau neuronal ne sait pas quel degré de cohésion il rend. J’ajoute donc une dimension additionnelle à l’observation de ce que fait le réseau neuronal mais je ne change pas la structure logique de réseau. Stratégie no. 2 : j’inclus les valeurs locales de la fonction d’adaptation – donc de la mesure de cohésion entre variables – comme paramètre utilisé par le perceptron. La cohésion mesurée dans la ronde d’expérimentation « k – 1 » est utilisée comme donnée dans la ronde « k ». La cohésion entre variables modifie la structure logique du réseau de façon récurrente. Je crée donc une composante d’apprentissage profond : le perceptron, initialement orienté sur l’expérimentation pure, commence à prendre en compte la cohésion interne entre ses propres décisions.

Cette prise en compte est indirecte. La cohésion des variables observée dans la ronde « k – 1 » s’ajoute, comme information additionnelle, aux valeurs de ces variables prise en compte dans la ronde « k ». Par conséquent, cette mesure de cohésion modifie l’erreur locale générée par le réseau et ainsi influence le processus d’apprentissage. Le mécanisme d’apprentissage sur la base de cohésion entre variables est donc identique au mécanisme de base de ce perceptron : la fonction d’adaptation, dont la valeur est inversement proportionnelle à la cohésion entre variables est une source d’erreur de plus. Le perceptron tend à minimiser l’erreur locale, donc il devrait, logiquement, minimiser la fonction d’adaptation aussi et ainsi maximiser la cohésion. Ça semble logique à première vue.

Eh bien, voilà que mon perceptron me surprend et il me surprend de façons différentes suivant la fonction d’activation qu’il utilise pour apprendre. Lorsque l’apprentissage se fait à travers la fonction sigmoïde, le perceptron rend toujours moins de cohésion à la fin des 5000 rondes d’expérimentation qu’il en rendait au début. Le sigmoïde à l’air de gonfler systématiquement la distance Euclidienne entre ses variables, quelle stratégie d’apprentissage que ce soit, la no. 1 ou bien la no. 2. Lorsque c’est la no. 2 (donc la cohésion est explicitement prise en compte) le perceptron génère une significativement moindre erreur cumulative et une plus grande cohésion à la fin. Moins d’erreur cumulative veut dire moins d’apprentissage : un perceptron apprend à travers l’analyse de ses propres erreurs. En même temps le sigmoïde qui doit activement observer sa propre cohésion devient un peu moins stable. La courbe d’erreur cumulative – donc la courbe essentielle d’apprentissage – devient un peu plus accidentée, avec des convulsions momentanées.

En revanche, lorsque mon perceptron s’efforce d’être intelligent à travers la tangente hyperbolique , il ne change pas vraiment la cohésion fondamentale entre variables. Il se comporte de façon typique à la tangente hyperbolique – donc il s’affole localement sans que ça change beaucoup à la longue – mais à la fin de la journée la cohésion générale entre variables diffère peu ou pas du tout par rapport à la position de départ. Pendant que le sigmoïde à l’air d’apprendre quelque chose à propos de sa cohésion – et ce quelque chose semble se résumer à dire qu’il faut vraiment la réduire, la cohésion – la tangente hyperbolique semble être quasi-inapte à apprendre quoi que ce soit de significatif. En plus, lorsque la tangente hyperbolique prend explicitement en compte sa propre cohésion, son erreur cumulative devient un peu moindre mais surtout son comportement devient beaucoup plus aléatoire, dans plusieurs dimensions. La courbe d’erreur locale acquiert beaucoup plus d’amplitude et en même temps l’erreur cumulative après 5000 rondes d’expérimentation varie plus d’une instance de 5000 à l’autre. La courbe de cohésion est tout aussi affolée mais à la fin de la journée y’a pas beaucoup de changement niveau cohésion.

J’ai déjà eu plusieurs fois cette intuition de base à propos de ces deux fonctions d’activation neurale : elles représentent des sentiers d’apprentissage fondamentalement différents. Le sigmoïde est comme un ingénieur enfermé à l’intérieur d’une capsule blindée. Il absorbe et amortit les chocs locaux tout en les menant gentiment dans une direction bien définie. La tangente hyperbolique quant à elle se comporte comme un chimpanzé névrotique : ça gueule à haute voix à la moindre dissonance, mais ça ne tire pas beaucoup de conclusions.  Je suis tenté de dire que le sigmoïde est comme intellect et la tangente hyperbolique représente les émotions. Réaction pesée et rationnelle d’une part, réaction vive et paniquarde d’autre part.

Je m’efforce de trouver une représentation graphique pour tout ce bazar, quelque chose qui soit à la fois synthétique et pertinent par rapport à ce que j’ai déjà présenté dans mes mises à jour précédentes. Je veux vous montrer la façon dont le perceptron apprend sous des conditions différentes. Je commence avec l’image. Ci-dessous, vous trouverez 3 graphes qui décrivent la façon dont mon perceptron apprend sous des conditions différentes. Plus loin, après les graphes, je développe une discussion.

 

Bon, je discute. Tout d’abord, les variables. J’en ai quatre dans ces graphes. La première, marquée comme ligne bleue avec des marques rouges discrètes, c’est l’erreur accumulée générée avec sigmoïde. Une remarque : cette fois, lorsque je dis « erreur accumulée », elle est vraiment accumulée. C’est la somme des toutes les erreurs locales, accumulées à mesure des rondes consécutives d’expérimentation. C’est donc comme ∑e(k) où « e(k) » est l’erreur locale – donc déviation par rapport aux valeurs initiales – observée sur les variables de résultat dans la k-tième ronde d’expérimentation. La ligne orange donne le même, donc l’erreur accumulée, seulement avec la fonction de tangente hyperbolique.

L’erreur accumulée, pour moi, elle est la mesure la plus directe de l’apprentissage compris de façon purement quantitative. Plus d’erreur accumulée, plus d’expérience mon perceptron a gagné. Quel que soit le scénario représenté sur les graphes, le perceptron accumule de l’apprentissage de manière différente, suivant la fonction neurale. Le sigmoïde accumule de l’expérience sans équivoque et d’une façon systématique. Avec la tangente hyperbolique, c’est différent. Lorsque j’observe cette courbe accidentée, j’ai l’impression intuitive d’un apprentissage par à-coups. Je vois aussi quelque chose d’autre, que j’ai même de la peine à nommer de manière précise. Si la courbe d’erreur accumulée – donc d’expérience rencontrée – descend abruptement, qu’est-ce que ça signifie ? Ce qui vient à mon esprit, c’est l’idée d’expériences contraires qui produisent de l’apprentissage contradictoire. Un jour, je suis très content par la collaboration avec ces gens de l’autre côté de la rue (l’autre côté de l’océan, de l’idéologie etc.) et je suis plein de conclusions comme « L’accord c’est mieux que la discorde » etc. Le jour suivant, lorsque j’essaie de reproduire cette expérience positive, ‘y-a du sable dans l’engrenage, tout à coup. Je ne peux pas trouver de langage commun avec ces mecs, ils sont nuls, ça ne semble aller nulle part de travailler ensemble. Chaque jour, je fais face à la possibilité équiprobable de me balancer dans l’un ou l’autre des extrêmes.

Je ne sais pas comme vous, mais moi je reconnais ce genre d’apprentissage. C’est du pain quotidien, en fait. Á moins d’avoir une méthode progressive d’apprendre quelque chose – donc le sigmoïde – nous apprenons fréquemment comme ça, c’est-à-dire en développant des schémas comportementaux contradictoires.

Ensuite, j’introduis une mesure de cohésion entre les variables du perceptron, comme l’inverse de la fonction d’adaptation, donc comme « 1/V(x) ». J’ai décidé d’utiliser cet inverse, au lieu de la fonction d’adaptation strictement dite, pour une explication plus claire. La fonction d’adaptation strictement dite est la distance Euclidienne entre les valeurs locales des variables de mon perceptron. Interprétée comme telle, la fonction d’adaptation est donc le contraire de la cohésion. Je me réfère à la théorie d’essaims, telle que je l’avais discutée dans « Si je permets plus d’extravagance ». Lorsque la courbe de « 1/V(x) » descend, cela veut dire moins de cohésion : l’essaim relâche ces associations internes. Lorsque « 1/V(x) » monte, une rigidité nouvelle s’introduit dans les associations de ce même essaim.

Question : puis-je légitimement considérer mes deux tenseurs – donc une collection structurée des variables numériques – comme un essaim social ? Je pense que je peux les regarder comme le résultat complexe d’activité d’un tel essaim : des décisions multiples, associées entre elles de manière changeante, peuvent être vues comme la manifestation d’apprentissage collectif.

Avec cette assomption, je vois, encore une fois – deux façons d’apprendre à travers les deux fonctions neurales différentes. Le sigmoïde produit toujours de la cohésion décroissante progressivement. L’essaim social qui marche selon ce modèle comportemental apprend progressivement (il accumule progressivement de l’expérience cohérente) et à mesure d’apprendre il relâche sa cohésion interne de façon contrôlée. L’essaim qui se comporte plutôt tangente hyperbolique fait quelque chose de différent : il oscille entre des niveaux différents de cohésion, comme s’il testait ce qui se passe lorsqu’on se permet plus de liberté d’expérimenter.

Bon, ça, ce sont mes impressions après avoir fait bosser mon perceptron sous des conditions différentes. Maintenant, puis-je trouver des connexions logiques entre ce que fait mon perceptron et la théorie économique ? Je dois me rappeler, encore et encore, que le perceptron, ainsi que tout le bazar d’intelligence collective, ça me sert à prédire l’absorption possible de mon concept financier dans le marché d’énergies renouvelables.

L’observation du perceptron suggère que le marché, il est susceptible de réagir à cette idée nouvelle de deux façons différentes : agitation d’une part et changement progressif d’autre part. En fait, en termes de théorie économique, je vois une association avec la théorie des cycles économiques de Joseph Schumpeter. Joseph, il assumait que changement technologique s’associe avec du changement social à travers deux sentiers distincts et parallèles : la destruction créative, qui fait souvent mal au moment donné, oblige la structure du système économique à changer de façon progressive.

Je continue à vous fournir de la bonne science, presque neuve, juste un peu cabossée dans le processus de conception. Je vous rappelle que vous pouvez télécharger le business plan du projet BeFund (aussi accessible en version anglaise). Vous pouvez aussi télécharger mon livre intitulé “Capitalism and Political Power”. Je veux utiliser le financement participatif pour me donner une assise financière dans cet effort. Vous pouvez soutenir financièrement ma recherche, selon votre meilleur jugement, à travers mon compte PayPal. Vous pouvez aussi vous enregistrer comme mon patron sur mon compte Patreon . Si vous en faites ainsi, je vous serai reconnaissant pour m’indiquer deux trucs importants : quel genre de récompense attendez-vous en échange du patronage et quelles étapes souhaitiez-vous voir dans mon travail ?

How can I possibly learn on that thing I have just become aware I do?

 

My editorial on You Tube

 

I keep working on the application of neural networks to simulate the workings of collective intelligence in humans. I am currently macheting my way through the model proposed by de Vincenzo et al in their article entitled ‘Mimicking the collective intelligence of human groups as an optimization tool for complex problems’ (2018[1]). In the spirit of my own research, I am trying to use optimization tools for a slightly different purpose, that is for simulating the way things are done. It usually means that I relax some assumptions which come along with said optimization tools, and I just watch what happens.

Vincenzo et al propose a model of artificial intelligence, which combines a classical perceptron, such as the one I have already discussed on this blog (see « More vigilant than sigmoid », for example) with a component of deep learning based on the observable divergences in decisions. In that model, social agents strive to minimize their divergences and to achieve relative consensus. Mathematically, it means that each decision is characterized by a fitness function, i.e. a function of mathematical distance from other decisions made in the same population.

I take the tensors I have already been working with, namely the input tensor TI = {LCOER, LCOENR, KR, KNR, IR, INR, PA;R, PA;NR, PB;R, PB;NR} and the output tensor is TO = {QR/N; QNR/N}. Once again, consult « More vigilant than sigmoid » as for the meaning of those variables. In the spirit of the model presented by Vincenzo et al, I assume that each variable in my tensors is a decision. Thus, for example, PA;R, i.e. the basic price of energy from renewable sources, which small consumers are charged with, is the tangible outcome of a collective decision. Same for the levelized cost of electricity from renewable sources, the LCOER, etc. For each i-th variable xi in TI and TO, I calculate its relative fitness to the overall universe of decisions, as the average of itself, and of its Euclidean distances to other decisions. It looks like:

 

V(xi) = (1/N)*{xi + [(xi – xi;1)2]0,5 + [(xi – xi;2)2]0,5 + … + [(xi – xi;K)2]0,5}

 

…where N is the total number of variables in my tensors, and K = N – 1.

 

In a next step, I can calculate the average of averages, thus to sum up all the individual V(xi)’s and divide that total by N. That average V*(x) = (1/N) * [V(x1) + V(x2) + … + V(xN)] is the measure of aggregate divergence between individual variables considered as decisions.

Now, I imagine two populations: one who actively learns from the observed divergence of decisions, and another one who doesn’t really. The former is represented with a perceptron that feeds back the observable V(xi)’s into consecutive experimental rounds. Still, it is just feeding that V(xi) back into the loop, without any a priori ideas about it. The latter is more or less what it already is: it just yields those V(xi)’s but does not do much about them.

I needed a bit of thinking as for how exactly should that feeding back of fitness function look like. In the algorithm I finally came up with, it looks differently for the input variables on the one hand, and for the output ones. You might remember, from the reading of « More vigilant than sigmoid », that my perceptron, in its basic version, learns by estimating local errors observed in the last round of experimentation, and then adding those local errors to the values of input variables, just to make them roll once again through the neural activation function (sigmoid or hyperbolic tangent), and see what happens.

As I upgrade my perceptron with the estimation of fitness function V(xi), I ask: who estimates the fitness function? What kind of question is that? Well, a basic one. I have that neural network, right? It is supposed to be intelligent, right? I add a function of intelligence, namely that of estimating the fitness function. Who is doing the estimation: my supposedly intelligent network or some other intelligent entity? If it is an external intelligence, mine, for a start, it just estimates V(xi), sits on its couch, and watches the perceptron struggling through the meanders of attempts to be intelligent. In such a case, the fitness function is like sweat generated by a body. The body sweats but does not have any way of using the sweat produced.

Now, if the V(xi) is to be used for learning, the perceptron is precisely the incumbent intelligent structure supposed to use it. I see two basic ways for the perceptron to do that. First of all, the input neuron of my perceptron can capture the local fitness functions on input variables and add them, as additional information, to the previously used values of input variables. Second of all, the second hidden neuron can add the local fitness functions, observed on output variables, to the exponent of the neural activation function.

I explain. I am a perceptron. I start my adventure with two tensors: input TI = {LCOER, LCOENR, KR, KNR, IR, INR, PA;R, PA;NR, PB;R, PB;NR} and output TO = {QR/N; QNR/N}. The initial values I start with are slightly modified in comparison to what was being processed in « More vigilant than sigmoid ». I assume that the initial market of renewable energies – thus most variables of quantity with ‘R’ in subscript – is quasi inexistent. More specifically, QR/N = 0,01 and  QNR/N = 0,99 in output variables, whilst in the input tensor I have capital invested in capacity IR = 0,46 (thus a readiness to go and generate from renewables), and yet the crowdfunding flow K is KR = 0,01 for renewables and KNR = 0,09 for non-renewables. If you want, it is a sector of renewable energies which is sort of ready to fire off but hasn’t done anything yet in that department. All in all, I start with: LCOER = 0,26; LCOENR = 0,48; KR = 0,01; KNR = 0,09; IR = 0,46; INR = 0,99; PA;R = 0,71; PA;NR = 0,46; PB;R = 0,20; PB;NR = 0,37; QR/N = 0,01; and QNR/N = 0,99.

Being a pure perceptron, I am dumb as f**k. I can learn by pure experimentation. I have ambitions, though, to be smarter, thus to add some deep learning to my repertoire. I estimate the relative mutual fitness of my variables according to the V(xi) formula given earlier, as arithmetical average of each variable separately and its Euclidean distance to others. With the initial values as given, I observe: V(LCOER; t0) = 0,302691788; V(LCOENR; t0) = 0,310267104; V(KR; t0) = 0,410347388; V(KNR; t0) = 0,363680721; V(IR ; t0) = 0,300647174; V(INR ; t0) = 0,652537097; V(PA;R ; t0) = 0,441356844 ; V(PA;NR ; t0) = 0,300683099 ; V(PB;R ; t0) = 0,316248176 ; V(PB;NR ; t0) = 0,293252713 ; V(QR/N ; t0) = 0,410347388 ; and V(QNR/N ; t0) = 0,570485945. All that stuff put together into an overall fitness estimation is like average V*(x; t0) = 0,389378787.

I ask myself: what happens to that fitness function when as I process information with my two alternative neural functions, the sigmoid or the hyperbolic tangent. I jump to experimental round 1500, thus to t1500, and I watch. With the sigmoid, I have V(LCOER; t1500) =  0,359529289 ; V(LCOENR; t1500) =  0,367104605; V(KR; t1500) =  0,467184889; V(KNR; t1500) = 0,420518222; V(IR ; t1500) =  0,357484675; V(INR ; t1500) =  0,709374598; V(PA;R ; t1500) =  0,498194345; V(PA;NR ; t1500) =  0,3575206; V(PB;R ; t1500) =  0,373085677; V(PB;NR ; t1500) =  0,350090214; V(QR/N ; t1500) =  0,467184889; and V(QNR/N ; t1500) = 0,570485945, with average V*(x; t1500) =  0,441479829.

Hmm, interesting. Working my way through intelligent cognition with a sigmoid, after 1500 rounds of experimentation, I have somehow decreased the mutual fitness of decisions I make through individual variables. Those V(xi)’s have changed. Now, let’s see what it gives when I do the same with the hyperbolic tangent: V(LCOER; t1500) =   0,347752478; V(LCOENR; t1500) =  0,317803169; V(KR; t1500) =   0,496752021; V(KNR; t1500) = 0,436752021; V(IR ; t1500) =  0,312040791; V(INR ; t1500) =  0,575690006; V(PA;R ; t1500) =  0,411438698; V(PA;NR ; t1500) =  0,312052766; V(PB;R ; t1500) = 0,370346458; V(PB;NR ; t1500) = 0,319435252; V(QR/N ; t1500) =  0,496752021; and V(QNR/N ; t1500) = 0,570485945, with average V*(x; t1500) =0,413941802.

Well, it is becoming more and more interesting. Being a dumb perceptron, I can, nevertheless, create two different states of mutual fitness between my decisions, depending on the kind of neural function I use. I want to have a bird’s eye view on the whole thing. How can a perceptron have a bird’s eye view of anything? Simple: it rents a drone. How can a perceptron rent a drone? Well, how smart do you have to be to rent a drone? Anyway, it gives something like the graph below:

 

Wow! So this is what I do, as a perceptron, and what I haven’t been aware so far? Amazing. When I think in sigmoid, I sort of consistently increase the relative distance between my decisions, i.e. I decrease their mutual fitness. The sigmoid, that function which sorts of calms down any local disturbance, leads to making a decision-making process like less coherent, more prone to embracing a little chaos. The hyperbolic tangent thinking is different. It occasionally sort of stretches across a broader spectrum of fitness in decisions, but as soon as it does so, it seems being afraid of its own actions, and returns to the initial level of V*(x). Please, note that as a perceptron, I am almost alive, and I produce slightly different outcomes in each instance of myself. The point is that in the line corresponding to hyperbolic tangent, the comb-like pattern of small oscillations can stretch and move from instance to instance. Still, it keeps the general form of a comb.

OK, so this is what I do, and now I ask myself: how can I possibly learn on that thing I have just become aware I do? As a perceptron, endowed with this precise logical structure, I can do one thing with information: I can arithmetically add it to my input. Still, having some ambitions for evolving, I attempt to change my logical structure, and I risk myself into incorporating somehow the observable V(xi) into my neural activation function. Thus, the first thing I do with that new learning is to top the values of input variables with local fitness functions observed in the previous round of experimenting. I am doing it already with local errors observed in outcome variables, so why not doubling the dose of learning? Anyway, it goes like: xi(t0) = xi(t-1) + e(xi; t-1) + V(xi; t-1). It looks interesting, but I am still using just a fraction of information about myself, i.e. just that about input variables. Here is where I start being really ambitious. In the equation of the sigmoid function, I change s = 1 / [1 + exp(∑xi*Wi)] into s = 1 / [1 + exp(∑xi*Wi + V(To)], where V(To) stands for local fitness functions observed in output  variables. I do the same by analogy in my version based on hyperbolic tangent. The th = [exp(2*∑xi*wi)-1] / [exp(2*∑xi*wi) + 1] turns into th = {exp[2*∑xi*wi + V(To)] -1} / {exp[2*∑xi*wi + V(To)] + 1}. I do what I know how to do, i.e. adding information from fresh observation, and I apply it to change the structure of my neural function.

All those ambitious changes in myself, put together, change my pattern of learing as shown in the graph below:

When I think sigmoid, the fact of feeding back my own fitness function does not change much. It makes the learning curve a bit steeper in the early experimental rounds, and makes it asymptotic to a little lower threshold in the last rounds, as compared to learning without feedback on V(xi). Yet, it is the same old sigmoid, with just its sleeves ironed. On the other hand, the hyperbolic tangent thinking changes significantly. What used to look like a comb, without feedback, now looks much more aggressive, like a plough on steroids. There is something like a complex cycle of learning on the internal cohesion of decisions made. Generally, feeding back the observable V(xi) increases the finally achieved cohesion in decisions, and, in the same time, it reduces the cumulative error gathered by the perceptron. With that type of feedback, the cumulative error of the sigmoid, which normally hits around 2,2 in this case, falls to like 0,8. With hyperbolic tangent, cumulative errors which used to be 0,6 ÷ 0,8 without feedback, fall to 0,1 ÷ 0,4 with feedback on V(xi).

 

The (provisional) piece of wisdom I can have as my takeaway is twofold. Firstly, whatever I do, a large chunk of perceptual learning leads to a bit less cohesion in my decisions. As I learn by experience, I allow myself more divergence in decisions. Secondly, looping on that divergence, and including it explicitly in my pattern of learning leads to relatively more cohesion at the end of the day. Still, more cohesion has a price – less learning.

 

I am consistently delivering good, almost new science to my readers, and love doing it, and I am working on crowdfunding this activity of mine. As we talk business plans, I remind you that you can download, from the library of my blog, the business plan I prepared for my semi-scientific project Befund  (and you can access the French version as well). You can also get a free e-copy of my book ‘Capitalism and Political Power’ You can support my research by donating directly, any amount you consider appropriate, to my PayPal account. You can also consider going to my Patreon page and become my patron. If you decide so, I will be grateful for suggesting me two things that Patreon suggests me to suggest you. Firstly, what kind of reward would you expect in exchange of supporting me? Secondly, what kind of phases would you like to see in the development of my research, and of the corresponding educational tools?

[1] De Vincenzo, I., Massari, G. F., Giannoccaro, I., Carbone, G., & Grigolini, P. (2018). Mimicking the collective intelligence of human groups as an optimization tool for complex problems. Chaos, Solitons & Fractals, 110, 259-266.

Si je permets plus d’extravagance

 

Mon éditorial sur You Tube

 

Je continue ma petite aventure intellectuelle avec l’application des réseaux neuronaux – donc de l’intelligence artificielle – à la simulation de l’intelligence collective des sociétés humaines. J’ai découvert un créneau de recherche – documenté dans la littérature – qui s’occupe presque exclusivement du sujet de l’intelligence collective en général. Voici donc un article par Stradner et al. (2013[1]) qui étudie les algorithmes afférents à la théorie d’essaims ou intelligence distribuée. Nous, les humains, nous hésitons toujours en ce qui concerne notre propre intelligence collective mais nous avons des idées bien arrêtées en ce qui concerne celle d’abeilles ou des fourmis. La théorie d’essaims est une approche behavioriste à l’intelligence collective : elle assume que l’action en général, intelligente ou pas tout à fait, peut être considérée comme collective si elle est coordonnée. L’intelligence collective d’un ensemble d’entités plus ou moins autonomes peut être observée comme coordination et plus exactement comme association. Nous pouvons percevoir toute action comme mouvement et alors nous pouvons juger de la coordination dans cette action par le degré d’association entre mouvements individuels.

 

Il y a trois classes d’association : dynamique aléatoire, dynamique corrélée et statique. La distinction se fait en fonction d’interactions observées entre les entités qui font l’ensemble donné. Si les interactions sont des contacts aléatoires, dépourvus des schémas habituels – sans ritualisation, si vous voulez – nous parlons d’association aléatoire. On s’associe, de temps à l’autre, mais on ne fait pas d’amis dans le processus. S’il y a échange d’information dans l’association aléatoire, nous perdons une part significative du contenu communiqué puisque nous n’avons pas de rituels sociaux pour le codage et décodage, ainsi que pour la reconnaissance de source. Si je suis un petit investisseur boursier, parmi une foule d’autres petits investisseurs que je ne croise que de façon accidentelle, c’est largement un cas d’association dynamique aléatoire.

 

L’intelligence dans association dynamique aléatoire, où est-elle ? Y-a-t-il une intelligence quelconque dans ce type d’interaction ? Une annonce publicitaire dans la télé c’est un message jeté en l’air dans l’espoir que quelqu’un l’attrape. Lorsqu’une nouvelle version de Renault Mégane est lancée, c’est un concept technologique introduit dans une population hétérogène, dans l’espoir de trouver un intérêt suffisamment prononcé de la part de certaines personnes dans cette population. Ce sont des cas d’interaction basée sur des associations dynamiques aléatoires. C’est un cas d’intelligence collective basée sur la méthode d’essai et erreur. On produit quelque changement et on sait que dans la plupart d’interactions sociales ce changement va être sans conséquence majeure, néanmoins on compte que d’une perspective probabiliste certaines interactions vont traduire de changement local en quelque chose de plus durable.

 

Dans l’expression « dynamique aléatoire » le mot « dynamique » veut dire que chaque entité dans l’essaim entre dans un nombre changeant d’interactions avec d’autres entités. Si je suis dans un essaim dynamique, je peux interagir avec 5 autres entités ou bien avec 50 autres, ça dépend. Lorsque « ça dépend » prend une forme récurrente, avec un schéma visible, ma position dans l’essaim devient dynamique corrélée. Le fait d’interagir avec entité A entraîne, avec une probabilité possible à définir, l’interaction avec l’entité B etc. Ce schéma répétitif permet de ritualiser la communication et de la faire plus efficiente, avec moins de pertes niveau contenu. Je retourne à l’exemple de la nouvelle version de Renault Mégane. Les agissements d’un concessionnaire Renault bien établi dans une communauté locale, lorsque ledit contacte ses clients fidèles locaux, c’est-à-dire ceux dont il sait que vu leur histoire de service prêté aux bagnoles Renault qu’ils ont maintenant les prédispose à acheter cette nouvelle bête bien attractive : ça c’est du dynamique corrélé.

 

Lorsque le schéma récurrent d’interaction devient rigide, donc lorsque chaque entité dans l’essaim interagit dans un réseau fixe d’autres entités – comme dans une structure cristalline – alors nous parlons d’association statique. Les interactions rigides sont hautement formalisées, y compris le mode de communication. Ce qui se passe dans un Parlement – enfin ce qui s’y passe officiellement – est ce type d’interaction.

 

Ces distinctions fondamentales peuvent être appliquées aux réseaux neuronaux et à l’intelligence artificielle qui s’ensuit. Les réseaux probabilistes, qui créent des agglomérations des phénomènes sur la base des traits distinctifs définis par observation, sans assomptions à priori, sont des formes d’association dynamique aléatoire. On se groupe autour de quelque chose qui émerge, dans cette situation précise, comme le truc le plus évident pour se grouper autour. L’algorithme appelé BEECLUST est un exemple de ce type d’intelligence collective. A l’autre extrémité nous avons des algorithmes pour simuler des relations plus ou moins statiques. L’un d’eux est le Réseau d’Hormones Artificielles. Imaginez un essaim d’abeilles – donc un BEECLUST – qui au lieu de s’agglomérer de façon tout à fait spontanée et spécifique à la situation donnée s’agrège autour d’un agent chimique, comme des phéromones. Si j’ai un petit univers fait des phénomènes décrits avec des variables quantitatives, je peux inclure un agent artificiel d’agglomération, comme une hormone artificielle. Dans un marché, je peux introduire une fonction qui distribue les actifs financiers parmi les agents économiques et imposer une règle d’agglomération autour de ces actifs financiers. L’idée générale est que si dans une structure associée aléatoirement j’introduis un agent d’agglomération contrôlé, je peux transformer des associations aléatoires en des associations corrélées ou même des associations statiques.

 

Le truc intéressant est de définir quel type d’association est le plus approprié pour le cas donné. Il y a une classe d’algorithmes qui suivent le modèle d’Embryogénèse Virtuelle (Thenius et al. 2009[2]). Dans cette approche, chaque entité dans l’essaim définit et redéfinit ses associations avec d’autres entités à mesure que l’essaim change de taille et/ou en fonction des conditions externes à l’essaim. C’est tout comme les cellules d’un embryon, qui redéfinissent leurs associations avec d’autres cellules au fur et à mesure de la croissance de l’organisme entier.

 

J’ai trouvé un autre article, par Kolonin et al. (2016[3]), qui va dans une direction tout à fait différente dans l’étude de l’intelligence collective. Kolonin et al assument qu’une structure intelligente – le cerveau humain par exemple – est d’habitude une structure complexe, composée des sous-structures distinctes. L’intelligence est l’interaction aussi bien entre ces sous-unités qu’à l’intérieur de chaque entité séparément. Une collectivité est habituellement formée de sous-groupes culturels qui se distinguent l’un de l’autre par leurs systèmes respectifs des croyances. La formation de ces sous-groupes suit un schéma quasi-colloïdal : un système des croyances a une frontière sémantique – comme une membrane cellulaire – qui définit la compatibilité relative de ce système avec d’autres systèmes.

Kolonin et al ont conduit une expérience avec la participation de 2784 utilisateurs des réseaux sociaux. L’expérience visait à étudier la manière dont cette population choisie de façon aléatoire se structurerait en des sous – groupes avec des systèmes distincts des croyances. La méthode expérimentale consistait à créer des conditions pour que ces 2784 individus se communiquent entre eux de façon typique à un réseau social, avec l’introduction des contraintes possibles.

La première découverte intéressante est qu’à mesure que de l’information nouvelle était mise en circulation dans le réseau expérimental, elle était utilisée par les participants pour former des systèmes des croyances de plus en plus distincts et divergents. La vitesse de divergence tend à croître avec le temps : l’utilité marginale de chaque nouvelle unité d’information est croissante. En plus, la divergence survenait de façon la plus dynamique lorsque la population expérimentale ne rencontrait aucune contrainte significative. Lorsque des contraintes contrôlées étaient imposées sur le groupe – manque relatif de ressources, par exemple – les sous – groupes commençaient à chercher des manières de combler les différences des croyances et de créer des liens intergroupes.

Alors là, si cette expérience par Kolonin et al était robuste en termes de rigueur scientifique, c’est toute une théorie nouvelle qui en émerge. On se pose des questions au sujet de cette violence verbale bizarroïde qui survient dans aussi bien que par l’intermédiaire des réseaux sociaux. L’expérience citée montre que c’est absolument normal. On prend une collectivité, on lui assure des conditions de vie décentes et on lui donne un medium de communication qui facilite le partage d’information. Cette collectivité semble être condamnée à se scinder en des sous-groupes qui vont développer des dissonances de plus en plus prononcées dans leurs systèmes des croyances. Seulement une pression environnementale sentie comme significative peut inciter à chercher l’accord.

J’essaie d’associer les deux lignes différentes de raisonnement dans les deux articles que je viens de citer. Y-a-t-il un lien entre la théorie d’essaims et les types possibles d’association dans un essaim, d’une part, et le phénomène découvert dans l’expérience de Kolonin et al ? Je me dis qu’une disparité de plus en plus poussée d’une collectivité, en des sous-groupes distincts avec des systèmes distincts des croyances, veut dire de moins en moins d’aléatoire dans les associations d’individus et de plus en plus de corrélation prévisible. Si je tends à faire copain avec un groupe des gens qui démontrent des croyances similaires aux miennes, et en même temps je crée une distance vis-à-vis de ces-mecs-qui-se-gourent-complétement-dans-leur-vision-du-monde, mes associations se ritualisent et deviennent plus rigides. Dans un cas extrême, une divergence très poussée des systèmes des croyances peut créer un essaim rigide, où chaque entité suit un protocole très strict dans ses relations avec les autres et chacun a peur d’aller au-delà de ce protocole.

Voilà encore un autre article, par de Vicenzo et al (2018[4]) avec lequel je reviens vers les algorithmes d’intelligence artificielle. L’algorithme général – ou plutôt le modèle d’algorithme – que Vicenzo et al présentent est un exemple d’apprentissage profond, spécifiquement orienté comme représentation des décisions humaines collectives. Il y a donc une partie perceptron, similaire à ce que j’utilise couramment dans ma recherche sur le marché de l’énergie, et une partie qui utilise la théorie d’essaims pour évaluer la qualité des solutions préalablement définies de façon expérimentale par le perceptron. Je sens que je vais y passer quelque temps, avec cet article. Pour le moment, cette lecture particulière me pousse à poser une question à la frontière de maths et de la philosophie : comment les nombres aléatoires existent-ils dans une structure intelligente ?

J’explique. Voilà un perceptron simple, comme celui que j’ai discuté dans « De la misère, quoi » où les variables d’entrée, groupées dans un tenseur, reçoivent des coefficients de pondération aléatoires. Tout logiciel, les langages de programmation compris, génèrent les nombres aléatoires comme, en fait, des nombres quasi-aléatoires. Le logiciel se réfère à une bibliothèque des nombres compris entre 0 et 1 et y conduit quelque chose comme tirage au sort. Les nombres que le logiciel rend après la commande du type « random » sont donc des nombres prédéfinis dans la bibliothèque. La pondération des variables d’entrée avec des coefficients aléatoires consiste donc, mathématiquement, à créer une combinaison locale entre le tenseur des variables en question avec un tenseur, beaucoup plus complexe, des nombres quasi-aléatoires. Comment est-ce que ça reflète l’intelligence humaine ? Lorsque quelqu’un me demande de dire un nombre au hasard, comme ça, par surprise, est-ce que mon cerveau crée ce nombre ou bien puise-je dans une bibliothèque des nombres dont l’existence même est au-delà de ma conscience momentanée ?

Vicenzo et al présentent une façon intéressante d’utiliser un perceptron pour simuler les décisions collectives : les variables d’entrée du perceptron peuvent être présentées comme autant des décisions distinctes. Lorsque dans ma recherche j’ai ce tenseur d’entrée fait des prix d’énergie, ainsi que de son coût lissé LCOE et du capital investi, ces variables composantes peuvent être vues comme autant des décisions. Si nous prenons plusieurs décisions distinctes, il est important de comprendre leurs liens mutuels. Pour le faire, Vincenzo et al utilisent une procédure communément appelée « NK » ou bien « paysage adaptatif ». Si j’ai N décisions variables dans mon tenseur d’entrée, chacune d’elles peut entrer en K = N – 1 interactions avec les K = N – 1 autres.

Pour chaque décision « di » de parmi l’ensemble de N = {d1, d2, …, dN} décisions je peux définir une fonction d’adaptation V(di). La fonction d’adaptation, à son tour, repose sur l’assomption que des décisions distinctes peuvent former un tout fonctionnel si elles sont mutuellement cohérentes. Je peux exprimer la cohérence mutuelle de quoi que ce soit avec deux opérations mathématiques de base : la distance Euclidienne ou la division. La distance Euclidienne c’est tout simplement de la soustraction qui a peur des minus, donc au lieu de calculer d1 – d2, je fais [(d1 – d2)2]0,5 , juste en cas où d2 > d1.

Ma fonction d’adaptation peut donc se présenter comme V(di) = {[(di – d1)2]0,5 + [(di – d2)2]0,5 + … + [(di – dK)2]0,5} ou bien comme V(di) = (di/d1 + di/d2 + … + di/dK). Laquelle de ces deux formes choisir pour un problème donné ? En théorie, la distance Euclidienne reflète la différence pendant que la division représente une proportion. Allez voir ce qui est plus approprié pour votre problème. Si nous appréhendons la différence d’opinion, par exemple, comme chez Kolonin et al, cité plus haut, la distance Euclidienne semble être plus appropriée. En revanche, lorsque je prends en compte la magnitude des résultats des décisions distinctes – comme des taux de rendement sur un portefeuille d’investissements distincts – je peux essayer la proportion, donc la division. En pratique, essayez voir ce qui marche niveau computation. Mon expérience personnelle avec les réseaux neuronaux est celle d’un néophyte, mais j’ai déjà appris un truc : un réseau neuronal c’est comme un mécanisme d’horlogerie. Y’a des combinaisons d’équations qui marchent et celles qui ne marchent pas, point à la ligne. Vous remplacez la tangente hyperbolique avec le sinus hyperbolique, avec le même ensemble des valeurs initiales, et le perceptron se cale.

Un autre truc que j’ai appris : les perceptrons, ça marche bien avec des valeurs numériques relativement petites et lissées. Je prends donc V(di) = {[(di – d1)2]0,5 + [(di – d2)2]0,5 + … + [(di – dK)2]0,5} et je lui fais faire deux pas de danse. Pas no. 1 : inclure di dans ce tenseur. La fonction d’adaptation d’une décision c’est la décision-même plus la somme de ses différences par rapport aux autres K décisions du lot. Ça donne V(di) = {di + [(di – d1)2]0,5 + [(di – d2)2]0,5 + … + [(di – dK)2]0,5}. Ensuite, je réduis la valeur de V(di) en la divisant par N, le nombre total des décisions dans le jeu. J’ai alors donne V(di) = {di + [(di – d1)2]0,5 + [(di – d2)2]0,5 + … + [(di – dK)2]0,5} / N et c’est à peu de choses près la version de travail en ce qui concerne la fonction d’adaptation. C’est suffisamment complexe pour refléter la structure du problème et c’est suffisamment simplifié pour être utilisé dans un algorithme.

Le temps est venu de mettre cette mise à jour en ligne, sur mon blog. C’est donc le temps de conclure. Comme je passe en revue la littérature sur le phénomène d’intelligence collective, je vois un dénominateur commun à toutes ces lignes de recherche extrêmement diverses : une intelligence collective est une culture et en tant que telle elle se caractérise par une certaine cohésion interne. Haute cohésion veut dire efficacité aux dépens de la découverte. Lorsque je relâche la cohésion et je permets plus d’extravagance, je peux découvrir plus mais la culture devient difficile à diriger et peut générer des conflits.

Je continue à vous fournir de la bonne science, presque neuve, juste un peu cabossée dans le processus de conception. Je vous rappelle que vous pouvez télécharger le business plan du projet BeFund (aussi accessible en version anglaise). Vous pouvez aussi télécharger mon livre intitulé “Capitalism and Political Power”. Je veux utiliser le financement participatif pour me donner une assise financière dans cet effort. Vous pouvez soutenir financièrement ma recherche, selon votre meilleur jugement, à travers mon compte PayPal. Vous pouvez aussi vous enregistrer comme mon patron sur mon compte Patreon . Si vous en faites ainsi, je vous serai reconnaissant pour m’indiquer deux trucs importants : quel genre de récompense attendez-vous en échange du patronage et quelles étapes souhaitiez-vous voir dans mon travail ?

[1] Stradner, J., Thenius, R., Zahadat, P., Hamann, H., Crailsheim, K., & Schmickl, T. (2013). Algorithmic requirements for swarm intelligence in differently coupled collective systems. Chaos, Solitons & Fractals, 50, 100-114.

[2] Thenius R, Bodi M, Schmickl T, Crailsheim K. (2009) Evolving virtual embryogenesis to structure complex controllers. PerAdaMagazine; 2009.

[3] Kolonin, A., Vityaev, E., & Orlov, Y. (2016). Cognitive architecture of collective intelligence based on social evidence. Procedia Computer Science, 88, 475-481.

[4] De Vincenzo, I., Massari, G. F., Giannoccaro, I., Carbone, G., & Grigolini, P. (2018). Mimicking the collective intelligence of human groups as an optimization tool for complex problems. Chaos, Solitons & Fractals, 110, 259-266.

More vigilant than sigmoid

My editorial on You Tube

 

I keep working on the application of neural networks as simulators of collective intelligence. The particular field of research I am diving into is the sector of energy, its shift towards renewable energies, and the financial scheme I invented some time ago, which I called EneFin. As for that last one, you can consult « The essential business concept seems to hold », in order to grasp the outline.

I continue developing the line of research I described in my last update in French: « De la misère, quoi ». There are observable differences in the prices of energy according to the size of the buyer. In many countries – practically in all the countries of Europe – there are two, distinct price brackets. One, which I further designated as PB, is reserved to contracts with big consumers of energy (factories, office buildings etc.) and it is clearly lower. Another one, further called PA, is applied to small buyers, mainly households and really small businesses.

As an economist, I have that intuitive thought in the presence of price forks: that differential in prices is some kind of value. If it is value, why not giving it some financial spin? I came up with the idea of the EneFin contract. People buy energy from a local supplier, in the amount Q, who sources it from renewables (water, wind etc.), and they pay the price PA, thus generating a financial flow equal to Q*PA. That flow buys two things: energy priced at PB, and participatory titles in the capital of their supplier, for the differential Q*(PA – PB). I imagine some kind of crowdfunding platform, which could channel the amount of capital K = Q*(PA – PB).

That K remains in some sort of fluid relationship to I, or capital invested in the productive capacity of energy suppliers. Fluid relationship means that each of those capital balances can date other capital balances, no hard feelings held. As we talk (OK, I talk) about prices of energy and capital invested in capacity, it is worth referring to LCOE, or Levelized Cost Of Electricity. The LCOE is essentially the marginal cost of energy, and a no-go-below limit for energy prices.

I want to simulate the possible process of introducing that general financial concept, namely K = Q*(PA – PB), into the market of energy, in order to promote the development of diversified networks, made of local suppliers in renewable energy.

Here comes my slightly obsessive methodological idea: use artificial intelligence in order to simulate the process. In classical economic method, I make a model, I take empirical data, I regress some of it on another some of it, and I come up with coefficients of regression, and they tell me how the thing should work if we were living in a perfect world. Artificial intelligence opens a different perspective. I can assume that my model is a logical structure, which keeps experimenting with itself and we don’t the hell know where exactly that experimentation leads. I want to use neural networks in order to represent the exact way that social structures can possibly experiment with that K = Q*(PA – PB) thing. Instead of optimizing, I want to see that way that possible optimization can occur.

I have that simple neural network, which I already referred to in « The point of doing manually what the loop is supposed to do » and which is basically quite dumb, as it does not do any abstraction. Still, it nicely experiments with logical structures. I am sketching its logical structure in the picture below. I distinguish four layers of neurons: input, hidden 1, hidden 2, and output. When I say ‘layers’, it is a bit of grand language. For the moment, I am working with one single neuron in each layer. It is more of a synaptic chain.

Anyway, the input neuron feeds data into the chain. In the first round of experimentation, it feeds the source data in. In consecutive rounds of learning through experimentation, that first neuron assesses and feeds back local errors, measured as discrepancies between the output of the output neuron, and the expected values of output variables. The input neuron is like the first step in a chain of perception, in a nervous system: it receives and notices the raw external information.

The hidden layers – or the hidden neurons in the chain – modify the input data. The first hidden neuron generates quasi-random weights, which the second hidden neuron attributes to the input variables. Just as in a nervous system, the input stimuli are assessed as for their relative importance. In the original algorithm of perceptron, which I used to design this network, those two functions, i.e. generating the random weights and attributing them to input variables, were fused in one equation. Still, my fundamental intent is to use neural networks to simulate collective intelligence, and intuitively guess those two functions are somehow distinct. Pondering the importance of things is one action and using that ponderation for practical purposes is another. It is like scientist debating about the way to run a policy, and the government having the actual thing done. These are two separate paths of action.

Whatever. What the second hidden neuron produces is a compound piece of information: the summation of input variables multiplied by random weights. The output neuron transforms this compound data through a neural function. I prepared two versions of this network, with two distinct neural functions: the sigmoid, and the hyperbolic tangent. As I found out, the way they work is very different, just as the results they produce. Once the output neuron generates the transformed data – the neural output – the input neuron measures the discrepancy between the original, expected values of output variables, and the values generated by the output neuron. The exact way of computing that discrepancy is made of two operations: calculating the local derivative of the neural function, and multiplying that derivative by the residual difference ‘original expected output value minus output value generated by the output neuron’. The so calculated discrepancy is considered as a local error, and is being fed back into the input neuron as an addition to the value of each input variable.

Before I go into describing the application I made of that perceptron, as regards my idea for financial scheme, I want to delve into the mechanism of learning triggered through repeated looping of that logical structure. The input neuron measures the arithmetical difference between the output of the network in the preceding round of experimentation, and that difference is being multiplied by the local derivative of said output. Derivative functions, in their deepest, Newtonian sense, are magnitudes of change in something else, i.e. in their base function. In the Newtonian perspective, everything that happens can be seen either as change (derivative) in something else, or as an integral (an aggregate that changes its shape) of still something else. When I multiply the local deviation from expected values by the local derivative of the estimated value, I assume this deviation is as important as the local magnitude of change in its estimation. The faster things happen, the more important they are, so do say. My perceptron learns by assessing the magnitude of local changes it induces in its own estimations of reality.

I took that general logical structure of the perceptron, and I applied it to my core problem, i.e. the possible adoption of the new financial scheme to the market of energy. Here comes sort of an originality in my approach. The basic way of using neural networks is to give them a substantial set of real data as learning material, make them learn on that data, and then make them optimize a hypothetical set of data. Here you have those 20 old cars, take them into pieces and try to put them back together, observe all the anomalies you have thus created, and then make me a new car on the grounds of that learning. I adopted a different approach. My focus is to study the process of learning in itself. I took just one set of actual input values, exogenous to my perceptron, something like an initial situation. I ran 5000 rounds of learning in the perceptron, on the basis of that initial set of values, and I observed how is learning taking place.

My initial set of data is made of two tensors: input TI and output TO.

The thing I am the most focused on is the relative abundance of energy supplied from renewable sources. I express the ‘abundance’ part mathematically as the coefficient of energy consumed per capita, or Q/N. The relative bend towards renewables, or towards the non-renewables is apprehended as the distinction between renewable energy QR/N consumed per capita, and the non-renewable one, the QNR/N, possibly consumed by some other capita. Hence, my output tensor is TO = {QR/N; QNR/N}.

I hypothesise that TO is being generated by input made of prices, costs, and capital outlays. I split my price fork PA – PB (price for the big ones minus price for the small ones) into renewables and non-renewables, namely into: PA;R, PA;NR, PB;R, and PB;NR. I mirror the distinction in prices with that in the cost of energy, and so I call LCOER and LCOENR. I want to create a financial scheme that generates a crowdfunded stream of capital K, to finance new productive capacities, and I want it to finance renewable energies, and I call KR. Still, some other people, like my compatriots in Poland, might be so attached to fossils they might be willing to crowdfund new installations based on non-renewables. Thus, I need to take into account a KNR in the game. When I say capital, and I say LCOE, I sort of feel compelled to say aggregate investment in productive capacity, in renewables, and in non-renewables, and I call it, respectively, IR and INR. All in all, my input tensor spells TI = {LCOER, LCOENR, KR, KNR, IR, INR, PA;R, PA;NR, PB;R, PB;NR}.

The next step is scale and measurement. The neural functions I use in my perceptron like having their input standardized. Their tastes in standardization differ a little. The sigmoid likes it nicely spread between 0 and 1, whilst the hyperbolic tangent, the more reckless of the two, tolerates (-1) ≥ x ≥ 1. I chose to standardize the input data between 0 and 1, so as to make it fit into both. My initial thought was to aim for an energy market with great abundance of renewable energy, and a relatively declining supply of non-renewables. I generally trust my intuition, only I like to leverage it with a bit of chaos, every now and then, and so I ran some pseudo-random strings of values and I chose an output tensor made of TO = {QR/N = 0,95; QNR/N = 0,48}.

That state of output is supposed to be somehow logically connected to the state of input. I imagined a market, where the relative abundance in the consumption of, respectively, renewable energies and non-renewable ones is mostly driven by growing demand for the former, and a declining demand for the latter. Thus, I imagined relatively high a small-user price for renewable energy and a large fork between that PA;R and the PB;R. As for non-renewables, the fork in prices is more restrained (than in the market of renewables), and its top value is relatively lower. The non-renewable power installations are almost fed up with investment INR, whilst the renewables could still do with more capital IR in productive assets. The LCOENR of non-renewables is relatively high, although not very: yes, you need to pay for the fuel itself, but you have economies of scale. As for the LCOER for renewables, it is pretty low, which actually reflects the present situation in the market.

The last part of my input tensor regards the crowdfunded capital K. I assumed two different, initial situations. Firstly, it is virtually no crowdfunding, thus a very low K. Secondly, some crowdfunding is already alive and kicking, and it is sort of slightly above the half of what people expect in the industry.

Once again, I applied those qualitative assumptions to a set of pseudo-random values between 0 and 1. Here comes the result, in the table below.

 

Table 1 – The initial values for learning in the perceptron

Tensor Variable The Market with virtually no crowdfunding   The Market with significant crowdfunding
Input TI LCOER         0,26           0,26
LCOENR         0,48           0,48
KR         0,01   <= !! =>         0,56    
KNR         0,01            0,52    
IR         0,46           0,46
INR         0,99           0,99
PA;R         0,71           0,71
PA;NR         0,46           0,46
PB;R         0,20           0,20
PB;NR         0,37           0,37
Output TO QR/N         0,95           0,95
QNR/N         0,48           0,48

 

The way the perceptron works means that it generates and feeds back local errors in each round of experimentation. Logically, over the 5000 rounds of experimentation, each input variable gathers those local errors, like a snowball rolling downhill. I take the values of input variables from the last, i.e. the 5000th round: they have the initial values, from the table above, and, on the top of them, there is cumulative error from the 5000 experiments. How to standardize them, so as to make them comparable with the initial ones? I observe: all those final output values have the same cumulative error in them, across all the TI input tensor. I choose a simple method for standardization. As the initial values were standardized over the interval between 0 and 1, I standardize the outcoming values over the interval 0 ≥ x ≥ (1 + cumulative error).

I observe the unfolding of cumulative error along the path of learning, made of 5000 steps. There is a peculiarity in each of the neural functions used: the sigmoid, and the hyperbolic tangent. The sigmoid learns in a slightly Hitchcockian way. Initially, local errors just rocket up. It is as if that sigmoid was initially yelling: ‘F******k! What a ride!’. Then, the value of errors drops very sharply, down to something akin to a vanishing tremor, and starts hovering lazily over some implicit asymptote. Hyperbolic tangent learns differently. It seems to do all it can to minimize local errors whenever it is possible. Obviously, it is not always possible. Every now and then, that hyperbolic tangent produces an explosively high value of local error, like a sudden earthquake, just to go back into forced calm right after. You can observe those two radically different ways of learning in the two graphs below.

Two ways of learning – the sigmoidal one and the hyper-tangential one – bring interestingly different results, just as differentiated are the results of learning depending on the initial assumptions as for crowdfunded capital K. Tables 2 – 5, further below, list the results I got. A bit of additional explanation will not hurt. For every version of learning, i.e. sigmoid vs hyperbolic tangent, and K = 0,01 vs K ≈ 0,5, I ran 5 instances of 5000 rounds of learning in my perceptron. This is the meaning of the word ‘Instance’ in those tables. One instance is like a tensor of learning: one happening of 5000 consecutive experiments. The values of output variables remain constant all the time: TO = {QR/N = 0,95; QNR/N = 0,48}. The perceptron sweats in order to come up with some interesting combination of input variables, given this precise tensor of output.

 

Table 2 – Outcomes of learning with the sigmoid, no initial crowdfunding

 

The learnt values of input variables after 5000 rounds of learning
Learning with the sigmoid, no initial crowdfunding
Instance 1 Instance 2 Instance 3 Instance 4 Instance 5
cumulative error 2,11 2,11 2,09 2,12 2,16
LCOER 0,7617 0,7614 0,7678 0,7599 0,7515
LCOENR 0,8340 0,8337 0,8406 0,8321 0,8228
KR 0,6820 0,6817 0,6875 0,6804 0,6729
KNR 0,6820 0,6817 0,6875 0,6804 0,6729
IR 0,8266 0,8262 0,8332 0,8246 0,8155
INR 0,9966 0,9962 1,0045 0,9943 0,9832
PA;R 0,9062 0,9058 0,9134 0,9041 0,8940
PA;NR 0,8266 0,8263 0,8332 0,8247 0,8155
PB;R 0,7443 0,7440 0,7502 0,7425 0,7343
PB;NR 0,7981 0,7977 0,8044 0,7962 0,7873

 

 

Table 3 – Outcomes of learning with the sigmoid, with substantial initial crowdfunding

 

The learnt values of input variables after 5000 rounds of learning
Learning with the sigmoid, substantial initial crowdfunding
Instance 1 Instance 2 Instance 3 Instance 4 Instance 5
cumulative error 1,98 2,01 2,07 2,03 1,96
LCOER 0,7511 0,7536 0,7579 0,7554 0,7494
LCOENR 0,8267 0,8284 0,8314 0,8296 0,8255
KR 0,8514 0,8529 0,8555 0,8540 0,8504
KNR 0,8380 0,8396 0,8424 0,8407 0,8369
IR 0,8189 0,8207 0,8238 0,8220 0,8177
INR 0,9965 0,9965 0,9966 0,9965 0,9965
PA;R 0,9020 0,9030 0,9047 0,9037 0,9014
PA;NR 0,8189 0,8208 0,8239 0,8220 0,8177
PB;R 0,7329 0,7356 0,7402 0,7375 0,7311
PB;NR 0,7891 0,7913 0,7949 0,7927 0,7877

 

 

 

 

 

Table 4 – Outcomes of learning with the hyperbolic tangent, no initial crowdfunding

 

The learnt values of input variables after 5000 rounds of learning
Learning with the hyperbolic tangent, no initial crowdfunding
Instance 1 Instance 2 Instance 3 Instance 4 Instance 5
cumulative error 1,1 1,27 0,69 0,77 0,88
LCOER 0,6470 0,6735 0,5599 0,5805 0,6062
LCOENR 0,7541 0,7726 0,6934 0,7078 0,7257
KR 0,5290 0,5644 0,4127 0,4403 0,4746
KNR 0,5290 0,5644 0,4127 0,4403 0,4746
IR 0,7431 0,7624 0,6797 0,6947 0,7134
INR 0,9950 0,9954 0,9938 0,9941 0,9944
PA;R 0,8611 0,8715 0,8267 0,8349 0,8450
PA;NR 0,7432 0,7625 0,6798 0,6948 0,7135
PB;R 0,6212 0,6497 0,5277 0,5499 0,5774
PB;NR 0,7009 0,7234 0,6271 0,6446 0,6663

 

 

Table 5 – Outcomes of learning with the hyperbolic tangent, substantial initial crowdfunding

 

The learnt values of input variables after 5000 rounds of learning
Learning with the hyperbolic tangent, substantial initial crowdfunding
Instance 1 Instance 2 Instance 3 Instance 4 Instance 5
cumulative error -0,33 0,2 -0,06 0,98 -0,25
LCOER (0,1089) 0,3800 0,2100 0,6245 0,0110
LCOENR 0,2276 0,5681 0,4497 0,7384 0,3111
KR 0,3381 0,6299 0,5284 0,7758 0,4096
KNR 0,2780 0,5963 0,4856 0,7555 0,3560
IR 0,1930 0,5488 0,4251 0,7267 0,2802
INR 0,9843 0,9912 0,9888 0,9947 0,9860
PA;R 0,5635 0,7559 0,6890 0,8522 0,6107
PA;NR 0,1933 0,5489 0,4252 0,7268 0,2804
PB;R (0,1899) 0,3347 0,1522 0,5971 (0,0613)
PB;NR 0,0604 0,4747 0,3306 0,6818 0,1620

 

The cumulative error, the first numerical line in each table, is something like memory. It is a numerical expression of how much experience has the perceptron accumulated in the given instance of learning. Generally, the sigmoid neural function accumulates more memory, as compared to the hyper-tangential one. Interesting. The way of processing information affects the amount of experiential data stored in the process. If you use the links I gave earlier, you will see different logical structures in those two functions. The sigmoid generally smoothes out anything it receives as input. It puts the incoming, compound data in the negative exponent of the Euler’s constant e = 2,72, and then it puts the resulting value as part of the denominator of 1. The sigmoid is like a bumper: it absorbs shocks. The hyperbolic tangent is different. It sort of exposes small discrepancies in input. In human terms, the hyper-tangential function is more vigilant than the sigmoid. As it can be observed in this precise case, absorbing shocks leads to more accumulated experience than vigilantly reacting to observable change.

The difference in cumulative error, observable in the sigmoid-based perceptron vs that based on hyperbolic tangent is particularly sharp in the case of a market with substantial initial crowdfunding K. In 3 instances on 5, in that scenario, the hyper-tangential perceptron yields a negative cumulative error. It can be interpreted as the removal of some memory, implicitly contained in the initial values of input variables. When the initial K is assumed to be 0,01, the difference in accumulated memory, observable between the two neural functions, significantly shrinks. It looks as if K ≥ 0,5 was some kind of disturbance that the vigilant hyperbolic tangent attempts to eliminate. That impression of disturbance created by K ≥ 0,5 is even reinforced as I synthetically compare all the four sets of outcomes, i.e. tables 2 – 5. The case of learning with the hyperbolic tangent, and with substantial initial crowdfunding looks radically different from everything else. The discrepancy between alternative instances seems to be the greatest in this case, and the incidentally negative values in the input tensor suggest some kind of deep shakeoff. Negative prices and/or negative costs mean that someone external is paying for the ride, probably the taxpayers, in the form of some fiscal stimulation.

I am consistently delivering good, almost new science to my readers, and love doing it, and I am working on crowdfunding this activity of mine. As we talk business plans, I remind you that you can download, from the library of my blog, the business plan I prepared for my semi-scientific project Befund  (and you can access the French version as well). You can also get a free e-copy of my book ‘Capitalism and Political Power’ You can support my research by donating directly, any amount you consider appropriate, to my PayPal account. You can also consider going to my Patreon page and become my patron. If you decide so, I will be grateful for suggesting me two things that Patreon suggests me to suggest you. Firstly, what kind of reward would you expect in exchange of supporting me? Secondly, what kind of phases would you like to see in the development of my research, and of the corresponding educational tools?

De la misère, quoi

 

Mon éditorial sur You Tube

 

Je reviens vers ma recherche sur le marché d’énergie et – pour la n-ième fois – j’essaie de formaliser de façon scientifiquement rigoureuse mon concept EneFin, donc celui de solution financière pour promouvoir le développement d’énergies renouvelables. L’année dernière, j’avais déjà beaucoup tourné autour du sujet et il y a un je ne sais quoi d’obscur, là. Quelque chose qui me bloque intellectuellement. Vous pouvez consulter, par exemple, « Alois in the middle » pour en savoir plus sur mes contorsions conceptuelles à ce sujet.

Alors, une approche de plus. J’ouvre de façon canonique, par les prémisses, donc par les raisons pour quiconque de ne pas s’en foutre éperdument de tout ce bazar. Dans un rapport sur le secteur d’énergie, publié par IRENA (International Renewable Energy Agency), deux observations me donnent un peu de démangeaison (économique). D’une part, le coût de génération d’énergies renouvelables, le soi-disant LCOE (Levellized Cost of Electricity), a chuté abruptement ces dernières années. D’autre part, l’investissement en des nouvelles capacités de génération en renouvelables a chuté aussi. Les énergies renouvelables ont la particularité de ne coûter rien en tant que telles ; ce qui coûte du pognon c’est la mise en place et la maintenance des technologies. Voyez « Ce que le prof en moi veut dire sur LCOE » pour en savoir plus. De tout en tout, les technologies d’énergies renouvelables ont l’air d’entrer dans la phase de banalisation. La technologie coûte de moins en moins, elle perd de valeur de plus en plus vite, et son produit final est de plus en plus bon marché aussi. D’autre part, les marchés bien structurés d’énergie ont une tendance à développer deux zones de prix : ceux relativement bas pour les gros consommateurs institutionnels et ceux plus élevés pour les petits consommateurs individuels (ainsi que petits institutionnels). Vous pouvez consulter « Deux cerveaux, légèrement différents l’un de l’autre »  pour en savoir plus.

Il y a un autre truc, qui commence à se dessiner dans ma recherche récente. Le développement quantitatif du secteur d’énergie en général semble prendre lieu plutôt en des chocs abrupts de court terme qu’en des tendances longues. A ce sujet précis, j’ai pondu récemment un article et j’essaie de convaincre quelqu’un que ça a du sens. Ce brouillon est intitulé « Apprehending Energy Efficiency ».

Je continue canonique, toujours. L’objectif de ma recherche est de mettre au point un mécanisme de financement des petites installations locales d’énergies renouvelables, qui utiliserait précisément ces deux phénomènes : la disparité des prix, qui se manifeste à mesure que le marché se développe et se structure, et la prédisposition de l’industrie à réagir aux chocs plutôt qu’à des stimuli gentils et patients. Mon hypothèse de travail est que la disparité observable dans les prix d’énergie peut être utilisée pour créer des chocs financiers contrôlés et locaux, qui à leur tour peuvent stimuler le développement desdites petites installations locales.

La méthode générale pour l’exploration et la vérification de cette hypothèse consiste à tester, sous plusieurs angles différents, un schéma financier qui exploite, précisément, la disparité des prix. Un fournisseur local d’énergie vend une certaine quantité Q d’énergie à des consommateurs tout aussi locaux à un prix relativement élevé, le PA, typique pour le marché des petits consommateurs, mais il la vend en paquets complexes, qui contiennent de l’énergie strictement dite, au prix PB, relativement bon marché, normalement réservé aux gros consommateurs industriels, plus des titres participatifs dans le capital du fournisseur. Ces titres participatifs représentent un ensemble des droits aux actifs du fournisseur et la valeur comptable de ces droits est K. La valeur-marché de l’énergie vendue est de Q*PB = E. La marge agrégée Q*(PA – PB), crée par la vente de la quantité Q d’énergie, est donc équivalente à du capital K investi dans le bilan du fournisseur d’énergie. Logiquement, la valeur-marché que l’énergie Q aurait aux prix petit consommateur PA est égale à la somme du capital K et de la valeur-marché E. Dans les équations ci-dessous je donne l’idée générale.

 

Q*(PA – PB) = K

Q*PB = E

Q*PA = K + E

PA > PB

PB  ≥  LCOE

Mon idée suivante est d’explorer les conditions de faisabilité de ce schéma financier, ainsi que de l’optimiser. La structure générale du coût de production d’énergie, donc du LCOE, dit que la quantité d’énergie produite est une fonction du capital investi dans les capacités de production. Le capital K dans mes équations demeure dans une certaine proportion au capital I investi dans les actifs productifs. Par conséquent, K a une influence fonctionnelle sur Q et c’est ainsi que la fonction f1, telle que f1(K) = Q, entre dans le jeu. La même structure logique du LCOE suggère que les prix d’énergie sont des manifestations de la façon dont le capital K est utilisé, puisqu’ils dépendent du coefficient K/Q et en même temps ils dépendent de la structure compétitive du marché ainsi que de sa structure institutionnelle. Seulement ça, ce serait trop simple. La logique Keynésienne suggère que ça marche aussi dans le sens inverse : le capital I investi dans la capacité de production, ainsi que sa fraction K collectée à travers le schéma financier que je viens d’esquisser dépendent toutes les deux des prix et des quantités produites d’énergie.

J’ai donc là un joli petit nœud logique : des variables qui dépendent mutuellement l’une de l’autre. Voilà aussi une belle occasion de faire un pas de plus hors de ma caverne d’économiste classique Smithonien et se tourner vers l’intelligence artificielle et les réseaux neuronaux. J’assume donc que le secteur d’énergie est une structure intelligente qui est capable de s’adapter aux impératifs de la civilisation humaine – survivre et avoir accès à Netflix – et cette adaptation peut se faire à travers deux chemins qu’emprunte toute intelligence digne de ce nom : expérimentation et abstraction.

J’imagine donc une structure intelligente plus ou moins conforme à ces équations là-dessus. Ce que je veux c’est une fourniture abondante d’énergie renouvelable. « Abondante » est un aspect de la chose, « renouvelable » en et une autre. En ce qui concerne l’abondance d’énergie, la consommation finale annuelle par tête d’habitant, fréquemment mesurée en kilogrammes (ou bien en tonnes) d’équivalent pétrole, semble être une mesure à forte assise empirique. Je structure cette abondance relative en deux types : renouvelable et non-renouvelable. Ici, je répète une remarque à propos de cette classification, une remarque que j’avais déjà faite dans « Les 2326 kWh de civilisation » : formellement, lorsqu’on brûle des bio-fuels, comme de la paille ou de la sciure de bois, c’est du renouvelable dans le sens que ce n’est ni du fossile ni de la fission nucléaire. Encore, faut venir là où j’habite, moi, pour comprendre que ce type précis de renouvelable n’est pas précisément soutenable à la longue. Vous voulez littéralement voir ce que vous respirez, sans être capable de voir grand-chose d’autre ? Eh bien, venez à Krakow, Pologne, en saison de chauffage. Vous verrez par vous-mêmes ce que veut dire l’usage abondant des bio-fuels.

En tout cas, ma structure intelligente distingue deux sous-catégories de Q (je sais, le jeu de mots, avançons SVP) : QR/N pour la consommation d’énergie renouvelable par tête d’habitant et QNR/N pour les non-renouvelables par la même tête d’habitant. Enfin, pas tout à fait la même, puisque la tête d’habitant qui roule sa vie sur les renouvelables démontre, très probablement, des schémas de comportement différents de celle qui s’en tient encore aux fossiles lorsqu’il s’agit d’alimenter le frigo. Je veux QR/N et mettre le QNR/N gentiment en veilleuse, juste en cas où une autre glaciation serait à venir et il y aurait besoin de chauffer la planète, juste un tantinet.

En tout cas, j’ai deux variables de résultat : [QR/N] et [QNR/N]. Ma structure intelligente peut suivre quatre sentiers alternatifs de changement. Le plus désirable des quatre est celui où [QR/N] croît et [QNR/N] décroit, en corrélation négative. Par ordre de désirabilité, le second sentier est celui de Les trois autres sont les suivants : i) [QR/N] décroit et [QNR/N] croît en corrélation négative ii) [QR/N] et [QNR/N] décroissent tous les deux et enfin le cas iii) où [QR/N] et [QNR/N] croissent en concours.

Mes variables d’entrée sont tout d’abord les prix d’énergie PA et PB, possiblement sous-catégorisés en des prix d’énergie renouvelable et non-renouvelable. L’un des trucs que je voudrais voir joliment simulé par un réseau neuronal est précisément ce « possiblement sous-catégorisés ». Quel sentier d’essai et erreur conduit à la convergence entre les prix de renouvelables et celui des fossiles ? Quel autre sentier peut conduire vers la divergence ? Quelles fourchettes de convergence ou divergence peuvent apparaître le long de ces sentiers ? Quelle relation avec le LCOE ? Voilà des choses intéressantes à explorer.

Deux autres variables d’entrée sont celles pertinentes au capital : le capital I investi dans la capacité productrice et son sous-ensemble K, collecté à travers le schéma financier que j’ai présenté quelques paragraphes plus tôt.

Somme toute, voilà que j’atterris avec deux tenseurs : celui de résultat TS et celui d’entrée TE. Le tenseur d’entrée se décompose comme TE = [(LCOER), [(LCOENR), (KR), (KNR), (IR), (INR), (PA;R), (PA;NR), (PB;R), (PB;NR)] et celui de résultat c’est TS = [(QR/N), (QNR/N)]. L’action niveau TE produit un résultat niveau TS. Un réseau neuronal peut connecter les deux tenseurs à travers deux sortes de fonction : expérimentation et abstraction.

L’expérimentation, ça peut prendre lieu à travers un perceptron à couches multiples. Je reprends le même, simple algorithme que j’avais déjà mentionné dans « Ce petit train-train des petits signaux locaux d’inquiétude ». Je prends donc mes deux tenseurs je crée un premier ensemble de valeurs empiriques, une valeur par variable. Je les standardise dans l’intervalle entre 0 et 1. Cela veut dire que le prix (PB;R), par exemple, est exprimé comme le pourcentage du prix maximal observé dans le marché. Si j’écris PB;R = 0,16, c’est un prix local qui est égal à 16% du prix maximal jamais observé dans ce marché précis. D’autres variables sont standardisées de la même façon.

Maintenant, je fais une chose peu usuelle – pour autant que je sache – dans l’application des réseaux neuronaux. La pratique normale est de donner à notre algorithme un ensemble de données aussi large que possible dans la phase d’apprentissage – pour découvrir des intervalles les plus plausibles pour chaque variable – et ensuite optimiser le modèle sur la base de cet apprentissage. Moi, je veux observer la façon dont le perceptron va apprendre. Je ne veux pas encore optimiser dans le sens strict du terme.

Je prends donc ce premier ensemble des valeurs empiriques standardisées pour mes deux tenseurs. Les voilà, dans Tableau 1, ci-dessous :

 

Tableau 1

Tenseur Variable Valeur initiale standardisée
TE  LCOER         0,26
 LCOENR         0,48
 KR         0,56
 KNR         0,52
 IR         0,46
 INR         0,99
 PA;R         0,71
 PA;NR         0,46
 PB;R         0,20
 PB;NR         0,37
TS  QR/N         0,95
 QNR/N         0,48

 

La situation initiale que je simule est donc celle, où la consommation d’énergie renouvelable par tête d’habitant QR/N est près du maximum empiriquement observable dans le secteur, pendant que la consommation des non-renouvelables QNR/N est à peu près à la moitié (48%) de son max respectif. Les prix avantageux d’énergie, réservés aux grands consommateurs, sont respectivement à PB;R = 20% et PB;NR = 37% de leurs maximums observables. Les prix plus élevés, normalement payés par les petits utilisateurs, y compris les ménages, sont à PA;R = 71% du max pour les renouvelables et PA;NR = 46% pour les non-renouvelables. Les marges initiales PA – PB sont donc respectivement à PA;R – PB;R = 71% – 20% = 51% pour les renouvelables et  PA;NR – PB;NR = 46% – 37% = 9% en ce qui concerne les non-renouvelables.

Voilà donc un marché initial où une demande relativement élevée pour les énergies renouvelables crée une fourchette des prix particulièrement défavorable pour ceux parmi les petits clients qui veulent ne consommer que ce type d’énergie. En même temps, les non-renouvelables sont un peu moins en demande et par conséquent la même fourchette des prix PA – PB est beaucoup plus étroite dans leur cas.

Les quantités de capital collectées à travers des plateformes de financement participatifs, donc mes K, sont à KR = 56% du max pour les fournisseurs d’énergies renouvelables et KNR = 52% dans le marché des non-renouvelables. Maintenant, je reviens à mon modèle, plus particulièrement à l’équation Q*(PA – PB) = K. Avec les quantités et les prix simulés ici et avec l’assomption de population N = constante, KR devrait être à QR*(PA;R – PB;R) = 0,95*(0,71 – 0,2) = 0,4845, pendant que la valeur initiale arbitraire est de 0,56. Les renouvelables sont donc légèrement sur-financées à travers le mécanisme participatif. Pour les non-renouvelables, le même calcul se présente comme KNR = QNR*(PA;NR – PB;NR) = 0,48*(0,46 – 0,37) = 0,0432 donc bieeeen en-dessous du KNR = 52% fixés arbitrairement comme valeur initiale. Si les renouvelables sont légèrement sur-financées, les non-renouvelables nagent carrément dans du pognon déséquilibré.

En ce qui concerne l’investissement I en capacités productives, il est initialement fixé à IR = 0,46 pour les renouvelables et INR = 0,99 pour les non-renouvelables. Les renouvelables sont donc clairement sous-investis, pendant que les fossiles et les fissions nucléaires sont gâtés en termes d’actifs productifs.

Les coûts de production d’énergie, donc les LCOE, sont peut-être les plus durs à exprimer en valeurs standardisées. En effet, lorsqu’on observe la signification économique du LCOE, la façon dont ça bouge semble avoir plus d’importance que la façon do ça se tient en place. Les valeurs initiales que j’ai fixées, donc LCOER = 0,16 et LCOENR = 0,48 sont une tentative de recréer la situation présente dans le secteur de l’énergie, où le LCOE des renouvelables plonge carrément, la tête en avant, pendant que le LCOE des non-renouvelables suit une trajectoire descendante quoique beaucoup plus respectable dans sa descente.

Alors, mon petit perceptron. Il est fait de juste deux neurones, l’un après l’autre. Le premier o l’affaire directement au stimuli du tenseur d’entrée TE = [(LCOER), [(LCOENR), (KR), (KNR), (IR), (INR), (PA;R), (PA;NR), (PB;R), (PB;NR)] et il attribue à chaque variable de ce tenseur un coefficient de pondération. C’est comme ces neurones superficiels connectés à notre appareil sensoriel, qui décident s’il est plus important de s’occuper de cette grosse tâche brune qui grandit très vite (l’ours grizzly qui charge sur moi) ou bien de ce disque lumineux qui tourne progressivement de l’orange vers le jaune (soleil dans le ciel).

Je ne sais pas comme vous, mais moi, je m’occuperais plutôt de l’ours. Il a l’air comme un peu plus pressant, comme stimulation sensorielle. Encore que ce neurone de première couche, il a de la liberté d’expérimenter avec l’importance relative des choses. Il attribue des coefficients aléatoires de pondération à chaque variable du tenseur TE. Il produit un cocktail d’information de la forme : TE(transformé) = [(LCOER)*p1 + (LCOENR)*p2 + (KR)*p3 + (KNR)*p4 + (IR)*p5 +  (INR)*p6 + (PA;R)*p7 + (PA;NR)*p8 + (PB;R)*p9 + (PB;NR)*p10. Les « pi » sont précisément les coefficients de pondération que le premier neurone attribue aux variables d’entrée.

Le second neurone, qui consulte le premier neurone en matière de ce qui se passe, c’est l’intello du lot. Il dispose d’une fonction de transformation neuronale. Elle est basée, en règle générale, sur la fonction exponentielle. Le tenseur TE(transformé) produit par le premier neurone est tout d’abord mis en négatif, donc « – TE(transformé) » et ce négatif est ensuite mis en exposant de la constante e = 2,72 etc. On tourne donc autour de e – TE(transformé) . Ceci fait, l’intello a deux façons usuelles d’en faire un usage cognitif : en sigmoïde ou bien en tangente hyperbolique. Je viens de découvrir que cette distinction a de l’importance, dans ce cas précis. J’y reviendrai plus tard. En tout cas, cette fonction de transformation – sigmoïde ou tangente hyperbolique – sert à produire une valeur hypothétique des variables de résultat, donc du tenseur TS = [(QR/N), (QNR/N)]. Ceci fait, le neurone intello calcule la dérivée locale de ce résultat hypothétique ainsi que la déviation dudit résultat par rapport aux valeurs originales TS = [(QR/N) = 0,95 ; (QNR/N) = 0,48]. La dérivée multipliée par la déviation donne une erreur locale. La somme de ces erreurs locales en ensuite transmise au premier neurone, ce concierge à l’entrée du système, avec la commande « Ajoute ça, s’il te plaît, aux valeurs initiales du TE, puis transforme le une nouvelle fois et donne-moi la nouvelle valeur TE(transformé) ».

Ça se répète, encore et encore. J’ai opté pour 5000 tours de cet encore et j’ai observé le processus d’apprentissage de mes deux neurones. Plus précisément, j’ai observé la valeur de l’erreur cumulative (donc sur les deux variables de résultat) en fonction du temps d’apprentissage. Voilà que la première différence saute aux yeux en ce qui concerne la fonction neuronale appliquée. Je la présente sous forme de deux graphes, ci-dessous. Si le neurone intello de la famille utilise la fonction sigmoïde, le processus d’apprentissage tend à réduire l’erreur expérimentale plutôt vite, pour osciller ensuite dans un intervalle beaucoup plus petit. C’est un schéma du type « un choc suivi par une adaptation progressive ». En revanche, la tangente hyperbolique apprend à travers la création délibérée des chocs impressionnants, entrecoupés par des longues périodes d’accalmie.

 

Voilà donc deux sentiers d’apprentissage très différents et ils produisent des résultats très différents. Tableau 2, ci-dessous, présente les valeurs apprises par les deux versions de mon réseau. Le sigmoïde conseille de pomper la valeur relative de toutes les variables d’entrée, pendant que la tangente hyperbolique est d’avis que la seule variable du TE digne de maximisation est l’investissement en capacité productrice des non-renouvelables pendant que le reste, faut les discipliner. Le plus intriguant c’est les valeurs négatives de LCOER et de PB;R. Pour LCOER = – 0,11 cela veut probablement dire soit une stimulation fiscale forte, soit une situation où les fournisseurs d’énergies renouvelables vendent leurs actifs productifs en masse. Le PB;R = – 0,19 c’est sans doute un appel à la stimulation fiscale des prix d’énergie renouvelable.

Voilà donc que le sigmoïde devient libéral et la tangente hyperbolique tourne en étatiste – interventionniste. Encore un petit test avec l’équation Q*(PA – PB) = K. Les valeurs conseillées par le sigmoïde libéral donnent  QR*(PA;R – PB;R) = 0,95*(0,90 – 0,73) = 0,1615 et QNR*(PA;NR – PB;NR) = 0,48*(0,82 – 0,79) = 0,0144 , contre les K appris indépendamment comme KR = 0,85 et KNR = 0,84. Le sigmoïde libéral veut donc monétiser significativement le secteur d’énergie. Plus de capital liquide veut dire plus de flexibilité et un cycle de vie beaucoup plus court en ce qui concerne les technologies en place.

La tangente hyperbolique interventionniste préconise QR*(PA;R – PB;R) = 0,95*[0,56 – (-0,19)] = 0,7125 et QNR*(PA;NR – PB;NR) = 0,48*(0,19 – 0,06) = 0,0624 contre KR = 0,34 et KNR = 0,28. Définitivement moins de pognon collecté à travers du crowdfunding. De la misère, quoi.

 

Tableau 2

Tenseur Variable Valeur initiale standardisée Valeur apprise par le réseau basé sur la fonction sigmoïde Valeur apprise par le réseau basé la tangente hyperbolique
TE  LCOER         0,26         0,75       (0,11)
 LCOENR         0,48         0,83         0,23
 KR         0,56         0,85         0,34
 KNR         0,52         0,84         0,28
 IR         0,46         0,82         0,19
 INR         0,99         1,00         0,98
 PA;R         0,71         0,90         0,56
 PA;NR         0,46         0,82         0,19
 PB;R         0,20         0,73       (0,19)
 PB;NR         0,37         0,79         0,06
TS  QR/N         0,95    
 QNR/N         0,48    

 

Je continue à vous fournir de la bonne science, presque neuve, juste un peu cabossée dans le processus de conception. Je vous rappelle que vous pouvez télécharger le business plan du projet BeFund (aussi accessible en version anglaise). Vous pouvez aussi télécharger mon livre intitulé “Capitalism and Political Power”. Je veux utiliser le financement participatif pour me donner une assise financière dans cet effort. Vous pouvez soutenir financièrement ma recherche, selon votre meilleur jugement, à travers mon compte PayPal. Vous pouvez aussi vous enregistrer comme mon patron sur mon compte Patreon . Si vous en faites ainsi, je vous serai reconnaissant pour m’indiquer deux trucs importants : quel genre de récompense attendez-vous en échange du patronage et quelles étapes souhaitiez-vous voir dans mon travail ?

Carl Lagerfeld and some guest blogging from Emilien Chalancon, my student

My editorial on You Tube

 

This time, instead of publishing my own train of thought, I am publishing the work of my student, CHALANCON Emilien, from Université Jean Monnet, Saint-Etienne, France (Department : Business and Administration, IUT Saint-Etienne). This is an essay prepared for a course in International Economic Relations, and devoted to the phenomenon of the so-called Yellow Jacket Movement.

I publish my students’ work, with their consent, of course, when I find a particular piece of writing particularly mature and bearing a sharp scientific edge. Here comes a piece of writing that shows intellectual maturity and the capacity to think beyond political correctness.

When I read any news about the Yellow Jacket Movement, I recall that social advertising, in 2008, precisely for yellow jackets and road safety, featuring Carl Lagerfeld saying: ‘C’est jaune, c’est moche, mais ça peut vous sauver la vie’, which, in English, spells: ‘It’s yellow, it’s ugly, but it can save your life’. Go figure why I make this association, still I tend to perceive French people in yellow vests as sort of dashing and trendy. Following some trends, anyway.

 

Here comes the exact wording of the essay by Emilien Chalancon.

 

FRANCE and Yellow Jacket movement

 

 To begin…

Liberté. Egalité. Fraternité. Over time, there is one nation has been differentiated from others by its freedom, social system and diversity: FRANCE. In fact, this state was the creator of the notion “soft power” and social security system. As you know, a big part of our medical treatment, school education, unemployment revenue is taken care by the state, it’s free. We call that “sécurité sociale”.

Citizens have the deep conviction that workers must pay contributions to people in need. That’s France: mutual help and respect. This is due to a national distribution of richness: 20 % percent of revenue goes to the tax system. The People is the Nation.

But, unfortunately, what made the greatness of my country before, is today a national and international element of discord. A part of the population is fade of the government and direct debits in our incomes. For some French people, the contribution for the unemployed, and other health insurances is too high.

Furthermore, citizens have to pay for wages of civil servants ( 5 million live in France). This collective tax payment, which is increases every year, force people to develop a hate around the government.

The social system is constantly calling into question also according to the crisis and many national protest movements. The last one is the most symbolic and violent of this century for my state. Since November, the “yellow jackets” riot with strong aggressiveness and this affected all the country.

The most alarming is that it spreads to economic and financial level’s inside and outside France (rate of jobs, Gross Domestic Product, National Stock Exchange). This last idea brings us to the following problematic: What are the economic and financial consequences of the Yellow Jacket movement at the domestic and international level for France?

First, we will explain the movement of yellow jackets. Secondly, we will see the economic and financial consequences at the domestic level. And then, we will finish with the consequences at the international scale.

 

I-Origin of Yellow Jacket

II-Economic and financial consequences at the domestic level II-Economic and financial consequences at the international scale

     I-Origin of Yellow Jacket

 Since few years, the French government has to face with endogenous and exogenous elements. The financial crisis of 2OO8 encouraged the State to raise taxes, cost of living and grew dramatically the public debt. This was necessary because of our social system : as I said before, it’s based on equal distribution of income. But salaries of the people did not follow that trend. A certain number of inhabitants still are not able to pay taxes and to live.

The price of food, clothes, gas, rent, petrol are always increasing. In November 2018, the government raised the amount of 8 centimes for petrol and 10 centimes for gasoline (= 0,35 PLN). This was the straw that broke the camel’s back! People are exhausted about the government.

From November 17th, 2018, the movement is organized around road obstructions on highway and national roads. After many violence with police and none dialogue with the President, yellow jackets demonstrators started to destroy symbolic national monuments. Actually, they want to overthrow the establish governments and believe that Mr. Macron didn’t listen the “national anger”.

But why people reacted just for that ?

For my point of view, I didn’t understand the reaction. In our history, my people always made crazy revolutions from very soft reasons. Since 2 years, our economy is on top 7 on the international ladder, we also won the last football world cup in Russia (it creates very powerful social link for a country to win a football cup).

The election of Mr. Macron (pro-entrepreneurial) encouraged many citizens to create their own business and promoted commerce. I mean, the environment was very enjoyable. But, a little rise of the price of petrol, and everything goes wrong…

To finish with, I think the life is too much comfortable in France. We are used to live with excessively wellbeing according to our social care system. A little inconvenience makes people mad. This is another argument that justify the creation of Yellow Jacket movement.

The issue is very dramatic for the economy and the finance of France: the minister of economy, Mr. Lemaire, speaks about a “disaster”.

II-Economic and financial consequences at the domestic level Economical aspect:

First, France has the world’s 6th largest economy by 2018 nominal figures and the 10th largest economy by PPP figures. It has the 3rd largest economy in the European Union. It is an agricultural, industrial and tourist power both on a European and on a global scale. France has a diversified economy: industrial, tourism, luxury, aerospace engineering.

The industrial sector (chemical, metallurgical, automobile) is one of the most develop and profitable for the State (12.4% of GDP). There is big group likes: Boiron, SNF, ArcelorMittal, Omerin and so on…

The Yellow Jacket movement has strongly compromised the activity of its companies. In fact, among the 280 000 participants of manifestations, a large number are workers and other employees of these industrial enterprises. Due to a significant number of demonstrators over the country, businesses have difficulty to hiring massively qualified workers.

Furthermore, the tourism is another big sector of revenues (84 million international tourist arrivals every years). The principal destination is of course Paris. Unfortunately, deteriorations by demonstrators of l’Arc de Triomphe, or also Les Palais des Tuileries, les Champs-Elysées decreased dramatically the number of tourists. From November 23 to 27, according to the MKG Consulting Observatory – OlaKala Destination, there have already been between 20,000 and 25,000 overnight stays canceled for the whole of December.

All these elements have huge impacts on the economy. On December 10, the Minister of the Economy linked the movement of Yellow Vests and the decline of growth, citing a decrease of 0.1-point growth of our national wealth.

On December 10, the national bank “Banque de France” announced that, it divided by two the French Gross Domestic Product growth rate it forecasts according to Yellow Jacket movement.

We could see an increase in business failures early next year, not only small traders but also larger distributors.

This type of event tends to accelerate the structural change of consumers towards the Internet sites. 41,000 employees were placed in partial activity or unemployed because of the closure of businesses.

And finally, the movement has several impacts in other economic sectors: the decline in turnover is estimates of 15% to 25% in supermarkets, 20% to 40% in the retail trade, and from 20% to 50% in food service.

Financial aspect: CAC 40 affected

The CAC 40 is a benchmark French stock market index. The index represents a capitalization-weighted measure of the 40 most significant values among the 100 highest market caps on the Euronext Paris. These values correspond to the most rentable financial French companies like: Airbus, BNP Paribas, Carrefour, LVMH. The stock market place, called “Bourse de Paris” is in the street quarter of “La Défense” in Paris.

The Yellow Vests rioted with strong hostility in “La Défense” street towards Chief Executive Officers of CAC 40 enterprises. This violent manifestation has forced some executives to move temporally to another European stock market place.

The CAC40 reduced its earnings penalized by Vinci (-3.1%), Carrefour (-3.3%) or AccorHotels (-1.5%) : three values victims of fears related to the national protest movement of Yellow Vests.

At the “Bourse de Paris”, the shares of the incumbent operators (EDF and Engie), which supply electricity and gas to regulated tariffs, fell by 7% and 2.9% respectively between 4 and 10 December following the raise of the petrol price’.

For the supermarket sector : Carrefour shares lost 13% in one month and Casino 4%.

French government bondholders now require 45 basis points more compensation than a 10-year Bund (first time in the history of my state).

III-Economic and financial consequences at the international scale Economic aspect:

Before Yellow Jackets

France had always been a model of diplomacy & welfare state. The election of Mr. Macron in may 2017 attracted many foreign companies to settle its businesses in France, closed to borderlands and reinforced international relations with our partners (Germany, Spain, USA). Our top 10 partners, which account for two thirds of our trade (67%), remain mostly European countries and continue to be so. Germany retains both its first position, its weight in our trade, which has decreased very little since 2012 (from 17.2% to 16.9% in 2016) and its gap with our other partners.

Moreover, China even accepted to buy an incredible number of French public debts thanks to a secure political climate at the end of 2017. This operation is also a strong wish to implant Chinese enterprises in France and to control economic decisions of the French government (if you control public debts of a country then you control the domestic government).

But, following to the Yellow Vests riot, all these plans have been completely modified.

Yellow Vets impact

For the rest of the world, Yellow Jacket movement constitutes the 2nd French revolution. This national attraction made many discussions in other countries about the social and economic stability for my nation. In Europe as in the United States, the media once again put the “yellow vests” on the front page. The image of Macron, so far very favorable, is degraded. And we all know that Chief Executive Officers of giant groups take very close attention to medias and social stability of a country to implement their companies there.

The manifestations in Bordeaux have strongly scared the giant Japanese Toray Carbon Fibers based in the city. Actually, the group wants to move from other European country.

Furthermore, at the end of 2018, the Spanish enterprise MECALUX group (nb 2 in European logistics) had for project to create a big subsidiary in France. But this move was canceled according to an insecure atmosphere in my state.

And this kind of example are numerous…

Financial impact:

The street quarter of “La Défense » in Paris is the second financial place in Europe. The first rank is detained by “The City”, famous place of London. With the Brexit treaty, many big financial companies and credit rating agencies based on The City want to move from another attractive place.

La Défense was the first choice of many of them. Macron created attractive fiscal conditions, the street was redesigned to receive these new companies. By the way, all the conditions were perfect for this new implantation of financial enterprises in Paris.

But, like always, the Yellow Jacket movement had scared financial enterprises based in London to move from Paris.

At the moment, Italy is probably the new desire of the financial companies.

Finally, some foreign investments of USA & China in favor of France were cancelled. Walmart & Chinese Petroleum wanted to finance new operations (Walmart subsidiaries in Paris, petrol extraction in the middle region of France). All of them are remove to new project because of Yellow Vest movement.

CONCLUSION:

“Gilets jaunes” put the mess in my country. Now, I am still in an inexplicable situation. WHY MY COUNTRY MAKES STUPID THINGS WHEN EVERYTHING GOES WELL… Why?

Maybe because with have too much things, too much safety.

 Maybe because we are spoiled child. But Yellow Jackets are not the only

 

 

Title

Carl Lagerfeld and some guest blogging from Emilien Chalancon, my student

 

Editorial

This time, instead of publishing my own train of thought, I am publishing the work of my student, CHALANCON Emilien, from Université Jean Monnet, Saint-Etienne, France (Department : Business and Administration, IUT Saint-Etienne). This is an essay prepared for a course in International Economic Relations, and devoted to the phenomenon of the so-called Yellow Jacket Movement. Read more at https://discoversocialsciences.com

 

Tags

Yellow vests movement, student,    

 

Rummaging inside Tesla: my latest exam in Microeconomics

 

My editorial on You Tube

 

One more educational update on my blog. This time, it is the interpretation of exam in microeconomics, which took place on February 1st, 2019, in two distinct majors of studies, i.e. International Relations, and Management. First, right below, I am presenting the contents of the exam sheet, such as it was distributed to students. Then, further below, I develop an interpretation of possible answers to the questions asked. One preliminary remark is due: the entire exam refers to Tesla Inc. as business case. In my classes of Microeconomics, as well as in those of Management, I usually base the whole semester of teaching on 4 – 6 comprehensive business cases. This time, during the winter semester 2018/2019, one of those cases was Tesla, and the main source material was Tesla’s Annual Report for 2017. The students who attended this precise exam were notified one week earlier that Tesla was the case to revise.

This said, let’s rock. Here comes the exam sheet:

 

Exam in Microeconomics February 1st, 2019

 

Below, you will find a table with selected financial data of Tesla Inc. Use that data, and your knowledge as regards the business model of this firm, to answer the two open questions below the table. Your answer to each of the questions will be graded on a scale from 0 to 3 points. No answer at all, or major mistakes, give you 0 points. Short descriptive answer, not supported logically with calculations, gives 1 point. Elaborate explanation, logically supported with calculations, gives 2 or 3 points, depending on the exhaustiveness of your answer. Points translate into your overall grade as follows: 6 points – 5,0 (very good); 5 points – 4,5 (+good); 4 points – 4,0 (good); 3 points – 3,5 (+pass); 2 points – 3,0 (pass); 0 ÷ 1 points – 2,0 (fail). 

 

 

Values in thousands of USD
Revenues 2017 2016 2015
Automotive sales    8 534 752       5 589 007       3 431 587    
Automotive leasing    1 106 548         761 759         309 386    
Energy generation and storage    1 116 266         181 394          14 477    
Services and other    1 001 185         467 972         290 575    
Total revenues   11 758 751       7 000 132       4 046 025    
Cost of revenues      
Automotive sales    6 724 480       4 268 087       2 639 926    
Automotive leasing      708 224         481 994         183 376    
Energy generation and storage      874 538         178 332          12 287    
Services and other    1 229 022         472 462         286 933    
Total cost of revenues    9 536 264       5 400 875       3 122 522    
Overall total gross profit    2 222 487       1 599 257         923 503    
Gross profit by segments      
Automotive sales 1 810 272 1 320 920 791 661
Automotive leasing 398 324 279 765 126 010
Energy generation and storage 241 728 3 062 2 190
Services and other (227 837) (4 490) 3 642
       
Operating expenses      
Research and development    1 378 073         834 408         717 900    
Selling, general and administrative    2 476 500       1 432 189         922 232    
Total operating expenses    3 854 573       2 266 597       1 640 132    
Loss from operations   (1 632 086)       (667 340)       (716 629)   

 

Question 1 (open): Which operating segment of Tesla generates the greatest value added in absolute terms? Which segment has the greatest margin of value added? How does it change over time? Are differences across operating segments greater or smaller than changes over time in each operating segment separately? How can you possibly explain those phenomena? Suggestion: refer to the theory of Marshallian equilibrium vs the theory of monopoly.

 

Question 2 (open): Calculate the marginal cost of revenue from 2015 to 2017 (i.e. ∆ cost of revenue / ∆ revenue), for the whole business of Tesla, and for each operating segment separately. Use those calculations explicitly to provide a balanced judgment on the following claim: “The ‘Energy and storage’ operating segment at Tesla presents the greatest opportunities for future profit”.  

 

Interpretation

 

Question 1 (open): Which operating segment of Tesla generates the greatest value added in absolute terms? Which segment has the greatest margin of value added? How does it change over time? Are differences across operating segments greater or smaller than changes over time in each operating segment separately? How can you possibly explain those phenomena? Suggestion: refer to the theory of Marshallian equilibrium vs the theory of monopoly.

 

The answer to that question starts with the correct understanding of categories in the source table. Value added can be approximated as gross profit. The latter is the difference between revenues and variable cost, thus between the selling price, and the price of key intermediate goods. This was one of the first theoretical explanations the students were supposed do start their answer with. As I keep repeating in my classes, good science starts with communicable, empirical observation, and thus you need to say specifically how the facts at hand correspond to the theoretical distinctions we hold.

 

As I could see from some of the exam papers that some of my students handed me back, this was the first checkpoint for the understanding of the business model of Tesla. The expression ‘operating segment’ refers to the following four categories from the initial table: automotive sales, automotive leasing, energy generation and storage, and services and other. To my sincere surprise, some of my students thought that component categories of operational costs, namely ‘Research and development’, and ‘Selling, general and administrative’ were those operational segments to study. If, in an exam paper, I saw someone calculating laboriously some kind of margin for those two, I had no other solution but marking the answer with a remark ‘Demonstrable lack of understanding regarding the business model of Tesla’, and that was one of those major mistakes, which disqualified the answer to Question 1, and gave 0 points.

 

In a next step, i.e. after matching the concept of value added with the category of gross profit, and explaining why they do so, students had to calculate the margin of value added. Of course, we are talking the margin of gross profit, or: ‘Gross Profit / Revenues’. Here below, I am presenting a table with the margin of gross profit at Tesla Inc.

 

 

Margin of gross profit 2017 2016 2015
Overall 18,9% 22,8% 22,8%
Automotive sales 21,2% 23,6% 23,1%
Automotive leasing 36,0% 36,7% 40,7%
Energy generation and storage 21,7% 1,7% 15,1%
Services and other -22,8% -1,0% 1,3%

 

There was a little analytical challenge in the phrasing of the question. When I ask whether  ‘differences across operating segments greater or smaller than changes over time in each operating segment separately‘, it is essentially a test for analytical flexibility. The best expected approach that a student could have developed was to use coefficients, like gross margin for automotive sales in 2017 divided by that in 2015, and, alternatively, divided by the gross margin on energy generation and storage etc. Thus, what I expected the most in this part of the answer, was demonstrable understanding that changes over time could be compared to cross-sectional differences with the use of a universal, analytical tool, namely that of proportions expressed as coefficients, like ‘A / B’.

As this particular angle of approach involved a lot of calculations (students could use calculators or smartphones in that exam), one was welcome to take some shortcuts based on empirical observation. Students could write, for example, that ‘The greatest gross profit in absolute terms is generated on automotive sales, thus is seems logical to compare the margin of value added in this segment with other segments…’. Something in those lines. This type of answer gave a clear indication of demonstrable understanding as regards the source data.

As for the theoretical interpretation of those numbers, I openly suggested my students to refer to the theory of Marshallian equilibrium vs the theory of monopoly. Here is how it goes. The margin of value added has two interpretations as regards the market structure. Value added can be what the supplier charges his customers, just because they are willing to accept it, and this is the monopolistic view. As the Austrian school of economics used to state, any market is a monopoly before being a competitive structure. It means that any relations a business can develop with its customers is, first of all, a one on one relation. In most businesses there is at least a small window of price, within which the supplier can charge their customers whatever he wants, and still stay in balance with demand. In clearly monopolistic markets that window can be quite wide.

On the other hand, value added is what the exogenous market equilibriums allow a firm to gain as a margin between the market of their final goods, and that of intermediate goods. This is value added understood as price constraint. Below, I present those two ideas graphically, and I expected my students to force their pens into drawing something similar.

 

Question 2 (open): Calculate the marginal cost of revenue from 2015 to 2017 (i.e. ∆ cost of revenue / ∆ revenue), for the whole business of Tesla, and for each operating segment separately. Use those calculations explicitly to provide a balanced judgment on the following claim: “The ‘Energy and storage’ operating segment at Tesla presents the greatest opportunities for future profit”.  

 

As I reviewed those exam papers, I could see that the concept of marginal change is enormously hard to grasp. It is a pity, as: a) the whole teaching of calculus, at high school, is essentially about marginal change b) the concept of marginal change is one of the theoretical pillars of modern science in general, and it comes straight from grandpa Isaac Newton.

Anyway, what we need, in the first place, is the marginal cost of revenue, from 2015 to 2017, calculated as ‘∆ cost of revenue / ∆ revenue’. The ∆ is, in this case, the difference between values reported in 2017, and those from 2015. The marginal cost of revenue is simply the cost of having one more thousand of dollars in revenue. The corresponding values of marginal cost are given in the table below.

 

Operating segment at Tesla Inc. Marginal cost of revenue from 2015 through 2017
Overall                             0,83
Automotive sales                             0,80
Automotive leasing                             0,66
Energy generation and storage                             0,78
Services and other                             1,33

 

Most of the students who took this exam, on the 1st of February, failed to address the claim phrased in the question, and it was mostly because they apparently did not understand what is the meaning of what they have calculated. Many had those numbers right, although some were overly zealous and calculated the marginal cost for two windows in time separately: 2015 – 2016, and then 2016 – 2017. I asked specifically to jump from 2015 straight into 2017. Still, the real struggle was the unit of measurement. I saw many papers, whose authors transformed those numbers – correctly calculated – into percentages. Now, look people. In the source table, you have data in thousands of dollars, right? A delta of $000 is given in $000, right? A coefficient made of two such deltas is still in $000. Those numbers mean that if you want to have one more thousand of them US dollars in revenues, at Tesla Inc., you need to spend $830 in cost of revenue, and correspondingly for particular operating segments.

Thus, when anyone wrote those marginal values as percentages, I was very sorry to give that answer a mention ‘Demonstrable lack of understanding regarding the concept of marginal cost’.

When considering the marginal cost of revenue as an estimation of future profits, the lower it is, the greater profit we can generate. With a given price, the lower the cost, the greater the profit margin. The operating segment labelled ‘Energy generation and storage’ doesn’t look bad at all, in that respect, certainly better than them ‘Services and other’, still it is the segment of ‘Automotive leasing’ that yields the lowest marginal cost of revenues. Thus, the claim “The ‘Energy and storage’ operating segment at Tesla presents the greatest opportunities for future profit” is false, as seen from this perspective.

I am consistently delivering good, almost new science to my readers, and love doing it, and I am working on crowdfunding this activity of mine. As we talk business plans, I remind you that you can download, from the library of my blog, the business plan I prepared for my semi-scientific project Befund  (and you can access the French version as well). You can also get a free e-copy of my book ‘Capitalism and Political Power’ You can support my research by donating directly, any amount you consider appropriate, to my PayPal account. You can also consider going to my Patreon page and become my patron. If you decide so, I will be grateful for suggesting me two things that Patreon suggests me to suggest you. Firstly, what kind of reward would you expect in exchange of supporting me? Secondly, what kind of phases would you like to see in the development of my research, and of the corresponding educational tools?