Essay response to question - Which fundamental principles of intelligence must be considered in the successful design of artificial intelligence?
Fundamental principles - Novelty - Is intelligence a compression of the world or a (chaotic) expansion of the world?
This was submitted as a response to an essay question on the 31st of December 2022. The date is given with precision, because in many ways this is a time of rapid development for what currently passes as AI. New products and tweeks to the currently dominant “transformer” architecture are appearing all the time, exciting the public, and even causing some to ask for government to step in and control what people imagine it has already become. I would argue as in this essay that, with all the excitement, the current state of knowledge does not address the issue of principles of intelligence at all. It’s highly unlikely we have made something without knowing what it is we seek to make. That at the very best calls for regulation, at least of intelligence, are meaningless, because there is still no really agreed definition what intelligence might be, let alone how it might be regulated. The regulation of intelligence! As if we have too much of that in the world! Though this essay would purport to present such principles, whether they might present a danger is another question. Would something we understand be a greater danger than our own intelligence that we don’t understand, or a greater danger than any other process which might develop without control, because it is not understood? Can you control intelligence, or is it more intelligence, for ourselves first, by understanding ourselves, which is needed for control? Is the study of intelligence the true first step which is needed for control, control firstly of our own self-destructive tendencies, or control of any self perpetuating process which is imperfectly understood. Isn’t that really the fear, that we cannot control what we don’t understand? Perhaps a rebrand of AI to become instead SKC, self-knowledge computing, HC, humanized computing, or simply “I”, the study of intelligence, and not artificial at all, might help. The essay is reproduced here for reference, and to invite comment.
Which fundamental principles of intelligence must be considered in the successful design of artificial intelligence?
Is that an essay question, or an easy question?
I think the question is easy. What is difficult is that we have been looking at it the wrong way. We have sought structure. When the essence of the system is novelty. Novelty constructed using “priors” indeed.
But more than just constructed using priors. Constructed with priors generating chaotic dynamics. And even generating contradictions.
Chaos is something unlearnable. It can only be generated. Eternally anew.
From my experience trying to learn natural language structure, I am sure the fundamental principle of intelligence is that it structures the world according to cause & effect prediction. Specifically that oscillations in the brain synchronize groups of perceptual stimuli as sequences of them are grouped by shared contexts.
Most importantly, this process of generating cause & effect groupings, forms a dynamical system, which is chaotic.
I believe the chaotic expression of this grouping behaviour is the reason large language models find so many parameters (actually hierarchical cause & effect groupings), but never enough to completely prevent them mixing up the facts. It is why we have not been able to find a complete representation for meaning. And never will find a complete representation for meaning. That incompleteness is the essence of it. It allows for infinite creativity/novelty (and actually suggests solutions for consciousness and freewill, as expansion and embodiment aspects of chaos, respectively.)
So, two principles:
1) Cause & effect groupings (c.f. transformers: prediction)
2) Number of parameters is infinite (chaotic, c.f. transformers find no bound in number of parameters.) Otherwise put: novelty.
Or, put another way:
1) Novelty (chaos, constantly generating novelty. Also embodied, the exact body of any instance of chaos cannot be duplicated.)
2) Generated as cause & effect groupings of sequences of sensory stimuli, by clustering contexts in the sequences using oscillations.
This chaos (generating eternal novelty, and also embodied) is the element we have been missing in our efforts to understand intelligence.
Large language models come closest. They use transformations, essentially predicting an effect based on a cause, and so structure based on cause & effect. And they are enormously large, so they capture much of the peculiarity, or otherwise said, infinite novelty, of chaos. And they are distributed. Distributed representation also allows some residual chaotic recombination across the fixed cause & effect hierarchies learned by the transformer.
Evidence is replete. The work of Walter Freeman (Kozma & Noack, 2017), analysing intelligence as chaotic patterns, with an emphasis on "intentionality", is particularly well aligned. Walter Freeman’s work only lacked a concrete generative principle. That generative principle is more easily motivated on functional grounds. Linguistics, with a functional focus, is an easier path to an appreciation of it.
Within linguistics it has long been clear that predictive contrast based on shared context is a meaningful structural parameter. What foxed them was that, while useful, it displayed contradiction. A sign of chaotic dynamics defying abstraction.
So contradiction, a sign of chaotic dynamics defying abstraction, is particularly evident starting with a functional analysis of language in linguistics.
But it is also evident in other domains, for those who have eyes to see it.
That constant novelty generated by mathematical chaos might be at the core of what we fail to understand about intelligence is evident in the dominance of distributed representation as the paradigm for the most successful artificial intelligence applications today.
Why has distributed representation come to dominate artificial intelligence in the last decade or so? Discussion is scarce. To the extent we examine the question at all, most might conclude something about machine learning: that neural networks automate the learning process, a convenience. Or they conclude that distributed representation came to dominate because Big Data became available, that a recent explosion of availability of data explain the dominance of distributed representation.
But why distributed representation? Machine learning attempts to abstract. You can start with big data, but arrive at symbolic abstraction. Why do networks, with their distributed representations, now dominate?
In a sense you can see the progress of artificial intelligence research over the last years, as progress forward, by walking backwards. Distributed representation means less abstraction. We believe we are trying to abstract meaning from the world. But actually our solutions achieve success in proportion to the extent we do not abstract. They become more successful as their representations become larger. The key distinguishing feature of distributed representation is that it is distributed. The abstraction is reduced.
Why would reduced abstraction result in better results?
I suggest the reason distributed representation has dominated the successful AI solutions of the last ten years is because the system of intelligence is fundamentally generative, chaotic, defying abstraction, and we capture this character better, the less we abstract.
It is evident.
That constant novelty generated by mathematical chaos might be at the core of what we fail to understand about intelligence is evident in the lesser known thread of artificial intelligence research which pursued the properties of artificial organisms. Rodney Brooks and his swarms of "Fast, Cheap, and Out of Control" (Morris), embodied robots. Rolf Pfeifer, and his (chaotically) dancing robots (How the Body Shapes the Way We Think: Rolf Pfeifer at TEDxZurich). Luc Steels and work on swarms (or maybe Chialvo and Millonas are a better source for swarms) (Chialvo and Millonas)
This research did capture a chaotic/emergent character to intelligence. What this artificial organism thread of research was missing, like the observational EEG work of Walter Freeman, was an adequate generative principle.
It was intuitively embodied, but too tied to actual bodies. Without a generative principle, a sufficient “prior”, it failed to pursue the abstraction of an embodied principle, which is chaos.
Chaos is embodied. But the principle is not tied to any single body. It is a general principle which explains the uniqueness of particular bodies. But does it in a way which generalizes to many bodies.
It is evident.
That constant novelty generated by mathematical chaos might be at the core of what we fail to understand about intelligence is evident in the perceived significance of novelty. Francois Chollet and others (Cronin) have identified novelty as a key missing element in current attempts at AI.
Chollet cites Elizabeth Spelke as identifying “six different core knowledge systems” (Chollet): "objectness & intuitive physics", "agentness", "elementary geometry & topology", "numbers, counting, quantitative comparisons".
Cause and effect might be seen as spanning “objectness”, and “agentness”.
But chaos introduces a twist. That “objects” are inherently seen as products of novelty. Novelty is not seen as limited to interaction between objects. It is inherent in our perception of the world as objects itself. Other “core knowledge systems” might be important. But this novelty generating, chaotic, and contradictory, aspect to “objectness” is what we have been missing.
It is evident.
That constant novelty generated by mathematical chaos might be at the core of what we fail to understand about intelligence is evident in philosophy.
For the last 200 years or more, this chaotic aspect which I am pitching as the core missing element in our efforts to replicate intelligence, has become increasingly evident to philosophers in the form of the observation that meaning contradicts.
Philosophers have no access to any chaotic underlying process. But they are able to observe that the results of such a process, their perceptions, involve contradictions.
Personally for me, the most poignant observation of this has been the Liar Paradox. Dating millennia, viz. the statement, "I am a liar."
It contradicts. If it is false, then the person is a liar. But that means the statement is true. Which means it is false... And so on to infinite recursion.
This was noted most significantly by Bertrand Russell (“Russell's Paradox (Stanford Encyclopedia of Philosophy)”) during his attempts to find a logical basis for mathematics early last century.
Russell's observation was then used by Kurt Goedel (Chaitin) to prove that mathematics, that ultimate evolution of logic, must be incomplete.
Otherwise said, that there is a random element even at the heart of that pinnacle of human mental clarity, mathematics.
So is all truth random?
This might be said to be the conclusion drawn by philosophy.
Some trace this to Kant: “In the Critique of Pure Reason Kant argues that space and time are merely formal features of how we perceive objects, not things in themselves that exist independently of us” (Stang).
This has evolved within philosophy into a contemporary dominance for postmodernism (Forrester et al.). Postmodern philosophy is at the core of most of our contemporary social problems: the eternal debates over definitions of gender, the replacement of “truth” by lived experience, the underlying framing that everything that happens is the result of conspiracy by one or other group, and independent of any objective physical reality.
Postmodernism also started from an observation of contrasting meaningful structure in language.
Once again, like linguistics, missing the observation that this contradictory structure observable within language might have chaotic properties, postmodernism concluded meaning was completely random. And lacking an objective basis, that it must only be an expression of the power of one or other political group.
Hence the contemporary fixation on identity and conspiracy. All meaning is assumed to be a conspiracy by one or other identity group.
Objective meaning is rejected.
So this then can be the first benefit of artificial intelligence research. To understand ourselves better. And in particular to understand that meaning can be random, but not random in the sense of having no objective basis at all, and merely being the expression of power by different identity groups. Instead random in the sense of Chaitin/Kolmogorov (Dunn), “random” in the sense of a most compact representation of information. And more than most compact, actually expanding, constantly growing larger, the very source of our human creativity.
We need to understand that the process of generating constant novelty can be personal, embodied, the source of free will and consciousness. So not objective in its productions, actually a true source of diversity. But that it can be objective in the sense of the processes generating it. Then we can connect meaning, truth, back to the world, and separate “truth” again from a sole basis in group identity and political power.
So perhaps the first benefit of artificial intelligence research will be to return society to a consensus about the basis for meaning. Not quite the objective truth sought by the Enlightenment. But a new insight into the sources of individual intuition, and diversity, in an objective process. That objective “truth” is an objective process, not an objective product, a process which is constantly generating novelty, creativity, from the world.
That constant novelty generated by mathematical chaos might be at the core of what we fail to understand about intelligence is evident then in multiple domains: in the dominance of deep learning and distributed representation over GOFAI, symbolism, and logic, in the insights of artificial organisms, swarms, and “Fast, Cheap & Out of Control” AI models, in the perceived importance of novelty and creativity, in the confusion of contradictory truth perceived by philosophy. It is evident. But we have not had eyes to see that evidence.
Once this chaotic generative principle is accepted, we need only add what is clear from the historical evidence of linguistics, and more recently the success of large language models, and transformers applied more generally, that cause and effect must be an important structuring principle generating that chaos.
That said, there might be relational principles other than cause and effect, other "priors". The fundamental missing element has been an appreciation that priors can have chaotic dynamics.
I have long been impressed by the work of Chris Domas (Domas), who looked at different relational principles for structuring the raw binary of computer code. The generative parameters he studied were not chaotic. But they did emphasize that novel structure can still be meaningful, if the way elements are related is meaningful. For instance by implementing relational parameters which preserve locality, not prediction now, preserving locality, he was able to generate patterns which highlight the presence of images in computer code. Or cryptographic keys, or text, or any one of a number of other patterns in code, which are invisible to human cognition, but might be a basis for another intelligence, and another way of looking at, and extracting "meaning" from, the world.
Who can say what other such "alien" intelligences we might discover by applying different relational parameters.
Or even within human intelligence, there may be other relational parameters. Different ways of connecting sensory experience in a manner which is useful to the organism.
Our particular relational parameter in shared cause & effect seems to be fundamental, but it need not be the end of the story.
However it seems clear that cause and effect, as exemplified by the success of large language models, may be the most fundamental place to start. What continues to be missed are the chaotic dynamics of the structures generated by these relational parameters. It is only that the dynamics of these relational parameters might be chaotic, which has foxed us.
The essence of this is simplicity. Intelligence has appeared complex to us only because we have failed to understand this expansive, chaotic, nature.
Being simple, it should be easy to implement.
In contrast to transformers, which require enormous resources because they attempt to enumerate what is actually an infinity of predictive structure, understanding the process of intelligence is essentially creative, and especially that it is chaotic, means we won't dream of attempting to enumerate it all. Instead we can focus on the actually much simpler task of generating the structure which is appropriate to a given context, as that context arises.
We can take a context, and generate the predictive structure meaningful for that context. We don't need to try and enumerate all predictive structures for all possible contexts beforehand. Ever extending beyond the 2^42 parameters jokingly suggested by Geoffrey Hinton (Hinton) in response to the 175 billion learned by GPT-3. Or agonizingly traversing the asymptotic tail of self-driving like Elon Musk (“Elon Musk talks to the FT about Twitter, Tesla and Trump”). Forever almost there, like the fabled eternal journey which ever advances only half the remaining distance, but can never arrive. Requiring double, or ten times the effort for each iterative, infinite, half step.
Instead we recognize that the structure is infinite. It must be. Otherwise creativity would be limited.
We need chaos to give us infinite novelty. It is both the solution, and the only desirable goal.
The only problem has been that we have either mistaken the goal to be abstraction/compression, historically with logic and symbolic AI, and even an emphasis on "learning" for neural networks, or, as in the case of artificial organism research, we have failed to generalize from embodiment of an organism instance, to chaos. Or that we simply lacked a simple functional generative principle.
It is our mistaken goals which have held us back, the wrong questions, not any intrinsic difficulty in the task itself. Language, as with transformers and large language models, points the way. The solution may be to resolve the world into a prediction problem - the implementation might be as simple as graphing sensory data in a sequence network - and then simply setting that network oscillating in response to different contexts.
The oscillations can perform the same role as learning in a transformer. A transformer learns what groupings tend to predict the next element well. Oscillations will tend to group elements in a network according to how well they share predictions. If the predictive structure of the system were finite, the two would be equivalent, and there would be no point in using oscillations to constantly learn the same thing.
But it is not finite. Instead the dynamics will generate chaos. And the structure generated will contradict, and be infinite.
Such a simple system based on oscillations emphasizing predictions using shared contexts, might have evolved from an even more simple system like an "echo state network”, also chaotic. An echo state network, "echos" the world. You push a stimulus into it, and it pushes it back out at you. It doesn't structure them, but it remembers them. A primitive organism might have evolved to record these echos to more effectively predict its environment. It might then have evolved to enhance these echos according to their predictive ability. Enhancing them to resonate over multiple similar experiences, essentially stacking events across time according to the way they shared predictions. It would be a natural enhancement to evolve by chance.
The insight that cognition has eluded us because we have failed to appreciate its chaotic generative aspect is simplifying in many ways. It is evident. And implementation may be easy.
The answer may be as easy as changing our sense of the question.
In conclusion, I believe the answer to the riddle which has been leading us a dance in AI research, the fundamental principle we must grasp if we seek to create intelligence, is that our intelligence, all intelligence defined as "a useful organization of information in pursuit of a goal", must actually an expansion of the world, not a compression.
An eternal generation of (meaningful) novelty, if you wish.
In particular that intelligence is a chaotic expansion of the world, which defies abstraction. And actually generates contradictions. But that it is essentially very simple. And understood in the first instance, by analogy with what we observe from language, as a chaotic expansion of structure generated by grouping sequences of sensory data according to their cause & effect properties. Perhaps as simply as setting a network of observed sensory sequences oscillating and seeing how elements in that network synchronize their oscillations, as a result of shared cause & effect.
Works Cited
Beall, Jc, et al. “Liar Paradox (Stanford Encyclopedia of Philosophy).” Stanford Encyclopedia of Philosophy, 20 January 2011, https://plato.stanford.edu/entries/liar-paradox/. Accessed 30 December 2022.
Chaitin, Gregory. “An Algorithmic God | Gregory Chaitin | Inference.” Inference Review, http://inference-review.com/article/an-algorithmic-god. Accessed 30 December 2022.
Chialvo, Dante R., and Mark M. Millonas. “How Swarms Build Cognitive Maps.” https://link.springer.com/chapter/10.1007/978-3-642-79629-6_20.
Chollet, Francois.
Cronin, Lee. Lee Cronin, Chemist: The origins of life, Consciousness, life throughout the Universe, Chemputation.,
Domas, Chris. “Christopher Domas The future of RE Dynamic Binary Visualization.”
Dunn, J. Michael. https://www.sciencedirect.com/topics/mathematics/algorithmic-information.
“Elon Musk talks to the FT about Twitter, Tesla and Trump.”
Forrester, J., et al. “Postmodernism (Stanford Encyclopedia of Philosophy).” Stanford Encyclopedia of Philosophy, 30 September 2005, https://plato.stanford.edu/entries/postmodernism/. Accessed 30 December 2022.
Hinton, Geoffrey. “Geoffrey Hinton on Twitter: "Extrapolating the spectacular performance of GPT3 into the future suggests that the answer to life, the universe and everything is just 4.398 trillion parameters."” Twitter, 10 June 2020,
Accessed 30 December 2022.
“How the body shapes the way we think: Rolf Pfeifer at TEDxZurich.”
Kozma, Robert, and Raymond Noack. “Freeman's Intentional Neurodynamics.” BiNDS Lab – Biologically Inspired Neural & Dynamical Systems (BINDS) Laboratory, 24 April 2017, https://binds.cs.umass.edu/papers/2017_Kozma_Noack_Freemans_Intentional_Neurodynamics.pdf. Accessed 30 December 2022.
Morris, Errol, director. Fast, Cheap & Out of Control. https://www.imdb.com/title/tt0119107/.
“Russell's Paradox (Stanford Encyclopedia of Philosophy).” Stanford Encyclopedia of Philosophy, 8 December 1995, https://plato.stanford.edu/entries/russell-paradox/. Accessed 30 December 2022.
Stang, Nicholas F. “Kant's Transcendental Idealism (Stanford Encyclopedia of Philosophy).” Stanford Encyclopedia of Philosophy, 4 March 2016, https://plato.stanford.edu/entries/kant-transcendental-idealism/. Accessed 31 December 2022.



You might want to check out Col. John Body's 1976 essay on the continual creation and destruction of mental models in the context of war strategy. This was a very disturbing idea at the time and didn't seem much traction outside military circles. But I think this is essential for nearly everything in a competitive world and as well as architecting artificial general intelligence systems.
Boyd, John. Destruction and creation. Leavenworth, WA: US Army Command and General Staff College, 1976.
https://scholar.google.com/scholar?cluster=13994112438515980960&hl=en&as_sdt=0,5