24 Comments

Good written article and there is probably some truth in it. But you misunderstand Kolmogorov Complexity. The surrounding machine that reads the code and instantiates it is *critical*. Is the shortes string describing tetris maybe just a link on my PC? Why not? After all the encoding of the floppy disk disregards all of the internal interpretations that are precoded into the computer.

In the same vain, the surrounding biology is *critical* to get from a DNA sequence to an animal or human. This includes not only the chemistry of ribosomes but all of physical reality. (The value might still be lower than one would assume naively, but much higher than estimates given by DNA)

Expand full comment

It also disregards the fact that ms word could be coded in a tiny fraction of the space if Microsoft had any economic interest in doing so.

Expand full comment

Well put and completely correct. This whole essay rests on the idea that the length of the instructions is directly proportional to the complexity of the resulting structure, without understanding that structures and languages for interpreting instructions can differ in complexity by orders of magnitude. If the language didn't matter you could take the genome of E. Coli, convert it into binary and run it as an .exe to "have" an E. Coli cell.

Expand full comment

I see what you mean and you're certainly right for the computer program side – I don't know enough about low-level computing to know how much Tetris relies on external libraries, ready-made CPU instructions, or whatever computer do when they execute programs. I agree the size of a software package must vastly underestimate its total complexity. I kind of neglected this, because my point is the biological systems are often more simple that software packages – if the complexity of software turns out to be higher than their size suggests, it only makes biological look even more simple in comparison.

For biological systems, however, I still think almost all the information about how the system works has to be contained in nucleic acids sequences, and that includes the nature of nucleic acid themselves. That's where I was getting at in the parenthesis about DNA being the substrate of evolution: if there are any "structures for interpreting instructions" encoded in the chemistry of the ribosome, and if these structures have any non-trivial complexity beyond the baseline complexity of the primordial soup, then this additional complexity must have been selected for through evolution. And, with very few exceptions, evolution operates on nucleic acids. Any hidden information that isn't be encoded in the DNA/RNA in some way (if anything, as a kind of redundancy) would have to evolve in a Lamarckian way, and that's just not very powerful compared to Darwinian evolution. From that, I conclude that pretty much all the information about how polymerases/ribosomes/etc work must show up somewhere in the DNA in the form of a constraint on the sequence. Not *all* the information, but close enough.

Of course none of these estimates include information about the laws of physics etc., but since I'm making a comparison between two things from the same universe, it's just an offset and shouldn't matter for the comparison.

Expand full comment

> For biological systems, however, I still think almost all the information about how the system works has to be contained in nucleic acids sequences, and that includes the nature of nucleic acid themselves.

If that were true, then it would in principle be possible to physically transfer naked DNA from species A into the gamete or zygote of species B and produce a viable member of species A.

Alternatively, you should be able to implant a zygote from mammal A into the womb of mammal B.

The general reason why this is somewhere between biologically unlikely and impossible is that epigenetic mechanisms mediate the interaction between DNA and its chemical environment throughout the life cycle. I'm not referring to epigenetic mechanisms of inheretance per se, but to the simple fact that disruptions to the chemical environment in which cells or fetuses exist have a profound effect on their viability.

As you may know, DNA is densely and dynamically interacting with an extremely complex protein mileau that is cell type and state dependent. This mileau is tightly regulated throughout mitosis and meiosis.

There is never any point in any organism whatsoever's life cycle in which naked DNA even exists, much less develops a cell or multicellular organism out of an arbitrary chemical soup. DNA is incapable of independently structuring a viable cell or organism outside the context of the biological soma with which it has co-evolved. The soma is arbitrarily complex.

> And, with very few exceptions, evolution operates on nucleic acids. Any hidden information that isn't be encoded in the DNA/RNA in some way (if anything, as a kind of redundancy) would have to evolve in a Lamarckian way, and that's just not very powerful compared to Darwinian evolution.

Another way to look at this issue that you may find clarifying is that evolution operates on *ancestral series* of nucleic acids. There is a series of mutational steps that permitted LUCA to evolve into a blue whale, another that permitted it to evolve into pangolins, and another that permitted it to evolve into the venus fly trap. One way we could potentially ignore the soma as a source of biological complexity is by focusing on ancestral series of genomes as a form of complexity. We can treat the soma as the product of the evolutionary path that gave rise to a particular individual, or even a particular cell.

I'm not even sure how to begin trying to make a theoretical estimate on how much complexity is possible, given average mutation rates, but it's probably much higher than what's possible if you treated a single individual's genome as the upper bound on their complexity.

This argument applies equally to MS Word. MS Word co-evolved with Windows, the programming language in which it's implemented, various hardware architectures, and so on. Its complexity is not bounded by the complexity of its source code.

Expand full comment

I think the central point has to be that the genetic code which ended up creating the actual organism we see is probably much less complex than the code that would *guarantee* it, and this is because some aspects of the complexity of an organism are imminentised through its development.

The code for “release chemical X in condition Y” can be very simple, but if it doesn’t fully determine the conditions you have a code for building something *a bit like the organism in front of you,* but not one that works in all contexts and all circumstances, and not one which is exact.

The complexity of the organism is not the complexity of the code that created it because the code will not produce it exactly. The code for TETRIS will, though?

So there is a difference between the instructions needed to create a complex emergent thing and the thing itself, and the simplicity results from a small number of processes turning on differently in different contexts. But that’s effectively offshoring complexity from the code itself; the contexts are necessary for the organism— these are two different forms of complexity accordingly

Expand full comment

'only 10% of the human genome is actually useful, in the sense that it’s maintained by natural selection. The remaining 90% just seems to be randomly drifting with no noticeable consequences'

This is simply false, and the scientists who study DNA have known it to be false for decades:

https://magazine.washington.edu/feature/no-such-thing-as-junk-dna-researchers-say/

'No such thing as ‘junk’ DNA, researchers say'

'For decades most scientists thought the bulk of the material in the human genome—up to 95 percent—was “junk DNA.”

It now turns out much of this “junk,” far from an evolutionary byproduct, actually contains the vital instructions that switch genes on and off in all kinds of different cells. Changes in these instructions can affect everything from color vision to whether a person develops diabetes or cardiovascular disease or a host of other conditions.

“The junk DNA concept, as it has come to color our perception of the human genome, is somewhat bizarre,” says Stamatoyannopoulos. “If you picked up a Chinese newspaper and you could read only one or two percent of the characters, would you automatically assume the rest was junk?”

The Human Genome Project sequenced the 3 billion letters or DNA bases that make up the genome, and it provided a basic catalog of genes, which occupy only about 2 percent of the genome. But understanding how genes turn on and off is vital to figuring out basic biological processes, like development, or how genes contribute to normal health and disease. It turns out—contrary to expectation—that there are a modest number of genes (around 20,000) but these genes are controlled by millions of DNA “switches,” with the whole unit functioning as a kind of operating system for the cell. '

https://med.stanford.edu/news/all-news/2023/09/junk-dna-diseases.html

'For decades, scientists have known that, despite its name, “junk DNA” in fact plays a critical role: While the coding genes provide blueprints for building proteins, which direct most of the body's functions, some of the noncoding sections of the genome, including regions previously dismissed as “junk,” seem to turn up or down the expression of those genes"

https://biology.mit.edu/so-called-junk-dna-plays-a-key-role-in-speciation/

'So-called “junk” DNA plays a key role in speciation'

'More than 10 percent of our genome is made up of repetitive, seemingly nonsensical stretches of genetic material called satellite DNA that do not code for any proteins. In the past, some scientists have referred to this DNA as “genomic junk.”

Over a series of papers spanning several years, however, Whitehead Institute Member Yukiko Yamashita and colleagues have made the case that satellite DNA is not junk, but instead has an essential role in the cell: it works with cellular proteins to keep all of a cell’s individual chromosomes together in a single nucleus.'

******

When you make sweeping claims, it's best to have your basic facts straight at the outset.

Otherwise, someone might conclude you simply have no idea what you're talking about with regard to any subject.

Expand full comment

Great article. This points at one of the central difficulties for (human) engineers hoping to influence biology: everything has multiple functions. There is no gene for a single trait; most traits are influenced by hundreds or thousands of genes, when the entire genome is only tens of thousands of genes! Every possible drug target also has hundreds of other functions!

I am hopeful that the nonlinear gestalt thinking of modern AIs will be able to intuit the emergent function of these n-dimensional biological networks.

Expand full comment

I am considerably more complex than just my genome. I learned to understand speak and read English, and live in my culture and smile or scowl. I then did a degree and worked for 50 years. Had children, earned and spent money and all that. That stuff is not coded in my genome. The potential is but all that nurture is part of me too. And it ain't coded in DNA. And there is an awful lot of it in any functioning adult.

Expand full comment

While this article presents an intriguing analytical framework, it exemplifies a fundamental limitation in modern scientific discourse that reduces the profound complexity of life to purely mechanical and computational metaphors. This perspective is problematically reductionist.

First, the article's characterization of bacteria as strategic actors that 'decide' to optimize for simplicity reflects a fundamental misunderstanding. Bacteria aren't sentient strategic agents - they don't make decisions in the way the article implies, which stems from modern science's reductionist habit of imposing human-like strategic behavior on natural processes it doesn't fully understand.

They are part of an interconnected living system that operates according to quantum principles rather than the linear cause-and-effect relationships our minds tend to impose on them. The distinction between 'good' and 'bad' bacteria is similarly an artificial construct that reveals more about our linear thinking than the actual nature of these organisms.

Here's an inconvenient truth: the body doesn't know what's a toxin, a protein, enzyme and whatnot. The body operates on a fundamentally different paradigm than the mind. The body's innate intelligence doesn't recognize the dualistic concepts of good versus bad that dominate our mental framework. This mirrors Hippocrates' insight about medicine and poison being distinguished only by dosage. Ironically, it's often the mind's well-intentioned interventions - through its insistence on linear, reductionist approaches - that create harm by imposing its rigid interpretations onto the body's more nuanced systems.

Second, the article's treatment of DNA storage capacity is particularly revealing of this limited perspective. Recent research has already confirmed that DNA functions as a fractal antenna - a sophisticated multidimensional structure capable of receiving, transmitting, and processing information in ways that transcend simple linear sequences (https://pubmed.ncbi.nlm.nih.gov/21457072/). This property is completely overlooked when we reduce DNA to a one-dimensional string of nucleotides measured in megabytes.

This relates to a deeper issue: the fundamental mismatch between how our analytical mind processes information and how biological systems actually operate. The human mind works like a serial processor - it understands ONLY through juxtaposition, creating meaning by placing one thing against another (like plotting points on an X and Y axis). This is why our scientific models tend to be two-dimensional representations that break systems down into comparable components.

But biological systems operate more like quantum parallel processors. They can maintain complete understanding of phenomena without needing to break them down into constituent parts. This is why attempting to measure genetic complexity in terms of conventional data storage units is deeply misleading. What appears as 156 KB in our linear, particle physics model could actually represent thousands of terabytes of information when understood in terms of quantum fields - yet paradoxically show as zero size in quantum measurements because the information exists in a zero-point scalar field state.

The article's comparison of genome sizes to software programs reveals this blind spot. It's not that Microsoft Word is more complex than a living organism - it's that we're measuring complexity using tools designed for linear, sequential information processing while completely missing the quantum, multidimensional nature of biological information storage and processing.

What makes this current paradigm particularly dangerous is its complete disconnection from the true nature of being. Modern science, in its clinical rationality, has elevated this mechanical worldview to an absolute model, while simultaneously severing our connection to the deeper principles that govern life itself. The supreme irony is that this approach, in its very sophistication, has fostered a profound delusion: that we can outsmart nature through pure rational analysis.

This hubris is perfectly reflected in the inventions and aspirations of modern humans, who, in their vanity, style themselves as direct descendants of the gods - capable of improving upon nature's designs through sheer force of analytical intelligence. It's a mindset that mistakes technological sophistication for true wisdom, and computational complexity for genuine understanding.

Until we develop a more humble and holistic understanding that reconnects scientific insight with spiritual wisdom, our models will continue to miss not just the true complexity and elegance of living systems, but also the profound responsibility that comes with attempting to manipulate them.

Expand full comment

Fascinating to what extend humans are capable of building more theoretical concepts on other purely theoretical concepts like "Evolution", or "Genomics/DNA" which are still not more than Hypotheses (aware about for instance the big Genomics scientists Meeting more than 20 years ago, where they mutually agreed that NOTHING in that field is clearly defined, or experimentally proven?)

Expand full comment

I think that reducing human beings to a number of genes in the genome is hugely off -- it doesn't even take account of epigenetics: how genes can be switched on or off by a myriad of factors relating to environment and experience.

It's like saying all there is to be known about a house is the number of bricks, floorboards, rafters and roof tiles.

And what about the brain? Every one with more neural connections than stars in the known universe -- constantly forming new connections and having old ones pruned: also directed by environment and experience..

Its complexity is probably beyond measurement in millions of petabytes.

Expand full comment

Correction: the brain more neurons than stars in the Milky Way, with 100 trillion synaptic connections. I have no idea how many petabytes of storage that would require.

Expand full comment

So, maybe I disagree with you about too simple for us. It is the fact that evolution doesn't have to understand something, it just has to win the experiment. And because it doesn't have to understand it can use a whole load of scarily complex non-linear effects (IE not easily modelled in maths). So because mutations and natural selection are dumb they just work with anything and the mathematical or organisational complexity doesn't faze it at all. That means that a whole load of non-linear effects are incorporated into successful designs. A butterfly throwing wee lumps of air swirling down to stay up ain't something an engineer could model and so would take some persuading to adopt. That's a simple case. The way we think. There's a lot going on in there that is just flipping complex.

No I think it is 2 things;

1) the not needing to think it through and understand it and

2) the constant checking to see if it works.

Giving evolution the ability to use any effect out there regardless of how tough it is to model.

Expand full comment
4hEdited

If the "structural" influences on somatotype (the influence of non-gene parent structure on offspring structure, putatively due to using existing structure as a guide in some way) are seen to be heredible, why would you say that they're not subject to natural selection? Natural selection operates at the level of determining whether the reproduction succeeds (and reproduces)." So anything heredible should be subject to natural selection should it not?

Expand full comment

This is a very simplistic view of the information content in the DNA, and the comparison with Microsoft Word is nonsensical.

Expand full comment

This seems to compare the sharp knife of natural selection to the cumbersome market forces and manipulation of man-made creations.

• F a life-form and it’s DNA to survive, it has to both adapt to its environment and out-perform competing life forms. Energy from food, growth of cells and reproduction are just a few of the basics that have to survive a universe that doesn’t play favorites.

• MS Word was born into a protected bubble, where MSoft had locked out competition through non-compete contracts with manufacturers. It had a built-in greenhouse in which to grow- in the millions of workers PCs. It had the wealth, influence and power of a multi-national corporation keeping competitors at bay.

If, say MS Word had been forced to compete with a robust market of invention and geniuses, would it still be here? Would its bloated form still exist?

Expand full comment

The author of this completely misses the connection between structure, process, function, and meaning. I haven’t read his whole article yet, but I will. But I think by analogy he’s trying to say that the whole of human literature is encapsulated in only 26 letters.

Let me know when he’s gotten these very simple mechanisms he “describes” taking energy from their environment, grow themselves through several stages, and re-create themselves, independently of any human intervention.

Expand full comment

Perhaps your point about the relative complexity of the "descriptive" code is correct, but you overlook the complexity of the brain needed to execute that code. MS Word manages that within its code budget. A human brain develops increasing complexity from day one, and writing out that as code would be a gargantuan task.

Expand full comment

The scale of software analogized to complexity is waaaaaay off. The Microsoft Word could easily be in a fraction of its size. In the 90s the software development community totally lost its ability to worry the size of its product. Speed of development and the notion of reuse became ALL. And apps like Microsoft generation-to-generation depended on faster apps and more memory. Depended wholly because the entire software industry lost the knowledge of how to build compact, concisely performing code. There is no logical justification for the present size of the Microsoft Word installation. None. It is NOT that much more complex than the version I still have on a single-sided floppy disk.

Expand full comment

There are as many possible neuron connections in the human brain as there are particles in the universe. --Norman Doidge, "The Brain That Changes Itself."

Expand full comment