Neuromythography

The Architecture of the Soul

Against the Statisticians

The field of research statistics and modeling has been critiqued far more effectively by others; perhaps the most famous and outspoken contemporary critic is Nassim Nicholas Taleb. Taleb is often misrepresented as a lone crank, but in most cases he is expressing ideas more colorfully that have been stated more quietly by mathematical experts. Andrew Gelman is a statistics wizard. Psychometrics was effectively eviscerated by Stephen Jay Gould in The Mismeasure of Man. Much of what I have to say here echoes what has already been said by other people far more knowledgeable in probability than I shall ever become.

The Deification of Randomness

What is randomness? Despite what we are taught in school about normal distributions, there is not a general mathematical definition of randomness. Rather, it is a descriptive property of a system that defies our ability to precisely predict outcomes.

Since the original study of probability in the context of games of chance, we understand that even within random systems there exist patterns of order. Many systems follow power laws, are chaotic, follow fractal rules, fluctuate between stability and instability. Even random noise follows patterns that have been assigned colors (white noise, red noise, brown noise, etc.).

Randomness became the god of European atheism between the late 19th and early 20th centuries. Random organic chemical reactions, random mutation, and natural selection created a new Genesis story for life. Intrinsic randomness as postulated by quantum theory became the new Genesis story for the universe.

Probability

The field of research statistics was invented by amateur scientist Francis Galton, established by Karl Pearson, and popularized by Ronald Fisher. What united the three of them in purpose was to establish epistemological support for their prior belief that there are fundamental genetic differences between the races, despite the then-impenetrable ignorance of the underlying biological substrate of inheritance.

This idea is not very politically-correct today, and the visceral reaction to ‘hereditarian’ ideas by left-leaning commentators obscures the genuine methodological problems that are identifiable from a dispassionate perspective. Our modern knowledge renders the original hypothesis of distinct subspecies boundaries between populations difficult to justify. A ‘dirty secret’ of evolutionary biology theory is that few morphotypes (observable body structures) and very few phenotypes (observable biological behaviors) have a clear causal relationship to specific genotypes (DNA alleles). We have a far better understanding of embryological development and the cascade of promoter genes than we do behavioral genetics, yet even today we are constantly discovering new aspects of how signaling proteins and cells interact to construct the body. We have very little understanding of how genetics affect a posited behavioral phenotype.

Only in 2007 were the actual genes discovered that correspond to Mendel’s original insight about the presence of discrete inheritance components for phenotypes of pea plants, despite the ubiquity of Mendelian inheritance in pedagogy. Modern knowledge of the complexity of the genome and the complex networked gene flow across populations based on mtDNA and y-chromosome evidence has further eroded support for the notion of strictly segregated races. More importantly, we still have little theoretical understanding of how genetic alleles could control abstract character qualities like disposition and intelligence (although, contrary to the ‘blank slate’ social constructionists, we do have tantalizing evidence of subtle genetic influences). The question of nature vs. environment has become impossible muddled by the discovery of epigenetic mechanisms such as heritable gene methylation (a heritable genetic modification caused by environment that represents a physical substrate for “racial memory”), heritable bioelectric fields, and even gut bacterial flora.

Despite the decline of the original eugenics raison d’être for research statistics, the rhetorical value of the methods developed by the its pioneers has proven irresistible to other fields by lowering the bar for what is considered valid scientific evidence. The traditional scientific method is ordinarily quite hard: one has to control all hidden variables in order to isolate a single variable, skeptically criticize the logical continuity between hypothetical construct and experiment, and investigate any deviations from repeatability in order to isolate further causal distinctions. With the advent of modern research statistics as promoted by Fisher, error was transformed from a problem to be bounded (as intended by Laplace) into a rhetorical asset for persuasion–the correlation concept invented by Galton. The purpose of this was to infer the existence of heritable components–later known as genes–that determined biological measures that Galton noticed tended to be normally-distributed. The social consensus within science shifted towards accepting correlation–an inverse of error–as a type of evidence. This poison chalice allowed a much lower standard of evidence to be presented as ‘scientific’ than had been considered valid prior to widespread acceptance of research statistics in the 1930s.

It is important to realize that many of the tools of research statistics are not derived from fundamental mathematics but represent social conventions. The p-value (a reductio ad absurdum argument involving calculating the probability that the null hypothesis is true and the present findings are the result of random statistical error) was first calculated in the 1700s by mathematicians such as Pierre-Simon LaPlace. Karl Pearson formally introduced the p-value, and Ronald Fisher famously promoted using a threshold standard for the p-value in his influential 1925 book, Statistical Methods for Research Workers:

It is usual and convenient for experimenters to take 5 per cent as a standard level of significance, in the sense that they are prepared to ignore all results which fail to reach this standard, and, by this means, to eliminate from further discussion the greater part of the fluctuations which chance causes have introduced into their experimental results.

If p is between 0.1 and 0.9 there is certainly no reason to suspect the hypothesis tested. If it is below 0.02 it is strongly indicated that the hypothesis fails to account for the whole of the facts. We shall not often be astray if we draw a conventional line at 0.05…

Ronald Fisher, Statistical Methods for Research Workers

It is important to note that Fisher did not derive this heuristic from any mathematical analysis, he simply asserted p<0.05 as a rule of thumb for evaluating whether the result could be explained by expected measurement errors. This became standard epistemology for science, such that calculating p-values is a kind of ritual tautology to signal rigor to peers, and has been used by researchers rationalize dubious boasts like that ‘the probability that my theory is wrong is less likely than being struck by lightning while being eaten by a shark’.

Gaussian Presumptions

Frequentists vs. Bayesians

Academia frequently hears baying between ‘frequentism vs. Bayesianism’. This is, at its core, an argument between modern empiricists who adopt the correlation hunting posture criticized previously, and modern rationalists who were an empiricist beard known as Bayesian inferencing. I will assume that the reader has familiarity with Bayesian inferencing (I recommend this as a primer).

Bayesian inferencing is an algorithm for uncritical logical induction. Practically speaking, Bayesian inference is a useful framework for building artificial learning intelligences, and there is evidence that neurons in the brain use a form of Bayesian updating when updating synaptic weights. However, as an epistemology, Bayesian inferencing has been demonstrated to be tautological sophistry by the likes of Karl Popper and David Miller. Bayesian inferencing presumes its prior, and overlooks issues with variance or unknown variables in its posterior distribution. Bayesian inferencing is prone to overgeneralization. Crucially, Bayesian inferencing suffers from unfalsifiability.