Survivorship Bias – XKCD

Source: XKCD, Apr 2017

This comic is a parody of entrepreneurial speeches. Entrepreneurial speeches are talks, such as graduation commencements or motivational speeches. The idea behind graduation commencements is that the entrepreneur, having accumulated wisdom and experience in the process of becoming successful, will share his insights and experience to the students, in the hope that they learn lessons that will help them achieve success as well. Companies hire motivational speakers to motivate employees to work hard.

A common theme in these talks is that the entrepreneur succeeded by persisting through hardship, sometimes despite other people telling them they would be better off giving up. They advise students to do the same, and to keep pursuing their dreams even through subsequent failure. While this isn’t necessarily bad business advice, this can give students a biased vision of reality, and lead them to imagine that they will succeed as long as they keep trying.

This comic makes a joke about survivorship bias, hence the title.

Survivorship bias, or survival bias, is the logical error of concentrating on the people or things that “survived” some process and inadvertently overlooking those that did not because of their lack of visibility.

This can lead to false conclusions in several different ways. The survivors may be actual people, as in a medical study, or could be companies or research subjects or applicants for a job, or anything that must make it past some selection process to be considered further. They may also have “survived” on only some of their attempts. For example, although Donald Trump had some successful businesses, he also had many that went bankrupt.

Above the Senior Wrangler – Phillipa Fawcett

Source: Mike Dash history website, Oct 2011

June 7, 1890, when—for the first and only time—a woman ranked first in the mathematical examinations held at the University of Cambridge. It was the day that Philippa Fawcett placed “above the Senior Wrangler.”

The most serious candidates invariably hired tutors and worked more or less round the clock for months. The historian Alex Craik notes that C.T. Simpson, who ranked as Second Wrangler in 1841, topped off his efforts by studying for 20 hours a day in the week before the exams and “almost broke down from over-exertion…

G.F. Browne, the secretary of the Cambridge exam board, was also concerned—because he feared that the women entered in the 1890 math exams might be so far below par that they would disgrace themselves. He worried that one might even place last, a position known at Cambridge as “the Wooden Spoon.” Late on the evening of June 6, the day before the results were to be announced, Browne received a visit from the senior examiner, W. Rouse Ball, who confided that he had come to discuss “an unforeseen situation” concerning the women’s rankings. Notes Siklos, citing Browne’s own account:

After a moment’s thought, I said: ‘Do you mean one of them is the Wooden Spoon?’

‘No, it’s the other end!’

‘Then you will have to say, when you read out the women’s list, “Above the Senior Wrangler”; and you won’t get beyond the word ‘above.’ “

By morning, word that something extraordinary was about to occur had electrified Cambridge. Newnham students made their way to the Senate House en masse, and Fawcett’s elderly grandfather drove a horse-drawn buggy 60 miles from the Suffolk coast with her cousins Marion and Christina. Marion reported what happened next:

It was a most exciting scene in the Senate… Christina and I got seats in the gallery and grandpapa remained below. The gallery was crowded with girls and a few men, and the floor of the building was thronged with undergraduates as tightly packed as they could be. The lists were read out from the gallery and we heard splendidly. All the men’s names were read first, the Senior Wrangler [G.T. Bennett of St John’s College] was much cheered.

At last the man who had been reading shouted “Women.”… A fearfully agitating moment for Philippa it must have been…. He signalled with his hand for the men to keep quiet, but had to wait some time. At last he read Philippa’s name, and announced that she was “above the Senior Wrangler.”


The male undergraduates responded to the announcement with loud cheers and repeated calls to “Read Miss Fawcett’s name again.” Back at the college, “all the bells and gongs which could be found were rung,” there was an impromptu feast, bonfires were lit on the field hockey pitch, and Philippa was carried shoulder-high into the main hall—”with characteristic calmness,” Siklos notes, “marking herself  ‘in’ on the board” as she swayed past. The men’s reaction was generous, particularly considering that when Cambridge voted against allowing women to become members of the university in 1921, the undergraduates of the day celebrated by battering down Newnham’s college gates.

The triumph was international news for days afterwards, the New York Times running a full column, headlined “Miss Fawcett’s honor: the kind of girl this lady Senior Wrangler is.” It soon emerged that Fawcett had scored 13 percent more points than had Bennett, the leading male, and a friendly examiner confided that “she was ahead on all the papers but two … her place had no element of accident in it.”

Active Reading

Source: Explorable Explanations, date indeterminate

What does it mean to be an active reader?

An active reader asks questions, considers alternatives, questions assumptions, and even questions the trustworthiness of the author. An active reader tries to generalize specific examples, and devise specific examples for generalities. An active reader doesn’t passively sponge up information, but uses the author’s argument as a springboard for critical thought and deep understanding.

Do our reading environments encourage active reading? Or do they utterly oppose it? A typical reading tool, such as a book or website, displays the author’s argument, and nothing else. The reader’s line of thought remains internal and invisible, vague and speculative. We form questions, but can’t answer them. We consider alternatives, but can’t explore them. We question assumptions, but can’t verify them. And so, in the end, we blindly trust, or blindly don’t, and we miss the deep understanding that comes from dialogue and exploration.

Explorable Explanations is my umbrella project for ideas that enable and encourage truly active reading. The goal is to change people’s relationship with text. People currently think of text as information to be consumed. I want text to be used as an environment to think in.

This essay presents examples of few initial ideas:

A reactive document allows the reader to play with the author’s assumptions and analyses, and see the consquences.

An explorable example makes the abstract concrete, and allows the reader to develop an intuition for how a system works.

Contextual information allows the reader to learn related material just-in-time, and cross-check the author’s claims.

Bret Victor – Humane Representation of Thought



New representations of thought — written language, mathematical notation, information graphics, etc — have been responsible for some of the most significant leaps in the progress of civilization, by expanding humanity’s collectively-thinkable territory.

But at debilitating cost. These representations, having been invented for static media such as paper, tap into a small subset of human capabilities and neglect the rest. Knowledge work means sitting at a desk, interpreting and manipulating symbols. The human body is reduced to an eye staring at tiny rectangles and fingers on a pen or keyboard.

Like any severely unbalanced way of living, this is crippling to mind and body. But it is also enormously wasteful of the vast human potential. Human beings naturally have many powerful modes of thinking and understanding. Most are incompatible with static media. In a culture that has contorted itself around the limitations of marks on paper, these modes are undeveloped, unrecognized, or scorned.

We are now seeing the start of a dynamic medium. To a large extent, people today are using this medium merely to emulate and extend static representations from the era of paper, and to further constrain the ways in which the human body can interact with external representations of thought.

But the dynamic medium offers the opportunity to deliberately invent a humane and empowering form of knowledge work. We can design dynamic representations which draw on the entire range of human capabilities — all senses, all forms of movement, all forms of understanding — instead of straining a few and atrophying the rest.

This talk suggests how each of the human activities in which thought is externalized (conversing, presenting, reading, writing, etc) can be redesigned around such representations.


Related Resources:

Wired, Jan 2014

MOST FORWARD-LOOKING DESIGNERS think about what might happen in the next five years. Bret Victor is more concerned about the next 500.

Print media, Victor says, has served us well. It helped us come up with new ways of representing knowledge that have been instrumental to human progress—things like charts, graphs, and mathematical notation. And yet print media has limitations. Namely, it engages a very narrow slice of our intellectual capabilities. Print is based on our eyeballs interpreting symbols. It doesn’t utilize our innate understanding of spatial relationships; it doesn’t take advantage of how deftly we learn by touching, holding, and manipulating objects with our hands.

Victor dubs it “the dynamic medium.” The basic idea is some sort of physical matter that has the ability to rearrange itself dynamically—think maybe some sort of computerized sand that could take any form at any time. Victor’s not concerned with the technological implementation of such a medium, though he’s utterly convinced that it will be feasible. He’s more interested in the intellectual breakthroughs it could yield.

What excites Victor about the dynamic medium is the possibility of new representational tools—leaps equivalent to the charts, graphs and notation of past centuries. Instead of reading about the global economy in all its intricacy, for example, imagine if you could hold a working model of it, or get inside it and have the model surround you.

as Victor sees it, human progress will eventually depend on such tools, ones that let us explore complex systems and concepts with our hands as well as our minds. (Victor’s big on the power of thinking with our hands.)

Medium, date indeterminate

He sees himself less as a designer/developer/engineer than as a researcher of computer-augmented creativity, much like his mentor Alan Kay (who pioneered graphical user interfaces and object-oriented programming) and his hero Douglas Engelbart (of “The Mother of All Demos” fame).

Victor declares that “the power to understand and predict the quantities of the world should not be restricted to those with a freakish knack for manipulating abstract symbols.”

 what would “post-paper” thoughts look like? Victor admits he has no idea. He just has a conviction about the medium that will enable them. “The important thing isn’t thinking about computers or programming as they are today, but thinking about moving from a static medium like marks on paper to a dynamic medium with computational responsiveness infused into it, that can actually participate in the thinking process,” he says.

how the media in which we choose to represent our ideas shape (and too often, limit) what ideas we can have.




Brave New World (Utopia) and “1984” (Dystopia)

Source: The Toll Online, Apr 2017

In both, Brave New World and 1984, common themes are addressed including government, orthodoxy, social hierarchy, economics, love, sex, and power. Both books portray propaganda as a necessary tool of government to shape the collective minds of the citizenry within each respective society and towards the specific goals of the state; to wit, stability and continuity.

In Brave New World, The “Bureaux of Propaganda” shared a building with the “College of Emotional Engineering” and all media outlets including radio, television, and newspaper. Much of the brainwashing of the citizens in Huxley’s world included messaging to stay within their genetically predetermined castes or to encourage the daily use of the drug, Soma, in order to anesthetize emotional agitation:


a gramme in time saves nine

A gramme is better than a damn

One cubic centimetre cures ten gloomy sentiments

When the individual feels, the community reels.

The “Ministry of Truth”, in 1984, also known as “minitrue” in Newspeak, served as the propaganda machine for Big Brother and the INGSOC regime. Although its main purpose was to rewrite history in order to realign it with Party doctrine and make the Party look infallible, the Ministry of Truth also promoted war hysteria in order to unite the citizens of Oceania while broadcasting simple messages designed to discourage any self-determination or autonomous thought.

many might consider Brave New World to be a utopian dream. In the context of individual autonomy, however, as well as the pursuit of truth, the opportunity for personal self-actualization, the dilemma of ethical considerations and the governmental dispensation of immoral law; Huxley’s vision of the future removes the lid of a veritable Pandora’s Box of questions. In reality, the societal structure as delineated in Brave New World would greatly resemble what could be called a “prison of pleasure” and, perhaps, even a “penitentiary of profligate practicality”.

Applying the same philosophical critique of 1984, and in similar fashion, Orwell’s nation-state of Oceana would be considered as a bona fide dystopian “prison of fear”.

As a matter of fact, both societies portray prisons of man’s own making, formed by governments following their own directions toward their respective future destinations. To say it another way: The road to hell is actually paved with bad intentions.

Both power structures in Brave New World and 1984 chose to diminish individual rights in order to achieve societal stability. To the governments of both super-states, their citizens were considered as mere “means to an end”; namely, the continuation of power.

The Differences Between Male and Female Brains

Source: NYMag, Apr 2017

… three of the key findings from the paper Ritchie and his colleagues just posted:

(1) Yes, there do appear to be many differences between male and female brains, but there’s also tons of overlap. The researchers examined all sorts of potential sites of male/female differences, and found many such differences. But with many of these brain areas, there’s a lot of variation, and a large range of sizes for which it could be safely said that a given brain could be either stereotypically male or female, as this graph shows:

the most noteworthy difference was that the men in the sample simply had larger brains in general, which isn’t surprising because men are larger than women, in general. They also tended to have denser brains and more white matter. At the level of individual structures, there were also statistically significant differences, some of them more pronounced than others

(2) These differences could, in the long run, help explain and provide treatment for diseases that tend to hit one sex harder than the other. “As has previously been argued,” the researchers write, “providing a clear characterisation of neurobiological sex differences is a step towards understanding patterns of differential susceptibility to neurodevelopmental disorders such as autism spectrum disorder, a variety of psychiatric conditions, and neurodegenerative disorders such as Alzheimer’s Disease.”

There is, at this point, solid evidence of certain robust differences between adult male and female brains, but it’s just too early to know exactly what those differences mean — or why there’s also so much overlap.

Implicit Association Test (IAT)

Source: NYMag, Jan 2017

Almost two decades after its introduction, the implicit association test has failed to deliver on its lofty promises.

… which purports to offer a quick, easy way to measure how implicitly biased individual people are.

Unfortunately, none of that is true.

A pile of scholarly work, some of it published in top psychology journals and most of it ignored by the media, suggests that the IAT falls far short of the quality-control standards normally expected of psychological instruments. The IAT, this research suggests, is a noisy, unreliable measure that correlates far too weakly with any real-world outcomes to be used to predict individuals’ behavior — even the test’s creators have now admitted as such. The history of the test suggests it was released to the public and excitedly publicized long before it had been fully validated in the rigorous, careful way normally demanded by the field of psychology.

There’s an entire field of psychology, psychometrics, dedicated to the creation and validation of psychological instruments, and instruments are judged based on whether they exceed certain broadly agreed-upon statistical benchmarks.

The most important benchmarks pertain to a test’s reliability — that is, the extent to which the test has a reasonably low amount of measurement error (every test has some) — and to its validity, or the extent to which it is measuring what it claims to be measuring. A good psychological instrument needs both.

Take the concept of test-retest reliability, which measures the extent to which a given instrument will produce similar results if you take it, wait a bit, and then take it again. Different instruments have different test-retest reliabilities.

Test-retest reliability is expressed with a variable known as r, which ranges from 0 to 1. To gloss over some of the gory statistical details, r = 1 means that if a given test is administered multiple times to the same group of people, it will rank them in exactly the same order every time.

Hypothetically, if the IAT had a test-retest reliability of r = 1, and you administered the test to ten people over and over and over, they’d be placed in the same order, least to most implicitly biased, every time. At the other end of the spectrum, when r = 0, that means the ranking shifts every time the test is administered, completely at random. The person ranked most biased after the first test would, after the second test, be equally likely to appear in any of the ten available slots. Overall, the closer you get to r = 0, the closer the instrument in question is to, in effect, a random-number generator rather than a remotely useful means of measuring whatever it is you’re trying to measure.

What constitutes an acceptable level of test-retest reliability? It depends a lot on context, but, generally speaking, researchers are comfortable if a given instrument hits r = .8 or so. The IAT’s architects have reported that overall, when you lump together the IAT’s many different varieties, from race to disability to gender, it has a test-retest reliability of about r = .55. By the normal standards of psychology, this puts these IATs well below the threshold of being useful in most practical, real-world settings.

The individual results that have been published, though, suggest the race IAT’s test-retest reliability is far too low for it to be safe to use in real-world settings. In a 2007 chapter on the IAT, for example, Kristin Lane, Banaji, Nosek, and Greenwald included a table (Table 3.2) running down the test-retest reliabilities for the race IAT that had been published to that point: r = .32 in a study consisting of four race IAT sessions conducted with two weeks between each; r = .65 in a study in which two tests were conducted 24 hours apart; and r = .39 in a study in which the two tests were conducted during the same session (but in which one used names and the other used pictures).

What all these numbers mean is that there doesn’t appear to be any published evidence that the race IAT has test-retest reliability that is close to acceptable for real-world evaluation. If you take the test today, and then take it again tomorrow — or even in just a few hours — there’s a solid chance you’ll get a very different result. That’s extremely problematic given that in the wild, whether on Project Implicit or in diversity-training sessions, test-takers are administered the test once, given their results, and then told what those results say about them and their propensity to commit biased acts.

In statistical terms, the architects of the IAT claimed, for a long time, that there is a meaningful correlation between two variables: Someone’s IAT score (call it x) and how implicitly biased they act in intergroup settings (call it y). Generally speaking, researchers measure the extent to which two variables are correlated by examining how much of the variation in one variable, y, is explained by changes in the other, x. The more two variables are correlated in this manner, the more meaningful a connection might exist between them.

when you use meta-analyses to examine the question of whether IAT scores predict discriminatory behavior accurately enough for the test to be useful in real-world settings, the answer is: No. Race IAT scores are weak predictors of discriminatory behavior.

the most IAT-friendly numbers, published in a 2009 meta-analysis lead-authored by Greenwald, which found fairly unimpressive correlations (race IAT scores accounted for about 5.5 percent of the variation in discriminatory behavior in lab settings, and other intergroup IAT scores accounted for about 4 percent of the variance in discriminatory behavior in lab settings), were based on some fairly questionable methodological decisions on the part of the authors.

The second, more important point to emerge from this years-long meta-analytic melee is that both critics and proponents of the IAT now agree that the statistical evidence is simply too lacking for the test to be used to predict individual behavior.

The psychometric issues with race and ethnicity IATs, Greenwald, Banaji, and Nosek wrote in one of their responses to the Oswald team’s work, “render them problematic to use to classify persons as likely to engage in discrimination.” In that same paper, they noted that “attempts to diagnostically use such measures for individuals risk undesirably high rates of erroneous classifications.” In other words: You can’t use the IAT to tell individuals how likely they are to commit acts of implicit bias.

The point is that the key experts involved in IAT research no longer claim that the IAT can be used to predict individual behavior. In this sense, the IAT has simply failed to deliver on a promise it has been making since its inception — that it can reveal otherwise hidden propensities to commit acts of racial bias. There’s no evidence it can.

The scientific truth is that we don’t know exactly how big a role implicit bias plays in reinforcing the racial hierarchy, relative to countless other factors. We do know that after almost 20 years and millions of dollars’ worth of IAT research, the test has a markedly unimpressive track record relative to the attention and acclaim it has garnered. Leading IAT researchers haven’t produced interventions that can reduce racism or blunt its impact. They haven’t told a clear, credible story of how implicit bias, as measured by the IAT, affects the real world. They have flip-flopped on important, baseline questions about what their test is or isn’t measuring.

And because the IAT and the study of implicit bias have become so tightly coupled, the test’s weaknesses have caused collateral damage to public and academic understanding of the broader concept itself. As Mitchell and Tetlock argue in their book chapter, it is “difficult to find a psychological construct that is so popular yet so misunderstood and lacking in theoretical and practical payoff” as implicit bias. They make a strong case that this is in large part due to problems with the IAT.