Source: Boston Review, Sep 2019
“We live in an era that presumes Big Data to be the solution to all our problems,” he says, “but I hope with this book to convince you that data are profoundly dumb.” Data may help us predict what will happen—so well, in fact, that computers can drive cars and beat humans at very sophisticated games of strategy, from chess and Go to Jeopardy!—but even today’s most sophisticated techniques of statistical machine learning can’t make the data tell us why.
For Pearl, the missing ingredient is a “model of reality,” which crucially depends on causes. Modern machines, he contends against a chorus of enthusiasts, are nothing like our minds.
Causation really cannot be reduced to correlation, even in large data sets, Pearl came to see. Throwing more computational resources at the problem, as Pearl did in his early work (on “Bayes nets,” which apply Thomas Bayes’s basic rule for updating probabilities in light of new evidence to large sets of interconnected data), will never yield a solution. In short, you will never get causal information out without beginning by putting causal hypotheses in.
he developed simple but powerful techniques using what he calls “causal graphs” to answer questions about causation, or to determine when such questions cannot be answered from the data at all.
the main innovation that Pearl is advertising—the use of causal hypotheses—gets couched not so much in algebra-laden statistics as in visually intuitive pictures: “directed graphs” that illustrate possible causal structures, with arrows pointing from postulated causes to effects. A good deal of the book’s argument can be grasped simply by attending only to these diagrams and the various paths through them.
Consider two basic building blocks of such graphs. If two arrows emerge from a single node, then we have a “common-causal fork,” which can produce statistical correlations between properties that are not, themselves, causally related (such as car color and accident rate on the reckless-drivers-tend-to-like-the-color-red hypothesis). In this scenario, A may cause both B and C, but B and C are not causally related.
On the other hand, if two different arrows go into the same node then we have a “collider,” and that raises an entirely different set of methodological issues. In this case, A and B may jointly cause C, but A and B are not causally related. The distinction between these two structures has important consequences for causal reasoning. While controlling for a common cause can eliminate misleading correlations, for example, controlling for a collider can create them. As Pearl shows, the general analytic approach, given a certain causal model, is to identify both “back door” (common cause) and “front door” (collider) paths that connect nodes and take appropriate cautions in each case.
The method of causal graphs allows us to test the hypotheses, both by themselves and against each other, by appeal to the data; it does not tell us which hypotheses to test.
(“We collect data only after we posit the causal model,” Pearl insists, “after we state the scientific query we wish to answer. . . . This contrasts with the traditional statistical approach . . . which does not even have a causal model.”)
Sometimes the data may refute a theory. Sometimes we find that none of the data we have at hand can decide between a pair of competing causal hypotheses, but new data we could acquire would allow us to do so. And sometimes we find that no data at all can serve to distinguish the hypotheses.
why care about causes? One reason is pure scientific curiosity: we want to understand the world, and part of that requires figuring out its hidden causal structure. But just as important, we are not mere passive observers of the world: we are also agents. We want to know how to effectively intervene in the world to prevent disaster and promote well-being. Good intentions alone are not enough.
We also need insight into how the springs and forces of nature are interconnected. So ultimately, the why of the world must be deciphered if we are to understand the how of successful action.