Engineers who’ve calculated whether or not their results are “statistically significant” have probably drawn the wrong conclusions about their work. In fact, problems with misapplied classical statistics are so widespread that the subject shouldn’t be taught to undergrads.
So says William M. Briggs, adjunct professor of statistical science at Cornell University and a statistical researcher. Briggs isn’t alone. He says a growing body of statisticians are concluding that frequentist statistics — those that draw conclusions from the frequency or proportion of data — should be kept out of the hands of nonexperts. Standard deviation, T tests, p-values, and even regression lines are concepts Briggs would keep away from all but Ph.D. students.
Briggs’ argument for such a radical stance is that most nonexperts misapply these ideas and often use them to leap to bad conclusions. “The technical definition of a p-value is so difficult to remember that people just don’t keep it in mind. Even the Wikipedia page on p-value has a couple of small errors,” Briggs says. “People treat a p-value as a magical thing: If you get a p-value less than a magic number then your hypothesis is true. People don’t actually say it is 100% true, but they behave as though it is.”
Briggs worries most about the false sense of security that nonstatisticians get from calculating hard numbers. Even peer-reviewed papers from cancer researchers, he says, have been known to draw questionable conclusions based on frequentist statistics. “P-values can and are used to prove anything and everything. The sole limitation is the imagination of the researcher,” he says. “To the civilian, the small p-value says that statistical significance has been found, and this, in turn, says that his hypothesis is not just probable, but true.”
Briggs says that the better way of teaching probability concepts to nonexperts is with Bayesian statistics. The Bayesian approach is a bit like the way a sports bettor might decide where to put money on a given Pistons-Celtics game. Thoughtful gamblers would make a baseline prediction using the evidence at hand, such as the two teams’ win-loss records. They would then update the prediction based on new information or more detailed insights about each team as it became available.
Briggs admits this process sounds a lot mushier than figuring standard errors and binomial coefficients. “Bayesian statistics work, but they demand a lot more of the user. They don’t spit out one single, comforting number the way frequentist statistics do. But in a lot of real-life situations, there is no hard number there, particularly when the evidence and data are so complicated that they can’t be quantified,” he says.
Nevertheless, there is only a slim chance a Bayesian revolution will sweep through statistics classrooms. The problem is one of inertia. “Most statistics classes are taught by nonstatisticians. They can’t teach Bayesian statistics because a lot of them have never heard of it,” says Briggs. Even worse, “Peer-review journal editors still want to see p-values in the papers they publish.”