Bad Science | The Z Blog

Even the most disconnected people know the gag about science announcing something and then the next day announcing the opposite. This is most common with food and diet, where everyday brings a new scare. If you follow the soft-sciences, then you know that most of what passes for academic research in some fields is complete nonsense that is easily refuted. This story in the New York Times goes into detail about the origin of what we have come to call junk science.

My first Raw Data column, published in January, was about the controversy over irreproducibility — experiments whose outcomes cannot be verified independently by another lab. Featured in the piece was a study by Dr. John P.A. Ioannidis that has been a source of contention since it appeared in 2005. It was called “Why Most Published Research Findings Are False.”

All scientific results are, of course, subject to revision and refutation by later experiments. The problem comes when these replications don’t occur and the information keeps spreading unchecked.

Dr. Ioannidis’s analysis took into account several factors — things like noisy data, a small sample size or relatively lenient standards for deciding if a finding is statistically significant. His model could be applied to any area of science that met his criteria. But most attention to the reproducibility problem has been in the life sciences, particularly in medical laboratory research and epidemiology. Based on the number of papers in major journals, Dr. Ioannidis estimates that the field accounts for some 50 percent of published research.

The small sample size is a favorite of the health rackets. I wish I had saved it, but my all time favorite was a study on milk using eight Norwegian dairy farmers. I forget the details, but it had something to do with heart disease and dairy consumption. The executive summary made the claim that dairy causes heart disease. They assumed the nitwit reporters would not bother to read the study.

Another area of concern has been the social sciences, including psychology, which make up about 25 percent of publications. Together that constitutes most of scientific research. The remaining slice is physical science — everything from geology and climatology to cosmology and particle physics. These fields have not received the same kind of scrutiny as the others. Is that because they are less prone to the problems Dr. Ioannides described?

Faye Flam, a science writer with a degree in geophysics, made that argument in a critique of my column in Knight Journalism Tracker, and I responded on my own blog, Fire in the Mind. Since then I’ve been thinking more about the matter, and I asked Dr. Ioannidis for his view.

“Physical sciences have a stronger tradition of some solid practices that improve reproducibility,” he replied in an email. Collaborative research, for example, is customary in physics, including large consortiums of experimenters like the teams that announced the discovery of the Higgs particle. “This certainly increases the transparency, reliability and cross-checking of proposed research findings,” he wrote.

He also mentioned more stringent statistical standards in particle physics — like the five sigma measure I mentioned in my second column — as well as sociological factors: “There seems to be a higher community standard for ‘shaming’ reputations if people step out and make claims that are subsequently refuted.” Cold fusion was a notorious example. He also saw less of an aversion to publishing negative experimental results — that is, failed replications.

Another factor, as Ms. Flam suggests, is how constrained a field is in generating plausible hypotheses to test. Almost anything might be suspected of causing cancer, but physicists are unlikely to propose conjectures that violate quantum mechanics or general relativity. But I’m not sure the difference is always that stark. Here is how I put it my blog post:

“What about the delicate and exquisitely controlled experiments that occur in laboratories? Are hypotheses involving intracellular enzyme pathways and the effects of microRNA on protein regulation so much less constrained than, say, solid-state physics and materials science?

Everyone is being polite here. The difference is social science is not science. Physics and chemistry are science. Science relies on math to validate itself. Long before humans walked the earth, arithmetic was true. Two plus two was true at the dawn of time and will be true into the future. Social sciences rely on statistics, which they use to calculate probabilities. There’s nothing wrong with that, but it is not science.

You can calculate the probability of you getting black jack on the next deal. You cannot prove you will get black jack. When you get into areas with loads of hard to quantify variables, statistics loses much of it value. The response to that is the creation of simplified models and studies that have no connection to reality. The result provides statistically useful results, but the difference between statistically useful and practically useful is often so large as to make them mutually exclusive.

Dr. Ioannidis said he was struck by an “arrogant dismissal” by some physical scientists of the suggestion that their field might be anything less than pristine. We won’t really know, he said, unless there are empirical studies like the recent ones in medical science.

“I have no doubt that false positives occur in all of these fields,” he concluded, “and occasionally they may be a major problem.”

I’ll be looking further into this matter for a future column and would welcome comments from scientists about the situation in their own domain.

Science is not immune from mischief. This story from last week shows how broken the peer review system is these days. Peer review is intended to weed out the junk science from the legitimate science. Experts in the field review your work, critique your methods, challenge your assumptions and look at your data. If computer generated gibberish is passing through the system it means no one is looking at this stuff. Peer review is useless if there are no peers and no review.