Bad Data In, Bad Data Out (BDI, BDO)
It's elementary, my dear Watson!
Attributed to Sherlock Holmes
As I work on my new book, Noise!! Until the 2016 Election Results Are
Known, The Health Debate Is Mostly Noise, two articles came to
my attention:
One, “The Big Data Future Has Arrived, “(WSJ, February 24) by Michael
S. Malone, a prolific writer on social
issues, in which he argues powerful computers, ubiquitous sensors, and the Web will transform our lives by
making the connections between man and
machine more personal, productive, and empowering.
Two, “Will Feeding Watson $3 Billion Worth of
Healthcare Data Improve Its Decisions “? by
Ross Koppel, PhD and Frank Meissner, MD in The Health Care Blog, February 24.
Koppel is a senior fellow at the Leonard Davis School of Economics
(Wharton) and Meissner is a cardiologist
in El Paso, Texas. The two ask whether IBM’s purchase of
Truven Health Analytics, and payer and patient
data at the Cleveland Clinic’s “Explory’s”
and Phytel, a software company, will
improve health care.
Watson, IBM’s
computer system, is designed by analyzing
health care data to create artificial intelligence to supplement human intelligence to improve health care
outcomes.
The two authors’ central questions are:
Will flawed data from payers,
physicians, and patients,
each inaccurate and biased on their own ways, be misleading or wise guides to future
care?
Will this data, in their words, produce “digital flatulence” or “digital
decisiveness”?
Will Watson’s $3 billion diet of
undigested data produce more noise than knowledge?
Their answers,
like the data, is ambiguous , because:
Medicine is an art rather than a science,
data collection is full of ambiguities.
Clinicians are often rushed and confronted
with limited time constraints, unfriendly
EHR interfaces and a byzantine list of
68.000 codes to pick from, the EHR output
is imprecise and flawed.
Hospitals
often enter EHR data calculated to maximize DRG revenues, that data is
biased towards procedures, such as
expensive cardiac workups even though the diagnosis of coronary artery disease
is at best, ambigious. “ It’s rare,” they
say, “ to find a patient admitted to a hospital with chest pain who is not
admitted as anything other than Acute Coronary Syndrome (ACS)—rather than a
less expensive diagnosis.
Patients
are often not forthcoming about their lives for reasons of embarrassment, privacy
concerns. Patients have understandable, primarily economic
reasons to deceive about their health insurance. They may be using the name of
a friend or relative who has health insurance, they may have a spouse’s or
ex-spouse’s insurance, they may wish not to have certain procedures or
conditions shown on their insurance records.
As I write in my book,
there are other confusing noise pollution
factors as well:
·
The Noise over the clash of cultures between health care
proponents and followers.
·
The Noise between President Obama’s ideology and its economic
consequences.
\
·
The Noise of physician
demoralization, shortages, and passive
resistance.
·
The Noise of middle class discontent over broken promises of
lower costs and keeping your doctor and health plan.
·
The Noise over Negative Forecasts predicting ObamaCare repeal.
Because of these multiple sources of confusing background
Noise and because of the bias inherent in the
payer, physician, and patient sources of data, Watson’s quest for enlightment through sheer
data may be a case of BDI, BDO (Bad Data In, Bad Data Out).
Still, as Koppel
and Meissner argue, IBM’s quest for data
Holy Grail is worth a try, why not give it a shot?