Every conversation with a patient is an exercise in the analysis of “big data.” The patient’s appearance, changes in mood and expression, and eye contact are data points. The illness narrative is rich in semiotics: pacing, timing, nuances of speech, dialect are influenced by context, background, and insight which in turn reflect religion, education, literacy, numeracy, life experiences and peer input. All this is tempered by personal philosophy and personality traits such as recalcitrance, resilience, and tolerance. Taking a history, by itself, generates a wealth of data but that’s just the start.
Add into the mix physical findings of variable reliability, laboratory markers of variable specificity, imaging bits and bytes and you have “big data.” Then you mine this data for the probabilistic variance of the potential causes of a complaint based on which you begin to consider values for numerous options for care. So armed, the physician next needs to factor the benefits and harms of multiple treatments’ derived from populations that never perfectly reflect the situation of the individual in the chair next to us, our patient. This is the information necessary to empower our patient to make rational choices from the menu of options. That is clinical medicine. That is what we do many times a day to the best of our ability and to the limits of our stamina.
Take that Watson. You need a lot more
than 90 servers and megawatts of electricity to manage our bedside
rounds. You need to contend with the gloriously complicated and
idiosyncratic fabric of human existence. Poets might be a match, but
Watson is not.
Watson is doomed not just from its
limited technical sufficiency compared our cognitive birthright. Even if
Watson could grow its server brain to match ours, it won’t be able to
find measurable quantities for the independent variables captured during
a patient encounter nor the role of personal values that temper that
patient’s choice. Life does not have independent and dependent
variables; the things that matter to us are on both sides of a
regression model. Watson needs rules to violate this statistic and there
are none that generalize. Somehow, our brains have a measuring
instrument that no data query can find or measure and that we innately
understand but can’t fully communicate. Also, our brains seem to
intuitively understand statistics; our brains know that the variations
around the regression lines (residuals) mean more to us than the models
themselves. Sure, if there is something discrete to know, a simple,
measurable deterministic item, or an answer to a game show question,
Watson will kick most, and maybe all, of our butts. But, what if what is
important to us is not deterministic, nor discrete? What if life is
more importantly measured in “when” than “if”? And what if the “when,
and how we feel about the when” are intertwined? What if medical life is
not even measured in outcomes, but, instead, relationships that foster
peaceful moments? In this reality, Watson will be lost.
Watson is doomed on yet another level
beyond a dearth of “code friendly” meaningful measures of humanity. It
is doomed in that it is capable of reading the “World’s Literature”. Our
desires and motives to improve the care of individuals is being buried
in reams of codependent, biased, unrestricted, marketed, false positive
or false negative associated, and poorly studied information that sees
the light of Watson’s day because it can read every report published in
the massive number of nearly 20,000 biomedical journals. A “60 Minutes”
report on AI reveled in Watson’s prowess at searching
the literature. We can’t substantiate one particular quote in the
report, and bet the quoted can’t either, that there are 8000 research
reports published daily. But, that is Watson’s problem. Watson fails to
recognize that it is more important to know what we should not read
rather than to be able to read it all. There is just too much precarious
information being perpetrated on unsuspecting readers, whether the
readers have eyes or algorithms.
Science is the glue that holds medical
care together but it is far from a perfect adhesive. We have both served
long tenures on the editorial boards of leading general and specialty
clinical journals. We have many an anecdote about the rocky relationship
between medical care and the science that informs it. An anecdote from
Dr. McNutt serves as a particularly disconcerting object lesson. He
commented on a paper being brought for publication, a paper that he
argued should be rejected because it was a Phase 2 study. The study was
not fatally flawed by design, just premature, as many Phase 2 studies
fail to be replicated after better-designed Phase 3 studies are
performed. Science is about accuracy and redundancy and timelessness and
process, not expediency. Despite his arguments the paper was published
and became highly cited. Sure enough a better-designed Phase 3 study
rejected the hypothesis supported by the Phase 2 study vindicating Dr.
McNutt on this occasion. But that is not the point. The point is that
Watson knows of both studies. You only need to know one of them. How did
Watson handle the irreproducible nature of the studies and their
contrary insights? One might wonder if the negative study was cited as
often as the positive, premature study. Watson would know.
Are we being too tough on AI? We are not
writing about Watson’s specific program but, instead, using it as a
metaphor for big data analytics and messy regression models. It is not
clear if Watson has been tested in a range of clinical situations where
inherent uncertainty prevails. No
pertinent randomized trials are cited when “Watson artificial
intelligence” is entered into “PubMed”. There are attempts to match
patients to clinical studies, but no outcome studies. This is important
since that 60 Minute episode told of a patient who was treated after a
“recommendation” from Watson. We assume that the treatment met ethical
standards for a Phase 1 study and that the patient was fully informed.
We are left to assume, also, that the information found by AI was
reliable and adequately tested. After
all, this compliant-with-Watson, yet unfortunate patient succumbed to
an “infection” several months after receiving the treatment. We
worry about the validity of the information spewed by the algorithm and
how on earth the researchers planned to learn anything about the
efficacy of the proposed intervention from treating their patient.
Science requires universal aims and adequate comparisons. In our view,
any AI solution for any patient should be subjected to stringent,
publicly available scientific testing. AI, to us, is in dire need of
Phase 1 testing.
Science can be better. Watson will not
advance science, scientific inquiry will. Better designs for clinical
care and insights from scientific data need to be developed
and implemented. We do not need massive amounts of data, just small
amounts gathered in thoughtfully planned studies. And with better
science, we will not need AI. Instead of banking, or breaking the bank,
on AI, we should use our remarkable brains to learn by rigorous
scientific enquiry and introduce valid scientific insights into the “big
data” dialogue we call the patient’s “history” and do so in the service
of what we call “patient care.” Watson and other systems may be able to
do a wonderful job determining what books we buy, and, from a medical
perspective, it might be able to pick a particular antibiotic given a
known infection due to the deterministic nature of that task. But,
treating infection, as an example, is a small data part of what we do;
we help sick people and for that big data task, Watson will, in our
view, not be sufficiently insightful.
0 comments:
Post a Comment