Categories Finance

The Impact of AI Errors on Health Care Use

Yves here. I firmly oppose the integration of AI in medicine, particularly regarding my own care, and I’m grateful to be in a country where the adoption of AI is gradual. The alarming accounts from IM Doc about AI’s frequent blunders in something as straightforward as taking patient notes are deeply concerning. These inaccuracies often remain hidden in technical jargon, making it nearly impossible for anyone other than the initial physician to identify errors.

I believe many individuals might find themselves dealing with sub-clinical conditions, which often leads to their dismissal as not significant enough to warrant attention. However, even minor issues can compound over time. I once pointed out to the leading surgeon who operated on my hips that, while my orthopedic anomalies seemed inconsequential when viewed individually, they collectively placed me well outside the standard range. He was taken aback and instantly agreed, indicating he had not previously considered my structural issues in this light.

Furthermore, I have long-standing endocrine test results that defy expectations, along with an inability to find pain relief through opiates. This has led me to distrust AI in addressing what my research has shown to be peculiarities in my health. I suspect a notable percentage of the population—ranging from 5% to even 15%—exhibits unusual traits that may not lend themselves well to AI-guided care.

This discussion touches on another significant drawback of AI treatments: inherent errors stemming from categorization issues. Many medical conditions share overlapping symptoms, such as scleroderma and dermatomyositis, both severe autoimmune diseases impacting the skin. Those of us who have been patients or practitioners can certainly recount instances of misdiagnosis where errors and delays considerably affected outcomes. A physician from earlier times might have been more responsive to a patient’s concerns when initial assessments of “no significant problem” began to seem incorrect, especially as the patient’s condition worsened.

The article suggests an ideal scenario in which a physician receives AI recommendations and then reviews them. However, this might turn out to be the worst possible arrangement. The Lancet reported that doctors using AI assistance during colonoscopies actually became less adept at identifying potentially dangerous polyps independently. Moreover, ongoing studies reveal that frequent use of applications like ChatGPT can alter brain activity, as recorded through EEGs. For instance, one study indicated that, “Brain connectivity systematically decreased with the amount of external support.”

By Carlos Gershenson, Professor of Innovation, Binghamton University, State University of New York. Originally published at The Conversation

In the last decade, the success of AI has generated unbridled enthusiasm and bold claims—even though users frequently experience various errors attributed to AI. For instance, an AI-powered digital assistant might misinterpret spoken language in humorous ways, while a chatbot could fabricate information, or, in my own experience, an AI navigation tool could even direct drivers through a cornfield without recognizing its mistakes.

Many users tolerate these errors because the technology often enhances efficiency in certain tasks. However, an increasing number of advocates are promoting the use of AI—sometimes with minimal human oversight—in high-stakes fields like healthcare. A bill introduced in the U.S. House of Representatives in early 2025 aims to allow AI systems to autonomously prescribe medications. Since its introduction, health researchers and legislators have vigorously debated the feasibility and advisability of such prescribing.

How such prescribing would function if this or similar legislation is enacted remains uncertain. However, it raises critical questions about the acceptable margin for errors in AI tools and the potential repercussions should these tools lead to adverse outcomes, including fatalities.

As a researcher specializing in complex systems, I examine the interactions of various system components that yield unpredictable results. My research seeks to investigate the boundaries of science, specifically regarding AI.

Over the past 25 years, I have contributed to projects on traffic light coordination, bureaucratic improvements, and tax evasion detection. Although these systems can often be highly effective, perfection is a distant goal.

Nobody – and Nothing, Not Even AI – Is Perfect

As Alan Turing, recognized as the father of computer science, once expressed, “If a machine is expected to be infallible, it cannot also be intelligent.” Learning, a fundamental component of intelligence, typically occurs through mistakes. I observe this tension between intelligence and infallibility recurring in my research.

In a study published in July 2025, my colleagues and I demonstrated that perfectly categorizing specific datasets into clear groups might be unattainable. In essence, some datasets inherently produce a minimal level of errors due to the overlapping nature of various categories. For numerous datasets—central to many AI systems—AI does not perform better than random chance.

For instance, a model trained on a dataset of thousands of dogs that only registers their age, weight, and height will likely distinguish Chihuahuas from Great Danes with absolute accuracy. However, it might struggle to differentiate between an Alaskan malamute and a Doberman pinscher, given that different individuals of various breeds might share the same age, weight, and height parameters.

This categorization process is known as classifiability, and my students and I began examining it in 2021. Utilizing data from over half a million students at the Universidad Nacional Autónoma de México from 2008 to 2020, we aimed to solve a seemingly straightforward question: Could an AI algorithm predict which students would graduate on time—within three, four, or five years depending on their major?

We evaluated several well-known algorithms used for classification in AI and even created our own. No algorithm proved flawless; the top-performing ones—including one specifically designed for this purpose—achieved only an 80% accuracy rate, indicating that at least one in five students was misclassified. We discovered that many students were indistinguishable in terms of grades, age, gender, socioeconomic status, and other features—yet some would graduate on time while others would not. Under such circumstances, no algorithm could yield perfect predictions.

One might assume that more data would enhance predictability, but this usually results in diminishing returns. Thus, to enhance accuracy by just 1%, one might need an additional 100 times the data. Consequently, we would never have enough students to significantly boost our model’s efficacy.

Additionally, many unexpected life events—like unemployment, death, or pregnancy—might occur after students’ first year at university, impacting their chances of timely graduation. Consequently, even with an infinite student pool, our predictions would still yield errors.

The Limits of Prediction

In general terms, complexity is what constrains prediction. The term complexity originates from the Latin plexus, meaning intertwined. The components of a complex system are interconnected, and it is the interactions among those elements that shape their behavior and outcomes.

Therefore, examining system elements in isolation may lead to misleading conclusions about both those elements and the system as a whole.

Consider a car traveling in a city. While it’s theoretically possible to predict its destination based on speed, real traffic conditions make its speed contingent on interactions with surrounding vehicles. Because these interactions evolve in real-time and cannot be anticipated, accurate predictions about the car’s path are feasible only minutes into the future.


AI is already playing an enormous role in health care. Not With My Health

The same principles apply to the prescribing of medications. Various conditions or diseases can present similar symptoms, and individuals with the same condition may display different symptoms. For example, a fever may stem from either a respiratory or a gastrointestinal illness. Moreover, a cold might lead to a cough, but not always.

This reality means that healthcare datasets are characterized by significant symptom overlap, which hinders the possibility of achieving error-free AI applications.

Certainly, humans make mistakes as well. However, when AI misdiagnoses a patient—an inevitability—this situation falls into a legal gray area surrounding accountability. It remains unclear who holds responsibility if a patient is harmed: pharmaceutical companies, software developers, insurance providers, or pharmacies?

In numerous situations, neither humans nor machines are the optimal solution for a specific task. “Centaurs,” or “hybrid intelligence”—the combination of human and machine capabilities—often outperform either party individually. A physician could leverage AI to suggest potential medications for various patients based on their medical history, physiological profiles, and genetic makeup. Researchers are actively exploring this concept within the realm of precision medicine.

However, common sense, along with the precautionary principle, indicates that it may be premature for AI to autonomously prescribe medications without human oversight. Given that mistakes may be inherent to the technology, it is reasonable to argue that human supervision is essential whenever patient health is on the line.

Print Friendly, PDF & Email

Leave a Reply

您的邮箱地址不会被公开。 必填项已用 * 标注

You May Also Like