Why the Type of Symptom Checker Healthcare Providers Offer could be a Life or Death Decision for Patients

Why the Type of Symptom Checker Healthcare Providers Offer could be a Life or Death Decision for Patients

By Jason Maude, CEO and Co-Founder Isabel Healthcare

Over the past 2-3 years there has been a proliferation of symptom checkers, all claiming to give personalised results based on the patient’s symptoms, and many healthcare providers are now integrating symptom checkers into their digital patient platforms. However, whilst at first glance they seem quite similar, the interfaces mask big differences in what they attempt to do and how they work.

I purposely do not use the vague and rather meaningless term ‘Artificial Intelligence (AI)’ as it does not distinguish between the different types of AI and their differing capabilities.

Symptom checkers fall into two distinct camps: the first are essentially rules-based (e.g. chat-bots) and the second are machine learning-based.

Here, I explain why machine learning must be the future of symptom checkers, and why this is not just some esoteric argument for AI enthusiasts, but could have life or death implications for patients.

The problem with relying on rules

‘Rules-based’ essentially means classical programming, where the developer defines the rules. Often, these tools will have a chatbot interface and while the developers may call them ‘AI’, whatever that now means, the basic underpinning is still a rules-based system. For many applications being rules-based is not a problem, but a symptom checker needs to be up to dealing with the huge variety of ways in which patients present and the number of diagnoses that exist. The inherent limitations of the technology raise a number of potentially dangerous issues for patients.

  1. The inaccuracy of a ‘probability-ranked list’ of diagnoses

The first concern is what rules-based symptom checkers are designed to do, which is to provide the patient with the diagnosis, or at least a list ranked by clinical probability. In effect, they are trying to replace or replicate the clinician’s role.

This claim to be able to produce a probability-ranked list stems from the fact that one of the key rules defined by the developer is the probability of a symptom being caused by a particular disease. There is little published evidence on these relationships, so instead developers employ teams of clinicians to provide their expert opinion. However, these clinicians won’t always have the same opinion, so each ‘probability’ could be a compromise between 3-4 experts. This may be fine for one symptom, but if a patient has multiple symptoms these probabilities are multiplied, thereby compounding the errors. In the end, this significantly degrades the tool’s accuracy.

A rules-based symptom checker requires thousands of symptom-to-disease relationships to be manually defined. This is why most of them cover only a few hundred symptoms and common diagnoses. As there are over 10,000 diseases, it means there is a good chance that if a patient has a less common disease it won’t be covered by the system, and it may not even include their symptoms.

  1. The concept of a ‘most important symptom’

If you have tried any rules-based symptom checkers, you will notice that they often start with the seemingly innocuous question ‘what is your most important symptom?’, often asked by a ‘friendly’ chatbot. As a rules-based system is built using standardised decision trees (e.g. for headache, chest pain or abdominal pain), its first job is to determine which one to follow.

This is the first and most dangerous step they make. For example, if a patient has jaw pain, breathlessness, nausea and is sweating, how would they know which is the most important symptom? Nausea may well be top of their list as it’s the one that’s bothering them most. But unlike doctors, the patient isn’t equipped to recognise the combination of these symptoms as signs of a pending heart attack.

Once the patient has made this decision, the tool follows the decision tree for that symptom. This is where the chatbot comes into play, as it often needs to get the user to answer 30-50 questions, taking several minutes. Many patients do not complete this process, and the ones that do persevere under the assumption that more questions mean the final results will be more accurate. However, if you enter 3 or 4 symptoms in different orders to this type of symptom checker, you often find that different lists of diagnoses appear for the same symptoms entered in a different order!

The real danger here for the patient is that their whole clinical presentation has effectively been skewed, from the four symptoms they actually have to just the one they declared as their ‘most important’. Their real diagnosis could therefore be radically different to the ‘most likely’ one given. So, for the poor patient who selected nausea as their most important symptom, they may well not be directed to where they belong – in Emergency Care.

  1. The triage system

Finally, when the list of diagnoses appears, often each diagnosis has a label explaining whether the diagnosis can be treated at home, should be dealt with by a GP or is treated as an emergency. Again, this seems perfectly innocuous until you realise that the symptom checker’s final advice on where to attend actually depends entirely on the patient choosing the correct diagnosis from the list, effectively diagnosing themselves.

With all of these issues, the obvious question is why rules-based symptom checkers are used at all.  This mainly boils down to the greater transparency of these systems: you are able to track why it came up with the decision it did, which means that if it gave the wrong advice you could at least understand why.  Another factor is the view that many patients can’t describe what is wrong with them very well, so may benefit from being taken through a series of questions to draw out the details. Whether or not doctors will actually look through 50 answers is another matter.

Why choose machine learning?

Although the technology is much more advanced, there are far fewer machine learning-based symptom checkers available. This is most likely because it takes many years to train, test and refine the database of medical information used to ‘train’ them. Most applications of machine learning in healthcare are therefore for analysing images, for example scans of specific tumours. These systems have been shown to be quite accurate. [1]

One fundamental difference between a machine learning symptom checker and a rules-based one is what it aims to provide. Rather than a single diagnosis or list of probability ranked diagnoses, the machine learning tool aims to provide a list of possible conditions – a list of suggestions which the user can then research and consider. It is important to understand this key difference because, for the reasons discussed above, it is simply not safe for a symptom checker to try to tell a patient what diagnosis they are likely to have, as it may give a false sense of reassurance and lead to something serious being missed.

 

What makes machine learning symptom checkers safer to use?

They are trained using a vast medical database of disease presentations, described in natural language, extracted from various evidence-based resources

This captures the various ways that diseases present – both the typical and atypical. An advantage of this structure is that its more scalable as the developer does not have to define the rules, so it can handle the 10,000 diseases that exist.

The system uses software that can understand the meaning and context of patients’ natural language

This both enables the user to enter their query in normal everyday language instead of medical terms, and enables the symptom checker to match the user’s query to the information in the database.

The system filters the matches based on the patient’s age, location and gender

This ensures the suggested conditions are as relevant as possible for the age, gender and location of the user.

Safer triage

The triage functionality of these tools is based on both the list of suggested conditions for the symptoms entered and the answers to a short list of questions about the onset, severity and duration of symptoms. The triage advice is therefore based on the user’s overall clinical picture rather than the single diagnosis they selected.

Not all symptom checkers are equal so it is crucial that healthcare providers and patients understand their fundamental underpinnings and what they can and cannot do, particularly in terms of identifying unusual or serious conditions requiring urgent medical attention. It really could be a life or death decision.

References

[1] Nicola Davis. AI equal with human experts in medical diagnosis, study finds. The guardian, 24.09.19 https://www.theguardian.com/technology/2019/sep/24/ai-equal-with-human-experts-in-medical-diagnosis-study-finds [accessed October 2019]

 

About the author

Jason Maude is the CEO and Co-Founder Isabel Healthcare. Isabel Healthcare is a UK-based company with an ethical mission to improve the quality of clinical decision making across the globe. It provides the Isabel Symptom Checker, which uses this type of sophisticated machine learning, and is based on their clinical reasoning tool for healthcare professionals (the Isabel DDx Generator). The symptom checker is designed to empower patients to research possible causes of their symptoms so they can have a more productive conversation with their healthcare professional, and work with them towards the right diagnosis as quickly as possible.