How to understand medical decisions

Blogs & discussion

Welcome to my blog


Here you can add some text to explain what your blog is about and a bit about you.

By HUW LLEWELYN, May 29 2021 11:08PM

The concepts of 'sensitivity' and 'specificity' were used for radar during Word War II. If the 'receiver-operator' turned the detection knob one way, it increased sensitivity but tended to detect unwanted objects (e.g. sea birds) by decreasing the specificity. If it was turned the other way, it it increased specificity and decreased sensitivity (but was at risk of missing smaller enemy aircraft). However, this reduced the risk of detecting unwanted objects (e.g. sea birds).

It was assumed by some that this is how diagnosis works too. The 'sensitivity' of a finding with respect to a diagnosis is the frequecy with which that patient's finding occurs in those with the diagnosis e.g. the frequency of localised right lower quadrant (LRLQ) pain in patients with appendicitis e.g. 50/100. However the 'specificity' is the frequency with which those WITHOUT the finding occurs in those WITHOUT the diagnosis. In turn, the false positive rate is '1 minus the specificity': the frequency with which a finding e.g. LRLQ pain occurs in people WITHOUT appendicitis. It is also assumed that if a test occurs equally frequently in those with and without a diagnosis, then it is useless when used alone or in combination with other findings.

There is a big problem with 'specificity' and 'false positive rate (i.e. '1 minus the specificity'). It is the issue of who should we regard as 'those without a diagnosis' e.g. who should we regard as those 'without appendicitis'? Is it those in a ward,n in a whole hospital or in the whole community? In other words, these values depend on the population in which 'those without' the diagnosis or finding were counted. There is also another problem in that 'those without a diagnosis' will include patients with other diagnoses. To understand this, look at Figure 1 below and then read on.

If localised right lower quadrant (LRLQ) pain occurs in 50% of those with appendicitis and 50% of those without appendicitis, the likelihood ratio is 1 and it seems unhelpful. Similarly, if guarding occurs in 50% of those with appendicitis and 50% of those without appendicitis, the likelihood ratio is 1 and guarding also seems to be unhelpful. These two findings will also seem unhelpful if used in combination as the combined likelihood ratio assuming statistical independence is 1 x 1 = 1.

However, assume that 50% of those without appendicitis had ‘non-specific abdominal pain’ (NSAP) and all these patients with NSAP had LRLQ pain, the others without appendicitis or NSAP never having LRLQ pain. Also assume that of those without appendicitis who had NSAP, none had guarding (see Figure 1).

This means that if a patient has LRLQ pain, he or she must have appendicitis or NSAP. If the patient has guarding then as this never occurs in NSAP but often occurs in appendicitis, the diagnosis must be appendicitis. So despite all the likelihood ratios being 1 (and apparently being useless), the combination of LRLQ pain and guarding predict appendicitis with certainty (showing that they are very useful indeed). This is how reasoning by elimination between LRLQ pain and guarding works. It is an important method of reasoning in medicine and is different to applying Bayes rule, sensitivities and specificities.

This is a serious issue because 'sensitivity' and 'specificity' currently play a central role in deciding whether the results of new tests are going to be useful for diagnosis and therefore whether use of that test should be allowed. However, from this example, it can be seen that 'specificity' is unreliable and that in reasoning by elimination, only the 'sensitivities' (and false negative rates) are used: the frequency of patients with a positive or negative finding in those with a specified diagnosis (e.g. the frequency of guarding in those with appendicitis (50%) and NSAP (0%).

Figure 1
Figure 1

By HUW LLEWELYN, May 28 2021 10:21PM

Tests are usually assessed by estimating their sensitivities and specificities. These epidemiological indices give an indication of how well a single test will perform during population screening. The discriminating power of a combination of test results is also estimated with these epidemiologocal indices by assuming statistical independence between the likelihood of the individual fingings in the combination occurring in the presence and in the absence of the single diagnosis. (This assumption of statistical independence is usually false and leads to over-estimation of discriminating power.)

The resulting likelihood ratio based on statistical independence is applied to an individual by combining it with the subjective prior probability of a single diagnosis for that individual. Despite a tendency to overestimate probabilities, this Bayesian approach is regarded by many as the standard way in which diagnostic test results should be used to arrive at diagnostic probabilities. However, this tendency to overestimate probabilities might be reduced by considering more than one diagnostic possibility and then normalising the probabilities by ensuring that they add up to one, thus reducing the risk of confirmation bias and over-diagnosis.

The differential diagnostic process

During clinical practice, a patient’s presenting complaint (or screening tests result) is usually interpreted by considering a list of possible diagnoses, each with an estimated probability based on past experience. Another finding is then looked for that occurs commonly in one or more of these possibilities but less commonly in one or more of the others. When the result of such an investigation becomes known, the probabilities of each of the diagnoses in the original list are updated. Alternatively if a new test result has a shorter list of possible causes (another measure of a ‘good’ test), that list could be considered instead.

In this differential diagnostic setting, the calculations are based on the ratios of likelihood between pairs of diagnoses (analogous to ‘Bayes factors’ when testing stochastic hypotheses). This reasoning process with multiple diagnoses is based on a derivation of the extended form of Bayes rule and a dependence assumption (see Chapter 13 of the Oxford Handbook of Clinical Diagnosis - accessed via this link). As some diagnostic possibilities become improbable more evidence becomes available, only a few (or one) in the original list may remain probable. The diagnostician will then try to confirm the one of these probable diagnoses by demonstrating the presence of one of its 'sufficient' diagnostic criteria.

Diagnostic criteria

A diagnostic criterion may be ‘sufficient’, ‘necessary’ or both, the latter being ‘definitive’. A sufficient criterion (e.g. a positive PCR) by convention implies that the diagnosis is confirmed. A necessary criterion implies that if its finding is absent then the diagnosis cannot be confirmed. In practice there will be many sufficient criteria that provide a choice of how the diagnosis is confirmed. It is rarely if ever that a single test’s result or even a combination can be definitive (i.e. both sufficient and necessary). However, necessary criteria can be constructed in a circular way from all the recognised sufficient criteria so that absence of all the sufficient criteria excludes the diagnosis. Althogh use of the diagnosis is excluded, this does not exclude the possibility that the underlying disease is present.

The relationship between RCTs and diagnostic criteria

The purpose of a diagnosis is to suggest actions to help the patient (such as giving advice about what is going to happen with or without intervention). If an intervention has to be justified with the result of a randomised control trial, then the entry criteria for that trial generated from biomedical hypotheses have to be present for the intervention to be offered. In order to ensure that all those with the diagnosis will be offered the treatment then the entry criterion for the RCT could be used as one of the sufficient criteria for the diagnosis. Alternatively it should be ensured that those showing the entry criterion for the RCT are a subset of those with a sufficient criterion for the diagnosis. In some cases, the entry criteria for a RCT may exclude patients in danger of mild adverse effects (e.g. the elderly with co-morbidities). If the danger from an illness exceeded that from adverse effects (e.g. during decision analysis), then the trial exclusion factor would not be applicable. The sufficient criteria for a diagnosis would therefore need to be widened so as not to exclude those who might benefit from its suggested treatments.

Comparing different tests for use as diagnostic criteria

In order to assess the usefulness of tests as diagnostic criteria, the effect of using different tests or different test result ranges could be compared on trial outcomes. This could mean having to repeat RCTs when the efficacy of a treatment has already been established. However an alternative approache could be used by randomising subjects to different tests initially instead to treatment and control (see the preprint accessed by this link).

By HUW LLEWELYN, May 10 2021 09:13PM

It is important to remember that it is not possible to identify all those and only those with a disease. The best that can be done is to assume or postulate that the disease is present based on some agreed criterion - this is what a diagnosis means. A failure to make this distinction between disease and diagnosis and to agree a diagnostic criteria will lead to endless confusion about diagnostic tests (as in the case of Covid-19 when trying to assess the sensitivity and specificity of a test result without agreed diagnostic criteria). It is only if we have clear diagnostic criteria that we can estimate probabilities of 'diagnoses' from other symptoms, signs and test results.

Sufficient and necessary criteria

A ‘sufficient criterion’ is a finding or combination of findings that justifies adopting a diagnosis as a hypothesis and acting on it so that it becomes a working diagnosis. There may be many such sufficient criteria but they may not account for all patients with the disease (as in Covid-19). From biological theories, they may be more likely to cover the severe forms of the disease or at least when symptoms are more prominent. The very early or mild forms cause more difficulty. New sufficient diagnostic criteria may be created all the time to cover more patients with a disease but they may never cover them all.

A ‘necessary’ criterion is a finding that occurs in all patients in whom using a diagnosis is justified but may also happen in many other diagnoses too. The absence of such a criterion means that the use of the diagnosis is ruled out. (Other diagnoses that the necessary finding covers may be ruled out too). However, if it considered ‘necessary’ for at least one sufficient diagnostic criterion established already to be present then the absence of all the known sufficient criteria rules out use of the diagnosis. This rule would then become a definitive diagnostic criterion because it identifies all those and only those in whom use of the diagnosis is justified. (However it is not a definitive disease criterion that rules out hidden disease that can spread to others as in Covid-19).

Arriving at diagnostic criteria

Diagnostic criteria can be arrived at from theoretical reasoning, shared experience, or agreements between experts. The entry criterion for a randomized controlled trial (RCT) can be arrived in the same way. If a RCT suggests benefit, its entry criterion can be adopted as another suffcient diagnostic criterion. A test or scoring system that predicts an important outcome can also be incorporated as a sufficient criterion. These would be the criteria that predict benefit more accurately. In order that patients are considered for such interventions, the set of patients shown to benefit in a RCT should be a subset of those with one of the suffcient diagnostic criteria for a diagnosis at least.

Simplifying diagnostic criteria

Diagnostic criteria that share findings can be combined to form broad criteria that will include patients who benefit for different reasons. in other words, many suffcient criteria (e.g. Type 1 Diabetes Mellitus, Type 2 Diabetes Mellitus, etc.) can become subsets of a broader sufficient diagnostic criterion (e.g. Diabetes Mellitus). If a particular treatment or demanding test is suggested by such a broad criterion, the decision to go ahead may also have to be supported by the original evidence of benefit for that particular intervention.

Oxford Handbook of Clinical Diagnosis

The Oxford Handbook of Clinical Diagnosis (access Chapter 1 from this link) outlines many of the broad suffcient diagnostic criteria. They are presented under the headings ‘Confirmed by’ (or ‘Affirmed by’).The book explains how diagnoses are arrived at by navigating between possible diagnostic criteria when a patient seeks help. This will involve estimating the probability with which various diagnostic criteria will be satisfied by using limited information.

By HUW LLEWELYN, May 10 2021 06:07PM

A ‘disease’ is what a person experiences and what others may observe. Many of its features may be shared with other diseases and many may be hidden, especially in its early stages (e.g. as in Covid-19). It may therefore be therefore impossible to identify and attach a name to all those and only those people with a disease. In order to do this we would need to have an impossibly perfect test that can be performed on a perfect random sample or everyone in the population. The best that can be done therefore is to assume that someone has a disease based on limited information and to consider the consequences of that assumption. This is what is meant by a diagnosis. It is a hypothesis applied to an individual. If it is acted upon, then it is called a working diagnosis or working hypothesis.

Diagnosis and imagination

A diagnosis is therefore not simply a group of observable phenomena or the 'disease name' given to the uncomfortable experiences of an individual but also a process of recognizing the latter and its implications. The term is derived from the ancient Greek word of ‘diagignoskein’ (‘dia’ – between) ‘gignoskein’ (recognize). A ‘diagnosis’ is also the title to various tentative predictions or hypotheses. These predictions may be experienced mentally as imagining what may be happening now (such as a virus multiplying inside cells), imagining what may have happened in the past (such as inhaling virus laden droplets) and imagining what may happen in future with and without various interventions (e.g. death or survival after intensive care). It also involves imagining a degree of severity.

Decisions arising from a diagnosis

The initial diagnosis may suggest a variety of outcomes such as spread of disease to others, a mild illness with spontaneous recovery or severe illness requiring hospitalization and various treatments. In order to decide the best way forward, more information is usually needed in the form of immediate tests or waiting to see how things progress. This will include assessing the current degree of severity and its rate of change with or without various interventions. When considering treatments, the information required might be the presence of findings shown to predict that patients in a randomized control trial fared better on treatment than on a control such as a placebo.

Provisional and final diagnoses

If the progress of the illness differs from what is suggested by the diagnosis then another diagnosis might be considered and different treatments tried. This may happen many times during the diagnostic process. However, if there is a predicted or satisfactory outcome with no reason to question the diagnosis further, it becomes ‘final’. This does not mean it is confirmed or true. It simply means that it continues to be assumed to be ‘true’ and becomes a ‘theory’.

Pattern recognition

An experienced diagnostician may be do all this silently as a process of pattern recognition and a ‘gut feeling’ as to what should be done. This is also what usually happens during ordinary situations, predictions and decisions made by people in their day to day lives. It is a challenge to make some of these thought processes transparent especially the diagnostic thought process (see Chapter 1 of the Oxford Handbook of Clinical Diagnosis for more details (it can be accessed via this link).

RSS Feed

Web feed