Probabilities in Medicine and More
Many things in life are rife with probabilities. In the context of medicine, they can be especially deceiving.
Thank you for reading our work! If you haven’t yet subscribed, please subscribe below:
As Nominal News grows larger, we will be able to make this a full-time project, and provide more content. Please consider supporting us and sharing this article in various social media!
If you would like to suggest a topic for us to cover, please leave it in the comment section.
This article was inspired by two events – within my family and in the news. Recently, Daniel Ellsberg passed on June 16, 2023. He is most famously known for releasing the Pentagon Papers, which covered the actual reasons for the US involvement in the Vietnam War. However, what might come as a surprise, Daniel Ellsberg was also an Economist and is known for the Ellsberg Paradox, which is a paradox about people making inconsistent decisions when faced with uncertain probabilities.
From a personal perspective, one of my family relatives has had several concerning medical symptoms for the past 3 months. These symptoms immediately had everyone worried about potentially the worst outcomes. Probabilities and thinking in probabilities became key in determining a diagnosis.
Probabilities – Quick Introduction
Probability is the numerical formulation of the likelihood of an event. Generally, nearly any event in life has a probability of occurring. In the very simple case of a coin flip, we usually assume the probability of heads or tails to be 50% or 1 in 2. However, without realizing, we are making a lot of implicit assumptions about this outcome. First, we’re assuming that the coin is fair, which is a reasonable assumption. We’re also, however, assuming that when flipping the coin, the side that is facing up does not matter. Interestingly, researchers have looked into this and found that it’s not true. It turns out that there is a 51% chance (and potentially up to 55%-60%) that a coin lands on the same side as it is originally flipped from!
Therefore, next time you see someone flipping a coin, if you see them holding the coin heads face up, you should pick heads since the probability is at least 51% it will land on heads. This choice is actually made based on a conditional probability. Conditional probability is the probability of the event occurring (landing on heads) given some initial facts (the coin is showing heads before the flip). The probability of the coin landing on heads, given it is being tossed with heads facing up, is 51%. Similarly, the probability of heads, given it is being tossed with tails facing up, is 49%.
It is worth noting that the unconditional probability of heads or tails in a coin flip is still 50%. This is because before the coin flip, whether it is facing heads or tails is random with 50% probability. But once you see the initial condition – which way the coin is facing - you will no longer rely on unconditional probabilities (50% of it landing on heads or tails), but rather on the conditional probability (51% depending on which way the coin is facing prior to the flip).
Nearly everything we do in life is actually a conditional probability because we already have some information about the event we are attempting to evaluate.
Building from above, one of the most important concepts in probability is named Bayes’ Rule. Bayes’ Rule tells us how to update our probabilities about certain events after we receive additional information. It is commonly seen used in medicine, where doctors want to typically determine what is the probability an individual has a disease given certain symptoms. Formulaically, Bayes’ Rule tells us how to compute the probability of event A (having the disease) given B (having symptoms), known as the “posterior”. It is the multiplication of the probability of event B (having symptoms) given event A (having the disease) times the probability of event A (what fraction of individuals in the population have the disease, which is called a “prior”) divided by probability of event B (what fraction of individuals in the population have symptoms). The formula, in mathematical notation, where the vertical line simply stands for “given”, is shown below:
The main conceptual takeaway from the above formula is that the general probability of events A and B matter. That is, how often the disease and symptoms occur in the population impacts how likely it is a person has the disease given the symptoms. Let’s illustrate with an example.
Is it Covid?
During the recent Covid pandemic, multiple at home Covid tests were developed that people relied upon to figure out if they have Covid or some other disease like the cold or the flu. Usually, people would do the test when they had symptoms. Let’s assume you had Covid symptoms and did the Abbott Binax Now at home test. The Binax Now test had a sensitivity of 84% (although possibly as low as 64%). Sensitivity means that if you have Covid, there is an 84% chance the test will come out positive. Suppose the test came out negative for Covid. Before going to the next section – should you assume you don’t have Covid?
You are trying to estimate if you have Covid given the test came out negative. To visualize this let’s start with a group of 1000 people that have symptoms. At the height of Covid, it was considered that most cases of people with respiratory symptoms were Covid rather than the flu or the cold. Suppose 90% of people with symptoms have Covid. That means in our group, 900 people have Covid. Out of the 900, 84% would test positive for Covid on the Binax test. For simplicity, let’s assume it's actually 80%, meaning 720 people with Covid would test positive and 180 people with Covid would test negative. Now, 100 people don’t have Covid – what is the probability they will test positive on the Binax test? This is called specificity – that is, if you don’t have the disease, the test will come out negative. Specificity of the Binax test is high – at 99%. That means of the 100 who don’t have Covid, 99 will test negative, while 1 person will test positive.
Given you tested negative, what is the probability you don’t have Covid? Out of the 1000 people, 180 + 99 = 279 people will test negative on the Binax test. Since 180 of them actually have Covid, the probability of having Covid given you have symptoms and tested negative is 180/279 or 64%! That means you are still well over 50% likely to have Covid. And if the sensitivity was actually lower (that is the test more often says you don’t have Covid when you actually do), as stated above, then the probability of having Covid is 70+%! Basically, a negative test is not very useful in informing your decision on whether you have Covid or not. Before the test, you would have assumed you have Covid with 90% probability, while after this test, your probability does not drop by much, as it drops to 64+% depending on the true sensitivity of the test. This is why the advice of isolating and staying at home was enacted if you had any Covid symptoms.
Note, the above numbers depend highly on how prevalent we believe Covid was. If the percentage of people who have symptoms and have Covid was 10%, instead of 90%, you can compute that the probability of you having Covid would have been 2% if you tested negative for Covid. The overall prevalence of the disease impacts how likely it is for us to have it. However, we don’t always know the prevalence of the disease and we must take a guess of what we believe it is. This is referred to as priors within Bayes’ Rule.
“Priors” are our best estimate of the probability of an event (having Covid) given all our previous knowledge (or prior knowledge, where the term prior comes from). When you take the Covid test and get a result, using Bayes’ Rule, you update your prior to a posterior (i.e. the probability you have Covid given all our previous information and the test result). Why are we focusing on priors?
Priors are very often implicitly used by doctors to determine whether you have a particular disease. The doctor usually starts with their own “prior” about what condition an individual might have based on some of the observables and symptoms a patient may mention. Some typical observables are age, sex, any current ailments and family conditions. Based on these observables, the doctor then chooses a test that would confirm or eliminate the most likely disease – that is a test that would significantly reduce the posterior probability of that disease.
In the case of my family relative, the doctor focused on the age and blood test results, and determined to test for certain cancers, viruses, and rheumatological diseases, as these were the most likely diseases. It is worth noting, most likely does not mean high probability – it’s just the highest compared to any other diseases. All the tests came back negative. So the doctor updated their posterior probability of what it could be and prescribed a gastro-intestinal endoscopy to test for internal bleeding. That test also came out negative and the doctor prescribed a further gastro-intestinal scan.
However, this is where we decided to see if there is any additional information that could help with the diagnosis. After a bit of research of all the symptoms, we separately found out that another family member had a rheumatological disease that cannot be identified via blood tests. Moreover, that disease appeared to match all the symptoms my relative has been experiencing, which included certain pain and fever. As it is a non-life-threatening disease that requires a certain genetic profile, it quickly became the most likely cause of all the symptoms.
Did the doctor make a mistake for not thinking initially it could be this disease? No – the doctor didn’t have all the information about the entire family history (which is why when visiting doctors it is important to have a full family history of diseases and conditions). Without that information as part of their prior, the doctor could not suggest that it was this rheumatological disease, as otherwise, it would be probabilistically unlikely because it is a very rare disease (only 0.2% of the population have it). However, even if we told the doctor that a family member had this rheumatological condition from the very beginning, the doctor’s course of action would probably still be the same. Why? Because the rheumatological disease is so rare, other conditions such as viruses or internal bleeding are far more likely.
Let’s return to the Covid example. In view of the fact that a negative test might not tell us whether we have Covid or not, some people would suggest simply doing a second Binax test. If two tests come out negative, then we definitely do not have Covid.
This is actually making another important implicit assumption, independence. Independence means that the likelihood of an event occurring does not change when another event occurs. For example, if you flip a coin and it comes out heads, this should not influence the probability of the outcome of the next coin flip. In the context of the Binax test, this would imply that the sensitivity and specificity of the test does not change based on the prior test you took. But this is actually a very strong assumption. Since the second Binax test is done on the same person by the same person, there could be many reasons why the second Binax test result could be related to the first test result. For example, we do not know why the Binax test came out negative, if you have Covid. If it’s due to some factors about the individual (how their Covid developed, their genes, etc.), maybe the Binax test will always come out negative for that individual. Then if the first test came out negative, the second test is useless since it will make the same mistake as in the first test! In this case we have perfect dependence (the opposite of independence). Another reason why the test might not be independent is because the same person is administering them to themselves, meaning they might be making a similar test mistake. Again, the independence assumption fails.
In the best case, if your two Binax tests are truly independent, and both came out negative, you would still have a 41% probability of having Covid (64% times 64%). Only after about 5 tests, would you have an approximately 10% chance of having Covid.
The independence assumption is often forgotten, even by experts. One recent example was the 2016 US presidential election. Using polling data, forecasters predicted that Hillary Clinton had a high probability of winning the election. The reason for this was that they assumed that polling result errors, such as poor sampling or lack of responses, were independent of each other. Thus, if a poll in one state is wrong, they assumed that this would have no impact on the probability of the poll in another state being wrong. But this was a strong assumption. Polling errors were correlated between states – if one state had an error in its poll, the other states were also likely to have errors. This is why Hillary Clinton’s chances of winning were heavily overstated, as even though she was leading in many state by state polls, if one of them was wrong, many of them were probably wrong.
There’s a lot of topics in probability that everyone miscalculates. In memory of Daniel Ellsberg, we will discuss his most famous contribution to economics, the Ellsberg Paradox. In economics, we are interested in understanding how individuals make decisions. This field of economics is referred to as decision theory. Economics typically models behavior using expected utility theory. Under this theory, individuals make decisions to maximize their own utility in expectation. The reason it is in expectation is due to the fact that the outcomes of their decisions are uncertain. For example, if you make a decision to switch jobs, you don’t know what exactly will happen, but based on your own expectations, you expect this decision to change jobs to maximize your utility. Daniel Ellsberg studied whether individuals adhere to the expected utility theory in an experimental setting.
Ellsberg conducted the following experiment: suppose you have an urn with 90 balls. 30 of the balls are red, while the remaining 60 are black or yellow. The participant of the experiment can choose a one of two gambles:
Gamble A: You receive $100 if you draw a red ball.
Gamble B: You receive $100 if you draw a black ball.
Which one would you choose?
Now the participant is given another choice of gamble with the same urn as before:
Gamble C: You receive $100 if you draw a red or yellow ball.
Gamble D: You receive $100 if you draw a black or yellow ball.
Which one did you choose?
Expected utility theory would predict that you should choose either Gamble A and Gamble C or Gamble B and Gamble D. The reason for this is that in both cases when making the decision, the participant must make the same assumption – are there more red or black balls in the urn. If the individual thinks there are more red than black balls (i.e. there are less than 30 black balls), they should choose Gamble A and Gamble C.
However, people strictly prefer Gamble A and Gamble D! This violates expected utility theory! The reason people prefer these Gambles has been attributed to ambiguity aversion – people do not like not knowing the probabilities. In Gamble A, you know the probability of winning is 30/90 or 33%. In Gamble D, you know the probability of winning is 60/90 or 66%. In the other two gambles, the probability is uncertain.
If you believe there are 40 black balls, then:
Gamble B has a probability of 40/90 (44%) chance of winning vs Gamble A probability of 33%;
Gamble C has a probability of (30+20)/90 (55%) chance of winning vs Gamble D having a 66% chance.
So you should choose Gamble B and Gamble D if you think there are 40 black balls. If you think there are fewer black balls than red balls, it can be computed that you should choose Gamble A and Gamble C.
Economists have found that individuals very often violate the expected utility theory – that is they do not make a decision that maximizes their utility based on their expectations. The Ellsberg paradox is a famous example of this.
Probabilities are complex – the human mind is not designed to think in probabilities. However, understanding them and their caveats is important, because they enable us to make better decisions. This is especially the case regarding medical questions due to the stress involved. It’s important to understand that positive or negative test results do not necessarily mean a diagnosis is certain. A lot of worry and stress, although understandable, is probabilistically unjustified.
Thinking in terms of probabilities and their caveats can also help us in understanding data and their implications. In the 2016 Presidential Election polling example mentioned above, taking into account independence could have resulted in better polling forecasts. Similarly, Bayes’ Rule also offers us a good framework of thinking through our understanding of our own biases and opinions on a variety of issues. Bayes’ Rule shows that we start off with a ‘prior’ and update our beliefs based on new evidence into a ‘posterior’. Using this type of framework can significantly improve our policy making.
Interesting Reads from the Week
Tweet/Paper: New research shows that increasing IRS spend by $1 on audits of high income earners generates $12 of tax revenue. This is why cutting IRS spending is a bad policy.
Tweet: In line with our Recession Talk article, total income in the tech sector has dropped. The tech sector is in a recession.
Tweet/Research: New research shows hospital mergers increase costs for users significantly. One hospital merger increases consumer costs by $204mln, which is more than the entire budget of the anti-trust department ($136mln ).
Article: Claudia Sahm goes into detail of how states are currently faring economically and how that informs us about the national likelihood of recession.
Cover photo by lil artsty.
If you enjoyed this article, you may also enjoy the following ones from Nominal News:
To Compete or Non-Compete (April 30, 2023) – why non-compete clauses do not solve any issues, but only create costs.
Early Child Investment - Child Tax Credit (January 15, 2023) – the benefits of the expanded child tax credit to society.
When Free Trade isn’t Free (February 20, 2023) – how trade issues extend beyond tariffs and quotas and into regulations.