Problems With Lurgies

statistics
Author

Hamed Bastan-Hagh

Published

August 29, 2023

The puzzle in the August edition of Significance magazine is about testing for diseases. Usually a puzzle like this is something you see when you’re learning about Bayes’ theorem (given that a person tests positive for a disease, what is the probability that they actually have the disease?).

In this case we are given three bits of important info about an outbreak of lurgies1 among a group of knights:

  1. The test has a sensitivity (recall) of 70%.
  2. The test has a specificity of 80%.
  3. Of those who tested positive 28% actually had lurgies.

A Knight, seemingly not infected.

Our goal is to find out what proportion of the knights were actually infected.

We could start messing around with Bayes’ theorem: our three bits of information correspond to \(\operatorname{P}(T | L)\), \(\operatorname{P}(T^c | L^c)\), and \(\operatorname{P}(L | T)\) respectively (where \(L\) is having lurgies and \(T\) is testing positive). But that’s all a bit too much work.

Here we want to solve for \(p\), the proportion of knights who have lurgies. To do that we just need a confusion matrix and some arithmetic.

Info items 1 and 2 tell us about the True Positive and True Negative rates for the test. This is enough info to complete the confusion matrix.

Code
data.frame(
  Pos = c(0.7, 0.2), Neg = c(0.3, 0.8), row.names = c("Pos", "Neg")
) |> 
  gt(rownames_to_stub = TRUE) |> 
  tab_stubhead("Actual") |> 
  tab_spanner("Test", columns = c("Pos", "Neg")) |> 
  tab_header("Lurgies Confusion Matrix") |> 
  fmt_percent(decimals = 0) |> 
  cols_align("center") |> 
  tab_style(cell_text(align = "center"), cells_stubhead())
Lurgies Confusion Matrix
Actual Test
Pos Neg
Pos 70% 30%
Neg 20% 80%

We know the proportion of knights who tested positive and had lurgies (which is \(p \times (\text{True Positive Rate}) = p \times 0.7\)) and the proportion who tested positive without having lurgies (\((1 - p) \times (\text{False Positive Rate}) = (1 - p) \times 0.2\)). We also know that only 28% of those who tested positive were sick. Now we just use that to solve for \(p\).

\[ \begin{align*} \frac{0.7 \times p}{(0.7 \times p) + ((1 - p) \times 0.2)} &= 0.28 \\ \frac{0.7 \times p}{0.28} &= 0.5 p + 0.2 \\ 2p &= 0.2 \\ p &= 0.1 \end{align*} \]

So 10% of the knights actually had lurgies at the beginning. The easiest way to check this (and often the most intuitive way to do those Bayes’ Theorem problems) is with natural frequencies.

Footnotes

  1. As far as I’m concerned having an illness is “having lurgies”, not “having the lurgy”. According to a New York Times quiz, now paywalled, this is common usage in West London and around Brighton.↩︎