How to Prepare for the USMLE: Biostatistics

January 4th, the first day of studying and it’s headlong into Biostatistics. Armed with my hand-written notes from 3rd Term, First Aid for the USMLE, Kaplan Lecture Notes: Biostatistics (bought the whole set off of someone for $200, completely unmarked), and the High Yield Biostatistics book I begin Operation Overkill.

I begin by settling into the local medical school library and then lazily looking through my own notes to regain some familiarity, which takes about an hour. I open up the First Aid and carefully read over every concept that they stress, adding Generic Post It Notes to each page to hold extra mnemonics or figures that help me remember what is important. After this, it’s on to the Kaplan Lecture Notes (each source is slowly increasing the level of detail). This takes much longer, maybe three hours to absorb everything with some understanding (finding mistakes along the way, see below). Having started at 9am, I’m now finishing up around 3 o’clock. I don’t feel solid, but I feel competent.

I begin the High Yield Biostatistics. This is the highest level of detail I’ve seen and also the easiest read. I’m thrilled to find diagrams and tables that are much better than the ones I found in the First Aid or Kaplan Notes and they feel like tiny treasures. I love that this author not only offers clear explanations of similar but different terms and concepts, but he then spends some time highlighting why their differences are important. His analogies are amazing the way magic is amazing, and I’m thinking about writing Dr. Glaser an email to thank him (he supplies his email address). The questions at the end of each section are appropriately difficult and after reading the material several times over earlier in the day, this is sort of pleasurable. Biostatistics, pleasurable? Well, yes.

After an hour spent eating dinner (sack lunch swallowed over notes in a hurry), I finished the book around 9pm with all of the questions in each section. And now, after playing on the computer and writing this, I’m going to go through the things I may forget from each source and combine them into one or two pages of notes that I will review in a week’s time and again in the week before the actual test.

**Warning: not high yield to continue reading**

So what errors did I find in these books?

First Aid for the USMLE

Though techinically tomorrow’s material, I found a description of an Advanced Directive on page 70 (2007 Ed.) that I believe is flawed. I think they’re describing a Do Not Resucisitate Order or DNR. Changing “withhold or withdraw” to “withhold or provide” would probably solve this.

Living Will — patient directs physician to withhold or withdraw life-sustaining treatment if the patient develops a terminal disease or enters a persistent vegetative state

I prefer this definition from Prudential:

Living Will: A document which specifies the life-prolonging measures an individual wants and does not want taken on his/her behalf in the event of a terminal illness. Living wills are often used in conjunction with a healthcare power of attorney, which appoints someone to make healthcare decisions on your behalf.

Kaplan Lecture Notes: Behavioral Science

On Page 7, there is a statement that I do not agree with. It states that:

point of optimum sensitivity = point of optimum negative predictive value; point of optimum specificity = point of optimum positive predictive value

This is incomplete, and I need an example to demonstrate it. In the usual square (Fig 1) you have true and false positive results (TP and FP) and true and false negative results (TN and FN). Our shorthand for this is A,B,C, and D. Without going into further detail, Specificity is calculated as D/(B+D) while positive predictive value (PPV) is calculated as A/(A+B). Sensitivity is calculated as A/(A+C) and negative predictive value (NPV) is calculated as D/(C+D). So if Kaplan is incorrect, let’s see if we can demonstrate it.

Assume a population of 100 people, split perfectly down the middle. 50 have the disease, 50 are disease free. We would like to see if a company’s new test can help diagnose this disease. The new device doesn’t work and the results are poor:

  • Specificity = 1/50 = 2%
  • Sensitivity = 25/50 = 50%
  • PPV = 49/98 = 34%
  • NPV = 1/2 = 4%

Specificity is low, Sensitivity is low, and the PPV and NPV are at also low. Well, it turns out we weren’t using the device correctly, and we run the experiment again.

  • Specificity = 1/50 = 2%
  • Sensitivity = 49/50 = 98%
  • PPV = 49/98 = 50%
  • NPV = 1/2 = 50%

According to Kaplan, the rise that we see in Sensitivity should be accompanied by a rise in NPV, and we see this. But without any change in Specificity, we see a rise in PPV. My point, after all of that, is that Kaplan’s statement is incomplete becuase it doesn’t take into account the effect that Specificity has on NPV and the effect that Sensitivity has on PPV, and instead paints an incomplete picture.

Dr. Glasner in HY Biostatistics takes it further:

Whereas the sensitivity and specificity of a test depend only on the characteristics of the test itself, predictive values vary according to the prevalence (or underlying probability) of the disease. Thus, predictive values cannot be determined without prior knowledge of the prevalence of the tests’s charateristics and of the setting in which it is being used.

Long story short: increasing the prevalence of a disease increases the PPV of a test and decreases the NPV of that test, without changing the Sensitivity or Specificity at all. So as I hope you can see, the sentence from Kaplan falls quite short of the truth and would be harmful to just memorize and use come test day.

Return to USMLE Step 1 page.


3 Responses to How to Prepare for the USMLE: Biostatistics

  1. marty ethics major says:

    They ment withhold, as in never give in the first place, or withdraw, as in take away from someone who needs them to live. IN ETHICS THOSE ARE TWO EXTREMELY DIFFERENT SCENARIOS.

  2. exdoctor says:

    I feel like point of optimum sensitivity = TP/TP+FN which has FN=0; point of optimum negative predictive value= TN/TN+FN, assume false negative =0 as well, so both point of optimum sensitivity = point of optimum negative predictive value=100%. cann’t really assume population number and calculated as you did above.

  3. Tony Glaser says:

    Thanks for your positive comments about my book. A 4th edition is in preparation, with lots of expansions and improvements for the future expansions in the USMLE Biostatistics, Epidemiology and Population Health material. If you or any other readers have suggestions on improvements, I would be delighted to hear from you. Tony Glaser, MD, PhD – author High Yield Biostatistics.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: