168 Comments
User's avatar
typhoonjim's avatar

Frequentism is impressive, but now! Bayes leads! Priors, for everyone!

Expand full comment
Brandon Hendrickson's avatar

Agreed! In the "Bigfoot" class, we actually go into the Bayesian leap (from a sort of implicit pre-frequentism), and talk about the perils and promises of treating internal confidences as countable items in the world.

Expand full comment
Mikhail Samin's avatar

Yudkowsky has an intro to the theorem, which is more intuitive than anything mentioned here: https://www.lesswrong.com/w/bayes-rule?lens=high-speed-intro-to-bayes-s-rule

Expand full comment
Brandon Hendrickson's avatar

I'll watch it!

Expand full comment
Christopher Wintergreen's avatar

If this works for you, including on occasions where the ratios are less pleasant, that's great, really. I don't think it would be particularly intuitive for the large majority of 12 year olds or younger. Unless we're talking specifically about the frequency representation section being better than the rectangles, in which case I agree, though it's only really going to work for a screen and sensitivity/specificity examples - time consuming for a whiteboard, so starting there then switching to the rectangles might be the best way.

Expand full comment
Timothy Johnson's avatar

I think your example ignores the possibility that some people are both NFL players and math teachers: https://en.wikipedia.org/wiki/John_Urschel?wprov=sfla1

Expand full comment
Brandon Hendrickson's avatar

Oh my god — he's literally one of the only NFL players I knew about, how did I forget about him here?

Expand full comment
Deiseach's avatar

You also apply the "how many NFL players versus maths teachers" to the entire country, but it very much depends on your local area. If, for example, you are walking past the Ethihad Campus (I was going to say Carrington, but Man U has not been doing the best recently), and a ball comes flying past you on a perfect trajectory, it is indeed way more likely to have been kicked by a professional footballer than a maths teacher.

Or the local former county hurling star who drives an oil delivery truck in my town, for another example.

For the ghost example, I think there might be biases at work there. I'm presuming you take the 100% bit as "ghosts are real" and the 1% bit as "nobody has really seen a ghost" and multiply it out to get the 9% likelihood that ghosts are real, so you should think it's *not* a ghost, there's another explanation.

But what if you switch it the other way round? 100% is "I hope ghosts are not real because if they were it would upturn my entire comfortable materialist worldview" versus "1% is all the records of people throughout history swearing they saw ghosts". Then you get the 9% likelihood of ghosts are not real, so you *should* think it's a ghost.

I am a mathematically illiterate idiot, tell me where I'm wrong here.

Expand full comment
Victor's avatar

You start with the objective phenomena you are trying to explain. In this case, people see ghosts. Ok, so what's the likely total number of ghosts in this country? Then, what's the most likely alternative explanation? People being mistaken by what they see? Ok, what's the likely total number of people who have made a mistake like that in the country? Diagram it from there.

Expand full comment
Seth B's avatar

I think you'll be interested in Frank Ryan as well:

https://en.wikipedia.org/wiki/Frank_Ryan_(American_football)

Expand full comment
Steve Sailer's avatar

Frank Ryan quarterbacked the Cleveland Browns to the NFL championship in 1964. The next year he earned his mathematic Ph.D. from Rice U. for his dissertation "A Characterization of the Set of Asymptotic Values of a Function Holomorphic in the Unit Disc."

Interestingly, his teammates did not consider Ryan a particularly smart decisionmaker on the gridiron. Instead, they were impressed by how brave he was at holding on to the ball until the last instant.

Expand full comment
Seth B's avatar

In the postgame celebrations after the Browns won the Championship in 1964, a reporter asked Ryan how much his mathematical background affected his decisions as a quarterback. His jubilant response was something to the effect of "not one goddamn bit!"

I'm pretty sure I read that story in one of Steven Krantz's two volumes of "Mathematical Apocrypha".

Expand full comment
Michael Weissman's avatar

Nice article! Though I think the fuzzy bigfoot/god/fantasy examples make poorer teaching tools than the other ones. Most kids like problems with $ in them.

It's a bit ironic that Scott chose this topic at this time, given how badly he screwed up his own Bayesian reasoning on Covid origins.

(https://michaelweissman.substack.com/p/open-letter-to-scott-alexander)

So it might be good to do a two-layer approach. First, simple straight one-stage Bayes problems like your football example. Second, some two-stage hierarchical problem (e.g. strong evidence provided by untrustworthy cops, good betting odds offered by someone who might not pay off, etc. ) just to get a handle on typical real-world problems.

Expand full comment
Brandon Hendrickson's avatar

Yeah, I agree that multi-stage problems would be good. My hope is that we can bring that into the Sea monsters class this summer.

Expand full comment
Christopher Wintergreen's avatar

I recently ran an exit ticket scenario: kid on the way home from school on the bus, wants PB sandwich, predicts the probability. Sees no car in driveway, updates probability. Sees empty PB jar, updates. Hears car returning, updates. Sees shopping bags, updates.

I wish now I'd done something social, the teens do love romance.

Expand full comment
Sniffnoy's avatar

Huh, when you started mentioning the actual Reverend Thomas Bayes, I thought you were going to say something about applying probability to games. I mean, a lot of the early work on probability came out of the study of gambling, right? Plenty of video games that kids would play involve chance to some extent. I don't really have a concrete idea here, but that's where I thought you were going with this...

Expand full comment
Brandon Hendrickson's avatar

My experience in education is that matching content to kids' specific interests struggles to scale. What fraction of the class is interested in a particular game? What fraction finds the game off-putting? I've found that this sort of approach really can work (and work well), but only with an ungodly amount of effort from the teacher. (This could, however, be something that AI can do better with.) That's why I find it more productive to find the deeper issues that virtually all kids care about (e.g "what the heck kind of world are we living in?!").

Expand full comment
Kimmo Merikivi's avatar

Certainly, if I were teaching kids in this manner, I'd definitely steer away from specific games and would instead use "idealized game" that uses common elements and tropes, but one that isn't any real-world game. Say, it would be contemplable to talk about "a tabletop game" with "knights and dragons" that uses "20-sided dies", and then present math problems about dice, but I wouldn't want to say it's D&D 5e. Or Minecraft, Fortnite, Roblox, or whatever kids these days are playing. I don't think a generic example would alienate anyone that wasn't already alienated by talk about a tabletop RPG period, but a specific example might alienate anti-fans, purists who are annoyed about details being wrong, those who dislike global consumerist culture exemplified by games played by hundreds of millions, etc.

Now that I think about it, at least the older kids probably can explain how e.g. the combat resolution mechanics work in a game they are playing, so why not ask the audience?

Expand full comment
Victor's avatar

Probably easier to just teach a simple game to the class, as an in-class activity. If it's fun enough, they will all quickly gain mastery, and you can proceed from there.

Expand full comment
Brandon Hendrickson's avatar

My hunch — and I have no actual experience here! — is that choosing an abstract game would miss both the motivational benefits of doing something worldview-y/mysterious like cryptids, and those of doing something individually exciting like a particular game. We nerds, of course, are disproportionately interested in the abstractions of how games in general work, but we should check before assuming that most kids share that. (But again, this is an empirical matter, and I have no direct evidence for this.)

Expand full comment
Anna Eplin's avatar

I love this! I’ve been trying to introduce the basic mechanics of Bayesian reasoning to my kids (who are still pretty young) by drawing up and down arrows on a page—up for things that increase the likelihood of something being true, down for things that decrease it. They do pretty well with that. But I’ve been needing help with taking it deeper, so I’m very glad for all this input. (And I’m quite interested in your summer camps too—what a cool idea!)

Expand full comment
Brandon Hendrickson's avatar

A delight! Reach out if you'd like to chat — I'd be fascinated to see how Bayes could be extended to even younger kids. (How old are yours?)

Expand full comment
Nate's avatar

Kier(an) Egan hey? Suspicious name 🤔

This is not a coincidence because nothing is ever a coincidence

Expand full comment
Brandon Hendrickson's avatar

This keeps coming up, and (as a big fan of the show) I really want to have some jokes ready about this. (Something something "SEVERing" a child's sense of self?) I avail myself to all of you for any ideas.

Expand full comment
Paul Goodman's avatar

Makes me wonder if someone writing the show had a really bad Egan-inspired schooling experience or something.

Expand full comment
Phil H's avatar

More than the specifics of the box diagrams, the practice of repeatedly and consistently using a diagram - any kind of thinking tool - would be very valuable to kids. I think Brandon has a problem with his summer school model, which is that he’s not necessarily getting the same kids each year. When he gets his school going (or when the next crazed president appoints him education grand wizard), he’ll be able to encourage this kind of habitual use of clarifying tools.

Honestly, though, I’m still not convinced that this tool is better than any traditional tool. The essay is precisely this: a tool for thinking in depth about things. If anyone thinks that schooling before rationalism wasn’t concerned with improving students’ thinking, then they’ve missed the point.

Expand full comment
Victor's avatar

The innovation here isn't the use of a diagram to teach something. It's what the diagram is being used to teach: Bayes Theorem. Apparently no one thought of that before.

Expand full comment
Eric Rasmusen's avatar

I've done it for years with my MBA students. I made up a little web python app, in fact, tho I see that it's broken now. To get the idea, see

https://rasmusen.org/cgi-bin/bayesbox.htm

where the underlying code is

https://rasmusen.org/cgi-bin/bayesbox.py

Sorry it's broken-- I haven't tried it for a few years.

Expand full comment
Brandon Hendrickson's avatar

I'd actually be fascinated to know where Bayes has popped up in people's K–12 experiences. I know it comes up in high school statistics classes, but I'm sure that some enterprising teachers of younger kids have done it. If anyone knows any, I'd love the connection.

Expand full comment
Totient Function's avatar

Surely this isn't true? I'm not sure I've ever seen an introduction to Bayes' Theorem that didn't have a diagram floating about somewhere, although to be fair I haven't seen that many - I mostly avoid popular articles and explainers etc. (I pretty much never find it helpful to fool myself into thinking I've understood something without actually have worked through a formal treatment, but then that's why I do math, probably atypical in this regard). But anyway, I certainly remember schematic diagrams not unlike these ones being commonplace as an intuitive aid (although I may be remembering my own drawings as having been present in the material? It's been a while...).

Edit: to be clear, this is not meant as a criticism of the body of the post! Just this specific claim.

Expand full comment
Brandon Hendrickson's avatar

>> "More than the specifics of the box diagrams, the practice of repeatedly and consistently using a diagram - any kind of thinking tool - would be very valuable to kids."

Hard agree on this. This is where I get frustrated about the disconnection of the American curriculum — most of what's learned in one year (tools, but also content) is abandoned for future years. Instead of accumulating knowledge, we flush it down the toilet.

>> "I think Brandon has a problem with his summer school model, which is that he’s not necessarily getting the same kids each year."

This is definitely a challenge. We try to meet it by telling people they need to watch the recordings of previous years in order to take the new ones. In the end, though, that might mean we have a very small group for "Ghosts". Still, it'll be worth it, if we can then share the full model we've developed.

Expand full comment
thefance's avatar

I sympathize with this as well. I was always amazed that nobody remembered stuff beyond a month or two, and everyone (teachers, students, parents alike) considered this fine and normal.

Expand full comment
Yaniv's avatar
3dEdited

One question that fascinates a lot of people, kids and adults, is “can we do magic?” For example, can a coin or a die or a card give us valid answers? Can you feel what someone else is thinking? Can you tell when someone is watching you? Does good luck follow bad luck? Does a lucky shirt really work? Is there information about the world hidden in dreams? And the granddaddy, do coincidences have meaning?

Expand full comment
Victor's avatar

Careful, though, there are parents who believe in some form of magic. Not as many as those who believe in God, but enough to warrant caution.

Expand full comment
Brandon Hendrickson's avatar

Yeah! That's a lot of what I want to cover in "Psychic Powers", in year 4.

Expand full comment
James Gauvreau's avatar

//for the majority of our species’ existence, most people probably haven’t been able to count to ten.//

This sounds improbable enough that we must be interpreting this sentence differently. Could you elaborate?

Expand full comment
Doug S.'s avatar
2dEdited

There are hunter-gatherer languages in which it is literally impossible to count to ten because they lack words for numbers greater than two; any quantity greater than two is referred to by a single word that would translate into English as "many."

Expand full comment
Deiseach's avatar

I have to wonder about that, I'm sure hunter-gatherers are perfectly well able to see "there are four of us in our band and six of them in their band, if push comes to shove they're likely to drive us away" or "there are lots of berries on this bush but not so many on that bush, we should pick from this bush" or "the berries are nearly all picked, time to move on to a new territory" even if there is no formal term in the language for counting.

You know what's coming next: I looked it up.

https://thelanguagecloset.com/2025/03/01/the-number-systems-of-hunter-gatherer-languages/

"Many of the Aboriginal Australian languages, for instance, usually have words for ‘one’ and ‘two’, but only a subset of these have words for ‘three’, and even fewer have ‘four’ and beyond. The Pirahã language too is well known for its supposed lack of precise numerals at all, yet, its speakers still demonstrate numeral literacy.

...Furthermore, ‘numbers’ in this study were defined as “spoken, normed expressions that are used to denote the exact number of objects for an open class of objects in an open class of situations with the whole speech community in question”. In essence, cardinal numbers like generic ‘one’, ‘two’, and ‘three’. This definition presents a broader set of problems, as some languages use counting terms for a specific type of object, which can differ from those of other objects. And so, the authors decided to include number systems that pertain to precise numerical values, loosening up the Hammarström definition by a bit.

...Before reading this paper, I thought that from the naïve observation of number systems and languages based on type of subsistence, one would be quick to point out that there is a heavy cultural association between the two factors. However, how such systems arose are not really understood. Perhaps some might have stemmed from how we count, like using the ‘hand’ to express 5, and ‘person’ to express 20 going by digit tallying. Others might have pointed out the need for trade, record keeping and verbal reporting because of the shift to agriculture, but even these have their own counterexamples (see Iñupiaq).

...But the idea that the lack of high numerals in these languages translates to poorer numeracy skills is rather inaccurate. Studies like this conducted on Aboriginal Australian children, who speak languages which generally lack numbers beyond 4 or 5, have shown that their numeracy skills are generally no different from those of English-speaking children. Other studies have also shown how speakers of these languages deal with larger numbers as well, such as the use of tally systems, or tally marks, to record the number of occurrences of a certain desired observation."

So it seems that it is entirely possible for a hunter-gatherer tribe to speak a language that doesn't have formal terms for numbers beyond four, but if they want to express higher numbers than that, they use terms like "there are two hands of people there" (which would mean ten people) or the like.

Expand full comment
Brandon Hendrickson's avatar

I'm honored to get the full Deiseach fact-finding treatment! (You know you've made it when...) I had heard of some anthropological pushback to the "counting was rare amongst oral-language cultures", but I think that in the past I've been wary of taking that seriously because of the bias amongst anthropologists against admitting that modern literate cultures bring any advantages. But I'll switch now to being more unsure of this. Lemme know if you (or anyone else reading this) finds anything determinative here.

Expand full comment
Deiseach's avatar

You are very gracious about this, thank you!

I do tend to be a little sceptical of "and the Potitoo tribe only count to two by using the right and left halves of their bodies", because most cultures do have at least "some/lots/many" distinctions.

So I think that if anthropologists come away with "they only have words for numbers up to three" they may well be missing "but the Potitoo count 'four' by using the term for "same number as how many arms and legs a person has" and 'five' by "how many toes on your foot" and so forth.

Expand full comment
Victor's avatar
2dEdited

We know that the ability to comprehend the concept of "most" precedes the linguistic ability to describe the quantities involved: https://www.tandfonline.com/doi/pdf/10.1080/15475440801922099?casa_token=PXwPhTnyzrwAAAAA:1LEPbDnCDPD2wVI8EIBmlUWY2TJWRb0KZ3WfmPmKfk52xNgmFk1AmFlHv425p5bShuii8GWffHF-WA

Expand full comment
Kenny Easwaran's avatar

Right - my understanding is that there is some intuitive logarithmic conception of quantity that applies indifferently to counts and masses, and that this is probably what most people are thinking when they use the word “most”. The claim is that it’s the precise use of numbers to add and subtract and multiply that depends on the technology of having a memorized list of counting words (and also the technologies of tally marks and numeral systems like Roman and Arabic).

Expand full comment
MT's avatar

It is grimly funny that this article starts off its introduction of Bayes theorem by putting forward in a huge black box an "equation" which conspicuously lacks an equals sign. The equals sign is the entire content of the theorem! The equals sign is telling you the precise mathematical relationship between P(E|H) and some combination of P(E), P(H), and P(H|E). The rest of the piece is honestly fine, but it's not a great look to start off an article about demystifying Bayesian reasoning with a pretty profound lack of clarity on the actual, mathematical relationship of its definition.

Expand full comment
Pascal Bercker's avatar

You probably mean: P(H / E) = some combination of P(E), P(H) and P(E / H).

Expand full comment
David Abbott's avatar

The math teachers/NFL players example is poorly chosen. No way there are 1.5 million math teachers in the country. Also, NFL players include linemen, linebackers, receivers and running backs who have less than a 100% chance of being able to throw a perfect spiral. There are also plenty of college quarterbacks who can throw perfect spirals. And some high schoolers! And semi pros.

Picking an example with this many holes in it makes you come across as an educated fool and makes Bayes’ theorem seem like an abstraction with little real world connection. Even worse, the regions of the square weren’t congruent with your numbers.

Expand full comment
Victor's avatar

Do you think this would matter to very many middle schoolers? Would they even understand what you are talking about?

Expand full comment
David Abbott's avatar

yes

Expand full comment
Eric Rasmusen's avatar

We put subjects in siloes too much. I like it best when the math examples I give my 7th graders also teach them something else-- heights of mountains, price of gasoline, etc. So we should strive to be realistic.

Expand full comment
ALL AMERICAN BREAKFAST's avatar

Yes, it would be a huge distraction. That’s the part they will immediately understand and have the intellectual resources to challenge.

Expand full comment
Brandon Hendrickson's avatar

>> "Picking an example with this many holes in it makes you come across as an educated fool"

Alas, a description I fear fits me well.

>> "and makes Bayes’ theorem seem like an abstraction with little real world connection."

A more important charge! I invite anyone who'd like to use this to improve the numbers. Honestly, though, coming up with weirdo situations and questionable-but-easy-to-work-with numbers is a hoary old tradition in math teaching (and my feeling is ESPECIALLY in Bayesian probability), so I'd be surprised if the average student has that take.

>> "Even worse, the regions of the square weren’t congruent with your numbers."

True! And I meant to write a "not to scale" note below it. (The trouble with these numbers is that they're too extreme to draw to scale. You're right that for this, making it to scale is really helpful.)

Expand full comment
David Abbott's avatar

Thank you for your thoughtful response. I’ve taken courses in statistics, I could apply Bayes’ theorem given internet access or a crib sheet, but I can’t write it out from memory. I could reason my way to something close to it with a pen and paper and 15 minutes. The main thing I got from it is “base rates are huge.”. Just teaching that would be a huge step.

I’ll try to think of a better example.

Expand full comment
David Abbott's avatar

Here’s how to improve the example. Make the question “was the spiral thrown by an nfl quarterback.” Say there are 70 nfl quarterbacks and 330 million non-quarterbacks, 1% of whom can throw a perfect spiral. That will yield a dramatic result, shoring the importance of base rates.

Want something more graphic? You are at a football training camp with 123 players. All three quarterbacks can throw perfect spirals. 15% of the other players can. That could easily become a scaled graphic.

Expand full comment
Christopher Wintergreen's avatar

In an blog post you have to have concrete numbers to do up an example. In a classroom, I'd encourage the kids to call BS on the numbers and have them give their estimates. I agree they might actually care - and if they want to think more deeply about specific location, do a quick calculation of maths teachers in an area, estimate how many NFL players can really throw a spiral, realise that some players have a non-100% chance of doing so, etc. then, uh, great, right? They are connecting the maths to the real world which is good, looking for holes, aiming for true accuracy.

Expand full comment
David Abbott's avatar

Once you add in that texture, you are better off coding a simulation than using a formula.

Expand full comment
Christopher Wintergreen's avatar

The kids are going to get way more out of coming up with the ideas and trying to factor them in than having a perfect simulation to play with. It's the difference between trying to perfect the tool and trying to optimise the process for learning.

I mean it's possible at the moment for kid to build a simulation and telling it to factor in various different things. So that might be the next step after thinking through a few as a group.

Expand full comment
bbqturtle's avatar

I’ve never understood bayes and this box helps me a ton! But, I could use a few more examples. Could you share one for the Bigfoot and mammogram concepts? I think if I see it created a few more times I would get there.

Expand full comment
bbqturtle's avatar

This is dense and illegible and the links don’t work :(

Expand full comment
cromulent's avatar

3blue1brown's one video on it is very good. It focuses on the mammogram example in detail.

https://www.youtube.com/watch?v=lG4VkPoG3ko

Expand full comment
bbqturtle's avatar

Thanks for the comment! I guess what I meant is that I find Yudkowsky extremely difficult to read, and I find OP very easy. So just a few more examples BY OP would be a huge value add for me

Expand full comment
Brandon Hendrickson's avatar

I'm at the LessOnline meetup right now (weird! intense! sometimes relaxing! super recommended!) so I won't be able to, but I agree with @cromulent — watching that 3blue1brown video I linked to will do it. (Though note, as someone did above, that all his examples are frequentist — they're about real quantities in the world, rather than subjective priors. If I had this post to do over again, I definitely would have included some of those.)

Expand full comment
Anteros's avatar

'Look ON my works....'

The misquote is glaring because it so obviously butchers the rhyming scheme (ten syllables per line) that it can only be made by those who know nothing about the poem, or even poetry in general.

And that's my pedantry quota for the week :)

Expand full comment
Phil H's avatar

You’re right that the poem reads “look on”, but having an extra syllable in a line certainly doesn’t wreck the meter. Ozymandias contains two lines with eleven syllables, and many of the line strongly and deliberately challenge the iambic meter (e.g. L12 Nothing beside remains). English meter is a flexible and generous tool!

Expand full comment
Anteros's avatar

Fair pount well made

Expand full comment
thefance's avatar

I would assume "Ozymandias" is intended to be voiced with only 4 syllables (not 5) to keep the meter consistent. I.e. "OZ y MAN dias", instead of "OZ y MAN di AS".

also, I don't see an issue with line 12?

Expand full comment
Phil H's avatar

The last syllable of Ozymandias wouldn’t be stressed, no matter how many syllables it has. The question is whether there is an extra unstressed syllable in between the stresses of MAN and KING. I agree that you could read it OZ-y-MAN-di-as or OZ-y-MAN-dyas; but the meter of the poem is not a clue which reading you should prefer. Because the English iambic pentameter is not and has never been a strict set of rules. It has always been OK to slip in an extra syllable here and there.

L12 starts with a trochee, very deliberately placed to interrupt the flow of the meter and throw stress on the “nothing”.

Expand full comment
thefance's avatar

yeah, i mean, i guess that all sounds good and reasonable. that said, I think im gonna go on pretending that the meter is 100% consistent.

Expand full comment
Unobserved Observer's avatar

I also always make that mistake. I don't know anything about poetry, but the one line in isolation sounds better to my ears with "upon".

Expand full comment
thefance's avatar

It's cuz the original is iambic, but colloquial english tends to put the stress on the first syllable of a word (or at least that's the impression I get) (though obviously there's exceptions). Which lends to a more trochaic sound. So there's a natural conflict between how the original is written, vs how normies wanna say it by default.

Expand full comment
Brandon Hendrickson's avatar

(1) You're entirely right, (2) I threw in that joke at 1:30am before hitting "send" and actually thought "should I make sure I'm getting this quote right? Nah, even if I'm wrong, no one will notice!", and (3) the resulting pedantic argument about this is one of the most rationalist conversations I've ever seen and I am SO happy my mistake sparked it!

Expand full comment
Brian Thurbon's avatar

To make this type of thinking truly relevant to a "normie" middle schooler, it should be presented as something they can use to address an issue they care deeply about and face every day: navigating social relationships.

"Given the fact that my friend said something mean to me, what's the likelihood that they don't want to be friends anymore vs the likelihood that they're having a bad day?"

"Given the fact that my crush laughed at my joke, what's the likelihood that they like me too?"

These problems are messy. Base rates aren't always straightforward. But they're vital. Almost all of us are hard wired to care what others think.

I guarantee you'll get engagement with questions like these, and if the ultimate goal is less about getting people to do the math and more about examining and adjusting assumptions, this is the direction to go.

Expand full comment
Unobserved Observer's avatar

Those 2 questions are really great applications. Useful in showing how much of that sort of reasoning we do intuitively as well.

Expand full comment
Victor's avatar

Unfortunately, it's also likely to turn the class into a therapy session. The drama of ongoing relationships is going to totally overwhelm the actual lesson. Best to leave that sort of thing to individual instruction, where it can be handled in a sensitive manner.

Expand full comment
Brandon Hendrickson's avatar

I think you're pointing to a true danger here, but as a teacher, I wouldn't say that it can only be done 1-on-1. A wise teacher can exercise judgement as to when to speak of personal, emotional things. (Though on the other side of this, see my response to the original commenter, above.)

Expand full comment
Victor's avatar

There are also privacy issues involved, and children can't give informed consent. They will be talking about people who are not in the room, and feelings are likely to be hurt. I'm not saying that relationships can never be discussed, but make it about relationships in general, not actual one's involving specific people. Best leave that sort of thing to a safer space.

Expand full comment
Tom!'s avatar

Thank you for this.

Expand full comment
Retsam's avatar

To me, this feels somewhat dangerous. Applying Bayes to social dynamics can be useful, but it has to be done carefully - it's very easy to get into a headspace where you don't take any nice thing anyone says at face value because it's drowned out by a high prior on "they secretly hate me"; the sort of prior that's very common in middle-schoolers.

Obviously; I'm not suggesting the idea that "person says X but means Y" is a foreign concept to middle schoolers, but I think explicitly giving them the tools to quantify it and teaching them that it's "rational" is potentially throwing gasoline onto the dumpster fire which is teenage emotions.

Honestly, even as a pretty emotionally stable adult, I try to be careful not to approach social dynamics with this kind of mindset: I recognize that it's "rational" and that people don't always mean what they say... but I also think that generally taking people at face value in day-to-day life is a social good, and the emotional highs of middle school are the last place I'd want to be explicitly undermining it.

Expand full comment
Brandon Hendrickson's avatar

I really like this; thanks for suggesting it. I'll only push back on two niggling points: (1) I don't think that practical social advice is more legitimate (or engaging) for middle school students than existential/worldview questions, and that there's a long and intellectually deleterious historical tradition of trying to make education more "obviously" practical (the book on this is Diane Ravitch's Left Back) that's worth being aware of (though your example would not, by itself, cross into that error). (2) In my experience, it's difficult for students to take seriously the wisdom of teachers to speak to the personal realms (friendship, love) of their lives. However, when that authority has been earned, then yes, this is an excellent thing to do! Thanks again for it.

Expand full comment
Melvin's avatar

I feel like it's time to de-emphasise the Bayes in Bayesian statistics.

Bayes' Theorem is a perfectly sensible piece of statistics, but somehow the online discourse around it has picked up shades of religious fervour. A young teenager googling about Bayes' Theorem quickly winds up in an online environment which feels more like a cult recruitment session than a statistics lesson.

I don't know how it became this way.

Expand full comment
Michael Weissman's avatar

True. Very unfortunate since Bayes gives a practical approach to familiar decision problems.

Expand full comment
Victor's avatar

It has taken on some of the characteristics of a tribal marker.

Expand full comment
Brandon Hendrickson's avatar

>> "somehow the online discourse around it has picked up shades of religious fervour."

Agreed — which was part of the fun in writing this!

>> "I feel like it's time to de-emphasise the Bayes in Bayesian statistics."

Is the only reason here because it could be construed as a gateway drug (ha!) to the online rationalist community?

Of course, I don't think that's so very terrible of a thing — though I feel like even if I believed that it was, another option would be to "rescue" it from the rationalists. If anyone wants to popularize it a different context, I say go for it!

Expand full comment
demost_'s avatar
2dEdited

To put my fact nazi hat on, "the longest thing anyone has submitted for an ACX contest” wasn't your post, but an essay of 90 pages (not even a book review) submitted to last years contest.

You probably get the prize for longest finalist, though, and a special prize for managing to write such a long entry that is engaging for the whole read.

Expand full comment
thefance's avatar

I can't help but ask: what was the 90-page essay about?

Expand full comment
demost_'s avatar

Honestly, I don't know. I couldn't make any sense of it. I read the first 20 pages and gave up. (It was the only review which I didn't complete, which says a lot about my OCD and about the entry.) You can read it here under "Sadly, porn", though the author reveals in the end that they haven't even read that book.

https://docs.google.com/document/d/1GYQw3pgvhi7hqOVR-Ql629Q_8thbyHe8sSRy5voyt30/edit?tab=t.0#heading=h.cdezdtonc8cn

Expand full comment
thefance's avatar

lol. I asked 1123581321 to read the "Sadly, Porn" review maybe a week ago, since he said he's a big TLP fan. He's still working on it. :^)

If you wanna hear my take, you're cordially invited to the last open thread, where i'm still debating him about the value of moldbug's Cathedral idea, and how it relates to "bureaucracy creep" (which is what's causing the narcisissim epidemic TLP and Lasch kept going on about). Though I acknowledge a strong possibility that you're just gonna think I've lost my marbles as well.

top level comment by Richard Ngo, which kicks off a discussion about what this mysterious "Cathedral" is:

https://www.astralcodexten.com/p/open-thread-381/comment/116451395

Expand full comment
demost_'s avatar

Hahahaha, really? This is not a coincidence, because nothing is ever a coincidence. I wish 1123581321 good luck making sense of it, because I didn't manage.

I find it fair to debate how much value Moldbug's Cathedral idea has, but I don't have much to contribute. I joined SSC after the time when Moldbug had interesting ideas (as far as I can tell), and so I don't have a strong opinion about it.

Expand full comment
thefance's avatar

I must admit, i had an inkling of which review it was, when you said it was especially long.

I think you misunderstood me though. I'm saying you might get something out of reading the thread, assuming you were still wondering about the thesis of the "Sadly Porn" review. Because I read the entire review, and I felt like I understood it pretty well, and I think it intersects with how moldbug understands modern history. Although moldbug was terrible at communicating his ideas imho. For a variety of reasons. Consequently, a lot of people seem to think he's just being edgy and doing the "reversed stupidy is intelligence" fallacy.

It's a "blindmen and the elephant" problem. Where lots of people catch glimpses of what I like to call "The Blob". But nobody seems to understand it in its entirety. moldbug is attacking it from a political angle, whereas TLP is attacking it from a psychological angle.

Expand full comment
demost_'s avatar

Thanks, I see. At least from the thread I found out that TLP stands for The Last Psychatrist. This is interesting, I do recognize that Scott has mentioned this name a few times. But I still lack a lot of context. I have never read anything by either The Last Psychatrist (except for 20% of a book review) nor from Moldbug, and I think that all the discussions around them took place before I joined the party. I wouldn't even vaguely know what topics they stand for, except for Scott's recent article on Moldbug.

Expand full comment
Brandon Hendrickson's avatar

NOOOOOOOOOOOOO

Expand full comment
Jonathan Moregård's avatar

Feedback: when I first saw the graphical version, it looked like a big 'L'. It took a little bit of time for my brain to snap into "adjacent bar chart". There might be benefits to doing examples where the odds are so different (100 vs 1) due to the dichotomy of it, but I wanted to name my brief confusion.

This is not a problem in the video, with the way they introduce the graphic

Expand full comment
Brandon Hendrickson's avatar

I appreciate that; I'll pay attention to it when I introduce this in the future.

Expand full comment
TT's avatar

I wrote an interactive bayes square a while back. I was going to extend it to include numbers and stuff, but never got around to it. Check it out:

https://tristantrim.github.io/bayesquares/

Expand full comment
Eric Rasmusen's avatar

I wrote up some Python and HTML a few years back, trying to learn how to create a web app. It's broken now. Maybe you'd like to improve it.

https://rasmusen.org/cgi-bin/bayesbox.htm

where the underlying code is

https://rasmusen.org/cgi-bin/bayesbox.py

Expand full comment
Dan's avatar

The NFL player example has "Monty Hall" / "Tuesday Boy" structure, where the categories you chose are contingent on the information you had. You telling us that it's either an NFL player or a math teacher, after we'd been previously considering "NFL player", strongly points to "math teacher".

If you'd recognized the thrower as a high school student, you presumably would've told us that the person in the bush is either an NFL player or a high school student. If the "NFL player" guess was wrong, you're telling us "either an NFL player or [the correct answer]".

Expand full comment
David Joshua Sartor's avatar

Yes!

Expand full comment
Cinna the Poet's avatar

Much of what you're talking about is done in the introductory critical thinking text Reason Better by David Manley.

https://tophat.com/catalog/humanities/philosophy/full-course/reason-better-an-interdisciplinary-guide-to-critical-thinking-david-manley/3425/

Expand full comment
David Manley's avatar

Thanks for the shout-out!

One difference: I use an easier-to-intuit version of the ODDS formulation.

First you ask "how much more likely is this evidence given H than not-H". That gives you a number, the Bayes factor. (I call it the "strength factor", as in a value representing the strength of the evidence.) Then it's simply:

prior odds in H X strength factor = new odds in H

I have found that for students this is far easier to do in one's head than the usual equation, or even trying to represent it pictorially.

See also:

https://www.clearerthinking.org/tools/the-question-of-evidence

https://arbital.com/p/bayes_waterfall_diagram/

https://arbital.com/p/bayes_rule_odds/

Expand full comment
Kenny Easwaran's avatar

Hi David!

I’ve long thought that it’s unfortunate that probability is so much easier to axiomatize than odds, because I think that odds (or perhaps even log odds) give us much better intuitions for how extreme the top and bottom ends of the scale are, and that 90% probability isn’t extreme at all for many of the examples that people use.

Expand full comment
David Manley's avatar

hey Kenny! yes! that and also:

- it’s much easier to remember the rule “prior odds times strength factor” than any of the alternatives

- it’s much easier to mentally perform two separate mathematical steps in sequence (or three if you count converting one value to odds), than it is to mentally plug in six values and then try to hold them in your head as you are doing calculations.

- the first of the two steps gives you the Bayes factor or “strength factor”—how much more likely is E given H than given not H?—which (i) cashes out the relevant notion of evidence by defining whether E counts as evidence at all for H, and (ii) is very intuitive as a way to think about evidence strength.

Even outside of updating, reflecting on how much more likely this observation was given various hypotheses reminds us of various ways this could have occurred even if our preferred hypothesis were false— what would I have expected the world to look like if I were wrong? This is one aspect of “considering the opposite” that has been shown to counteract confirmation bias.

Expand full comment
David Manley's avatar

(on the last point: of course the other variations also ask you to assess P(E|H) and P(E|~H), but then those values get lost a big equation. This version highlights that their ratio is what constitutes the strength of evidence and it's what operates on your prior to update you.)

Expand full comment
Anonymous Dude's avatar

One of the things I've realized in my old...well, middle age is that being good at math means I'm not very good at explaining it. "Well, duh, you just add that to both sides and you can get a quadratic equation! That's obvious, right?"

So, from my very limited point of view...well done, sir.

Expand full comment
Mike G's avatar

Good stuff! Random: Does it matter than ~5% of NFL players are quarterbacks? I wonder what the spiral is like for the typical o-line and d-line guy.

Expand full comment
Deiseach's avatar

I had a very wet-blanket response to this planned out in my head, but it was too negative. So I'll just be *mildly* Debbie Downer instead of full-on Eeyore.

"Consider, instead, just how irrelevant, useless, and impractical many of the things were that you threw yourself into when you were that age. Running a D&D campaign? Modding Half-Life? Learning Photoshop? Designing a fictional language?"

Looking it up, I see that American middle schools are in the age ranges 11-14 or 12-15. So let's take 12-15.

This statement has me (metaphorically) laughing-crying while I bang my head off the desk. The schools I was involved with definitely did not have kids doing these kinds of things. Instead, for example, we had 12 year olds already smoking cigarettes. This ties in with what you later say:

"A lot of the kids who I work with (not a representative sample: they skew toward the gifted and hyperactive sides of the spectrums) "

And I think it shows. Your model works great for the five schools where the kids are bright and interested in learning and thinking. Great, now how do we apply it to the other ninety-five schools where it's a struggle between "I cannot wait to get out of here" and "I'm only doing the bare minimum I need to do". You can draw all the pretty boxes you like, they do not and will not care.

Because those kids are not interested in nerdy topics or Bayes or anything other than sex'n'drugs'n' rock and roll, as it were. Currently where I'm working it's pre-school kids, and for some of them I already see that they come out of homes and a culture that is, to be blunt, welfare and scamming. We get them for three hours a day and it's the equivalent of "you have a pot plant which you carefully nurture with the right plant food and water it on a schedule and make sure to transplant it to bigger pots as it grows", then they go home for the other twenty-one hours of the day and it's "yoink the plant out of the pot, dump it on the bare concrete, it might get pissed on by dogs, it might get trampled underfoot, if it's lucky it'll put some roots down in a crack" until the morning when it's scooped up and put back in the pot for three hours.

(This, by the way, is why I have no patience for the lamentations of "schooooool is a prison for kids!" Yeah well going free-range is no picnic, either).

How are you going to inculcate a love of learning, or interest in knowledge, or getting them to think, when they're immersed in home and social lives of soap operas and following celebrity gossip?

I wish this was the way of education. I wish it could be scaled up for all schools. But I greatly fear it's going to apply to the few schools with the bright kids from supportive homes.

Expand full comment
Victor's avatar

Consider the possibility that there are significant numbers of children who are interested in sex, drugs, celebrities, *and* nerdy questions regarding the nature of reality.

Student: "My girlfriend believes in Bigfoot, what do I say to her?"

Me: "You're 14!!"

Expand full comment
Spinozan Squid's avatar

There's a big part of my experience that has a problem with these types of high-minded and noble theories of education.

My brother was an above average intelligence kid in a blue collar suburban school district. He struggled with math heavily starting at around Algebra 2. His issue? He never completely mastered one-digit multiplication: he would sometimes forget what 8x6 was, because he had kind of a checked out fourth grade math teacher. He never completely mastered nor completely memorized the quadratic formula. Little gaps like these that compounded over the years creating massive 'skill lag'. By the time he was in Algebra 2, there were so many fundamentals that he had to 'consciously load' to solve each problem, that he would just freeze up.

I feel like Bayes' Theorem is a great example of a 'skill lag' problem. It seems very simple for educated, intelligent people who read blogs and have degrees and careers that built a natural proficiency with numbers. But many people lack this type of numerical proficiency. I think students could conceptually understand Bayes' Theorem by itself, and I think they could perform the act of doing the manual calculations by itself. However, when you combine the two (conceptually parsing the problem enough to know what to calculate and calculating that thing), I think this creates too much to 'consciously load' for many students, and just like my brother in Algebra 2, many normal students would freeze up. If someone actually added Bayes' Theorem to Algebra curriculums, I think it a common classroom dynamic would repeat: the gifted kids (who are likely to develop these skills anyway) would enjoy it and thrive, the above average intelligence grinders would muddle through enough to perform adequately on the tests, and everyone else would struggle. Math is boringly more about fundamentals than most people who already have them like to admit.

Expand full comment
Victor's avatar

That's a school-support issue. No teacher can realistically accommodate 100% of all students regardless of need. In your brother's case, he should have been referred to a one-on-one tutoring program until he caught up.

Expand full comment
Eric Rasmusen's avatar

I think your brother would have had a better time with Bayes Theorem than with multiplication. You can do Bayes Theorem with pictures. He could see the size of the rectangles and get an idea of teh answer (2% vs. 20% vs 80%) even tho he couldn't calculate the exact number. But the big idea with Bayes Theorem is really qualitative, or at least ordinal.

Expand full comment
Skittle's avatar

> I think your brother would have had a better time with Bayes Theorem than with multiplication. You can do Bayes Theorem with pictures. He could see the size of the rectangles and get an idea of teh answer (2% vs. 20% vs 80%) even tho he couldn't calculate the exact number.

You can do the exact same thing with multiplication. I don’t know about American schools, but it’s a big part of how it’s (ideally) taught in England at the moment. Kids make arrays until they are sick of them!

They still have to memorise their times tables, but I hope it helps with seeing that 7 x 7 is only one away from 6 x 8.

Expand full comment
Phil H's avatar

"Math is boringly more about fundamentals than most people...like to admit."

Amen to this. I sometimes teach maths to young kids in China, where most parents are fully invested in the Asian=good at maths trope. Quite often, when I tell them that my course for seven year olds starts with quite a lot of counting, parents bridle and ask if I'll be quickly moving onto grade-appropriate material for their little Jimothy, who is quite advanced, actually...

As I'm in marketing mode, I make soothing noises at them until they start paying money, and then go right on with teaching lots and lots of counting. It's always valuable. (Follow up: times tables board games.)

Expand full comment
Don P.'s avatar

I've begun noticing that when most people report unashamedly that they're "bad at math", they don't even actually mean "mathematics"; they mean they can't multiply 2 3-digit numbers with pencil and paper.

Expand full comment
Kenny Easwaran's avatar

What I’ve found is that giving people a different angle about why to care about what you’re learning gives them a new chance to master those basics, because those basics become relevant to a new thing they care about.

Expand full comment
Unobserved Observer's avatar

Couple of thoughts. They're somewhat critical, so I'll just preface by saying that I sympathize with what you're wishing for here! I am a bit skeptical though.

1. I don't know anything about Kieran Egan's work, but I don't think you need anything you mentioned from him here in order to motivate the techniques you're proposing, and it makes it needlessly esoteric. Some things (like visuals) are going to naturally work better for humans than other things, and I don't think it's because visuals are more "embodied" or narrative-like. The same goes for making it vital. Things we're interested in, particularly things we can interact with in some way, will motivate us more than abstract knowledge. That motivation is essential in getting us to put work into difficult things like understanding equations. Especially for children.

(Making it visual the way you're doing it here is a good idea because it's directly downstream of 2 big heuristics. The first is that visual aids do just seem to be helpful regardless of the content. Possibly there's an advantage in freezing things in time and seeing everything at one (equations also have that advantage though); a good visual will also make the connection between parts of an idea more obvious.

The second heuristic is that making things more concrete is better. An equation consisting of only variables is of the most abstract things we've got. There's beauty in that, but I doubt most kids can appreciate that beauty. Making it about specific numbers and showing how that they interact visually gives you something to hang on to while you're busy grasping the basic idea. I don't know if this is just me, but I think when teaching pretty much anything the concept should be expressed first with a concrete example and then the more abstract description/formula in terms of the specific example given.)

2. I just don't buy that this will work for the majority of kids. Even if you make it about something they're interested in, if it gets too abstract or difficult it seems to me that most kids would just decide that they don't care *that* much about whether Bigfoot exists. There'll always be some that are more naturally inclined towards this sort of thing, but I genuinely think that it's basically not going to happen for a lot of kids.

There's anecdotes about kids/teenagers putting a lot of time and effort into learning math or programming just because they were really interested in modding their favorite video game or something. I think that does work because there's a great feedback loop in learning for the sake of creating something or doing in general. The motivation there is much greater (and often more frequent) than the motivation you'd get for wanting to know how likely something is. Maybe if you applied it to a game that pretty obviously involves probability and then kids who are better at updating correctly will win more often? That's a pretty limited application though.

An additional related point; I don't know how kids would deal with a genuinely Bayesian philosophy (if you have anecdotes about this I'd be interested). The fact of the matter about whether the football thrower is a professional player or a math teacher is already out there. It's not really probability that we're calculating, it's degrees of belief. Particularly once the evidence gets more involved. Will most kids be able to grasp that?

3. > And all of this is toward the goal of helping lift them out of the Matrix, so they can see what they’re studying as imperfect, historically-contingent tools that they can, as autonomous agents, choose to use as they see fit.

This is kind of a nitpick, and I don't think it implies not teaching Bayes' theorem or rationality in general, but I wonder: should we even want this? It seems plausible to me that it's good for kids to have a really stable, necessarily simplified belief system until they're older and more emotionally ready to deal with the complexity of the real world.

4. > If that much human richness and potential can be pulled out of just one piece of the curriculum (albeit an important bit!), what could be achieved if we re-humanize the rest? What hidden vitality lies in poetry, or geography, or punctuation? With ancient history, or economics, or biology?

I'm not sure what you're suggesting here. Bayes is important because it's a general method. You're going to get a lot of richness because it has implications for how everything involving truth gets done. The other stuff are individual subjects. They're great, and we can see how great they are by looking at the kids that are already enthusiastic about their favorite subjects, but I don't think you're going to get a paradigm shift there the way you would if everyone were more rational.

My ideas after reading the essay for how teach these other subjects are a. make it visual (or use other cog-sci techniques), b. make it vital (i.e. interest/motivate). I agree that we should do those things, but haven't people been saying all of that for years? And if it's the case that those ideas are fairly widespread, doesn't that show that what we really need are better teachers or smaller class sizes or something? I have no doubt you'd be a much better than average teacher and that your camp teaches Bayes' theorem better than almost anywhere else. But isn't that really a function of time, resources and really smart, motivated teachers (and possibly a body of students who are more similar to each other than average) rather than some radical new idea? (You're going to tell me to read the full book review, aren't you?)

Expand full comment
Perelandra99's avatar

What’s the probability that ALL 526 recorded witnesses to Christ’s resurrection were mistaken or lying?

—That was actually the question Reverend Thomas Bayes was trying to answer, which he wrote in response to David Hume’s essays against the existence of miracles.

But aside from that, there’s the fatally subjective issue of choosing priors based on rates.

You can’t just pick “how many NFL players” are in the world as a base rate, but how many would be in your park. Do NFL players hang out throwing a football in public parks at all? What day of the week is it, and how many NFL players would be at the stadium or traveling to an away game on that day?

Then what about math teachers? How many math teachers would be in class that day? Where did you get 1% as percentage of math teachers who can throw a perfect spiral? How many math teachers did you test throwing a football to get that base rate?

All of the heavy lifting is in choosing these priors and base rates—the formula itself is trivial.

Expand full comment
The original Mr. X's avatar

>All of the heavy lifting is in choosing these priors and base rates—the formula itself is trivial.

Yeah, that's the problem with over-reliance on Bayes Theory -- it's fine when you have clear and objective base rates, but when you don't it mostly just becomes a way of laundering your own prejudices.

Expand full comment
Kenny Easwaran's avatar

I think there is value to making your own prejudices very visible, and clean, and nicely pressed. That makes it much easier to focus on them.

I know that many of the people in this online rationalist community seem committed to an objective version of bayesianism where there is a correct prior and a correct set of likelihoods, but I think all we can really justify is the subjective version - but that factorizing things into priors and likelihoods makes it easier to understand why different people are getting to different conclusions.

Expand full comment
The original Mr. X's avatar

>I think there is value to making your own prejudices very visible, and clean, and nicely pressed. That makes it much easier to focus on them.

Sure, but I haven't seen any evidence that learning Bayes' Theorem actually makes people more likely to do that. If anything, it seems to have the opposite effect -- because everything's now all maths-y, people think they're being scientific and objective when in reality they're being the complete opposite.

Expand full comment
Kenny Easwaran's avatar

Yes, I do think there are a lot of problems of this sort.

Expand full comment
Dan's avatar
2dEdited

Strongly agree. I'd prefer to see Bayes' theorem reserved as a serious mathematical tool. What I see in the wild (including most comments on ACX) is that the vast majority of its use is people trying to legitimize their own BS. People talk about their "priors" when they really just mean "assumptions made without adequate data". Too many members of the rationalist community have allowed poor use of Bayes theorem to lead them to absurd and faulty conclusions because they didn't question the validity of their inputs enough.

People seriously underestimate how much good data is necessary to make good mental models. Unless you're willing to do a Nate Silver-level amount of data gathering and analysis in a particular field, I would urge people to reserve Bayes for theoretical math problems.

Expand full comment
Don P.'s avatar

And the related issue that I see all the time in arguments is an implicit belief that 99% is the biggest possible amount of something that's not "all", and 1% the smallest that's not zero.

Expand full comment
Benjamin E Nachumi's avatar

Do the Monty Hall paradox next.

Expand full comment
Mahatsuko's avatar

Sure. Let's you that you choose Door A, and Monty responds by opening Door B to reveal a goat. Assuming all the standard rules for the game, you could use the following chart.

Https://ibb.co/XfhTvbYy

The green rectangle is bigger than the red rectangle and blue rectangle, so you should choose Door C.

Expand full comment
Benjamin E Nachumi's avatar

I just think of it this way: P(car in doors B or C) was 2/3, and remains 2/3 because cars do not quantum tunnel. So now after Monty opens door B revealing a goat, 2/3 of the time the car is behind door C.

Expand full comment
Mahatsuko's avatar

I agree that that is a more intuitive way to approach the problem, but it isn't Bayes theorum, so a method of illustrating Bayes theorum can't use it (unless I'm missing something).

Expand full comment
Benjamin E Nachumi's avatar

The intuitive argument in pictures looks a lot like yours, but I like it better with two diagrams for before/after the goat revelation.

Expand full comment
leopoldo blume's avatar

I'll wade right in:

It "smells" like in your essay you are dancing around explaining how you are able to use Bayes probability theories to somehow prove the non-existence (or extreme unlikelihood) of a God, ie. you have come up with a way to convince kids (or people in general), that theism is silly (presumably because you are an atheist and you think this will make society more rational and therefore better off in general).

Could you elaborate on the reasoning and probabilities you use to reach your conclusions about this? (assuming I'm not way off base - if I am I apologize in advance for the question - and you really do reach such conclusions).

Expand full comment
Victor's avatar

You had me at Bigfoot.

If I still taught in a public school, I would try to adopt a lot of this. If my children still attended school, I would send them to a class that used these methods.

It's not the answer to everything, of course. We need an approach for when you suspect someone is trying to deliberately deceive you, or for when someone is using an argument to undermine you. What's the rationalist approach to hate speech, and how do you diagram that?

Still, an appreciable advance. Kudos.

Expand full comment
Mo Diddly's avatar

Great essay, though I got a little rattled by the fact that the first equation is not actually an equation.

Expand full comment
David Dunn's avatar

Possibly worth noting that the 70's and 80's Republican tax program were driven by Art Laffer drawing what became known as the Laffer Curve on a napkin belonging to Jack Kemp, then showing it to a bunch of people https://www.washingtonpost.com/archive/lifestyle/1986/08/31/the-lies-of-taxes-are-upon-you/9d37fcf2-5c88-42ba-a6cd-3584dc14403d/

Expand full comment
Gilpish's avatar

Your diagram is not to scale and it detracts from the point you are making. The whole point of Bayes blocks is to make visual comparison easy, but the block representing the NFL players looks *bigger* than the block representing the math teachers who can throw well.

I'm guessing you did this because the numbers you used make it very hard to see the two relevant parts of the diagram, but you should really have chosen an example with numbers which work better visually. I think because we are talking about statistical reasoning and logic it's important for this stuff to be to scale. At the very least it should be visually obvious that the green block is smaller than the red block.

Here- I made a version of it to scale. (yes, the NFL players are on there, just a green line right at the edge of the block.)

https://i.imgur.com/ZcVbm0M.jpeg

and here's the same numbers as squares with the same areas (easier to compare sizes)

https://i.imgur.com/fUCi8SK.jpeg

and here is the updated bayes block after removing the math teachers who can't throw.

https://i.imgur.com/3Ntyzi6.jpeg

Expand full comment
Gilpish's avatar

to add to this, the greaterwrong article on arbital is a good example of a to-scale bayes box https://arbital.greaterwrong.com/p/bayes_rule/

Expand full comment
priorGuesstimator's avatar

I see that a lot of work was done here, but unfortunately this explanation still didn't "click" to me. I understand the theorem intellectually and can do the calculations, but even having this helpful diagram doesn't make me instinctively use Bayes reasoning for those kind of problems.

Maybe the problem is that I'm kinda far from NFL in general, so the choice of the example is not the best for me personally. But again, I try to imagine how I would explain this model of reasoning to my less math-inclined friends and relatives and I draw a blank.

Like, what numbers are chosen for the bottom line and what for the vertical bars? Why is there such a big empty space on the right in most of the examples? Can the right big rectangle be higher than the left thin one and what would that mean? Why is thin always on the left? Can and should you make a chain of those in case you have multiple steps and how would they look - fractal-ish, maybe? Why isn't the bottom line drawn thicker than the rest of the borders if we start with it as a prior?

Maybe there is a more detailed article somewhere with like 15-20 examples of different kinds? I've read the original Eliezer's article and it didn't click then too.

Expand full comment
Eric Rasmusen's avatar

I don't think The Existence of God is too spicy. I could use it in the fundamentalist school where I teach, tho maybe it is too spicy for public schools. My 7th graders would be very interested. 7th graders know what are big questions, and they like to argue. So Pascal's Wager first, and then Bayes's Rule and Hume's Miracles Argument would make a great series of class sessions. It isn't pro-God or anti-God--- the arguments are not conclusive, just clever and useful. It can be taught in a balanced way.

Expand full comment
ike saunders's avatar

I think the annotations in these diagrams appear slightly misleading once you add the likelihoods. I'd guess that a majority of people, if shown one of the pictures in section 3 with no other context, would think that the red rectangle represents 1,500,000 teachers, not 15,000. Instead I think black arrows should be used to annotate the size of the priors, and coloured arrows the size of the likelihoods.

Expand full comment
Rafael Kaufmann's avatar

Just dropped by to say I also have a two-year-old who loves the Bayes for Babies book, also primarily because of all the BALLS

Expand full comment
Seth Benzell's avatar

Hi Brandon, great post. I'm also very interested in teaching people about Bayes. From two perspectives.

As a professor, teaching probability and statistics courses at the sophomore level, I put a big focus on Bayes that isn't there in most textbooks. I spend about a week on it in a semester class. My favorite lesson plan is the following: After the intro material, we use Bayes rule to unpack the case of Sally Clark https://en.wikipedia.org/wiki/Sally_Clark, a woman who was accused of murdering her two children. This was a case with 100% circumstantial evidence, and with some famous statistics errors, making it a good example.

After the students listen to a podcast discussing the case, we talk about how to fit it into the Bayesian framework. In small teams the students then discuss what a good prior would be (and I talk about the "prosecutor's fallacy" here), discuss the correct way to calculate the probability of two babies dying of SIDS if the mother was innocent (a doctor who testified incorrectly assumed SIDS deaths were statistically independent), as well as the probability of the two dead babies NOT being found if she did murder them (low, but always funny to discuss). The students put their guesses into this worksheet: https://www.dropbox.com/scl/fi/b6tuhb05hv63cnb55sye3/sally-clark-bayes-rule-exercise.xlsx?rlkey=ei1mi6jbde8uzcf5rjevm8jty&st=i50siwa0&dl=0, which automatically calculates the posterior based on their assumptions, so they can play with them.

It's always a popular class, I think because the topic is so grizzly and immediately engaging, and also because the students get to feel smarter than a British jury.

I also co-host a podcast called "Justified Posteriors" on Substack https://empiricrafting.substack.com/podcast that follows a Bayesian framework. Every other week we read a paper on economics and AI, explain our prior, and then see how much the paper changes our beliefs. I think too little scientific analysis takes this approach!

Anyway I salute you Bayes-king

Expand full comment
spandrel's avatar

I am often in the position of explaining why Bayes theorem matters to my colleagues. I am a research methodologist who plans a lot of clinical studies with smart people from other fields, so they aren't the target audience of educators; instead, they are people who lean heavily on their frequentist experience, and would who would generally prefer to ignore the problem of coming up with priors. So I came up with a modification of the Raven Paradox [sic] to make it clear why priors matter.

First, assume you take as a null hypothesis that “all ravens are blue”, and want to design a study to reject this hypothesis. You sample 100 ravens and find that none of them are blue; the frequentist conclusion is that the prob(ravens are blue) is 0% (95% CI = [0,0.04]). Not likely! So reject the null hypothesis.

But it’s hard to actually sample ravens! We can't find any, in fact. But we notice that logically, “all ravens are blue” is equivalent to “all non-blue things are not ravens”. So we decide since it’s hard to find ravens we’ll just test this null hypothesis instead, and sample 100 random objects that aren’t blue. We find that all of them are not ravens, and estimate that the prob(non blue things are not ravens given our data) is 100% (95% CI =[.96,1]). Do not reject – it’s almost surely true.

Thus, we have tested ‘equivalent’ null hypotheses but arrived at extremely different conclusions. What happened? We ignored our priors. If we incorporate accurate prior probabilities then we should arrive at very similar probabilities that the null hypotheses are rejected.

Usually I just stop there, because my audience gets it and we’re all busy people. But for the interested, here is how we’d work it out.

Approach 1.

Prob(ravens are blue|data) = P(H|E) = P(E|H)P(H)/P(E)

Note that P(E|H) = prob(observe no blue ravens in 100| all ravens are blue) = 0, so we don’t even need to know P(H) or P(E), only that P(E) > 0. Then

P(E|H)P(H)/P(E) = 0 * P(H)/P(E) = 0

We can reject the null hypothesis with certainty.

Approach 2.

Prob(non ravens are not blue|data) = P(H|E) = P(E|H)P(H)/P(E)

Here, P(E|H), the probability that we observed no ravens from our sample of non-blue things if our hypothesis is true, is 1. That’s how we got the 100% estimate earlier. Now let’s say the probability of our evidence P(E) – the probability that if we draw a bunch of non-blue things and see no ravens is 0.999999999. This is given what we know about the number of ravens vs objects in the world – we might look at a million non-blue objects before we see a raven. Then

Prob(non blue things are not ravens|data) = 1*P(H)/0.9999999 = 1.0000001*P(H)

Then our probability of rejecting the null hypothesis given the data depends almost entirely on our prior P(H) – how strongly we believe in our hypothesis. And it’s not much! Something very close to 0 in fact, say 10^-20. Which would bring Prob(all non blue things are not ravens| data) very close to 0 as well. Again, we reject the null hypothesis (though with perhaps slightly less certainty).

Thus, priors matter.

Expand full comment
Moritz's avatar

I find Bayes theorem to be much more natural when thinking in odds, not probabilities. In your example:

- Prior odds of math teachers v NFL players: 1000:1 (i.e. 1,500,000 teachers vs. 1,500 players)

- Observation odds: 1:100 (i.e. 1% likelihood vs. 100%)

- Posterior odds? Easy - just multiply the two together. 1000:1 x 1:100 = 10:1. I.e. about 9%

Indeed Bayes’ theorem, when reformulated in odds-language (odds of an event E being p(E):p(not E)), simplifies to the straightforward multiplication above. I wonder why it’s not tought this way - maybe I oversee something?

Expand full comment
Ppau's avatar

Some people do explain it that way! 3Blue1brown among them

Expand full comment
kipling_sapling's avatar

> not because I think we should expect kids to sit down and do the math on their own, in their everyday lives. (Do any of us actually do that?)

I confess I'm a little confused. Indeed, I don't do that. But my impression from frequenting these communities was that "doing the math" in our everyday lives was precisely the goal for LWers -- and I always felt a bit inferior that I don't do that, and even though I was once an aspiring actuary, it would take quite a bit of mental effort to do it correctly for everyday situations.

So what IS the goal of teaching Bayes? "Developing intuition for everyday probabilities by working a lot of specific examples" seems like a reasonable answer, but I don't think I've seen people claim that, not when they're specifically *teaching Bayes* in rationalist spaces at least. If that's the goal, wouldn't it be good then to have students preregister their estimates for a probability, then do the math, then check the accuracy of their initial estimates, and track their progress over time? More than being able to do all the steps infallibly, being skilled at estimating the answer quickly (with System 1) would be the skill that actually translates to "improving intuition for everyday probabilities," wouldn't it?

As I mentioned earlier, I've felt inferior for years at the fact that I don't have a great intuition for how to apply Bayes to everyday life and would struggle to even set up the problem correctly to accurately calculate probabilities using it for everyday problems. I've often wished that the introductions I've read would have more practice problems, more exercises, that demonstrated exactly how it works for a normal layperson going about their daily life. I agree with you that worldview issues are probably the best hook for most people -- do you have some practice problem bank for such questions? I think for me, the best way to build my muscles with this stuff would be to have a few hundred problems that are a mix of worldview issues, casino games probabilities, the standard medical/insurance type problems, and fork-in-the-road-of-life-how-do-I-make-the-right-move questions. I realize that the "answers" to the problems would depend on the prior probabilities that *I* assign, but that's not an insurmountable problem. A web applet could take my prior probabilities and provide an objective answer I could check my calculation against, and could even integrate with an LLM that could let me know when I've entered a prior probability that it judges to be way out of wack (like if I've smuggled assumptions about the posterior probabilities into my prior numbers, a classic rookie mistake!)

I don't want to go back and rewrite my whole comment, but I'm realizing at this juncture that part of your thesis is that the *social* aspect is what develops the intuition. I think that's great, but I still think the preregistration AND having a wide variety of problems from different spheres would go far.

Expand full comment
Roman's Attic's avatar

Reading through this article, I'm realizing that being gay taught me rationality. Learning about denial and how never to experience it again taught me a lot of the scout mindset, and having crushes on people that *maybe seemed gay but were actually straight* taught me about Bayes' theorem and the low base probability.

Expand full comment
Roman's Attic's avatar

The best image I've seen representing Bayes' theorem is The Decision Lab's cartoon talking about the base rate fallacy (https://thedecisionlab.com/biases/base-rate-fallacy)

Expand full comment
TheKoopaKing's avatar

I think Bayes's theorem is useless outside of empirical fields where base rates are clearly measured. And that in the majority of these cases you are simply being told of something that will impact the probability outcome, which of course you will use some sort of multiplicative operation to calculate that probability outcome. But in most cases you are not really multiplying anything, and instead vaguely reasoning about what impacts what. And that that's what "learning" should be geared towards - making models of the world with more things in them - not Bayes's theorem.

Expand full comment
Alcibiades's avatar

One small suggestion: when you are using visual aids, make sure it actually makes sense visually. The area you use to represent the NFL players and math teachers are about the same size, but the math teacher area should be 10x larger. What you've shown would work well if the probability was 50%. It doesn't make sense for 9%, and instead is just another source of confusion.

Expand full comment
Eric Rasmusen's avatar

Psychologist Gerd Gigerenzer has devoted a lot of his work to showing that (a) even smart people like doctors don't get Bayes Rule, but (b) even dumb people like children can understand it if you teach them the right way.

Someobody has taught Bayes Rule with legos. I forget if it was him. Anyway, here is one example f his work:

See Cognition, 98, 2006, 287–308. www.elsevier.com/locate/cognit

Children Can Solve Bayesian Problems:

The Role of Representation in Mental Computation

Liqi Zhu1

Institute of Psychology, Chinese Academy of Sciences, Beijing

Gerd Gigerenzer

Max Planck Institute for Human Development, Berlin

Abstract. Can children reason the Bayesian way? We argue that the answer to this question depends on how numbers are represented, because a representation can do part of the computation. We test, for the first time, whether Bayesian reasoning can be elicited in children by means of natural frequencies. We show that when information was presented to fourth, fifth, and sixth graders in terms of probabilities, their ability to estimate the Bayesian posterior probability was zero. Yet when the same information was presented in natural frequencies, Bayesian reasoning showed a steady increase from fourth to sixth grade, reaching an average level of 19%, 39%, and 53%, respectively, in two studies. Sixth graders’ performance with natural frequencies matched the performance of adults with probabilities. But this general increase was accompanied by striking individual differences. More than half of the sixth graders solved most or all problems, whereas one third could not solve a single one. An analysis of the children’s responses provides evidence for the use of three non-Bayesian strategies. These follow an overlapping wave model of development and continue to be observed in the minds of adults. More so than adults’ probabilistic reasoning, children’s reasoning depends on a proper representation of information.

Expand full comment
David Bahry's avatar

My favourite way is to

1) start with the verbal intuition, "How likely something is after seeing a new piece of evidence, depends both on how strong the new evidence is, and on how likely it already was to begin with. Think 'extraordinary claims require extraordinary evidence.' The lower something's prior plausibility, the stronger the new evidence would have to be to convince you."

2) give the odds form of Bayes' theorem: posterior odds = prior odds × Bayes factor. It's more transparent to intuition than the probability form. Explain how the terms line up with the verbal version, including BF as measuring the strength of the evidence.

3) Explain Bayes factors, including examples. "BF is the ratio of how strongly the hypothesis predicts the evidence, to how strongly we'd expect the evidence even if the theory's false. Imagine someone on trial for murder telling us he's innocent. That's only weak evidence, because he'd probably tell us he's innocent whether or not he was. The strongest kind of evidence would be something you'd expect to see if the hypothesis is true, *and* strongly expect not to see if the hypothesis is false."

Expand full comment
David Bahry's avatar

I see Eliezer also started teaching it with the odds form! https://www.lesswrong.com/w/bayes-rule?lens=high-speed-intro-to-bayes-s-rule

Expand full comment
sclmlw's avatar

This is great. One concern I'm glad you mentioned, but that feels like blasphemy sometimes to even bring up in these spaces is this part:

> Bayesian reasoning can become confirmation bias on steroids. You have to be humble in your analysis, because there are SO MANY DIFFERENT WAYS IT CAN GO WRONG.

Take the football example you mentioned. There's are so many wild guesses in that example, plus contrived certainties. (E.g. the guy who goes into the bushes knows it's one or the other with certainty for some reason? Real situations are rarely this binary or clearly known. Or the idea that you'd know the frequency distribution of NFL players and teachers? In my experience people just guess, which is worse than nothing.)

I feel like there's too much worship at the altar of Bayes. Mathematic equations are only as good as the assumptions they're based on. Bayes' theorem assumes you know 3 variables in order to get a probability for the fourth. I can't tell you how many times I've seen people whose only certainty is their own prior try to apply Bayes' theorem to the problem, guess wildly at the unknown variables, and come away with a reduced uncertainty that's totally unjustified.

I recommend ensuring your curriculum has these kinds of situations built in, so you can also teach students under which circumstances they should NOT reach for Bayes' theorem.

Expand full comment
NoodleIncident's avatar

In the NFL example of the box diagram, it’s drawn so out-of-scale that the visual intuition actually misleads you! Since the light blue box (1/1000) is drawn even wider than the dark blue box (1/100), the visual shows more-than-even odds of an NFL player.

I’m not sure if attempting to shade 1/1000 of the width would be helpful, either, but it was so distracting that I had to mention it anyway.

Expand full comment
Alan Thiesen's avatar

This post reminds me of a problem I have with Bayes' Law. Whenever I try to use it to update the probability of a possible future event, Bayes' Law is useless because the conditional probabilities on the right side of the equation are harder to estimate than the one on the left. What am I doing wrong?

I'll illustrate this in the discrete case with two alternatives. Let P(F) be the prior probability of a possible future event F. A recent event E that affects P(F) has occurred. To estimate P(F|E), I say:

P(F|E) = P(F)*P(E|F) / ( P(F)*P(E|F) + P(~F)*P(E|~F) )

But the conditional probabilities on the right-hand side are of the form

P(something that has already happened | a possible future event that might or might not happen).

It is difficult to understand what such a probability even means, let alone estimate it. The conditional probability on the left is easier to estimate, so Bayes’ Law is useless.

What am I doing wrong?

Expand full comment
thefance's avatar

this is, without a shadow of doubt, the #1 complaint about bayes stanning.

Expand full comment
David Wyman's avatar

Whatever our priors, constantly updating based on new information eventually works. Like the free market and gravity, you can interrupt it, game it, or overcome it (crony capitalism, upward force), but it doesn't go away.

Expand full comment
Mo Nastri's avatar

Unless our priors are trapped of course. Which is probably the case for the most important cognitive biases of all https://www.astralcodexten.com/p/trapped-priors-as-a-basic-problem

Expand full comment
Dylan Kane's avatar

I'm curious how you see Geary's distinction between biologically primary/secondary knowledge fitting in here (https://psycnet.apa.org/record/2008-16048-002).

In short, Geary argues that we've evolved to learn some things (biologically primary) like language, many motor skills, social conventions, basic physical intuitions. We learn these without any formal instruction through exposure and immersion. Then there are some things we haven't evolved to learn (biologically secondary). These are things like reading, solving systems of equations, chess, and Bayes' theorem. There are some bright people who can pick up this stuff without formal instruction but most people need some sort of structured learning to understand these ideas.

Does Geary's work fit into your view on education? Are you trying to harness those biologically primary modes to teach biologically secondary skills?

Expand full comment
Fabian's avatar

I don't get why Bayes alone should be the change for the world.

There are many other influences that make people irrational (all kinds of stress)

before Bayes can even kick in to lift rationality on another level.

"Could a new kind of school make the world rational?"

wouldn't the rational thing be to put Bayes after other things like:

Nutrition, physical fitness, sleep (understanding the importance of sleep, sleep hygiene and routines), mental & emotional health, financial literacy, critical thinking and media literacy....

then Bayes...

then communication & relationships (active listening, conflict resolution, setting boundaries, navigating romantic relationships), life skills (first aid, emergency respose, time management, ..)

and many other important things...

Expand full comment
thefance's avatar

The most common form is the equation. The 2nd most common form is the boxy rectangles. But when I first came across Bayes Theorem on LW, I invented my own visualization. Which, curiously, I have yet to see anyone else demonstrate.

Mentally, I have a 3D image a building, where each floor represents either a circle or Venn Diagram of circles. The area of each circle represents an absolute quantity, and the area of the floor represents the entire population size. And the multiplication operator (or division operator) represents a "telescoping" relationship between the circles from one floor to the next. In this frame, Bayes Theorem represents how one particular relationship between two circles gets shifted into a different relationship between two circles.

It's very difficult to explain in words without a diagram. But I think it's somewhat easier to follow if \omega (i.e. the entire sample) is made explicit, rather than hidden away for convenience. E.g. P(H) should be fully unrolled as P(H|\omega) because it makes the underlying concept of "multiplication qua telescoping" [e.g. P(E&H|\omega) = P(E|H) * P(H|\omega)] more intuitively obvious. Consider how, if you treat each probability as an actual *fraction* of two numbers, there's a sense in which the "H's" cancel out. I.e.

> P(E&H|\omega) = P(E|H) * P(H|\omega)

> P(E&H|\omega) = P(E|_) * P(_|\omega)

> P(E&H|\omega) = P(E&H|\omega)

has the same underlying logic as

> (A/C) = (A/B) * (B/C)

> (A/C) = (A/_) * (_/C)

> (A/C) = (A/C)

And once you grok this, Bayes Theorem itself becomes a natural and obvious conclusion. (I know if I say "becomes trivial", i'm going to be memed into the stratosphere. But this is how I feel.)

Expand full comment
Vadim's avatar
1dEdited

Places where Bayesianism contributed something to my life, from what I can easily recall:

· Just this general idea that your world model is a weighted mixture of hypotheses and you receive evidence and promote and demote hypotheses correspondingly.

· Connection between evidence and information, the idea that evidence can be trivially presented in bits if you use log odds ratio.

· Intuitive ideas about how evidence works in real life, like the conservation of expected evidence: if there are two outcomes of an experiment, and one should update you toward a hypothesis, the other should update you away from it. Moreover, they should balance each other out quantitatively, like if there is a strong probability of weak evidence in one direction, there should be a weak probability of strong evidence in the other direction — and your expectation of update before the actual experiment is 0.

In general, Bayesianism to me seemed like some kind of hard-to-describe mental discipline and a bunch of mental habits of someone who tries to keep a consistent and quantitative picture of the world in light of uncertainty as all kinds of evidence arrive and make it difficult. So, like, if you previously believed something with high probability, and now evidence very strongly updated you away from it, you'll try to take apart the entire reasoning sequence that lead you to this belief in the first place. Not because of some theorem, but just because that's what it takes to have a consistent worldview.

Otherwise, interpreting the entirety of Bayesianism as "you know, just figure out some priors, and then slap some updates, and look, we have a shiny new number!" seems like the kind of mistake that could lead to rootclaim losing the lab leak debate. (Do I sound like I think the original post makes this mistake? Because I don't! I actually think the idea of letting kids figure out bigfoot claims with Bayes is a very cool idea. So I just wanted to describe what Bayesianism contributed to my life, and this basic mistake I've seen elsewhere seemed worth mentioning.)

Expand full comment
Sjoerd Dost's avatar

Are you familiar with Bret Victor's Dynamicland? Connecting modern concepts to 'the old tools' in a spectacular way, imo. See https://dynamicland.org/

I bet teaching kids about Bayes' theorem through cutting/pasting paper boxes with physical scissors and (re)combining them to see probability change would be a hit.

Expand full comment
Martin L Morgan's avatar

Thanks for the post. I enjoyed reading it.

Expand full comment
Sufeitzy's avatar

This is a poor example because the visual is an extremely bad representation of quantitative information - the “math teacher passers) are is not 100x as large as the football player area, but you depend on that ratio for visual intuition. Yet you display something like (P=Pro, M=Math Passers, m=math non-passers)

Pmmmmmmmmmm

Pmmmmmmmmmm

Pmmmmmmmmmm

Pmmmmmmmmmm

Pmmmmmmmmmm

Pmmmmmmmmmm

Pmmmmmmmmmm

Pmmmmmmmmmm

Pmmmmmmmmmm

PMMMMMMMMMM

= 50% chance (10 P / 10 P + 10M)

Let me give you a related problem in logistics which is exasperating.

Perfect order reliability = number of delivered orders with no damage & are complete & and met the request date & correct paperwork / total # orders.

I have adult professionals who then say Ok we will calculate it as (% undamaged x % complete x % infime x % good papers )

Ok, I say in a shipment we have one of each defect. Then the immediate answer is

75% x 75% x 75% x 75% = 23%

Then I say - no, you you can’t know. You have to know the rate of each combination of issues.

If each order has one defect then the reliability is 0% then I make a 2x2 gif where each column or row is a defect

XX

XX

If one order has four defects then the reliability is 75% - it’s pretty obvious R is 75%

XR

RR

Visually the intuition is instant.

The combinatorics stump them - orders which are damaged, damaged and incomplete, damaged and incomplete and late, damaged and incomplete and late and bad paperwork, damaged and late, damaged and late and bad paperwork, damaged and bad paperwork, incomplete, incomplete and late, incomplete and late and bad paperwork, incomplete and bad paperwork, late, late and bad paperwork, bad paperwork etc.

From this example, and a grid of countable numbers (2x10) it becomes more obvious. In your visual aid the size of the math teacher area was not 100x the size of the football area therefore the visual reasoning was very poor

Try

Pmmmmmmmmmm

PMMMMMMMMMM

You get likelihood more obviously as 2 P / 2P + 10M = 16.7%

If we were to use the proportions in the visual example you gave it looks like the likelihood is 50% not 9% because the math passers and football players were the same areal size (P=pro, M=Math passers, m=math non-passers)

Pmmmmmmmmmm

Pmmmmmmmmmm

Pmmmmmmmmmm

Pmmmmmmmmmm

Pmmmmmmmmmm

Pmmmmmmmmmm

Pmmmmmmmmmm

Pmmmmmmmmmm

Pmmmmmmmmmm

PMMMMMMMMMM

In teaching with visuals, you actually have to have visuals which work. 

If you’re just starting out making visual displays for instruction, I recommend Tufte, he’s got reasonably good rules of thumb

https://www.amazon.com/Visual-Display-Quantitative-Information/

Expand full comment
Kalimac's avatar

OK, so how do you use Bayes to estimate the existence of Bigfoot? Using the example given, is "Bigfoot exists" the NFL players or the math teachers, and what's the other one? Color me confused.

Expand full comment
Seth Benzell's avatar

Hi Brandon, great post. I'm also very interested in teaching people about Bayes. From two perspectives.

As a professor, teaching probability and statistics courses at the sophomore level, I put a big focus on Bayes that isn't there in most textbooks. I spend about a week on it in a semester class. My favorite lesson plan is the following: After the intro material, we use Bayes rule to unpack the case of Sally Clark https://en.wikipedia.org/wiki/Sally_Clark, a woman who was accused of murdering her two children. This was a case with 100% circumstantial evidence, and with some famous statistics errors, making it a good example.

After the students listen to a podcast discussing the case, we talk about how to fit it into the Bayesian framework. In small teams the students then discuss what a good prior would be (and I talk about the "prosecutor's fallacy" here), discuss the correct way to calculate the probability of two babies dying of SIDS if the mother was innocent (a doctor who testified incorrectly assumed SIDS deaths were statistically independent), as well as the probability of the two dead babies NOT being found if she did murder them (low, but always funny to discuss). The students put their guesses into this worksheet: https://www.dropbox.com/scl/fi/b6tuhb05hv63cnb55sye3/sally-clark-bayes-rule-exercise.xlsx?rlkey=ei1mi6jbde8uzcf5rjevm8jty&st=i50siwa0&dl=0, which automatically calculates the posterior based on their assumptions, so they can play with them.

It's always a popular class, I think because the topic is so grizzly and immediately engaging, and also because the students get to feel smarter than a British jury.

I also co-host a podcast called "Justified Posteriors" on Substack https://empiricrafting.substack.com/podcast that follows a Bayesian framework. Every other week we read a paper on economics and AI, explain our prior, and then see how much the paper changes our beliefs. I think too little scientific analysis takes this approach!

Anyway I salute you Bayes-king

Expand full comment
Peregrine Journal's avatar

I recommend we start by nudging people toward probabilistic reasoning at all, by calling out common fallacies that result from avoiding it.

Unpacked more here:

https://peregrinejournal.substack.com/p/plausible-probable

Expand full comment
Wasteland Firebird's avatar

The math teacher problem is a good example. It makes a lot of sense! But I'm a smart guy and I still can't figure out how to translate that technique to "do ghosts exist" and "who should I vote for." I'd like to see a post with just a bunch of examples like these.

Expand full comment