Astral Codex Ten

Comment deleted

Expand full comment

The original Mr. X

>All of the heavy lifting is in choosing these priors and base rates—the formula itself is trivial.

Yeah, that's the problem with over-reliance on Bayes Theory -- it's fine when you have clear and objective base rates, but when you don't it mostly just becomes a way of laundering your own prejudices.

Expand full comment

I think there is value to making your own prejudices very visible, and clean, and nicely pressed. That makes it much easier to focus on them.

I know that many of the people in this online rationalist community seem committed to an objective version of bayesianism where there is a correct prior and a correct set of likelihoods, but I think all we can really justify is the subjective version - but that factorizing things into priors and likelihoods makes it easier to understand why different people are getting to different conclusions.

Expand full comment

The original Mr. X

>I think there is value to making your own prejudices very visible, and clean, and nicely pressed. That makes it much easier to focus on them.

Sure, but I haven't seen any evidence that learning Bayes' Theorem actually makes people more likely to do that. If anything, it seems to have the opposite effect -- because everything's now all maths-y, people think they're being scientific and objective when in reality they're being the complete opposite.

Expand full comment

Yes, I do think there are a lot of problems of this sort.

Expand full comment

Dan

May 30Edited

Strongly agree. I'd prefer to see Bayes' theorem reserved as a serious mathematical tool. What I see in the wild (including most comments on ACX) is that the vast majority of its use is people trying to legitimize their own BS. People talk about their "priors" when they really just mean "assumptions made without adequate data". Too many members of the rationalist community have allowed poor use of Bayes theorem to lead them to absurd and faulty conclusions because they didn't question the validity of their inputs enough.

People seriously underestimate how much good data is necessary to make good mental models. Unless you're willing to do a Nate Silver-level amount of data gathering and analysis in a particular field, I would urge people to reserve Bayes for theoretical math problems.

Expand full comment

Don P.

And the related issue that I see all the time in arguments is an implicit belief that 99% is the biggest possible amount of something that's not "all", and 1% the smallest that's not zero.

Expand full comment

typhoonjim

Frequentism is impressive, but now! Bayes leads! Priors, for everyone!

Expand full comment

Agreed! In the "Bigfoot" class, we actually go into the Bayesian leap (from a sort of implicit pre-frequentism), and talk about the perils and promises of treating internal confidences as countable items in the world.

Expand full comment

Mikhail Samin

Yudkowsky has an intro to the theorem, which is more intuitive than anything mentioned here: https://www.lesswrong.com/w/bayes-rule?lens=high-speed-intro-to-bayes-s-rule

Expand full comment

I'll watch it!

Expand full comment

If this works for you, including on occasions where the ratios are less pleasant, that's great, really. I don't think it would be particularly intuitive for the large majority of 12 year olds or younger. Unless we're talking specifically about the frequency representation section being better than the rectangles, in which case I agree, though it's only really going to work for a screen and sensitivity/specificity examples - time consuming for a whiteboard, so starting there then switching to the rectangles might be the best way.

Expand full comment

Timothy Johnson

I think your example ignores the possibility that some people are both NFL players and math teachers: https://en.wikipedia.org/wiki/John_Urschel?wprov=sfla1

Expand full comment

Oh my god — he's literally one of the only NFL players I knew about, how did I forget about him here?

Expand full comment

You also apply the "how many NFL players versus maths teachers" to the entire country, but it very much depends on your local area. If, for example, you are walking past the Ethihad Campus (I was going to say Carrington, but Man U has not been doing the best recently), and a ball comes flying past you on a perfect trajectory, it is indeed way more likely to have been kicked by a professional footballer than a maths teacher.

Or the local former county hurling star who drives an oil delivery truck in my town, for another example.

For the ghost example, I think there might be biases at work there. I'm presuming you take the 100% bit as "ghosts are real" and the 1% bit as "nobody has really seen a ghost" and multiply it out to get the 9% likelihood that ghosts are real, so you should think it's *not* a ghost, there's another explanation.

But what if you switch it the other way round? 100% is "I hope ghosts are not real because if they were it would upturn my entire comfortable materialist worldview" versus "1% is all the records of people throughout history swearing they saw ghosts". Then you get the 9% likelihood of ghosts are not real, so you *should* think it's a ghost.

I am a mathematically illiterate idiot, tell me where I'm wrong here.

Expand full comment

You start with the objective phenomena you are trying to explain. In this case, people see ghosts. Ok, so what's the likely total number of ghosts in this country? Then, what's the most likely alternative explanation? People being mistaken by what they see? Ok, what's the likely total number of people who have made a mistake like that in the country? Diagram it from there.

Expand full comment

Seth B

https://en.wikipedia.org/wiki/Frank_Ryan_(American_football)

I think you'll be interested in Frank Ryan as well:

Expand full comment

Steve Sailer

Frank Ryan quarterbacked the Cleveland Browns to the NFL championship in 1964. The next year he earned his mathematic Ph.D. from Rice U. for his dissertation "A Characterization of the Set of Asymptotic Values of a Function Holomorphic in the Unit Disc."

Interestingly, his teammates did not consider Ryan a particularly smart decisionmaker on the gridiron. Instead, they were impressed by how brave he was at holding on to the ball until the last instant.

Expand full comment

Seth B

In the postgame celebrations after the Browns won the Championship in 1964, a reporter asked Ryan how much his mathematical background affected his decisions as a quarterback. His jubilant response was something to the effect of "not one goddamn bit!"

I'm pretty sure I read that story in one of Steven Krantz's two volumes of "Mathematical Apocrypha".

Expand full comment

Michael Weissman

(https://michaelweissman.substack.com/p/open-letter-to-scott-alexander)

Nice article! Though I think the fuzzy bigfoot/god/fantasy examples make poorer teaching tools than the other ones. Most kids like problems with $ in them.

It's a bit ironic that Scott chose this topic at this time, given how badly he screwed up his own Bayesian reasoning on Covid origins.

So it might be good to do a two-layer approach. First, simple straight one-stage Bayes problems like your football example. Second, some two-stage hierarchical problem (e.g. strong evidence provided by untrustworthy cops, good betting odds offered by someone who might not pay off, etc. ) just to get a handle on typical real-world problems.

Expand full comment

Yeah, I agree that multi-stage problems would be good. My hope is that we can bring that into the Sea monsters class this summer.

Expand full comment

I recently ran an exit ticket scenario: kid on the way home from school on the bus, wants PB sandwich, predicts the probability. Sees no car in driveway, updates probability. Sees empty PB jar, updates. Hears car returning, updates. Sees shopping bags, updates.

I wish now I'd done something social, the teens do love romance.

Expand full comment

Sniffnoy

Huh, when you started mentioning the actual Reverend Thomas Bayes, I thought you were going to say something about applying probability to games. I mean, a lot of the early work on probability came out of the study of gambling, right? Plenty of video games that kids would play involve chance to some extent. I don't really have a concrete idea here, but that's where I thought you were going with this...

Expand full comment

My experience in education is that matching content to kids' specific interests struggles to scale. What fraction of the class is interested in a particular game? What fraction finds the game off-putting? I've found that this sort of approach really can work (and work well), but only with an ungodly amount of effort from the teacher. (This could, however, be something that AI can do better with.) That's why I find it more productive to find the deeper issues that virtually all kids care about (e.g "what the heck kind of world are we living in?!").

Expand full comment

Kimmo Merikivi

May 30Edited

Certainly, if I were teaching kids in this manner, I'd definitely steer away from specific games and would instead use "idealized game" that uses common elements and tropes, but one that isn't any real-world game. Say, it would be contemplable to talk about "a tabletop game" with "knights and dragons" that uses "20-sided dies", and then present math problems about dice, but I wouldn't want to say it's D&D 5e. Or Minecraft, Fortnite, Roblox, or whatever kids these days are playing. I don't think a generic example would alienate anyone that wasn't already alienated by talk about a tabletop RPG period, but a specific example might alienate anti-fans, purists who are annoyed about details being wrong, those who dislike global consumerist culture exemplified by games played by hundreds of millions, etc.

Now that I think about it, at least the older kids probably can explain how e.g. the combat resolution mechanics work in a game they are playing, so why not ask the audience?

Expand full comment

Probably easier to just teach a simple game to the class, as an in-class activity. If it's fun enough, they will all quickly gain mastery, and you can proceed from there.

Expand full comment

My hunch — and I have no actual experience here! — is that choosing an abstract game would miss both the motivational benefits of doing something worldview-y/mysterious like cryptids, and those of doing something individually exciting like a particular game. We nerds, of course, are disproportionately interested in the abstractions of how games in general work, but we should check before assuming that most kids share that. (But again, this is an empirical matter, and I have no direct evidence for this.)

Expand full comment

Anna Eplin

I love this! I’ve been trying to introduce the basic mechanics of Bayesian reasoning to my kids (who are still pretty young) by drawing up and down arrows on a page—up for things that increase the likelihood of something being true, down for things that decrease it. They do pretty well with that. But I’ve been needing help with taking it deeper, so I’m very glad for all this input. (And I’m quite interested in your summer camps too—what a cool idea!)

Expand full comment

A delight! Reach out if you'd like to chat — I'd be fascinated to see how Bayes could be extended to even younger kids. (How old are yours?)

Expand full comment

Nate

Kier(an) Egan hey? Suspicious name 🤔

This is not a coincidence because nothing is ever a coincidence

Expand full comment

This keeps coming up, and (as a big fan of the show) I really want to have some jokes ready about this. (Something something "SEVERing" a child's sense of self?) I avail myself to all of you for any ideas.

Expand full comment

Paul Goodman

Makes me wonder if someone writing the show had a really bad Egan-inspired schooling experience or something.

Expand full comment

Alas, I don't think Egan was ever famous enough out of very geeky educational circles for me to give a high prior to that! (But now you've given me a new goal: make this sort of education well-known enough that he COULD inspire a villain in a prestige drama.)

Expand full comment

More than the specifics of the box diagrams, the practice of repeatedly and consistently using a diagram - any kind of thinking tool - would be very valuable to kids. I think Brandon has a problem with his summer school model, which is that he’s not necessarily getting the same kids each year. When he gets his school going (or when the next crazed president appoints him education grand wizard), he’ll be able to encourage this kind of habitual use of clarifying tools.

Honestly, though, I’m still not convinced that this tool is better than any traditional tool. The essay is precisely this: a tool for thinking in depth about things. If anyone thinks that schooling before rationalism wasn’t concerned with improving students’ thinking, then they’ve missed the point.

Expand full comment

The innovation here isn't the use of a diagram to teach something. It's what the diagram is being used to teach: Bayes Theorem. Apparently no one thought of that before.

Expand full comment

I'd actually be fascinated to know where Bayes has popped up in people's K–12 experiences. I know it comes up in high school statistics classes, but I'm sure that some enterprising teachers of younger kids have done it. If anyone knows any, I'd love the connection.

Expand full comment

Totient Function

May 31Edited

Surely this isn't true? I'm not sure I've ever seen an introduction to Bayes' Theorem that didn't have a diagram floating about somewhere, although to be fair I haven't seen that many - I mostly avoid popular articles and explainers etc. (I pretty much never find it helpful to fool myself into thinking I've understood something without actually have worked through a formal treatment, but then that's why I do math, probably atypical in this regard). But anyway, I certainly remember schematic diagrams not unlike these ones being commonplace as an intuitive aid (although I may be remembering my own drawings as having been present in the material? It's been a while...).

Edit: to be clear, this is not meant as a criticism of the body of the post! Just this specific claim.

Expand full comment

>> "More than the specifics of the box diagrams, the practice of repeatedly and consistently using a diagram - any kind of thinking tool - would be very valuable to kids."

Hard agree on this. This is where I get frustrated about the disconnection of the American curriculum — most of what's learned in one year (tools, but also content) is abandoned for future years. Instead of accumulating knowledge, we flush it down the toilet.

>> "I think Brandon has a problem with his summer school model, which is that he’s not necessarily getting the same kids each year."

This is definitely a challenge. We try to meet it by telling people they need to watch the recordings of previous years in order to take the new ones. In the end, though, that might mean we have a very small group for "Ghosts". Still, it'll be worth it, if we can then share the full model we've developed.

Expand full comment

I sympathize with this as well. I was always amazed that nobody remembered stuff beyond a month or two, and everyone (teachers, students, parents alike) considered this fine and normal.

Expand full comment

Yaniv

May 30Edited

One question that fascinates a lot of people, kids and adults, is “can we do magic?” For example, can a coin or a die or a card give us valid answers? Can you feel what someone else is thinking? Can you tell when someone is watching you? Does good luck follow bad luck? Does a lucky shirt really work? Is there information about the world hidden in dreams? And the granddaddy, do coincidences have meaning?

Expand full comment

Careful, though, there are parents who believe in some form of magic. Not as many as those who believe in God, but enough to warrant caution.

Expand full comment

Yeah! That's a lot of what I want to cover in "Psychic Powers", in year 4.

Expand full comment

G wedekind

Richard Feynman's book "The Meaning of It All: Thoughts of a Citizen-Scientist" is a collection of three public lectures Feynman delivered in 1963, exploring the relationship between science and society. In the first lecture, titled "The Uncertainty of Science," Feynman discusses the scientific method and the importance of questioning even strongly held beliefs. He presents a hypothetical situation in which a belief with a strong prior—the impossibility of telekinesis—is eventually overturned through thorough and exhaustive experimentation, illustrating his point that scientific knowledge is always provisional and subject to revision in light of new evidence

This (accompanied by appropriate caveats; e.g. Feynman's wording if I remember correctly is a bit sloppy when describing the posterior belief) might be a fun discussion starter for your Psychic Powers topic, or perhaps a memorable way to bring it to a close.

Expand full comment

James Gauvreau

//for the majority of our species’ existence, most people probably haven’t been able to count to ten.//

This sounds improbable enough that we must be interpreting this sentence differently. Could you elaborate?

Expand full comment

Doug S.

May 30Edited

There are hunter-gatherer languages in which it is literally impossible to count to ten because they lack words for numbers greater than two; any quantity greater than two is referred to by a single word that would translate into English as "many."

Expand full comment

https://thelanguagecloset.com/2025/03/01/the-number-systems-of-hunter-gatherer-languages/

I have to wonder about that, I'm sure hunter-gatherers are perfectly well able to see "there are four of us in our band and six of them in their band, if push comes to shove they're likely to drive us away" or "there are lots of berries on this bush but not so many on that bush, we should pick from this bush" or "the berries are nearly all picked, time to move on to a new territory" even if there is no formal term in the language for counting.

You know what's coming next: I looked it up.

"Many of the Aboriginal Australian languages, for instance, usually have words for ‘one’ and ‘two’, but only a subset of these have words for ‘three’, and even fewer have ‘four’ and beyond. The Pirahã language too is well known for its supposed lack of precise numerals at all, yet, its speakers still demonstrate numeral literacy.

...Furthermore, ‘numbers’ in this study were defined as “spoken, normed expressions that are used to denote the exact number of objects for an open class of objects in an open class of situations with the whole speech community in question”. In essence, cardinal numbers like generic ‘one’, ‘two’, and ‘three’. This definition presents a broader set of problems, as some languages use counting terms for a specific type of object, which can differ from those of other objects. And so, the authors decided to include number systems that pertain to precise numerical values, loosening up the Hammarström definition by a bit.

...Before reading this paper, I thought that from the naïve observation of number systems and languages based on type of subsistence, one would be quick to point out that there is a heavy cultural association between the two factors. However, how such systems arose are not really understood. Perhaps some might have stemmed from how we count, like using the ‘hand’ to express 5, and ‘person’ to express 20 going by digit tallying. Others might have pointed out the need for trade, record keeping and verbal reporting because of the shift to agriculture, but even these have their own counterexamples (see Iñupiaq).

...But the idea that the lack of high numerals in these languages translates to poorer numeracy skills is rather inaccurate. Studies like this conducted on Aboriginal Australian children, who speak languages which generally lack numbers beyond 4 or 5, have shown that their numeracy skills are generally no different from those of English-speaking children. Other studies have also shown how speakers of these languages deal with larger numbers as well, such as the use of tally systems, or tally marks, to record the number of occurrences of a certain desired observation."

So it seems that it is entirely possible for a hunter-gatherer tribe to speak a language that doesn't have formal terms for numbers beyond four, but if they want to express higher numbers than that, they use terms like "there are two hands of people there" (which would mean ten people) or the like.

Expand full comment

I'm honored to get the full Deiseach fact-finding treatment! (You know you've made it when...) I had heard of some anthropological pushback to the "counting was rare amongst oral-language cultures", but I think that in the past I've been wary of taking that seriously because of the bias amongst anthropologists against admitting that modern literate cultures bring any advantages. But I'll switch now to being more unsure of this. Lemme know if you (or anyone else reading this) finds anything determinative here.

Expand full comment

You are very gracious about this, thank you!

I do tend to be a little sceptical of "and the Potitoo tribe only count to two by using the right and left halves of their bodies", because most cultures do have at least "some/lots/many" distinctions.

So I think that if anthropologists come away with "they only have words for numbers up to three" they may well be missing "but the Potitoo count 'four' by using the term for "same number as how many arms and legs a person has" and 'five' by "how many toes on your foot" and so forth.

Expand full comment

May 30Edited

We know that the ability to comprehend the concept of "most" precedes the linguistic ability to describe the quantities involved: https://www.tandfonline.com/doi/pdf/10.1080/15475440801922099?casa_token=PXwPhTnyzrwAAAAA:1LEPbDnCDPD2wVI8EIBmlUWY2TJWRb0KZ3WfmPmKfk52xNgmFk1AmFlHv425p5bShuii8GWffHF-WA

Expand full comment

Right - my understanding is that there is some intuitive logarithmic conception of quantity that applies indifferently to counts and masses, and that this is probably what most people are thinking when they use the word “most”. The claim is that it’s the precise use of numbers to add and subtract and multiply that depends on the technology of having a memorized list of counting words (and also the technologies of tally marks and numeral systems like Roman and Arabic).

Expand full comment

It is grimly funny that this article starts off its introduction of Bayes theorem by putting forward in a huge black box an "equation" which conspicuously lacks an equals sign. The equals sign is the entire content of the theorem! The equals sign is telling you the precise mathematical relationship between P(E|H) and some combination of P(E), P(H), and P(H|E). The rest of the piece is honestly fine, but it's not a great look to start off an article about demystifying Bayesian reasoning with a pretty profound lack of clarity on the actual, mathematical relationship of its definition.

Expand full comment

Pascal Bercker

You probably mean: P(H / E) = some combination of P(E), P(H) and P(E / H).

Expand full comment

May 30Edited

The math teachers/NFL players example is poorly chosen. No way there are 1.5 million math teachers in the country. Also, NFL players include linemen, linebackers, receivers and running backs who have less than a 100% chance of being able to throw a perfect spiral. There are also plenty of college quarterbacks who can throw perfect spirals. And some high schoolers! And semi pros.

Picking an example with this many holes in it makes you come across as an educated fool and makes Bayes’ theorem seem like an abstraction with little real world connection. Even worse, the regions of the square weren’t congruent with your numbers.

Expand full comment

Reply (3)

Do you think this would matter to very many middle schoolers? Would they even understand what you are talking about?

Expand full comment

yes

Expand full comment

Ben Skubi

Yes, it would be a huge distraction. That’s the part they will immediately understand and have the intellectual resources to challenge.

Expand full comment

>> "Picking an example with this many holes in it makes you come across as an educated fool"

Alas, a description I fear fits me well.

>> "and makes Bayes’ theorem seem like an abstraction with little real world connection."

A more important charge! I invite anyone who'd like to use this to improve the numbers. Honestly, though, coming up with weirdo situations and questionable-but-easy-to-work-with numbers is a hoary old tradition in math teaching (and my feeling is ESPECIALLY in Bayesian probability), so I'd be surprised if the average student has that take.

>> "Even worse, the regions of the square weren’t congruent with your numbers."

True! And I meant to write a "not to scale" note below it. (The trouble with these numbers is that they're too extreme to draw to scale. You're right that for this, making it to scale is really helpful.)

Expand full comment

Thank you for your thoughtful response. I’ve taken courses in statistics, I could apply Bayes’ theorem given internet access or a crib sheet, but I can’t write it out from memory. I could reason my way to something close to it with a pen and paper and 15 minutes. The main thing I got from it is “base rates are huge.”. Just teaching that would be a huge step.

I’ll try to think of a better example.

Expand full comment

May 31Edited

Here’s how to improve the example. Make the question “was the spiral thrown by an nfl quarterback.” Say there are 70 nfl quarterbacks and 330 million non-quarterbacks, 1% of whom can throw a perfect spiral. That will yield a dramatic result, shoring the importance of base rates.

Want something more graphic? You are at a football training camp with 123 players. All three quarterbacks can throw perfect spirals. 15% of the other players can. That could easily become a scaled graphic.

Expand full comment

In an blog post you have to have concrete numbers to do up an example. In a classroom, I'd encourage the kids to call BS on the numbers and have them give their estimates. I agree they might actually care - and if they want to think more deeply about specific location, do a quick calculation of maths teachers in an area, estimate how many NFL players can really throw a spiral, realise that some players have a non-100% chance of doing so, etc. then, uh, great, right? They are connecting the maths to the real world which is good, looking for holes, aiming for true accuracy.

Expand full comment

Once you add in that texture, you are better off coding a simulation than using a formula.

Expand full comment

The kids are going to get way more out of coming up with the ideas and trying to factor them in than having a perfect simulation to play with. It's the difference between trying to perfect the tool and trying to optimise the process for learning.

I mean it's possible at the moment for kid to build a simulation and telling it to factor in various different things. So that might be the next step after thinking through a few as a group.

Expand full comment

bbqturtle

I’ve never understood bayes and this box helps me a ton! But, I could use a few more examples. Could you share one for the Bigfoot and mammogram concepts? I think if I see it created a few more times I would get there.

Expand full comment

David Joshua Sartor

https://www.lesswrong.com/w/bayes-rule?lens=bayes_rule_guide

Expand full comment

bbqturtle

This is dense and illegible and the links don’t work :(

Expand full comment

https://www.youtube.com/watch?v=lG4VkPoG3ko

cromulent

May 30Edited

3blue1brown's one video on it is very good. It focuses on the mammogram example in detail.

Expand full comment

David Joshua Sartor

https://www.lesswrong.com/w/frequency-diagram?pathId=62f&lens=frequency-diagrams-a-first-look-at-bayes

Is this better?

Expand full comment

UlyssesB

https://www.yudkowsky.net/rational/bayes

Expand full comment

bbqturtle

Thanks for the comment! I guess what I meant is that I find Yudkowsky extremely difficult to read, and I find OP very easy. So just a few more examples BY OP would be a huge value add for me

Expand full comment

https://www.youtube.com/watch?v=BrK7X_XlGB8

I'm at the LessOnline meetup right now (weird! intense! sometimes relaxing! super recommended!) so I won't be able to, but I agree with @cromulent — watching that 3blue1brown video I linked to will do it. (Though note, as someone did above, that all his examples are frequentist — they're about real quantities in the world, rather than subjective priors. If I had this post to do over again, I definitely would have included some of those.)

Expand full comment

TakeAThirdOption

Jun 2Edited

There is a 10 min video "A visual guide to Bayesian thinking" from Julia Galef that uses this visualisation.

She has a couple of different examples.

Expand full comment

Anteros

'Look ON my works....'

The misquote is glaring because it so obviously butchers the rhyming scheme (ten syllables per line) that it can only be made by those who know nothing about the poem, or even poetry in general.

And that's my pedantry quota for the week :)

Expand full comment

Reply (3)

You’re right that the poem reads “look on”, but having an extra syllable in a line certainly doesn’t wreck the meter. Ozymandias contains two lines with eleven syllables, and many of the line strongly and deliberately challenge the iambic meter (e.g. L12 Nothing beside remains). English meter is a flexible and generous tool!

Expand full comment

Anteros

Fair pount well made

Expand full comment

May 31Edited

I would assume "Ozymandias" is intended to be voiced with only 4 syllables (not 5) to keep the meter consistent. I.e. "OZ y MAN dias", instead of "OZ y MAN di AS".

also, I don't see an issue with line 12?

Expand full comment

The last syllable of Ozymandias wouldn’t be stressed, no matter how many syllables it has. The question is whether there is an extra unstressed syllable in between the stresses of MAN and KING. I agree that you could read it OZ-y-MAN-di-as or OZ-y-MAN-dyas; but the meter of the poem is not a clue which reading you should prefer. Because the English iambic pentameter is not and has never been a strict set of rules. It has always been OK to slip in an extra syllable here and there.

L12 starts with a trochee, very deliberately placed to interrupt the flow of the meter and throw stress on the “nothing”.

Expand full comment

yeah, i mean, i guess that all sounds good and reasonable. that said, I think im gonna go on pretending that the meter is 100% consistent.

Expand full comment

Unobserved Observer

I also always make that mistake. I don't know anything about poetry, but the one line in isolation sounds better to my ears with "upon".

Expand full comment

It's cuz the original is iambic, but colloquial english tends to put the stress on the first syllable of a word (or at least that's the impression I get) (though obviously there's exceptions). Which lends to a more trochaic sound. So there's a natural conflict between how the original is written, vs how normies wanna say it by default.

Expand full comment

(1) You're entirely right, (2) I threw in that joke at 1:30am before hitting "send" and actually thought "should I make sure I'm getting this quote right? Nah, even if I'm wrong, no one will notice!", and (3) the resulting pedantic argument about this is one of the most rationalist conversations I've ever seen and I am SO happy my mistake sparked it!

Expand full comment

Brian Thurbon

To make this type of thinking truly relevant to a "normie" middle schooler, it should be presented as something they can use to address an issue they care deeply about and face every day: navigating social relationships.

"Given the fact that my friend said something mean to me, what's the likelihood that they don't want to be friends anymore vs the likelihood that they're having a bad day?"

"Given the fact that my crush laughed at my joke, what's the likelihood that they like me too?"

These problems are messy. Base rates aren't always straightforward. But they're vital. Almost all of us are hard wired to care what others think.

I guarantee you'll get engagement with questions like these, and if the ultimate goal is less about getting people to do the math and more about examining and adjusting assumptions, this is the direction to go.

Expand full comment

Reply (4)

Unobserved Observer

Those 2 questions are really great applications. Useful in showing how much of that sort of reasoning we do intuitively as well.

Expand full comment

Unfortunately, it's also likely to turn the class into a therapy session. The drama of ongoing relationships is going to totally overwhelm the actual lesson. Best to leave that sort of thing to individual instruction, where it can be handled in a sensitive manner.

Expand full comment

I think you're pointing to a true danger here, but as a teacher, I wouldn't say that it can only be done 1-on-1. A wise teacher can exercise judgement as to when to speak of personal, emotional things. (Though on the other side of this, see my response to the original commenter, above.)

Expand full comment

There are also privacy issues involved, and children can't give informed consent. They will be talking about people who are not in the room, and feelings are likely to be hurt. I'm not saying that relationships can never be discussed, but make it about relationships in general, not actual one's involving specific people. Best leave that sort of thing to a safer space.

Expand full comment

Tom!

Thank you for this.

Expand full comment

Retsam

To me, this feels somewhat dangerous. Applying Bayes to social dynamics can be useful, but it has to be done carefully - it's very easy to get into a headspace where you don't take any nice thing anyone says at face value because it's drowned out by a high prior on "they secretly hate me"; the sort of prior that's very common in middle-schoolers.

Obviously; I'm not suggesting the idea that "person says X but means Y" is a foreign concept to middle schoolers, but I think explicitly giving them the tools to quantify it and teaching them that it's "rational" is potentially throwing gasoline onto the dumpster fire which is teenage emotions.

Honestly, even as a pretty emotionally stable adult, I try to be careful not to approach social dynamics with this kind of mindset: I recognize that it's "rational" and that people don't always mean what they say... but I also think that generally taking people at face value in day-to-day life is a social good, and the emotional highs of middle school are the last place I'd want to be explicitly undermining it.

Expand full comment

Andrew Wright

Jun 6

I think that exposing this prior would be exceptionally helpful in a calm, supportive and information-seeking environment. I'm specifically thinking the context of counselling. If you can't identify the prior you can't examine or alter it, even though it may be the exact thing that is causing your decision-making to go wrong. It seems so promising that I think I'll show it to my wife (a counsellor) because it could be an excellent therapeutic tool for the right clients.

Expand full comment

I really like this; thanks for suggesting it. I'll only push back on two niggling points: (1) I don't think that practical social advice is more legitimate (or engaging) for middle school students than existential/worldview questions, and that there's a long and intellectually deleterious historical tradition of trying to make education more "obviously" practical (the book on this is Diane Ravitch's Left Back) that's worth being aware of (though your example would not, by itself, cross into that error). (2) In my experience, it's difficult for students to take seriously the wisdom of teachers to speak to the personal realms (friendship, love) of their lives. However, when that authority has been earned, then yes, this is an excellent thing to do! Thanks again for it.

Expand full comment

Melvin

I feel like it's time to de-emphasise the Bayes in Bayesian statistics.

Bayes' Theorem is a perfectly sensible piece of statistics, but somehow the online discourse around it has picked up shades of religious fervour. A young teenager googling about Bayes' Theorem quickly winds up in an online environment which feels more like a cult recruitment session than a statistics lesson.

I don't know how it became this way.

Expand full comment

Michael Weissman

True. Very unfortunate since Bayes gives a practical approach to familiar decision problems.

Expand full comment

It has taken on some of the characteristics of a tribal marker.

Expand full comment

>> "somehow the online discourse around it has picked up shades of religious fervour."

Agreed — which was part of the fun in writing this!

>> "I feel like it's time to de-emphasise the Bayes in Bayesian statistics."

Is the only reason here because it could be construed as a gateway drug (ha!) to the online rationalist community?

Of course, I don't think that's so very terrible of a thing — though I feel like even if I believed that it was, another option would be to "rescue" it from the rationalists. If anyone wants to popularize it a different context, I say go for it!

Expand full comment

May 30Edited

To put my fact nazi hat on, "the longest thing anyone has submitted for an ACX contest” wasn't your post, but an essay of 90 pages (not even a book review) submitted to last years contest.

You probably get the prize for longest finalist, though, and a special prize for managing to write such a long entry that is engaging for the whole read.

Expand full comment

I can't help but ask: what was the 90-page essay about?

Expand full comment

https://docs.google.com/document/d/1GYQw3pgvhi7hqOVR-Ql629Q_8thbyHe8sSRy5voyt30/edit?tab=t.0#heading=h.cdezdtonc8cn

Honestly, I don't know. I couldn't make any sense of it. I read the first 20 pages and gave up. (It was the only review which I didn't complete, which says a lot about my OCD and about the entry.) You can read it here under "Sadly, porn", though the author reveals in the end that they haven't even read that book.

Expand full comment

https://www.astralcodexten.com/p/open-thread-381/comment/116451395

May 31Edited

lol. I asked 1123581321 to read the "Sadly, Porn" review maybe a week ago, since he said he's a big TLP fan. He's still working on it. :^)

If you wanna hear my take, you're cordially invited to the last open thread, where i'm still debating him about the value of moldbug's Cathedral idea, and how it relates to "bureaucracy creep" (which is what's causing the narcisissim epidemic TLP and Lasch kept going on about). Though I acknowledge a strong possibility that you're just gonna think I've lost my marbles as well.

top level comment by Richard Ngo, which kicks off a discussion about what this mysterious "Cathedral" is:

Expand full comment

Hahahaha, really? This is not a coincidence, because nothing is ever a coincidence. I wish 1123581321 good luck making sense of it, because I didn't manage.

I find it fair to debate how much value Moldbug's Cathedral idea has, but I don't have much to contribute. I joined SSC after the time when Moldbug had interesting ideas (as far as I can tell), and so I don't have a strong opinion about it.

Expand full comment

May 31Edited

I must admit, i had an inkling of which review it was, when you said it was especially long.

I think you misunderstood me though. I'm saying you might get something out of reading the thread, assuming you were still wondering about the thesis of the "Sadly Porn" review. Because I read the entire review, and I felt like I understood it pretty well, and I think it intersects with how moldbug understands modern history. Although moldbug was terrible at communicating his ideas imho. For a variety of reasons. Consequently, a lot of people seem to think he's just being edgy and doing the "reversed stupidy is intelligence" fallacy.

It's a "blindmen and the elephant" problem. Where lots of people catch glimpses of what I like to call "The Blob". But nobody seems to understand it in its entirety. moldbug is attacking it from a political angle, whereas TLP is attacking it from a psychological angle.

Expand full comment

Thanks, I see. At least from the thread I found out that TLP stands for The Last Psychatrist. This is interesting, I do recognize that Scott has mentioned this name a few times. But I still lack a lot of context. I have never read anything by either The Last Psychatrist (except for 20% of a book review) nor from Moldbug, and I think that all the discussions around them took place before I joined the party. I wouldn't even vaguely know what topics they stand for, except for Scott's recent article on Moldbug.

Expand full comment

NOOOOOOOOOOOOO

Expand full comment

Jonathan Moregård

Feedback: when I first saw the graphical version, it looked like a big 'L'. It took a little bit of time for my brain to snap into "adjacent bar chart". There might be benefits to doing examples where the odds are so different (100 vs 1) due to the dichotomy of it, but I wanted to name my brief confusion.

This is not a problem in the video, with the way they introduce the graphic

Expand full comment

I appreciate that; I'll pay attention to it when I introduce this in the future.

Expand full comment

https://tristantrim.github.io/bayesquares/

I wrote an interactive bayes square a while back. I was going to extend it to include numbers and stuff, but never got around to it. Check it out:

Expand full comment

https://rasmusen.org/cgi-bin/bayesbox.htm

I wrote up some Python and HTML a few years back, trying to learn how to create a web app. It's broken now. Maybe you'd like to improve it.

where the underlying code is

https://rasmusen.org/cgi-bin/bayesbox.py

Expand full comment

Dan

The NFL player example has "Monty Hall" / "Tuesday Boy" structure, where the categories you chose are contingent on the information you had. You telling us that it's either an NFL player or a math teacher, after we'd been previously considering "NFL player", strongly points to "math teacher".

If you'd recognized the thrower as a high school student, you presumably would've told us that the person in the bush is either an NFL player or a high school student. If the "NFL player" guess was wrong, you're telling us "either an NFL player or [the correct answer]".

Expand full comment

David Joshua Sartor

Yes!

Expand full comment

Cinna the Poet

https://tophat.com/catalog/humanities/philosophy/full-course/reason-better-an-interdisciplinary-guide-to-critical-thinking-david-manley/3425/

Much of what you're talking about is done in the introductory critical thinking text Reason Better by David Manley.

Expand full comment

David Manley

https://arbital.com/p/bayes_waterfall_diagram/

Thanks for the shout-out!

One difference: I use an easier-to-intuit version of the ODDS formulation.

First you ask "how much more likely is this evidence given H than not-H". That gives you a number, the Bayes factor. (I call it the "strength factor", as in a value representing the strength of the evidence.) Then it's simply:

prior odds in H X strength factor = new odds in H

I have found that for students this is far easier to do in one's head than the usual equation, or even trying to represent it pictorially.

https://arbital.com/p/bayes_rule_odds/

Expand full comment

Hi David!

I’ve long thought that it’s unfortunate that probability is so much easier to axiomatize than odds, because I think that odds (or perhaps even log odds) give us much better intuitions for how extreme the top and bottom ends of the scale are, and that 90% probability isn’t extreme at all for many of the examples that people use.

Expand full comment

David Manley

hey Kenny! yes! that and also:

- it’s much easier to remember the rule “prior odds times strength factor” than any of the alternatives

- it’s much easier to mentally perform two separate mathematical steps in sequence (or three if you count converting one value to odds), than it is to mentally plug in six values and then try to hold them in your head as you are doing calculations.

- the first of the two steps gives you the Bayes factor or “strength factor”—how much more likely is E given H than given not H?—which (i) cashes out the relevant notion of evidence by defining whether E counts as evidence at all for H, and (ii) is very intuitive as a way to think about evidence strength.

Even outside of updating, reflecting on how much more likely this observation was given various hypotheses reminds us of various ways this could have occurred even if our preferred hypothesis were false— what would I have expected the world to look like if I were wrong? This is one aspect of “considering the opposite” that has been shown to counteract confirmation bias.

Expand full comment

David Manley

(on the last point: of course the other variations also ask you to assess P(E|H) and P(E|~H), but then those values get lost a big equation. This version highlights that their ratio is what constitutes the strength of evidence and it's what operates on your prior to update you.)

Expand full comment

Anonymous Dude

May 30Edited

One of the things I've realized in my old...well, middle age is that being good at math means I'm not very good at explaining it. "Well, duh, you just add that to both sides and you can get a quadratic equation! That's obvious, right?"

So, from my very limited point of view...well done, sir.

Expand full comment

Mike G

Good stuff! Random: Does it matter than ~5% of NFL players are quarterbacks? I wonder what the spiral is like for the typical o-line and d-line guy.

Expand full comment

May 30Edited

I had a very wet-blanket response to this planned out in my head, but it was too negative. So I'll just be *mildly* Debbie Downer instead of full-on Eeyore.

"Consider, instead, just how irrelevant, useless, and impractical many of the things were that you threw yourself into when you were that age. Running a D&D campaign? Modding Half-Life? Learning Photoshop? Designing a fictional language?"

Looking it up, I see that American middle schools are in the age ranges 11-14 or 12-15. So let's take 12-15.

This statement has me (metaphorically) laughing-crying while I bang my head off the desk. The schools I was involved with definitely did not have kids doing these kinds of things. Instead, for example, we had 12 year olds already smoking cigarettes. This ties in with what you later say:

"A lot of the kids who I work with (not a representative sample: they skew toward the gifted and hyperactive sides of the spectrums) "

And I think it shows. Your model works great for the five schools where the kids are bright and interested in learning and thinking. Great, now how do we apply it to the other ninety-five schools where it's a struggle between "I cannot wait to get out of here" and "I'm only doing the bare minimum I need to do". You can draw all the pretty boxes you like, they do not and will not care.

Because those kids are not interested in nerdy topics or Bayes or anything other than sex'n'drugs'n' rock and roll, as it were. Currently where I'm working it's pre-school kids, and for some of them I already see that they come out of homes and a culture that is, to be blunt, welfare and scamming. We get them for three hours a day and it's the equivalent of "you have a pot plant which you carefully nurture with the right plant food and water it on a schedule and make sure to transplant it to bigger pots as it grows", then they go home for the other twenty-one hours of the day and it's "yoink the plant out of the pot, dump it on the bare concrete, it might get pissed on by dogs, it might get trampled underfoot, if it's lucky it'll put some roots down in a crack" until the morning when it's scooped up and put back in the pot for three hours.

(This, by the way, is why I have no patience for the lamentations of "schooooool is a prison for kids!" Yeah well going free-range is no picnic, either).

How are you going to inculcate a love of learning, or interest in knowledge, or getting them to think, when they're immersed in home and social lives of soap operas and following celebrity gossip?

I wish this was the way of education. I wish it could be scaled up for all schools. But I greatly fear it's going to apply to the few schools with the bright kids from supportive homes.

Expand full comment

Reply (3)

Consider the possibility that there are significant numbers of children who are interested in sex, drugs, celebrities, *and* nerdy questions regarding the nature of reality.

Student: "My girlfriend believes in Bigfoot, what do I say to her?"

Me: "You're 14!!"

Expand full comment

I really, really appreciate this pushback (and have spent way too long writing and re-writing a reply!).

>> "The schools I was involved with definitely did not have kids doing these kinds of things."

Mine didn't either. (I'm actually bamboozled that my local public schools now do, which is a testament to the spread of nerd culture.) Those examples were chosen to speak to my assumed ACX reader. (I'm curious: how'd I do?)

>> "Great, now how do we apply it to the other ninety-five schools where it's a struggle between "I cannot wait to get out of here" and "I'm only doing the bare minimum I need to do". You can draw all the pretty boxes you like, they do not and will not care."

Caring is, indeed, the core problem. In fact, I'd say the rot runs deeper — most *teachers* don't care about the content either. (I don't mean they don't care about their students, or meeting state standards, or a bunch of other potentially-relevant things — just that the academic content doesn't trigger any of their brains' "THIS IS ACTUALLY IMPORTANT!" buttons.)

But in most situations (maybe not yours; I'll address that below), I think there are still some intellectual coals glowing beneath the dirt (not the best metaphor), in both teachers and students. Kindling these into fires in teachers and showing them how to spread the fire to students has to be the central task of any plan for ed reform that stands a chance of achieving much worthwhile — and it's one that I'm undertaking as part of a new startup. (I never know how much sharing about such things is déclassé, but I'm happy to answer questions about it.)

>> "Currently where I'm working it's pre-school kids, and for some of them I already see that they come out of homes and a culture that is, to be blunt, welfare and scamming. We get them for three hours a day and it's the equivalent of "you have a pot[ted] plant which you carefully nurture with the right plant food and water it on a schedule and make sure to transplant it to bigger pots as it grows", then they go home for the other twenty-one hours of the day and it's "yoink the plant out of the pot, dump it on the bare concrete, it might get pissed on by dogs, it might get trampled underfoot, if it's lucky it'll put some roots down in a crack" until the morning when it's scooped up and put back in the pot for three hours."

This really sounds like a hard situation. Whatever work you're doing to bring goodness and respite to their lives, thank you.

There really are some situations where the wisest thing to do is to not even try to teach kids academics (like Bayes' theorem). FWIW, Egan argued that we should "split the school" into multiple tracks in middle school, like Germany or Switzerland. Powerfully (to me), there's good data that low-SES kids who go into vocational high schools see big rises in their math and reading skills — presumably because they begin to actually care about their schooling.

This is a sign that I should be careful not to speak *against* a shallow conception of "relevancy" in education. I've been talking with my co-founder recently about this, and how practical skills can themselves be further Eganized; there's deep human meaning to be found in prosaic jobs that can be brought out. I think I'll meditate on this, and use it for the workshops I'm creating this summer for homeschooling parents. Again, thanks!

Expand full comment

Thank you for the considered answer.

Yeah, I have a lot of feelings about the complaints about schools, because while some are valid and others have some degree of validity, a lot of the problems come down to culture. The environment that the kids are living in, that they interact with when they're out of school.

And increasingly school is being asked to do parenting as well as education. "Kids have shitty home lives? Well just extend afterschool hours so they can stay after classes end so they can do their homework!"

That is not a solution. Teachers have to have lives of their own, and paying teachers to stay longer to supervise homework classes is not feasible long-term. Unless we reinvent boarding schools, those kids are eventually going back to homes where maybe they aren't even getting fed (breakfast clubs, free school lunches), there may be only one parent or caregiver, the parent may not be capable even if willing, there's poor literacy or education, a whole damn host of problems. And yet school is meant to fix all that.

So then along come lovely, optimistic, and what sounds to me like naive notions like Egan about "let's make learning fun in a way that deeply engages the kids!" when a lot of the kids are from homes where maybe there's not even a pen or pencil in the house.

Not all of them. And I'm by no means saying "kids from lower class are all stupid or unacademic" because that's not true. But there's only so much the education system can do, and even if we managed to invent the perfect school, we still have to deal with the homes, the neighbourhoods, the attitudes, the generational poverty and lack of interest, the whole megillah outside of school.

Expand full comment

Christina the StoryGirl

JESUS CHRIST THANK YOU.

I've spent months (years?) chipping away at a kind of grand thesis / explanatory essay for the smartest people around here to make them understand stupid people.

Because many people here are way too smart to be able to understand stupid people. Like, I'm pretty certain many of them *literally* can't even imagine what it's like to be irredeemably stupid, because their super-smart brains automatically frame everything from a perspective of super-smartness.

Of course, they intellectually understand that low-IQ people exist and some will never have the ability to do certain difficult intellectual tasks, but in their brain of brains, they secretly believe stupidity is a problem of education, not raw potential, because for them, difficult intellectual tasks are indeed a problem of education, not raw potential.

I'm a bit smarter than the median person out there in the world, but amongst the dumbest of the ACX readers/commenters, especially when it comes to mathy stuff, and after puzzling a bit over the graphs and math I simply shut down. Then I got irritated.

That's what really stupid people do...except about almost everything, and without the self-awareness I have to observe and analyze my own experience and then coherently describe it to someone else.

Expand full comment

You're welcome! As a proud member of the Dumb As A Stump Community, I feel it is my duty to represent my people and share our lived experiences.

Since I am mathematically illiterate, that renders me a Grade-A Moron for the purposes of IQ tests and mathematical discussions on here, so I think I provide a valuable service in putting forward the viewpoint of the Too Thick To Live segment of the population 😁

Expand full comment

Christina the StoryGirl

FWIW, I think you and I and a few other ACX dummies have a rare perspective in that we're *juuuuust* dumb enough to have the kinds of jobs which put us in frequent and/or long-term contact with truly stupid people.

That actually matters!

ACX's FAANG coders and university professors and AI researchers et al don't have any truly stupid coworkers and of course don't socialize with truly stupid people, either (with perhaps the exception of the occasional conventionally hot truly stupid woman). They almost certainly don't have truly stupid family members.

They don't know any truly stupid people well, and thus...they just don't get it.

Expand full comment

https://www.nbcnews.com/news/us-news/florida-student-assaulted-school-employee-nintendo-switch-gets-5-years-rcna165667

Jun 3Edited

This is why, despite the vast gulf between our political views, I can agree with Freddie deBoer on education - he's been at the coalface, he's worked in schools, he knows that it's not a simple matter of "just throw money at it and every single kid in the class will then go on to Yale".

Same with Scott, who in his early practice did meet people from lower socio-economic classes whose problems weren't simply solved by "well go on a retreat to a yoga centre for a month" but they were enmeshed in the system where they'd be grinding away at low-paid jobs all their lives.

I've dealt with people on the lower end. Not all are stupid. Many are struggling with health (both physical and mental) problems, the legacies of their own upbringing, lack of education, etc. A lot aren't capable of filling out forms or jumping through bureaucratic hoops. It isn't easy for them. And their kids are most likely not junior rocket engineers in waiting who will become rounded citizens if the proper reformed educational system just uses This One Weird Trick to engage them.

I know Egan's method is *not* meant to be just This One Weird Trick, but in practice that is how it will shake out - the better schools with the better pupils with the engaged and supportive parents will adopt this and do pretty okay; if it gets traction, some muppet will want to scale it up to every school, and of course that means the original method will be watered down to fit into existing models, there will be no extra resources, teachers won't be adequately trained into it, there will be some kind of testing to demonstrate the kids are improving and hitting the metrics, schools with problem students will be expected to produce little Leonardos and Einsteins instead of "I'll smash your face in if you try to take my console/phone away"

https://www.fox5atlanta.com/news/teen-girl-sentenced-one-year-behind-bars-brutal-attack-teacher

Good luck teaching Bayesianism to a 15 year old who prefers to be on her phone than listen to you and will violently assault you if you try to take the phone away.

And we'll end up with one more Bright Idea on the scrapheap of failed Bright Ideas.

Expand full comment

Christina the StoryGirl

Jun 3Edited

> "the better schools with the better pupils with the engaged and supportive parents will adopt this and do pretty okay; if it gets traction, some muppet will want to scale it up to every school, and of course that means the original method will be watered down to fit into existing models, there will be no extra resources, teachers won't be adequately trained into it, there will be some kind of testing to demonstrate the kids are improving and hitting the metrics, schools with problem students will be expected to produce little Leonardos and Einsteins instead of "I'll smash your face in if you try to take my console/phone away""

Beautifully put.

Part of my vaguely formed theory of stupidity is that stupidity isn't so much about the ability to perform intellectual tasks (solve this equation, figure out that puzzle) as it is an inability to more or less dispassionately self-reflect.

The most noticeable constant in the stupid people I know well - some for many years - is a kind of basic inability to model any perspective which challenges their own experience or core values in any way. I used to believe that any unresolved disagreements I had with them were a result of me not being sufficiently skilled at persuasively explaining my thought, but after long, *long* experience, I eventually observed that my explanations very literally weren't even being considered.

*Couldn't* be considered.

And I'm not referring to topics like politics or religion, where demonstrably obvious provable facts often don't exist and many individuals hold opinions which are exquisitely rational *to them,* based on their experiences.

No, I'm talking about an inability to even consider that one might be incorrect about what a particular written corporate policy says, or how our health insurance deductible works.

The above real-life examples are objective facts which are supported by very available hard evidence. In both examples, the people I disagreed with were so passionately convinced of their incorrect position that I wondered if there was something *I* was missing. Even though I'd been seeing the policy and paying health deductibles for over a decade and would have bet the entire contents of my bank account that I was correct, I nevertheless went to check the hard evidence to see if something had changed since the last time I looked or I'd misunderstood something.

I hadn't.

But going back to my stupid coworkers with a copy of the employee handbook or insurance flyer and saying, "Oh, this sentence says..." wouldn't resolve the disagreement. It was like the words just...bounced off their brains, unheard and unprocessed. Like the words were very literally completely *unknowable* to them.

And these were words which were highly relevant to their well-being!

I had to witness moments like that over and over and over in the same people before I came to understand *this person is meaningfully different from me.* This person literally doesn't experience, "I'm pretty sure I'm right but I'll check anyway just in case I'm not" or even "I know I'm right but I'll get the evidence anyway just to prove they're wrong."

You can see those moments in police bodycam videos, but one-off scenarios can be excused as "drugs" or "unusually high emotions" and etc. It isn't until you see the pattern of behavior in someone over time that you begin to understand that's who they *are.*

Oh! And if you haven't read this essay yet, buckle up! https://hilariusbookbinder.substack.com/p/the-average-college-student-today

Expand full comment

Doctor Hammer

Jun 5

This is an excellent observation. I think that the directionality is somewhat incorrect, however. I believe that the inability to consider that one is wrong in the face of all evidence is a learned behavior that often is the outcome of stupidity, and that while it certainly reinforces ignorance and brain failure, it isn't the cause of stupidity itself in most cases.

I say it is a learned behavior, although maybe the better phrase is psychological shortcut. I usually see it in both stupid and smart people, both in temporary manufacturing workers and director level corporate drones, but not universally in those groups, which makes me believe it isn't stupidity itself. Instead, it seems to be a way of avoiding disagreement and winning arguments by exploiting the mechanism you described: being so sure of one's belief that others question themselves an back down. It probably emerges as a subconscious way of winning arguments with friends or parents when young, and eventually becomes so ingrained that "go with your first impulse and never look back" becomes the default setting until they cannot do anything else without hard physical reality smacking them first.

I think you really hit the nail on the head though with the description, and it does seem to be more common in stupid people than the very smart, although in my experience the slightly above average intelligence people seem to really make use of it as well. That group seems to be in the awkward position of putting great stock in being "right" by other people's judgement, getting good grades etc., but being right on the edge worrying that they are not really smart. So never backing down or admitting fault or the possibility of fault becomes a defense mechanism. Granted, their higher intelligence allows them to rationalize better reasons why they might not have been perfectly correct but were in fact also not wrong in the least, but it seems to be the same general behavior.

Expand full comment

Continue thread →

Spinozan Squid

May 30Edited

There's a big part of my experience that has a problem with these types of high-minded and noble theories of education.

My brother was an above average intelligence kid in a blue collar suburban school district. He struggled with math heavily starting at around Algebra 2. His issue? He never completely mastered one-digit multiplication: he would sometimes forget what 8x6 was, because he had kind of a checked out fourth grade math teacher. He never completely mastered nor completely memorized the quadratic formula. Little gaps like these that compounded over the years creating massive 'skill lag'. By the time he was in Algebra 2, there were so many fundamentals that he had to 'consciously load' to solve each problem, that he would just freeze up.

I feel like Bayes' Theorem is a great example of a 'skill lag' problem. It seems very simple for educated, intelligent people who read blogs and have degrees and careers that built a natural proficiency with numbers. But many people lack this type of numerical proficiency. I think students could conceptually understand Bayes' Theorem by itself, and I think they could perform the act of doing the manual calculations by itself. However, when you combine the two (conceptually parsing the problem enough to know what to calculate and calculating that thing), I think this creates too much to 'consciously load' for many students, and just like my brother in Algebra 2, many normal students would freeze up. If someone actually added Bayes' Theorem to Algebra curriculums, I think it a common classroom dynamic would repeat: the gifted kids (who are likely to develop these skills anyway) would enjoy it and thrive, the above average intelligence grinders would muddle through enough to perform adequately on the tests, and everyone else would struggle. Math is boringly more about fundamentals than most people who already have them like to admit.

Expand full comment

Reply (6)

Comment deleted

Jun 4

Comment deleted

Expand full comment

Mujtaba Alam

Area estimation is easier in some sense because it's easy to tell that your formula is wrong. Not so with bayed, where a situation's complexities are hidden until noticed.

Expand full comment

That's a school-support issue. No teacher can realistically accommodate 100% of all students regardless of need. In your brother's case, he should have been referred to a one-on-one tutoring program until he caught up.

Expand full comment

I think your brother would have had a better time with Bayes Theorem than with multiplication. You can do Bayes Theorem with pictures. He could see the size of the rectangles and get an idea of teh answer (2% vs. 20% vs 80%) even tho he couldn't calculate the exact number. But the big idea with Bayes Theorem is really qualitative, or at least ordinal.

Expand full comment

Skittle

> I think your brother would have had a better time with Bayes Theorem than with multiplication. You can do Bayes Theorem with pictures. He could see the size of the rectangles and get an idea of teh answer (2% vs. 20% vs 80%) even tho he couldn't calculate the exact number.

You can do the exact same thing with multiplication. I don’t know about American schools, but it’s a big part of how it’s (ideally) taught in England at the moment. Kids make arrays until they are sick of them!

They still have to memorise their times tables, but I hope it helps with seeing that 7 x 7 is only one away from 6 x 8.

Expand full comment

Mujtaba Alam

What are "arrays"? Are they the two by two boxes where each row/column represents a different place value? If so, that's a common core thing which many Americans do not like because it's not how learned it and looks more complicated than the traditional algorithm to them. (Of course, they also internalized the latter but not the former due to prior exposure in school)

Expand full comment

Skittle

To make an array to show 3 times 4, you put three squares (or blobs, or cubes, or grapes, or…) in a row, and then you do the same thing underneath until you have four rows of three each. You can count that this gives you 12. You can count it up in 3s. You can count it up in 4s. You can see that 3x4 gives the same answer as 4x3 (commutativity of multiplication becomes obvious).

And then you can see how you can split it up with bigger and smaller numbers. You can see that 8x3 is the same as 10x3 take away two lots of 3. You can see that 13x3 would be 10x3 plus 3x3.

You can sketch rectangles to consider multiplication problems, as you become more confident, and think about how changing one thing changes something else.

Then you can generalise to grid methods for larger numbers, if you want (which is what you’re asking about, I think, although they are not limited to multiplying two digit numbers by two digit numbers). Most schemes of work progress through that into the familiar algorithms so the algorithms aren’t just random method, but for a lot of kids the grid remains the best way, and it reinforces the concept of area, and it generalises to multiplying algebra. You can use them for polynomial division.

But the basic arrays let you explore all the properties of multiplication, and the links to other operations and topics, and give a way for considering problems. They also give you a sense of ‘size’.

Expand full comment

"Math is boringly more about fundamentals than most people...like to admit."

Amen to this. I sometimes teach maths to young kids in China, where most parents are fully invested in the Asian=good at maths trope. Quite often, when I tell them that my course for seven year olds starts with quite a lot of counting, parents bridle and ask if I'll be quickly moving onto grade-appropriate material for their little Jimothy, who is quite advanced, actually...

As I'm in marketing mode, I make soothing noises at them until they start paying money, and then go right on with teaching lots and lots of counting. It's always valuable. (Follow up: times tables board games.)

Expand full comment

Don P.

I've begun noticing that when most people report unashamedly that they're "bad at math", they don't even actually mean "mathematics"; they mean they can't multiply 2 3-digit numbers with pencil and paper.

Expand full comment

Mujtaba Alam

Counting is valuable up to a point. I'm sure you don't do daily counting drills. I wouldn't be surprised if the reason they want you to move on is because they already drilled it to hell and back from ages 3-6.

Expand full comment

What I’ve found is that giving people a different angle about why to care about what you’re learning gives them a new chance to master those basics, because those basics become relevant to a new thing they care about.

Expand full comment

"he would sometimes forget what 8x6 was"

I can sympathise with him! The amount of times I have to repeat the times tables to myself to get an answer because my brain will not hold "this is the obvious answer, duh" automatically! Which is why drilling in times tables was my salvation, and I was lucky enough that when I was at school this was still done before the new approach of "let them organically figure it out" was implemented.

Expand full comment

Unobserved Observer

May 30Edited

Couple of thoughts. They're somewhat critical, so I'll just preface by saying that I sympathize with what you're wishing for here! I am a bit skeptical though.

1. I don't know anything about Kieran Egan's work, but I don't think you need anything you mentioned from him here in order to motivate the techniques you're proposing, and it makes it needlessly esoteric. Some things (like visuals) are going to naturally work better for humans than other things, and I don't think it's because visuals are more "embodied" or narrative-like. The same goes for making it vital. Things we're interested in, particularly things we can interact with in some way, will motivate us more than abstract knowledge. That motivation is essential in getting us to put work into difficult things like understanding equations. Especially for children.

(Making it visual the way you're doing it here is a good idea because it's directly downstream of 2 big heuristics. The first is that visual aids do just seem to be helpful regardless of the content. Possibly there's an advantage in freezing things in time and seeing everything at one (equations also have that advantage though); a good visual will also make the connection between parts of an idea more obvious.

The second heuristic is that making things more concrete is better. An equation consisting of only variables is of the most abstract things we've got. There's beauty in that, but I doubt most kids can appreciate that beauty. Making it about specific numbers and showing how that they interact visually gives you something to hang on to while you're busy grasping the basic idea. I don't know if this is just me, but I think when teaching pretty much anything the concept should be expressed first with a concrete example and then the more abstract description/formula in terms of the specific example given.)

2. I just don't buy that this will work for the majority of kids. Even if you make it about something they're interested in, if it gets too abstract or difficult it seems to me that most kids would just decide that they don't care *that* much about whether Bigfoot exists. There'll always be some that are more naturally inclined towards this sort of thing, but I genuinely think that it's basically not going to happen for a lot of kids.

There's anecdotes about kids/teenagers putting a lot of time and effort into learning math or programming just because they were really interested in modding their favorite video game or something. I think that does work because there's a great feedback loop in learning for the sake of creating something or doing in general. The motivation there is much greater (and often more frequent) than the motivation you'd get for wanting to know how likely something is. Maybe if you applied it to a game that pretty obviously involves probability and then kids who are better at updating correctly will win more often? That's a pretty limited application though.

An additional related point; I don't know how kids would deal with a genuinely Bayesian philosophy (if you have anecdotes about this I'd be interested). The fact of the matter about whether the football thrower is a professional player or a math teacher is already out there. It's not really probability that we're calculating, it's degrees of belief. Particularly once the evidence gets more involved. Will most kids be able to grasp that?

3. > And all of this is toward the goal of helping lift them out of the Matrix, so they can see what they’re studying as imperfect, historically-contingent tools that they can, as autonomous agents, choose to use as they see fit.

This is kind of a nitpick, and I don't think it implies not teaching Bayes' theorem or rationality in general, but I wonder: should we even want this? It seems plausible to me that it's good for kids to have a really stable, necessarily simplified belief system until they're older and more emotionally ready to deal with the complexity of the real world.

4. > If that much human richness and potential can be pulled out of just one piece of the curriculum (albeit an important bit!), what could be achieved if we re-humanize the rest? What hidden vitality lies in poetry, or geography, or punctuation? With ancient history, or economics, or biology?

I'm not sure what you're suggesting here. Bayes is important because it's a general method. You're going to get a lot of richness because it has implications for how everything involving truth gets done. The other stuff are individual subjects. They're great, and we can see how great they are by looking at the kids that are already enthusiastic about their favorite subjects, but I don't think you're going to get a paradigm shift there the way you would if everyone were more rational.

My ideas after reading the essay for how teach these other subjects are a. make it visual (or use other cog-sci techniques), b. make it vital (i.e. interest/motivate). I agree that we should do those things, but haven't people been saying all of that for years? And if it's the case that those ideas are fairly widespread, doesn't that show that what we really need are better teachers or smaller class sizes or something? I have no doubt you'd be a much better than average teacher and that your camp teaches Bayes' theorem better than almost anywhere else. But isn't that really a function of time, resources and really smart, motivated teachers (and possibly a body of students who are more similar to each other than average) rather than some radical new idea? (You're going to tell me to read the full book review, aren't you?)

Expand full comment

Benjamin E Nachumi

Do the Monty Hall paradox next.

Expand full comment

Mahatsuko

Sure. Let's you that you choose Door A, and Monty responds by opening Door B to reveal a goat. Assuming all the standard rules for the game, you could use the following chart.

Https://ibb.co/XfhTvbYy

The green rectangle is bigger than the red rectangle and blue rectangle, so you should choose Door C.

Expand full comment

Benjamin E Nachumi

May 31Edited

I just think of it this way: P(car in doors B or C) was 2/3, and remains 2/3 because cars do not quantum tunnel. So now after Monty opens door B revealing a goat, 2/3 of the time the car is behind door C.

Expand full comment

Mahatsuko

I agree that that is a more intuitive way to approach the problem, but it isn't Bayes theorum, so a method of illustrating Bayes theorum can't use it (unless I'm missing something).

Expand full comment

Benjamin E Nachumi

The intuitive argument in pictures looks a lot like yours, but I like it better with two diagrams for before/after the goat revelation.

Expand full comment

leopoldo blume

I'll wade right in:

It "smells" like in your essay you are dancing around explaining how you are able to use Bayes probability theories to somehow prove the non-existence (or extreme unlikelihood) of a God, ie. you have come up with a way to convince kids (or people in general), that theism is silly (presumably because you are an atheist and you think this will make society more rational and therefore better off in general).

Could you elaborate on the reasoning and probabilities you use to reach your conclusions about this? (assuming I'm not way off base - if I am I apologize in advance for the question - and you really do reach such conclusions).

Expand full comment

You had me at Bigfoot.

If I still taught in a public school, I would try to adopt a lot of this. If my children still attended school, I would send them to a class that used these methods.

It's not the answer to everything, of course. We need an approach for when you suspect someone is trying to deliberately deceive you, or for when someone is using an argument to undermine you. What's the rationalist approach to hate speech, and how do you diagram that?

Still, an appreciable advance. Kudos.

Expand full comment

Mo Diddly

Great essay, though I got a little rattled by the fact that the first equation is not actually an equation.

Expand full comment

David Dunn

Possibly worth noting that the 70's and 80's Republican tax program were driven by Art Laffer drawing what became known as the Laffer Curve on a napkin belonging to Jack Kemp, then showing it to a bunch of people https://www.washingtonpost.com/archive/lifestyle/1986/08/31/the-lies-of-taxes-are-upon-you/9d37fcf2-5c88-42ba-a6cd-3584dc14403d/

Expand full comment

Gilpish

https://i.imgur.com/ZcVbm0M.jpeg

Your diagram is not to scale and it detracts from the point you are making. The whole point of Bayes blocks is to make visual comparison easy, but the block representing the NFL players looks *bigger* than the block representing the math teachers who can throw well.

I'm guessing you did this because the numbers you used make it very hard to see the two relevant parts of the diagram, but you should really have chosen an example with numbers which work better visually. I think because we are talking about statistical reasoning and logic it's important for this stuff to be to scale. At the very least it should be visually obvious that the green block is smaller than the red block.

Here- I made a version of it to scale. (yes, the NFL players are on there, just a green line right at the edge of the block.)

and here's the same numbers as squares with the same areas (easier to compare sizes)

https://i.imgur.com/fUCi8SK.jpeg

and here is the updated bayes block after removing the math teachers who can't throw.

https://i.imgur.com/3Ntyzi6.jpeg

Expand full comment

Gilpish

to add to this, the greaterwrong article on arbital is a good example of a to-scale bayes box https://arbital.greaterwrong.com/p/bayes_rule/

Expand full comment

Lauren S

Thank you - I'm shocked no one else has pointed this out. The elegance of 3brown1blue's explanation is that the key insight can be understood visually: you can immediately see that a small % of a large number may be much larger than a large % of a small number, without doing any math at all. If the diagram is not to scale, all the visual intuition is gone.

Expand full comment

priorGuesstimator

I see that a lot of work was done here, but unfortunately this explanation still didn't "click" to me. I understand the theorem intellectually and can do the calculations, but even having this helpful diagram doesn't make me instinctively use Bayes reasoning for those kind of problems.

Maybe the problem is that I'm kinda far from NFL in general, so the choice of the example is not the best for me personally. But again, I try to imagine how I would explain this model of reasoning to my less math-inclined friends and relatives and I draw a blank.

Like, what numbers are chosen for the bottom line and what for the vertical bars? Why is there such a big empty space on the right in most of the examples? Can the right big rectangle be higher than the left thin one and what would that mean? Why is thin always on the left? Can and should you make a chain of those in case you have multiple steps and how would they look - fractal-ish, maybe? Why isn't the bottom line drawn thicker than the rest of the borders if we start with it as a prior?

Maybe there is a more detailed article somewhere with like 15-20 examples of different kinds? I've read the original Eliezer's article and it didn't click then too.

Expand full comment

I don't think The Existence of God is too spicy. I could use it in the fundamentalist school where I teach, tho maybe it is too spicy for public schools. My 7th graders would be very interested. 7th graders know what are big questions, and they like to argue. So Pascal's Wager first, and then Bayes's Rule and Hume's Miracles Argument would make a great series of class sessions. It isn't pro-God or anti-God--- the arguments are not conclusive, just clever and useful. It can be taught in a balanced way.

Expand full comment

ike saunders

I think the annotations in these diagrams appear slightly misleading once you add the likelihoods. I'd guess that a majority of people, if shown one of the pictures in section 3 with no other context, would think that the red rectangle represents 1,500,000 teachers, not 15,000. Instead I think black arrows should be used to annotate the size of the priors, and coloured arrows the size of the likelihoods.

Expand full comment

Rafael Kaufmann

Just dropped by to say I also have a two-year-old who loves the Bayes for Babies book, also primarily because of all the BALLS

Expand full comment

Seth Benzell

Hi Brandon, great post. I'm also very interested in teaching people about Bayes. From two perspectives.

As a professor, teaching probability and statistics courses at the sophomore level, I put a big focus on Bayes that isn't there in most textbooks. I spend about a week on it in a semester class. My favorite lesson plan is the following: After the intro material, we use Bayes rule to unpack the case of Sally Clark https://en.wikipedia.org/wiki/Sally_Clark, a woman who was accused of murdering her two children. This was a case with 100% circumstantial evidence, and with some famous statistics errors, making it a good example.

After the students listen to a podcast discussing the case, we talk about how to fit it into the Bayesian framework. In small teams the students then discuss what a good prior would be (and I talk about the "prosecutor's fallacy" here), discuss the correct way to calculate the probability of two babies dying of SIDS if the mother was innocent (a doctor who testified incorrectly assumed SIDS deaths were statistically independent), as well as the probability of the two dead babies NOT being found if she did murder them (low, but always funny to discuss). The students put their guesses into this worksheet: https://www.dropbox.com/scl/fi/b6tuhb05hv63cnb55sye3/sally-clark-bayes-rule-exercise.xlsx?rlkey=ei1mi6jbde8uzcf5rjevm8jty&st=i50siwa0&dl=0, which automatically calculates the posterior based on their assumptions, so they can play with them.

It's always a popular class, I think because the topic is so grizzly and immediately engaging, and also because the students get to feel smarter than a British jury.

I also co-host a podcast called "Justified Posteriors" on Substack https://empiricrafting.substack.com/podcast that follows a Bayesian framework. Every other week we read a paper on economics and AI, explain our prior, and then see how much the paper changes our beliefs. I think too little scientific analysis takes this approach!

Anyway I salute you Bayes-king

Expand full comment

spandrel

I am often in the position of explaining why Bayes theorem matters to my colleagues. I am a research methodologist who plans a lot of clinical studies with smart people from other fields, so they aren't the target audience of educators; instead, they are people who lean heavily on their frequentist experience, and would who would generally prefer to ignore the problem of coming up with priors. So I came up with a modification of the Raven Paradox [sic] to make it clear why priors matter.

First, assume you take as a null hypothesis that “all ravens are blue”, and want to design a study to reject this hypothesis. You sample 100 ravens and find that none of them are blue; the frequentist conclusion is that the prob(ravens are blue) is 0% (95% CI = [0,0.04]). Not likely! So reject the null hypothesis.

But it’s hard to actually sample ravens! We can't find any, in fact. But we notice that logically, “all ravens are blue” is equivalent to “all non-blue things are not ravens”. So we decide since it’s hard to find ravens we’ll just test this null hypothesis instead, and sample 100 random objects that aren’t blue. We find that all of them are not ravens, and estimate that the prob(non blue things are not ravens given our data) is 100% (95% CI =[.96,1]). Do not reject – it’s almost surely true.

Thus, we have tested ‘equivalent’ null hypotheses but arrived at extremely different conclusions. What happened? We ignored our priors. If we incorporate accurate prior probabilities then we should arrive at very similar probabilities that the null hypotheses are rejected.

Usually I just stop there, because my audience gets it and we’re all busy people. But for the interested, here is how we’d work it out.

Approach 1.

Prob(ravens are blue|data) = P(H|E) = P(E|H)P(H)/P(E)

Note that P(E|H) = prob(observe no blue ravens in 100| all ravens are blue) = 0, so we don’t even need to know P(H) or P(E), only that P(E) > 0. Then

P(E|H)P(H)/P(E) = 0 * P(H)/P(E) = 0

We can reject the null hypothesis with certainty.

Approach 2.

Prob(non ravens are not blue|data) = P(H|E) = P(E|H)P(H)/P(E)

Here, P(E|H), the probability that we observed no ravens from our sample of non-blue things if our hypothesis is true, is 1. That’s how we got the 100% estimate earlier. Now let’s say the probability of our evidence P(E) – the probability that if we draw a bunch of non-blue things and see no ravens is 0.999999999. This is given what we know about the number of ravens vs objects in the world – we might look at a million non-blue objects before we see a raven. Then

Prob(non blue things are not ravens|data) = 1*P(H)/0.9999999 = 1.0000001*P(H)

Then our probability of rejecting the null hypothesis given the data depends almost entirely on our prior P(H) – how strongly we believe in our hypothesis. And it’s not much! Something very close to 0 in fact, say 10^-20. Which would bring Prob(all non blue things are not ravens| data) very close to 0 as well. Again, we reject the null hypothesis (though with perhaps slightly less certainty).

Thus, priors matter.

Expand full comment

Moritz

I find Bayes theorem to be much more natural when thinking in odds, not probabilities. In your example:

- Prior odds of math teachers v NFL players: 1000:1 (i.e. 1,500,000 teachers vs. 1,500 players)

- Observation odds: 1:100 (i.e. 1% likelihood vs. 100%)

- Posterior odds? Easy - just multiply the two together. 1000:1 x 1:100 = 10:1. I.e. about 9%

Indeed Bayes’ theorem, when reformulated in odds-language (odds of an event E being p(E):p(not E)), simplifies to the straightforward multiplication above. I wonder why it’s not tought this way - maybe I oversee something?

Expand full comment

Paulin

Some people do explain it that way! 3Blue1brown among them

Expand full comment

kipling_sapling

May 30Edited

> not because I think we should expect kids to sit down and do the math on their own, in their everyday lives. (Do any of us actually do that?)

I confess I'm a little confused. Indeed, I don't do that. But my impression from frequenting these communities was that "doing the math" in our everyday lives was precisely the goal for LWers -- and I always felt a bit inferior that I don't do that, and even though I was once an aspiring actuary, it would take quite a bit of mental effort to do it correctly for everyday situations.

So what IS the goal of teaching Bayes? "Developing intuition for everyday probabilities by working a lot of specific examples" seems like a reasonable answer, but I don't think I've seen people claim that, not when they're specifically *teaching Bayes* in rationalist spaces at least. If that's the goal, wouldn't it be good then to have students preregister their estimates for a probability, then do the math, then check the accuracy of their initial estimates, and track their progress over time? More than being able to do all the steps infallibly, being skilled at estimating the answer quickly (with System 1) would be the skill that actually translates to "improving intuition for everyday probabilities," wouldn't it?

As I mentioned earlier, I've felt inferior for years at the fact that I don't have a great intuition for how to apply Bayes to everyday life and would struggle to even set up the problem correctly to accurately calculate probabilities using it for everyday problems. I've often wished that the introductions I've read would have more practice problems, more exercises, that demonstrated exactly how it works for a normal layperson going about their daily life. I agree with you that worldview issues are probably the best hook for most people -- do you have some practice problem bank for such questions? I think for me, the best way to build my muscles with this stuff would be to have a few hundred problems that are a mix of worldview issues, casino games probabilities, the standard medical/insurance type problems, and fork-in-the-road-of-life-how-do-I-make-the-right-move questions. I realize that the "answers" to the problems would depend on the prior probabilities that *I* assign, but that's not an insurmountable problem. A web applet could take my prior probabilities and provide an objective answer I could check my calculation against, and could even integrate with an LLM that could let me know when I've entered a prior probability that it judges to be way out of wack (like if I've smuggled assumptions about the posterior probabilities into my prior numbers, a classic rookie mistake!)

I don't want to go back and rewrite my whole comment, but I'm realizing at this juncture that part of your thesis is that the *social* aspect is what develops the intuition. I think that's great, but I still think the preregistration AND having a wide variety of problems from different spheres would go far.

Expand full comment

Roman's Attic

Reading through this article, I'm realizing that being gay taught me rationality. Learning about denial and how never to experience it again taught me a lot of the scout mindset, and having crushes on people that *maybe seemed gay but were actually straight* taught me about Bayes' theorem and the low base probability.

Expand full comment

Roman's Attic

The best image I've seen representing Bayes' theorem is The Decision Lab's cartoon talking about the base rate fallacy (https://thedecisionlab.com/biases/base-rate-fallacy)

Expand full comment

Alcibiades

One small suggestion: when you are using visual aids, make sure it actually makes sense visually. The area you use to represent the NFL players and math teachers are about the same size, but the math teacher area should be 10x larger. What you've shown would work well if the probability was 50%. It doesn't make sense for 9%, and instead is just another source of confusion.

Expand full comment

Psychologist Gerd Gigerenzer has devoted a lot of his work to showing that (a) even smart people like doctors don't get Bayes Rule, but (b) even dumb people like children can understand it if you teach them the right way.

Someobody has taught Bayes Rule with legos. I forget if it was him. Anyway, here is one example f his work:

See Cognition, 98, 2006, 287–308. www.elsevier.com/locate/cognit

Children Can Solve Bayesian Problems:

The Role of Representation in Mental Computation

Liqi Zhu1

Institute of Psychology, Chinese Academy of Sciences, Beijing

Gerd Gigerenzer

Max Planck Institute for Human Development, Berlin

Abstract. Can children reason the Bayesian way? We argue that the answer to this question depends on how numbers are represented, because a representation can do part of the computation. We test, for the first time, whether Bayesian reasoning can be elicited in children by means of natural frequencies. We show that when information was presented to fourth, fifth, and sixth graders in terms of probabilities, their ability to estimate the Bayesian posterior probability was zero. Yet when the same information was presented in natural frequencies, Bayesian reasoning showed a steady increase from fourth to sixth grade, reaching an average level of 19%, 39%, and 53%, respectively, in two studies. Sixth graders’ performance with natural frequencies matched the performance of adults with probabilities. But this general increase was accompanied by striking individual differences. More than half of the sixth graders solved most or all problems, whereas one third could not solve a single one. An analysis of the children’s responses provides evidence for the use of three non-Bayesian strategies. These follow an overlapping wave model of development and continue to be observed in the minds of adults. More so than adults’ probabilistic reasoning, children’s reasoning depends on a proper representation of information.

Expand full comment

David Bahry

My favourite way is to

1) start with the verbal intuition, "How likely something is after seeing a new piece of evidence, depends both on how strong the new evidence is, and on how likely it already was to begin with. Think 'extraordinary claims require extraordinary evidence.' The lower something's prior plausibility, the stronger the new evidence would have to be to convince you."

2) give the odds form of Bayes' theorem: posterior odds = prior odds × Bayes factor. It's more transparent to intuition than the probability form. Explain how the terms line up with the verbal version, including BF as measuring the strength of the evidence.

3) Explain Bayes factors, including examples. "BF is the ratio of how strongly the hypothesis predicts the evidence, to how strongly we'd expect the evidence even if the theory's false. Imagine someone on trial for murder telling us he's innocent. That's only weak evidence, because he'd probably tell us he's innocent whether or not he was. The strongest kind of evidence would be something you'd expect to see if the hypothesis is true, *and* strongly expect not to see if the hypothesis is false."

Expand full comment

David Bahry

I see Eliezer also started teaching it with the odds form! https://www.lesswrong.com/w/bayes-rule?lens=high-speed-intro-to-bayes-s-rule

Expand full comment

sclmlw

This is great. One concern I'm glad you mentioned, but that feels like blasphemy sometimes to even bring up in these spaces is this part:

> Bayesian reasoning can become confirmation bias on steroids. You have to be humble in your analysis, because there are SO MANY DIFFERENT WAYS IT CAN GO WRONG.

Take the football example you mentioned. There's are so many wild guesses in that example, plus contrived certainties. (E.g. the guy who goes into the bushes knows it's one or the other with certainty for some reason? Real situations are rarely this binary or clearly known. Or the idea that you'd know the frequency distribution of NFL players and teachers? In my experience people just guess, which is worse than nothing.)

I feel like there's too much worship at the altar of Bayes. Mathematic equations are only as good as the assumptions they're based on. Bayes' theorem assumes you know 3 variables in order to get a probability for the fourth. I can't tell you how many times I've seen people whose only certainty is their own prior try to apply Bayes' theorem to the problem, guess wildly at the unknown variables, and come away with a reduced uncertainty that's totally unjustified.

I recommend ensuring your curriculum has these kinds of situations built in, so you can also teach students under which circumstances they should NOT reach for Bayes' theorem.

Expand full comment

NoodleIncident

In the NFL example of the box diagram, it’s drawn so out-of-scale that the visual intuition actually misleads you! Since the light blue box (1/1000) is drawn even wider than the dark blue box (1/100), the visual shows more-than-even odds of an NFL player.

I’m not sure if attempting to shade 1/1000 of the width would be helpful, either, but it was so distracting that I had to mention it anyway.

Expand full comment

Alan Thiesen

This post reminds me of a problem I have with Bayes' Law. Whenever I try to use it to update the probability of a possible future event, Bayes' Law is useless because the conditional probabilities on the right side of the equation are harder to estimate than the one on the left. What am I doing wrong?

I'll illustrate this in the discrete case with two alternatives. Let P(F) be the prior probability of a possible future event F. A recent event E that affects P(F) has occurred. To estimate P(F|E), I say:

P(F|E) = P(F)*P(E|F) / ( P(F)*P(E|F) + P(~F)*P(E|~F) )

But the conditional probabilities on the right-hand side are of the form

P(something that has already happened | a possible future event that might or might not happen).

It is difficult to understand what such a probability even means, let alone estimate it. The conditional probability on the left is easier to estimate, so Bayes’ Law is useless.

What am I doing wrong?

Expand full comment

this is, without a shadow of doubt, the #1 complaint about bayes stanning.

Expand full comment

MicaiahC

This is a weird way of viewing things.

Take the presidential election.

P(X candidate wins| some poll outputs some result)

Would have relevant variables:

P(prior belief on how well a president was doing)

P(times the polls output equivalent results| president actually wins)

P(times the polls outputs equivalent results|presidential actually loses)

I'm not sure what about time comes into it, except insofar as time goes on you get more information, which would be the eventual expected update:

P(president wins| every media resource you see and everyone you see saying they won)

And so on.

Just because you know that update is going to eventually come, doesn't mean that the earlier steps are pointless.

Bayes is meant to represent changes in your state of knowledge given evidence. Trying to purposefully put terms into your evidence-learning machine that isn't evidence and you're proud that it isn't evidence predictably breaks it.

Expand full comment

David Wyman

Whatever our priors, constantly updating based on new information eventually works. Like the free market and gravity, you can interrupt it, game it, or overcome it (crony capitalism, upward force), but it doesn't go away.

Expand full comment

Mo Nastri

Unless our priors are trapped of course. Which is probably the case for the most important cognitive biases of all https://www.astralcodexten.com/p/trapped-priors-as-a-basic-problem

Expand full comment

Dylan Kane

I'm curious how you see Geary's distinction between biologically primary/secondary knowledge fitting in here (https://psycnet.apa.org/record/2008-16048-002).

In short, Geary argues that we've evolved to learn some things (biologically primary) like language, many motor skills, social conventions, basic physical intuitions. We learn these without any formal instruction through exposure and immersion. Then there are some things we haven't evolved to learn (biologically secondary). These are things like reading, solving systems of equations, chess, and Bayes' theorem. There are some bright people who can pick up this stuff without formal instruction but most people need some sort of structured learning to understand these ideas.

Does Geary's work fit into your view on education? Are you trying to harness those biologically primary modes to teach biologically secondary skills?

Expand full comment

Fabian

I don't get why Bayes alone should be the change for the world.

There are many other influences that make people irrational (all kinds of stress)

before Bayes can even kick in to lift rationality on another level.

"Could a new kind of school make the world rational?"

wouldn't the rational thing be to put Bayes after other things like:

Nutrition, physical fitness, sleep (understanding the importance of sleep, sleep hygiene and routines), mental & emotional health, financial literacy, critical thinking and media literacy....

then Bayes...

then communication & relationships (active listening, conflict resolution, setting boundaries, navigating romantic relationships), life skills (first aid, emergency respose, time management, ..)

and many other important things...

Expand full comment

May 31Edited

The most common form is the equation. The 2nd most common form is the boxy rectangles. But when I first came across Bayes Theorem on LW, I invented my own visualization. Which, curiously, I have yet to see anyone else demonstrate.

Mentally, I have a 3D image a building, where each floor represents either a circle or Venn Diagram of circles. The area of each circle represents an absolute quantity, and the area of the floor represents the entire population size. And the multiplication operator (or division operator) represents a "telescoping" relationship between the circles from one floor to the next. In this frame, Bayes Theorem represents how one particular relationship between two circles gets shifted into a different relationship between two circles.

It's very difficult to explain in words without a diagram. But I think it's somewhat easier to follow if \omega (i.e. the entire sample) is made explicit, rather than hidden away for convenience. E.g. P(H) should be fully unrolled as P(H|\omega) because it makes the underlying concept of "multiplication qua telescoping" [e.g. P(E&H|\omega) = P(E|H) * P(H|\omega)] more intuitively obvious. Consider how, if you treat each probability as an actual *fraction* of two numbers, there's a sense in which the "H's" cancel out. I.e.

> P(E&H|\omega) = P(E|H) * P(H|\omega)

> P(E&H|\omega) = P(E|_) * P(_|\omega)

> P(E&H|\omega) = P(E&H|\omega)

has the same underlying logic as

> (A/C) = (A/B) * (B/C)

> (A/C) = (A/_) * (_/C)

> (A/C) = (A/C)

And once you grok this, Bayes Theorem itself becomes a natural and obvious conclusion. (I know if I say "becomes trivial", i'm going to be memed into the stratosphere. But this is how I feel.)

Expand full comment

Vadim

May 31Edited

Places where Bayesianism contributed something to my life, from what I can easily recall:

· Just this general idea that your world model is a weighted mixture of hypotheses and you receive evidence and promote and demote hypotheses correspondingly.

· Connection between evidence and information, the idea that evidence can be trivially presented in bits if you use log odds ratio.

· Intuitive ideas about how evidence works in real life, like the conservation of expected evidence: if there are two outcomes of an experiment, and one should update you toward a hypothesis, the other should update you away from it. Moreover, they should balance each other out quantitatively, like if there is a strong probability of weak evidence in one direction, there should be a weak probability of strong evidence in the other direction — and your expectation of update before the actual experiment is 0.

In general, Bayesianism to me seemed like some kind of hard-to-describe mental discipline and a bunch of mental habits of someone who tries to keep a consistent and quantitative picture of the world in light of uncertainty as all kinds of evidence arrive and make it difficult. So, like, if you previously believed something with high probability, and now evidence very strongly updated you away from it, you'll try to take apart the entire reasoning sequence that lead you to this belief in the first place. Not because of some theorem, but just because that's what it takes to have a consistent worldview.

Otherwise, interpreting the entirety of Bayesianism as "you know, just figure out some priors, and then slap some updates, and look, we have a shiny new number!" seems like the kind of mistake that could lead to rootclaim losing the lab leak debate. (Do I sound like I think the original post makes this mistake? Because I don't! I actually think the idea of letting kids figure out bigfoot claims with Bayes is a very cool idea. So I just wanted to describe what Bayesianism contributed to my life, and this basic mistake I've seen elsewhere seemed worth mentioning.)

Expand full comment

Sjoerd Dost

Are you familiar with Bret Victor's Dynamicland? Connecting modern concepts to 'the old tools' in a spectacular way, imo. See https://dynamicland.org/

I bet teaching kids about Bayes' theorem through cutting/pasting paper boxes with physical scissors and (re)combining them to see probability change would be a hit.

Expand full comment

Martin L Morgan

Thanks for the post. I enjoyed reading it.

Expand full comment

Sufeitzy

https://www.amazon.com/Visual-Display-Quantitative-Information/

This is a poor example because the visual is an extremely bad representation of quantitative information - the “math teacher passers) are is not 100x as large as the football player area, but you depend on that ratio for visual intuition. Yet you display something like (P=Pro, M=Math Passers, m=math non-passers)

Pmmmmmmmmmm

PMMMMMMMMMM

= 50% chance (10 P / 10 P + 10M)

Let me give you a related problem in logistics which is exasperating.

Perfect order reliability = number of delivered orders with no damage & are complete & and met the request date & correct paperwork / total # orders.

I have adult professionals who then say Ok we will calculate it as (% undamaged x % complete x % infime x % good papers )

Ok, I say in a shipment we have one of each defect. Then the immediate answer is

75% x 75% x 75% x 75% = 23%

Then I say - no, you you can’t know. You have to know the rate of each combination of issues.

If each order has one defect then the reliability is 0% then I make a 2x2 gif where each column or row is a defect

If one order has four defects then the reliability is 75% - it’s pretty obvious R is 75%

Visually the intuition is instant.

The combinatorics stump them - orders which are damaged, damaged and incomplete, damaged and incomplete and late, damaged and incomplete and late and bad paperwork, damaged and late, damaged and late and bad paperwork, damaged and bad paperwork, incomplete, incomplete and late, incomplete and late and bad paperwork, incomplete and bad paperwork, late, late and bad paperwork, bad paperwork etc.

From this example, and a grid of countable numbers (2x10) it becomes more obvious. In your visual aid the size of the math teacher area was not 100x the size of the football area therefore the visual reasoning was very poor

Try

Pmmmmmmmmmm

PMMMMMMMMMM

You get likelihood more obviously as 2 P / 2P + 10M = 16.7%

If we were to use the proportions in the visual example you gave it looks like the likelihood is 50% not 9% because the math passers and football players were the same areal size (P=pro, M=Math passers, m=math non-passers)

Pmmmmmmmmmm

PMMMMMMMMMM

In teaching with visuals, you actually have to have visuals which work.   If you’re just starting out making visual displays for instruction, I recommend Tufte, he’s got reasonably good rules of thumb

Expand full comment

Kalimac

OK, so how do you use Bayes to estimate the existence of Bigfoot? Using the example given, is "Bigfoot exists" the NFL players or the math teachers, and what's the other one? Color me confused.

Expand full comment

Seth Benzell

Hi Brandon, great post. I'm also very interested in teaching people about Bayes. From two perspectives.

It's always a popular class, I think because the topic is so grizzly and immediately engaging, and also because the students get to feel smarter than a British jury.

Anyway I salute you Bayes-king

Expand full comment

Peregrine Journal

https://peregrinejournal.substack.com/p/plausible-probable

I recommend we start by nudging people toward probabilistic reasoning at all, by calling out common fallacies that result from avoiding it.

Unpacked more here:

Expand full comment

Wasteland Firebird

May 31Edited

The math teacher problem is a good example. It makes a lot of sense! But I'm a smart guy and I still can't figure out how to translate that technique to "do ghosts exist" and "who should I vote for." I'd like to see a post with just a bunch of examples like these.

Expand full comment

MicaiahC

ET Jaynes talks a lot about the pedant correct way to write the equation is to add |given all the evidence) to every term. This isn't just trying to be technically correct. What it's also saying is that you can build on other, related observations. So for example, you can decompose "are ghosts real" into "given x number of people witnessed something, how often does that turn out to be real"? Or "given what we know about death (or souls) what is the chance this turns out to be real" etc.

For the political question, it becomes much trickier, since you have to start looking at actual policy positions, stated during election vs actions while elected etc.

Expand full comment

Nir Rosen

Rationalist indoctrination camps for kids!

Expand full comment

REF

This line seems problematic: ‘ It overlaps a lot with Kahneman’s “System 1” and “System 2”.’ Clearly it does not overlap with “System 1” and “System 2.” I can imagine a parallel where old tools represent algorithms already programmed into System 1 by System 2, prior to advanced schooling. However, the existing phrasing makes it seem like you don’t understand that (S1/S2) which then raises the question of whether you also don’t understand all of the other references with which I am completely unfamiliar.

Expand full comment

Snakesnakeseveralsnakes

I like to apply Bayesianism to Bayesianism. What is my P(Bayesianism is valid epistemology)? What observations would increase my credence that Bayesianism is valid? What observations would decrease my credence? How low would it have to get before I seriously question its utility? If I rejected Bayesianism by using Bayesianism, would this counteract the rejection? (Been reading too much Hofstadter!)

Expand full comment

The Dao of Bayes

"I do, and I tell you that the person in the bush is either an NFL player or a math teacher. "

This conveys very different information if you pre-commit to answering that question, or if you came up with that framing after learning the actual answer. All of your math assumes the former, but the actual wording is the latter.

I wouldn't usually be pedantic like this, but if your goal is "intuitive explanation for kids" you have to expect them to pick things apart.

Expand full comment

David Piepgrass

Jun 4Edited

> I tell you that the person in the bush is either an NFL player or a math teacher

Oh boy, this immediately struck me as too complicated.

You launch into discussing the frequency of math teachers and NFL players. This makes sense if this is a math problem, but not if this is a rationality exercise. Here, H=he's NFL and E=you saying "either NFL or math teacher". The rationalist must define H and E in a way that is grounded in what they are certain of. So E is NOT "he's either an NFL player or a math teacher", it's "you TOLD me he's either an NFL player or a math teacher", and since that's a very strange thing to say, it prompts me to wonder things like "huh, did you decide to say that before or after you talked to him and would that make a difference to the probability? Did you even talk to the guy at all or are you shy too? Does your face give away what you're up to?" because these things should factor into a proper mental Bayes calculation. I do think that one way people use Bayes wrong is to replace evidence with whatever the evidence appears to point toward, implicitly giving P=1 to the only interpretation they considered.

How can we make this more realistic? Well, a prerequisite here should be that it is meaningful to children. And it should to be a realistic scenario―it could be a puzzle, but preferably not in the way the above example is a puzzle. Ideally it should be something fun, where intuition often fails. Maybe the Monty Hall problem? Or perhaps to explain why it is safe(ish) to talk to random strangers, but not safe to trust a stranger who approaches you?

Later in the article you move in the right direction, of course. (Speaking of the reason I got into rationalism, I think it's because I first learned that my religion wasn't true, which took me about 30 years, and then I encountered various reasoning cesspools on the internet ― climate science denial, /r/politics, anti-vax, anti-nuclear, flat earth, .... On the first, I wanted to know how much evidence it would take to convince one of them they were wrong, and the answer turned out to be ∞. People told me it was ∞, but I didn't believe them; I had to learn for myself that for many people, evidence and sources are the enemy's weapon, something to attack, or more often ignore, not something to understand or explain.)

You can't very well explore important political matters of today without angry calls from parents, but maybe important political matters of the past: the arguments people made about cholera in the 19th century, or about slavery, or about methods of farming (if it could somehow be made to seem not-boring), or about whether other stars had planets, or about evolution―uh, have we not resolved that one yet? It would be great to find a way to make kids excited these topics, because beyond simple rationality training, I think it's important to realize that we live on the same planet we were living on 1000 years ago, the same planet that used to be Malthusian, disease-ridden and low-tech, and while civilization has come to us, there is nothing forcing it to stay. It's hard enough to build civilization that today many people still die of die of malaria or tuberculosis or hungry/malnutrition, and if the current "let's destroy institutions" trend continues, developed countries can become developing countries once again. Maybe if you can tell the story of real-world risks and rewards in the right way, kids can somehow be persuaded to care.

In any case, I think the best trick would be to make it into a game. To somehow make a video game or board game that is modeled on real-world information and systematically rewards Bayesian thinking. Like what one of those empire-building games were based on realistic data and involved a lot of evidence being presented to the player, who then had to interpret the evidence in a Bayesian way? For example, a premise for a game could be that success requires discerning the intention of world leaders based on evidence: social evidence, like you discern the intention of autocratic and democratic governments by things they say, and also statistical evidence, like production/consumption figures, trade numbers, GDP...

Similar games could work at a more local level, like: running a business based on evidence from market signals, governing based on expert information, running for public office based on interpreting polls to choose a platform, running a factory (debugging causes of problems in faulty equipment, choosing new equipment based on biased marketing material from vendors...)

One thing missing from this, as usual in rationalist discourse, is the question of where priors come from, because I would think it normal that there is not a clear and reliable source for the four numbers you need to fill out the diagram. Oh well, I'm sure somebody will look into it someday.

Expand full comment