335 Comments
Comment deleted
Expand full comment

On the question of who is a great K-12 teacher, just ask the parents. Or the admin. They all just know who it is. It is the one everyone wants next year.

No need to spend resources coming up with a metric to assess which teacher is most deserving of merit-based pay raise.

Expand full comment

The problem with that is that it will often be the one that was really good ten years ago and is coasting on their reputation now.

Expand full comment

That assumes involved parents who care, and who know the community well enough to know what teachers are better. It can break down in some cases.

On the other hand, it's a whole lot better than nothing.

Expand full comment
author

A lot of research shows that student evaluation of teachers is biased (depends on things like instructor attractiveness, how much they inflate grades, etc) and not related to knowledge gain (uncorrelated or negatively correlated with test score improvement). See eg https://www.aaup.org/article/student-evaluations-teaching-are-not-valid but I think there are dozens of studies that support this.

Given that, I wouldn't want to say parents are definitely good at this before testing it.

If parental opinion really is the best marker for teacher performance, a prediction market would learn to take that into account. If it was one among many good markers, a prediction market could learn how to weight it compared to the others.

Expand full comment

Also cookies. Teachers who give out snacks within a couple of weeks before evaluations do better.

Expand full comment

I was a teaching assistant for a computer science class at university. I was amazed by how many comments I got that were like "She has nice eyes" and "She is kind", all unrelated to my actual work and whether it helped them.

But that was college students giving anonymous feedback.

In a K-12 school though, in my experience, parents of older students at the school are a treasure to know as they can give your child great advice. Including, "This is a terrible school for sports. Very disorganized. Fund another school if it matters to you."!!

These older students have to be similar to your child though...very similar in giftedness, similar in interests, similar in goals, for the advice to be useful.

Advice like : Do not join this club. Or, do not take this class here, just do a coursera over the summer.

As kids get older, they find such advice directly from older peers. Ofcourse, they need to exercise good judgement in who they get advice from.

Expand full comment
Comment deleted
Expand full comment

That reminds me of my ninth grade math teacher - fresh outta school, hot, cute smile and suddenly I was straight As in math. In eighth grade I had uninterrupted streak of Fs after first half of semester, just for comparison.

Grades converted to Yank system for readers' convenience.

Expand full comment

Given how fetching you found your teacher, I bet you had quite the yank system in place already.

Expand full comment

GLOMAR on that!

Expand full comment

This is a valid feedback. Had contact with otherwise good teachers who were assholes - and therefore they were quite bad teachers.

Expand full comment

You would be shocked how many times, as a professor, I would have to call students into my office and tell them it wasn't appropriate to give your TA the feedback "you're really hot".

Then again, you probably wouldn't.

Expand full comment

Student evals were always anonymous in my department. I'm surprised that the student information was available to do that.

Expand full comment

Because these were TA evals, they weren't anonymous and just went through me to be aggregated.

Expand full comment

Is research on such subjective questions really scientific though? I just don't think so. There are too many factors to consider. Too many unknowns, and also unknown unknowns.

The best advice on K-12 education I ever got, was from mom friends, or mom friends of mom friends who were (like me) very serious about an excellent education for their children, plus eager to share what they'd learned.

Most of all, they had kids were similar in giftedness and interests.

I found some great approximations to schools that turned out to be very good matches. They were never perfect matches.

Homeschooling is ideal academically (and so customized your child is done in 3 hours and can then focus on a passion) but if you want to homeschool, begin EARLY! Otherwise kids seem to think of it as weird and refuse to do it.

Expand full comment

I am not rejecting all science in K-12 education. In fact I really love the writings of Dr. Daniel Willingham, cognitive science professor, who focuses on K-12.

Expand full comment

I wish there was an edit button for after I press "Post", for a few minutes!

That first line in my previous comment here should read :

I am not rejecting all scientific studies on how to do K-12 education well. In fact I really love the writings of Dr. Daniel Willingham, cognitive science professor, who focuses on K-12.

Expand full comment

As for instructor attractiveness influencing teacher evaluations, economist Steven Landsburg addressed this in one of his books. Summary is on page 14 of this transcript: https://www.rogerdooley.com/wp-content/uploads/2018/12/EP247-BrainfluencePodcastTranscript.pdf

In brief: This effect is perfectly consistent with people *not* being influenced by how attractive their instructor is. Attractive people have a lot of career options. The ones who go into teaching despite this fact must be the ones who really love teaching. No surprise that they connect better with students.

This isn't central to the point SA was making. It's an interesting side note though, and it shows me how easy it is to misinterpret data!

Expand full comment

Statistics is oversold to us anyway.

:)

I'm not talking about formally questionnaires resulting in evaluations but an informal chat, often, with other parents.

Nothing like it.

And yes, I do assume closely involved parents.

Expand full comment

Or consider the fact that people want to please attractive people more, because they value their good opinion more highly, for the usual social reasons. Everyone wants to be friends with the good-looking man or woman.

Expand full comment

Should we expect the same effect in every career except the very best-paying ones? I.e. should we predict that attractive people in most fields are also above-average competent?

(After correcting for the known correlation of attractiveness with family wealth, good health etc. etc., and for any direct effect of attractiveness on job performance in e.g. sales.)

Expand full comment

I have the following generalized confusion regarding prediction markets:

Prediction markets are slightly less than zero sum games, right? The players win and lose against each other and the market itself takes a tiny rake?

If that's correct, it seems like the markets need a steady stream of consistent losers to operate and that seems unsustainable. This is not the stock or bond market, the FED isn't constantly pumping in money such that you can be confident of some return just for playing. You have to actually win your bets to come out ahead. If you play the market, learn you are not good at predicting student/teacher outcomes, logically you will stop playing pretty quickly. Once the bad predictors are weeded out, and only the good predictors remain, they're at a slighly less than zero sum game, each trading off with each other and on the whole losing just a bit to the rake. This leaves them with no incentive to continue playing.

This bothers me as a problem with all prediction markets that are funded only by the bets of the players. I don't see how they can fund/sustain themselves outside the irrational confidence of bad predictors.

Expand full comment

One source of funding is people who want to know the answer to a problem. They seed the market and cover the transaction costs that are eventually required.

Expand full comment

Also they can't tell how well teachers know the material

Expand full comment

Why split hairs over some kind of objective measure that may just end up getting gamed anyway. If the parents and kids are happy, I know a virtuous cycle will flow from there.

Expand full comment

I don't mind subsidizing attractive teachers, but the grade inflation is a real problem... and ironically one that seems to have an easy solution -- just separate teaching from grading! Yet somehow there is a strong opposition against this, and I don't really understand why. I can only guess...

Perhaps teachers want to keep an option to reduce conflicts with students and parents by inflating grades? Powerful people want to keep an option to threaten teachers of their kids to inflate the grades? Or is this somehow systemic -- like, kids of the educated class get their grades inflated more than kids of the working class, which keeps everyone in the same class as their parents, so everyone who has the power to change the system is okay with how it works now?

When I was a teacher, I was confused. On one hand, the government provided some norms for grading, according to which, solving the problem successfully was merely B or C depending on whether you needed minor nudges; to receive an A you had to be able to solve the problem in multiple ways and discuss the relative advantages and disadvantages of the individual solutions. According to this norm, most students in the school would get a C, and maybe one student in each year would get a A.

But in reality, majority of students got an A, and most of the rest got a B. I quickly learned that trying to give someone a C invites a conflict with parents (even in situations where according to the written norms, the student should get a D or E). Also, your colleagues would gently suggest that perhaps it is all your fault because perhaps you suck at teaching. It felt as if the grade reflected on the teacher, not the student. So I quickly learned to give my students only A's and B's, regardless of what the norms said. And everything was okay, because the incentives are asymmetrical -- no one is going to send laywers in your direction for giving someone a *better* grade than (they think) they deserve.

Expand full comment

Very good point about objective measures getting gamed.

Educating a child is a very complex problem. There is no one-size-fit-all solution.

Expand full comment

Separating grading from teaching is standardized tests, which have their benefits but also their limitations. Most of my better HS teachers clearly marked certain lessons as for the standardized test and certain lessons as not for the standardized test. Removing their ability to adjust their curricula would not have improved those classes.

And of course there is the problem of subjective or imperfect answers. Removing the power over grading means removing the ability to correct mistakes.

Expand full comment

Also, standardized tests are soooo gamed. Where I grew up, if you compare the evolution of the straightforward state algebra test from C. 1990 - some multiple choice, some short answer problems, and a couple of word problems where you have to show your work. And a rubric converting this to 100% that was pretty straightforward.

In the mid-2000s this collapsed, and there was more and more multiple choice, staple topics got lost in the exam, somehow the word problems sometimes got longer though...and the conversion also became insane, where a very low raw score less than half, was "passing", but at the higher end you could lose 2-3 scaled points for 1 lost raw point.

Expand full comment

Here my opinion is that the tests could be improved. I mean, "could" from technical perspective; but probably couldn't for political reasons.

If the problem is that certain lessons are marked as "for the test" and others are "just for your curiosity, which most of you don't have anyway", the solution is to include *all* the lessons in the test. Like, instead of saying "these 5 lessons are mandatory and these 5 are optional", you could go "these are 10 lessons, but you only need to succeed at 5 of them to get the certificate, although you will be marked as exceptional if you succeed in all 10 of them".

Now the students who only complete the 5 formerly mandatory lessons will pass, but better students are motivated to learn all lessons to get the best result, and even the worse students are motivated to somewhat pay attention to all lessons because any extra point obtained from the formerly optional lessons can cover for a point lost at the formerly mandatory lessons.

The political problem with this is the general opposition against "too smart" kids. My model is that powerful people typically have kids who are above average, but not too high above the average. Therefore they want to design a system where *their* kids get on the top. A system where all people are equal, would not satisfy this. But a system where all people are ranked precisely according to their skills would neither; they want to be at the very top, not merely at like 80th percentile. The optimal solution makes a cutoff exactly where your kids are -- so it clearly separates them from the average muggles, but puts them in the same group with all the actual geniuses. (If your kids have IQ 115, you want the highest category to be called "kids with IQ 115 and higher" without any internal distinction.) The current grading system is suspiciously similar to this, which probably is not a coincidence.

Expand full comment

-"the solution is to include *all* the lessons in the test"

Uh, the teacher doesn't get to decide what's in a standardized test.

Expand full comment

It's not grade inflation so much as grade compression. Difference are still important, but *now* they are the difference between a B- and an A+, whereas they used to be between an A and a C or D. It would not surprise me if in the not too distant future people start inventing A-+ and B+- grades.

Expand full comment

Viliam's story is why teaching and grading are not separated. If your students aren't learning much and the outside testing shows that, you are screwed. But if you are doing your own testing, there are lots of things you can do. You can make the test easier. You can mark it more liberally. You can "scale" the test, giving everyone an extra 5 or 7 points. You can review the day before, hitting everything that will be on the test and nothing else. You can do that and tell the students, "This is what you have to know for the test tomorrow." The day before, you can give students a practice test that includes all the questions that will be on the real test, just rephrased and rearranged. Then go over the practice test. Tell them, "If you know everything on the practice test, you will do well tomorrow."

Thus, you can make sure that all but the truly stupid or uncaring will pass and you will not be called into the department head's office.

Expand full comment

The outside testing does often expose poor teaching. This is why many elite private high schools shun AP exams.

It is easy to find bad things to say about these AP tests, but they force the school to do a certain relatively hard curriculum, and the results are quite revealing.

These schools would rather not expose to the world, how weak their students are in the subject matter.

That said, there is a certain element of having to teach to the test, specially in exams with subjective essay questions. But some high schools blow that way out of proportion in order to escape accountability.

I can privately share the names of 2 such schools. They charge an arm and a leg for a free.

You never see their students do well in any hard academic high school competitions such as the Olympiads, either.

Expand full comment

If the test is good, you should teach to the test. The test is supposed to assess knowledge of the Standards and you're supposed to teach the Standards. That part is pretty straightforward.

Alas, it is extremely hard to come up with a test that actually tests long-term understanding, and that also takes a reasonable amount of time to administer and grade. As a former high school science teacher, I was constantly trying to do better when it came to my own "assessments". Which is to say that I felt I never got very close.

Expand full comment

I agree. There is a whole range of abilities in a class though. How do you address that, when testing a specific concept?

Expand full comment

As a college instructor, I'm very, very skeptical of research in this area. It's more or less a dogma - and has been for a while - that evals are useless. In the same way that all standardized testing, grades, blind grading, etc. are "useless." I've been half-convinced for a while now that the uncomfortable truths that evals reveal are the real problem.

The one caveat that I'll make is that there's possibly a preference for more "lenient" professors or classes, but even that could be a function of more skilled instructors having the ability to appear lenient while still maintaining standards.

Expand full comment

Maybe most ppl are only interested in the credential college provides, not the learning. This was a key point made in the book, "The case against education".

Expand full comment

This implies knowledge gain is the ultimate goal of teachers. One can make an argument that other things like character development (conscientious, determination) or instilling values is as important or more important a goal. I would shoot for a mixed basket of measurements that correlate with living a good life as your measurement against parent desires. When I think back of who my 'best' teachers were, my selection was not based on knowledge gain alone.

Expand full comment

This is what my step-father (a doctor) says about choosing a doctor: Ask the nurses. They all know.

Expand full comment

I agree with you. Of course, the issue is you can't turn that into an Objective Metric applicable Across Schools. I think you can't get around human discretion, and the necessity of principals, local knowledge, and local control in the school. As soon as you introduce a metric, it becomes game-able and plenty of examples where it fails.

And also teachers can be good in different ways. One teacher might not have engaging lectures, but after all that work you KNOW the subject. Or they can be hardasses, running on discipline, and again you come out KNOWING the subject. Or you can have a teacher who is very positive and motivational; which can work well, and might work well with motivated students but not with students who DGAF.

You can have a teacher who is very good at teaching the gifted students: knows the breadth and depth of the subject. You can have a teacher who isn't great at advanced subjects, but he knows how to teach the slowest of the slow so that they actually pass the test.

Most teachers will have better and worse years, better and worse classes, better chemistry with a subject, with a curriculum (especially in an era of frequent curricular changes), with a class, with a specific student. It's really hard to quantify.

Expand full comment

Sure, but the problem isn't picking out the *best* teacher in the entire school, the problem is ranking *all* the teachers, so you can tell Ms. A is 5% better than Ms. B, and so on, until you arrive at the point where you let people go -- which is where the difference is between "unacceptably bad" and "just barely acceptable." Judging *that* line is way, way harder than just picking a superstar. This is one of those cases where you can't say "at the extreme margin this problem is easy to solve! and therefore (by extrapolation) it ought not to be that hard down where in the gray muddled middle." In fact, it gets much, much harder as you get to where you actually need it to work.

Expand full comment

Teachers on the bubble tend to, like all people, sometimes do better and sometimes do worse. They have 9-10 months a year with students, so their good days and bad days can happen months apart even during the same year. A marginal teacher will be counseled, probably do better for a period of time (to the point that they shouldn't be fired) and then maybe droop for a while again. Given even legitimate teacher protections, it would be difficult and counterproductive to fire a teacher that sometimes isn't up to par, but often is. Some schools only hire rockstar teachers and have really high standards, but there are simply too many teachers and too many positions for that to be standard.

At most schools there aren't many, if any, teachers that are just so bad they should be fired outright. In the schools where those teachers filter to, they can't afford to fire bad teachers, because that's all they get. I know a state public school that pays their teachers about 60% of what the other local public schools do. They get what the other schools don't hire, and they still have really high turnover. They aren't going to fire people except in extreme cases.

Expand full comment

I don't have a problem with that. In fact, I think it's a good example of a properly functioning free market in action. There *should* be really crappy schools where really crappy teachers can get hired, and people who have no better options can send their kids to get crappy educations -- because a crappy education is better than no education at all, just like a shitty used car that breaks down every 2 weeks is better than no car at all. If we insisted that no one could buy, or own and use, any car other than a $50,000 BMW i3 series in perfect condition, a heck of a lot of people who can now drive to work would have to walk, reducing the quality of their lives considerably.

The insistence that *everyone* get identically maximally-optimized education, regardless of his or her ability to pay, ability to make use of it, or interest, is as delusional and destructive and wasteful in education as it is in medicine.

Expand full comment

The difference is that in education, the really shit schools still cost a lot of money (to the taxpayer, not the parents of the kids attending), and in fact often cost more than the good public schools (though obviously still less than elite private schools who filter on money)

Expand full comment

Well, but that's because the government distorts the market, as it usually does. I think everyone *suspects* based on historical experience that if government did *not* intervene, then crappy neighborhoods would have crappy schools with much less money than good schools in good neighborhoods.

It is interesting, though, to speculate what effect the government distortion has on teaching career, though. If government did *not* intervene and poor neighborhoods had poor schools that paid poorly, surely a number of people who are bad at teaching but nevertheless go into it because the money is better than their talent has a right to, if they deploy their (meager) talents in crappy schools that are the benefit of government attempts to shovel back the tide.

What would those people do in the *absence* of the distortion? Not go into teaching, surely, but then...what? What is related to poor teaching but pays better in a free market? Televangelism? Con artist? Cult leader?

Expand full comment

Fantasy football, but with grade school teachers.

Expand full comment

"How many minutes do you spend in Codex-like tool?" is a terrible statement for a question. Google already has AI-driven auto-completion in its internal code editor. I don't see how to measure its usage in minutes. (You could theoretically count _saved_ minutes, but it's not easy to estimate.)

Question to any programmer: How many minutes per week do you spend in thee auto-complete of your editor of choice ?

Expand full comment
Comment deleted
Expand full comment

At least for me the main time saving for autocomplete comes from not needing to dive into code to find the right function name or other identifier. The speed of coding is highly irregular and average WPM speed may not be representative for the time saved by using autocomplete.

Expand full comment

My programming speed is mostly limited by figuring out what I want to say, not by typing speed. Auto-complete either quickly fills in something that I already know I want to say, in which case the time saved is related to my PEAK typing speed (vastly faster than my average), or it helps me identify a function name that I couldn't immediately have recalled from memory, in which case it mostly saves the time of looking up that information in a different tool (which has essentially nothing to do with the number of characters in the name).

Either way, characters auto-completed divided by my average typing speed would be an extremely weak estimate.

Expand full comment

Visual Studio also already has an AI-assist auto-completion feature (Intellicode) which is trained partially on global code repositories and partially learns based on your own code base.

There's no qualitative distinguisher between the current feature that I use every day and Codex. Codex is a bit more aggressive and capable, but it's the same tech.

So... do I expect to be using a more aggressive and capable version of intellicode in 5 years? Of course. I fully expect Microsoft to improve their product quite a bit!

Expand full comment

Not exactly. Intellisense orders the suggestions, but that's usually it. Codex (at least Copilot) suggests whole fragments of code.

Expand full comment

Intellicode, not intellisense. It's a newer experimental feature. Among other things, it will observe me making two similar transformations to properties (say, adding a notify implementation to each) and then propose applying the same transformation to all the rest of the properties. (And my notify implementation is idiosyncratic; it doesn't know what I'm implementing.)

Expand full comment

I have access to Github copilot. It's active all the time because that's how extensions work.

> Question to any programmer: How many minutes per week do you spend in the auto-complete of your editor of choice ?

Do you mean "choosing between different options of the auto-complete instead of actually writing code"? It's hard to say, and not a good metric, as you will spend less time in a good autocomplete. A good metric would be "what percentage of your code has been written by autocomplete, fully?" and "what percentage of your code has been written by autocomplete, partially?".

Expand full comment

This comment uses autocomplete but it’s what I would have written anyway. For most code I’m sure it’s largely the same.

Expand full comment

> If you treat “teachers whose students get high test scores = good”, then you’ll just promote teachers who work in rich areas, or who get lots of smart students, or some other confounder related to student selection effects.

Give the class to the teacher that will accept the lowest base salary. A class that will predictably have worse scores, granting a lower grades bonus, will grant a higher resulting base salary. Teachers that prefer teaching smart kids can accept a lower salary to do so. Teachers that prefer making money will go for the class where they can improve scores more than others.

Expand full comment

This is basically what DC public schools try to do except it's the school level not class level. Potential bonuses are lower for better schools and higher for worse schools. At least this is that what they were doing 10 years ago when a number of my friends were DC public school teachers. I don't know if it worked out.

Expand full comment

This is basically already how it works, but on the school level. When I first started looking at teaching jobs in [big municipality] it looked like I could make ~$100k working in the big city, ~$40k working in a high-achieving suburb, or ~$50-60k working at a private school. I chose the private school option.

Expand full comment

In my metro area, private school teachers make significantly less money. Both because they ride off many teachers doing it to get reduced/free tuition for their own kids, and because they don't require the teaching certificates and degrees that the public schools do (but will take subject matter expertise).

Expand full comment

https://www.gwern.net/CO2-Coin has some good ideas on a CO2 Blockchain. The piece contains some interesting tangents on prediction markets.

Expand full comment

“and we should also replace all lower courts with prediction markets about what the Supreme Court would think”

We should replace the Circuit Courts of Appeals, though.

We should not replace courts of first instance, though. There is a need for some organisation to create a transcript of the evidence for the prediction market to work with - and also to produce decisions in the cases that aren't especially controversial. The Federal District Courts do this for the federal system and are necessary; it is their transcript of evidence that the higher courts rely upon when generating a verdict, and that is also what the prediction market participants need.

Expand full comment

I'm a doctor, and I've suggested to colleagues, even wrote a concept paper for a small conference, that we use prediction markets for prognosis, diagnosis, and treatment of patients. It has gone over as one might expect.

Expand full comment
author

I've also heard Eliezer expound on this at length. I disagreed with him at the beginning, and we had a long argument that involved creating lots of complicated epicycles that would be necessary to keep it fair, but I updated to believe that once you add all the necessary epicycles it really would work shockingly well.

Expand full comment

I was gonna make a joke about the supreme court prediction algorithm replacing lower courts, and a cleveland clinic/MD anderson/UCSF/brigham and womens prediction algorithm replacing...all doctors until I realized that I wasn't joking

Expand full comment

Any chance it was a written disagreement posted somewhere public? My first thought would be that sufficient epicycles might very well do a good job patching the misaligned incentives, but that they tend to attenuate profit motive beyond the point of usefulness. Having a discrete proposal makes crunching the numbers tenable.

Expand full comment

I expect that the response was negative, but maybe you could tell us more about it?

Expand full comment

Unless you're proposing to donate all proceeds to e.g. the winners' chosen medical non-profits, I am not surprised that doctors refused to consider it.

Also, if a prognosis prediction market came to embraced by doctors, legislators might seize on it as a way to ration e.g. Medicare/Medicaid (US) spending. What I mean is that legislators might see the prediction market as a way to implement healthcare budget cuts without taking as much responsibility for setting the rules of how the rationing will occur. That might slightly reduce the 'Death Panels' rhetoric the politicians face, at the cost of imposing more of it on doctors themselves.

Expand full comment

Your phrasing in the second paragraph makes it sound like an objection, but I'm not sure why. Given that rationing is necessary as resources are not infinite, surely making those decisions based on more reliable predictions is a good thing?

Expand full comment

That's a fine question. I am ignorant of healthcare funding, so I can't develop a reasoned opinion on the least undesirable form of rationing. In spite of my curiosity about how some sort of futarchy might actually work (starting with less contentious issues than medical care), let's suppose I do in fact oppose prognosis-market rationing. Here's one consideration:

Imagine a well-developed prognosis market. Congress cuts the healthcare budget by 10% (after years of automatic increases), with the cuts to be born by those with poor prognoses according to the prognosis market.

Now, three patients of the same age with lung cancer:

-X never smoked; X has a 30% chance of survival with treatment.

-Y gave up smoking 40 years ago; Y has a 31% chance of survival with treatment.

-Z gave up smoking upon diagnosis this week; Z has a 32% chance of survival with treatment.

According to the prognosis market, rationing prioritizes Z's treatment. Is this desirable?

If yes, now imagine Z is lying about giving up smoking. In fact, Z continues to smoke. If this was known, the market would reduce Z's chance of survival with treatment to 25%. But it isn't known, so Z is still at 32%.

Most likely my thought experiment is recapitulating decades of debates among physicians (I think of alcoholic George Best's liver transplant), but probably not in the context of a market.

Expand full comment

I would say it's desirable (but fairly unlikely to happen in that direction in the real world).

In the UK, I believe the NHS does make rationing decisions on the basis of predicted QALY, though I don't know if predictions are made for individual patients or just broad demographics.

It's also a topic of contention for insurances and banks, who would *love* to set their prices based on more accurate predictions, and are always trying to gather everyone's data and catch & punish liars. They're opposed by the people most at risk (who want their burden carried by others), and people who care about privacy, and people who subscribe to some kinds of egalitarianism (you might be able to make more accurate predictions if you are allowed to care about race or class or gender, BUT...).

Expand full comment

Well, like AI (= well-trained pattern recognition algorithm) it would do very well at reducing errors in quotidian diagnoses, but it would fail spectacularly at the margins, where Unusual Dangerous Condition X can easily be mistaken for Boring Mostly Harmless Condition Y, and given that the latter is where your professional pride kicks in, and where the jury awards the most damages in the malpractice suit, I can see why your experienced practioner recoils.

Expand full comment

For "how do you know how good a teacher is", the obvious answer to me is "use standardized tests and use a model to adjust for expected student performance".

The hard part isn't in constructing the model (you could use a Prediction Market, or you just could hire two college students to write R for a month), it is in constructing the standardized tests.

Both "what should be tested" and "does the test actually measure that" are Hard problems that I'm not sure the government can actually solve in the long term; it seems certain to me that most school districts aren't solving that well today.

Expand full comment

One of the problems with "standardized tests" is whether they should be SAT-like or AP-like. Note that most countries have graduation tests that are AP-like, ie they test knowledge imparted in class, rather than "scholastic aptitude", which is at least intended to be abstracted from what the student has actually learned.

Expand full comment

In practice, measuring how much a teacher has contributed to a student's improvement on an AP-like test will be orders of magnitude easier than measuring the contribution to improvement on an aptitude test.

Expand full comment

I would expect AP-like tests to create better incentives. "Teaching to the test", for an AP-like test, is likely to involve a significant amount of genuinely valuable teaching.

Expand full comment

This is one reason I prefer AP-type testing for university entrance to SAT-type - it incentivises "learning to the test" resulting in learning something useful.

There is a problem that if you want to examine people at the end of their high school (ie in May/June) and then allocate university places to them by the beginning of the next year (August/September) - which is that if students can't apply until they have their results, you have only a couple of months to resolve this.

There are two possible options, one is to apply before results and get offers conditional on achieving the necessary grades (the UK system); the other is to arrange to have an extended period between high school and university (e.g. change university years to start in January).

Note that the UK system requires a large "clearing" operation between grades being announced and term starting so that students who missed their conditional offers and universities with spare places can be matched up with each other. It also can turn grade inflation into a crisis; a mid-ranking university with 500 places might make 750 offers, knowing that only 400 or so will actually make the grade and it can then fill up in clearing from the students who missed grades at high-level universities. But if there is sudden grade inflation and 600 make the grade instead of 400, then the university has a crisis - they don't have enough academic staff, or physical space, or accommodation. This happened in 2020 when COVID made examinations impossible (groups of a hundred or more people gathered in a large, poorly-ventilated indoor room for 2-3 hours in May 2020; obviously not) and the government decided to allow class teachers to just make up a grade for each student (even if those were fair and not inflated grades, that still means that the kid who is sick on exam day, or has a panic attack, or whatever, doesn't get a lower grade; that's grade inflation too).

Expand full comment

Or one could just make the system more efficient and get it done in 2 months; here in Australia the final exams are in October, grades come out early December, and university starts end of February. Every student puts in their applications, university and major (max of 10 applications, I believe), then the universities take students in order of their marks. There's 3 rounds of this to allow for the slots freed up by students who say no to go to students who missed the cut-off, and by end of January it's basically all sorted.

The things that make this work are that every student has a single grade, standardised nationally, and that it's calculated as a percentile so the number of students with each precise grade is constant and known. It also helps that our top universities are big, not niche private ones with limited capacity.

Expand full comment

You can solve the 2100 prediction problem with another kind of recursion - have them predict what the market 20 years from now will predict (conditional on it existing). And of course the hypothetical 2041 market will predict what the 2061 market will predict, and so on.

Expand full comment
author

Yeah! I just thought of that too! I'm having trouble figuring out why that's different from a normal market (surely if you buy stocks now and sell them in 2040, that's much like having a prediction market on what stocks will be in 2040). I think the right answer is something like that you can amplify the price difference with a prediction market. That is, if the probability of nuclear destruction by 2100 is 30% now, and likely to be either 29% or 31% in 2040, then it's not worth people's time to invest in a normal market, but if you were predicting what the 2040ians will predict, you can have the people who correctly guess 31% get *all* the money of the people who incorrectly guessed 29%, and then they double their money instead of +1%ing it.

Expand full comment

That's just leverage, though, isn't it?

Expand full comment

The normal market already contains this market, you just commit to selling your bet in 2040, so it doesn't really solve the problem.

Relatedly if you when to learn 2 pieces of insider info, which point in different directions, then it can be profitable to move the market away from your best estimate, release one of the pieces, earn money, move the price again and release the second piece.

Resulting in you earning more than if you only bought/sold once and released both pieces of evidence at once.

Expand full comment

Just in case anyone is reading this and thinking this is a great idea (and disclaimer, nothing I say is investing, legal, or any other kind of advice), this is illegal in many jurisdictions.

Expand full comment

I had a discussion about something similar in the comments before. My take away was that its would be gameable unless you were willing to commit to the full term.

If you had a crystal ball and knew there was 0% chance of x by 2100 then you can bet big on 0% will be the consensus opion in 2040 but if the consensus isn't correct in 2040 then you either have to play again or eat the loss.

The protection from gaming the system is that people who are wrong loose money. If you can't be proved wrong in intermediate markets then nothing stops manipulation. It will eventually be found out when the market bottoms out in 2100 but the problem was you didn't want to wait till then

Expand full comment

Regarding using prediction markets for long-term predictions. I discussed it with Polymarket's founders and they are aware of the problem, but choose to ignore it for now, by limiting PM to short-term predictions.

A solution for longer-term market would be to make stakes in a currency that grows at the rate of the market. The guy from Polymarket mentioned Compound USDc, but in my eyes it would be much easier to just make stakes in ETH. Considering that you don't need to wait until the market is resolved and can sell your stake at any time, I don't see why it wouldn't work.

Expand full comment

>Vitalik didn’t end with “and we should also replace all lower courts with prediction markets about what the Supreme Court would think”, but I’m not sure why not.

I'd disagree with this, even if we were good at executing it. I think the lower courts help shape SCOTUS decision on newer/more controversial cases (on others, they do just apply precedent, and do ok at it). It's good for the problem to be fairly worked over by the time it reaches SCOTUS, and for the justices and clerks to have a number of perspectives to sort through.

A lot of my opinion on SCOTUS was shaped by a stage play, Arguendo, that simply staged oral arguments from a case about whether exotic dancing was first amendment protected speech. (I reviewed the show here: https://www.the-american-interest.com/2014/09/15/the-art-of-argument/).

The Justices seemed less focused on justice or even applying past precedent, than in crafting a narrow decision that would be minimally harmful when it came back to bite us all as Established Precedent. And thus, seeing how lower courts wrestle with the issue, what they think different decisions would imply down the line, etc. is helpful.

Expand full comment

This is an interesting point. SCOTUS decisions are never really about the issue at hand; they're about influencing lower-court decisions on all the related cases. It's sort of like the CDC making a recommendation on face masks. According to the CDC's data, recommending face masks does not increase mask-wearing, but does increase violent attacks on non-mask-wearers. Since you're the CDC, broken bones are not under your purview, but those violent attacks do increase the probability of disease transmission, so the CDC decides against recommending masks.

Expand full comment

Isn’t the best way to evaluate teachers is to let kids choose their next teacher? Sure some kids will opt for a soft option, but if that’s what they want who are we to argue? I would bet most teenagers have a pretty good understanding of what’s in their best interest and can easily get feedback from the current pupils. If a teacher gets few or very little uptake, its a pretty good signal that they are not providing what the market wants.

Expand full comment
author

But that is evaluating with little skin in the game.

Expand full comment
author

First, this is assuming really good networks where students learn from older students who the good teachers are. I felt like this *sometimes* happened at my school and sometimes didn't.

Second, I think everyone except a few really driven kids would go to whichever teacher promised the least homework.

Third, I think "gives little homework" is much easier for kids to assess and communicate than "teaches well" (especially compared to other teachers in the school that kid hasn't had) so this would amplify that.

Expand full comment

Well at least at my school there was lots of information on which teachers provided a good service (or my crucially were bad teachers), but maybe this isn’t true of all schools. On the second point, isn’t there lots of evidence that homework is not helpful in kids leaving? And finally, sure hyperbolic discounting is a big problem with many kids, but I can’t believe forced participation in a class where they hate a teacher is a good answer to that. Maybe under this system teachers would have to think about how to motivate their kids and keep them

Interested.

Expand full comment

Probably cookies, grade inflation, and watching lots of supposedly educational videos. I took a continuing ed course on nonverbal communication once. It involved watching a bunch of Hollywood movies on a VCR and then discussing the nonverbal communication in the movies.

Expand full comment

grade inflation and poor teaching are addressable via standardised testing, and giving out cookies to students seems like behaviour that is largely fine to universalise (and would probably be less alluring when common)

Expand full comment

What if it happened to be the case that most teachers really do give too much homework?

Expand full comment

>I think everyone except a few really driven kids would go to whichever teacher promised the least homework.

If you search, you can find that a lot of people do think too much homework at schools is a genuine problem. One example is https://www.theatlantic.com/education/archive/2021/04/homeroom-how-much-homework-too-much/618580/ .

One reason that it's a problem is probably because of precisely the dismissal you're making--students are the most informed about it and the most affected, but they can easily be dismissed because they're students.

Expand full comment

If you think of education as an ordinary consumer good, where the consumer knows best what they want to get out of it, then this might work. But education is a good where the consumer by definition *doesn't* know many of the most important things about what they're going to get, even in addition to the point Scott mentions about students currently wanting less homework even though a few years from now they will want to have had good preparation for hard things.

Expand full comment

Hmm, but we are happy to let college level folks decide on what and how they will learn, despite “not knowing” what the most important things are in the subject they are studying.

One thing I would suggest is that we look how this problem is approached by people outside the formal school system, like people choosing a sports coach or a dance coach or a private language teacher. It seems to me that pupils there are well aware whose a good teacher on an overall basis (not just a soft one).

Finally, it really is disturbing that people seem to think we have to force kids to learn using a tough teacher that kids generally will hate. I can’t believe that is conducive to a good learning environment.

Expand full comment

College students still usually have a set of course requirements they have to pass - there's a certain realm of control that makes sense, and certain realm that doesn't.

I agree that the comparisons you mention are good ones.

I don't think too many people think "we have to force kids to learn using a tough teacher that kids generally will hate" - I don't think the best teachers are almost ever ones that the students hate, but I also think they're not that often the ones students love most. Being popular with the students, and even making the students *feel* like they're getting a lot out of class isn't always that strongly correlated with helping the students get a lot of out of class.

Expand full comment

They're not comparable. If you're in sports, you can see the results of hard work and/or better coaching almost immediately. Sure, your eye may be on Olympic gold, but you can measure how much better you're getting -- on a conveniently objective* scale -- by whatever competition is going on at your level right now. You win the juniors' age 8-9 state competition or whatever.

Education is completely different, because any objectively measureable payoff (e.g. your starting salary) is decades away from the moment of decision. Every test and grade and other metric is an attempt to create a proxy for the ultimate payoff, and every one suffers from the usual risk of gross inaccuracy attendent on crystal-ball gazing forward 10-20 years. We can't even use "what worked for the generation before you" (which is objectively measureable) because conditions change. Believe it or not, studying computer programming in the early 1970s was kind of a weird and risky career choice, but of course those who made it in fact made out like bandits, as it luckily turned out.

-------------------

* By educational standards. Actual athletes will roll their eyes at the notion that atheletic competitions are purely objective, but by the standards of what we do in academia, they are much more so.

Expand full comment

Adam Smith thought the right way to run a university was to have the professor directly paid by the students who chose to take his class.

Expand full comment

Probably worked fine in the 1800s when people went to university because they were actually interested in learning things rather than acquiring credentials to say that they've learned things.

It still works great for other fields of endeavour where credentials don't exist (e.g. music lessons).

Come to think of it, it still works great in fields of endeavour where credentials do exist, such as driving lessons. You pay an instructor for driving lessons, and then you go take a test and pass or fail, and it all works out pretty reasonably. The key seems to be that the people who do the testing are totally separate from the people who do the teaching.

Expand full comment

Yes this external independent grading as done in most of the world is what I was thinking off.

Expand full comment

I wish we could do this. One of my classes did the math for this just last week--each student individually pays more for a single class period than an adjunct professor earns for that same time (it was like $75 per student and $62 TOTAL for the adjunct for the same hour of instruction). Which is disgusting.

Expand full comment

Sure, but we need to supplement it with a side-market in personal bonds. For example, it costs $100 to take my thermodynamics course. The student says, I can't afford that, but I can sell you (the professor) a bond for 0.1% of my salary in the year Now + 10. How much will you pay for that? The prof says $90. So the student gives him $10 in cash and the bond, and takes the class. If he does well and becomes rich, he pays off the bond and the professor has a good pension. If the student does poorly, because the professor misjudged either his intructional talent, the future market for the knowledge, or the student's quality, then the professor has to eat cat food in retirement.

Older established professors with reliable real-world results won't pay much for bonds, so most of the time their students will have to pay cash, which is appropriate. Young turks who think they know a better way will have to work for peanuts and accumulate a massive stack of bonds, and the more avant-garde they are, the bigger the leverage, which means they might get very rich indeed if they succeed, incentivizing experimentation at a certain level.

The students, too, can choose how much to spend now and how much of their future earnings to bet, which puts them in the right frame of mind for choosing instructors.

Expand full comment

I like this a lot for university students, who are adults making their own choices. Schools for children face the issue that many decisions are made by the parent rather than the pupil

Expand full comment

Their individual incentives don't align with the collective incentive. If a teacher is know to give everyone As the rational thing is to choose them, although on a collective level that makes the grades worthless.

Expand full comment

I am thinking of a different school system, in UK for example you take externally graded exams, the grade is not given by the teacher.

Expand full comment

This might work well for larger schools, but it wouldn’t work at small ones. At my school pretty much each teacher always taught the same class. If you wanted to take Chemistry, you were going to take it with Teacher X, whether or not she was good at her job, because she was the only one who taught it. For required Classes, like Freshman English, everyone took it from the same teacher unless you had some disability accommodations.

Expand full comment

For long-term predictions like the chance of nuclear war by 2100, could you augment the Keynesian beauty pageant approach by “laddering” out 80 years, 10 years at a time. So basically, you’d be guessing today what you think the outcome of a Keynesian beauty pageant run in ten years would say. And that one would be trying to guess the one 20 years away, which would be trying to guess the one 30 years away, etc.

The benefit is that there is an answer within 10 years (not in 80 but not right away) but if you have insider information that you think will be public (or at least known by enough participants) in 10 years, you should fact that into your guess. You don’t have a perpetual liquid market, just run the beauty pageant every 10 years.

Of course, I’m not sure how this is any better than having a shorter-term futures (or options) market on top of the underlying long-term question (e.g. what will the market price be for chance of nuclear war by 2100 in 2030). The underlying long-term may not directly get interest from knowledgeable forecasters (because they don’t care about a payout in 80 years) but the short-term one should and should incorporate information (or arguments) that will come to light in the next 10 years. Plus, the long-term market price should adjust towards what the ten-year out market predicts (barring new, real information entering it) because otherwise you could trade on the difference.

But wait, how is that different from someone buying (selling) the 80-year contract because they think it will be higher (lower) in ten years and they’ll be able to cash out then? If the contracts are tradable and the market liquid, you don’t have to wait for them to expire, you can profit based on the change you expect over any time horizon that matches your preferred holding time.

So is the approach only really relevant when you can’t provide a permanent liquid market?

Expand full comment

I'm pretty sure there's a teacher shortage that's only gotten worse since the pandemic. Not sure why teacher merit pay is a question that anyone cares about, as opposed to just raising teacher pay across the board in order to address the shortage, and also improving working conditions to reduce burnout to ensure that it doesn't just happen again in a few years.

Here's a fun fact: experience makes people better at their jobs, and getting teachers to stick around for more than five years would do a lot more to improve performance across the board than questionable schemes to identify the best teachers.

How about thinking about ways to get teachers more training and professional development? More time to do lesson/unit planning? Did you know that teachers can write off school supplies that they buy for their classroom on their taxes? Seems nice, until you realize that it means that students whose teachers don't have disposable income aren't getting fully-stocked classrooms! Why does the richest country on earth have its overworked, underpaid teachers buying school supplies for their classrooms?

What's the point of merit pay? Do you know any teachers who have the potential to be great, but choose not to do their best because they aren't getting paid more than their peers? What do you think it is, exactly, that motivates teachers?

Sorry to come off as forceful here but merit pay is the kind of scheme that is just so completely divorced from the problems and realities of education that it makes me despair that we'll ever get serious solutions.

Expand full comment

"Here's a fun fact: experience makes people better at their jobs, and getting teachers to stick around for more than five years would do a lot more to improve performance across the board than questionable schemes to identify the best teachers."

I think this is emphatically false. Most teachers do not improve after the first few years, the improvement from experience is much less than the difference between the best and the worst, and most teachers stick around for 5+ years anyway. So it would be far more fruitful to identify the high achieving teachers, keep them teaching and learn from them, while getting rid of the under-achieving teachers.

Expand full comment

"Most teachers do not improve after the first few years"

Based on what criteria, and according to what source?

"most teachers stick around for 5+ years anyway"

Not nearly as many as in other professions. Point is, enough leave to contribute to the ongoing shortage of qualified teachers. See this for details: https://www.epi.org/publication/u-s-schools-struggle-to-hire-and-retain-teachers-the-second-report-in-the-perfect-storm-in-the-teacher-labor-market-series/

"it would be far more fruitful to identify the high achieving teachers, keep them teaching and learn from them, while getting rid of the under-achieving teachers"

You think it would be more fruitful to go from not having enough teachers to having even fewer teachers, on the theory that - what - "under-achieving teachers" are worse than no teachers at all? Do these underachievers somehow suck the information out of their students' minds like pedagogical vampires? Do we stick 70 or 80 kids in front of these "high achieving" teachers on the theory that they can handle it and anyway we're paying them extra? Or maybe you think that some kids just don't deserve teachers at all? I'm confused about what your theory is for how getting rid of teachers during a teacher shortage crisis during a broader labor market shortage adds up to better education.

The basic economic concept of supply and demand applies here: there is not enough supply to meet demand, so the buyer cannot be selective. Until you fix the supply problem, talk of teacher quality is less than meaningless - it's actively destructive to attempts to reform education. It perpetuates the myth that bad teachers are to blame for the education system's failures, which means people will devote their energy to fixing the wrong problem, and propose punitive solutions that do nothing but drive even more teachers out of the profession.

You're not willing to talk about improving teacher pay or conditions and I'll bet a dollar you're anti-union too, which is why you can't even acknowledge that there is a shortage. You're just promoting an agenda that is 100% ideological and 0% empirical. Which of course you have the right to do, but speaking as an educator, I really wish you wouldn't.

Expand full comment

"You're not willing to talk about improving teacher pay or conditions and I'll bet a dollar you're anti-union too, which is why you can't even acknowledge that there is a shortage. You're just promoting an agenda that is 100% ideological and 0% empirical."

I think it should be clear from this exchange that it is exactly the other way around. It is you who are promoting an ideological agenda that is for your own personal benefit.

My interest is in the welfare of the children of this country and I have no personal stake, other than the success of my own children. I have no interest in politics, only policy. I base my opinions on empirical data. You seem to want to battle with strawmen only, so I will leave you to it.

Expand full comment

"I base my opinions on empirical data."

Says the guy who won't provide or respond to any empirical data.

Expand full comment

Like I said, I'm not going to engage someone who is clearly operating in bad faith.

Expand full comment

Another way of improving supply would be to eliminate restrictions on who is allowed to teach. It varies from place to place, but I'm pretty sure that in many places I could not be employed to teach high school physics since although I have a doctorate in physics I have no education degrees or credentials at all. I'm even surer that I couldn't be employed to teach English although I have written multiple books, some of which got well reviewed, and know more poetry than most English teachers.

Oddly enough the teachers unions, which support higher salaries for teachers, do not generally support reducing restrictions on entering the profession.

Expand full comment

To be fair though, I've known plenty of people who are experts in their fields, and who could not teach anyone to do anything. If they tried teaching a child to make PB&J sandwiches, that child would be scarred for life and his sandwich would explode.

IOW, the ability to teach well does not linearly correlate with expertise. It's a separate dimension.

Expand full comment

The guy who taught me computer science at Stuyvesant High School in New York was an industry programmer - IIRC he'd worked at a big bank before he left to become a teacher. It was his calling. After a couple of decades at Stuy he moved on to Hunter College to start a teacher training program for CS teachers, and he's worked with the DOE to establish computer science programs throughout New York.

The idea of hiring great people from industry is a good one, and it's produced good results when implemented. The problem is - how many people are called to teach as a life's passion? Do you want to teach high school physics? Have you ever looked into it? I'm 100% sure you could find a teaching position for next year if you tried. Maybe not at a regular public school, but at a high-need school you could teach if you simultaneously enrolled in a teacher training program. At a private school - especially one abroad - you could get hired without even doing that.

You could absolutely become an English teacher next year especially if you're comfortable teaching EFL.

But the fact that you haven't even looked into this option enough to know that it is an option illustrates why eliminating restrictions isn't enough. The vast majority of people with established careers don't want to teach the subjects and grades that need teaching. Hell, I'm a teacher myself and I definitely wouldn't move to the US and teach in grade school there - not the way American teachers are treated and paid.

Expand full comment

"Do you want to teach high school physics? Have you ever looked into it?"

I don't want to. I've taught a variety of things at the college level, and I spent a few summers as a councilor at a camp for gifted children teaching kids, which was fun.

There are a lot of people in the world with knowledge of one sort or another, and it only takes a small fraction of them to add a significant number of teachers. In early 19th c. England, before the establishment of the public school system, you had retired ship captains teaching kids, and presumably many other sorts of people. Certainly some people are better teachers than others but I don't think there is much evidence that education courses work to produce good teachers.

Expand full comment

"I don't think there is much evidence that education courses work to produce good teachers"

Hmm, I wonder if there might be some way to check whether there is any evidence that education courses work to produce good teachers.

Eh, you're probably right, I guess there's just no way to know. Baseless speculation it is, then.

Expand full comment

I feel like debates about the supply of good teachers are a little like debates over the supply of adoptive parents, both obscure the actual problem in an effort to avoid grappling with some uncomfortable realities. In truth, there is no lack of people eager to adopt newborn white or Asian babies in perfect health with good genetic backgrounds and excellent prospects -- there are waiting lists for that kind of thing. But there is definitely a lack of people eager to adopt a black kid age 3 who was born to a crack addict and abandoned on the street, who in addition to suffering brain damage during birth was hideously abused in his first 2 years and suffers from a variety of irreparable physical and probably irreparable psychological problems. No doubt his future life could be improved from "awful" to "fair, not actually in prison or institutionalized" if some selfless couple were to throw themselves on this particular grenade, but not a lot of people are eager to earn karma points in the afterlife by doing so.

Similarly, there is really no lack of qualified talented people to teach interesting subjects to good students from sound families in schools (private or public) in good neighborhoods. But people look at schools in savagely dysfunctional neighborhoods, gang-infested, with broken and screwed-up families and terrible homelife, and an administration that is generally incompetent, if not actually corrupt and under some heavy-handed court supervision, and say "Gee, if we only got some talented and selfless people to throw themselves on these grenades, we could improve the prospects of students from 'awful' to 'not so bad, not actually all in jail, most can read and do the 3 times table, and so are immediately employable at Wal-Mart.'" And that is no doubt true, but it would take a serious amount of wealth transfer to pay enough people to commit to that discouraging task with the balance of their working lives.

Arguably something like Teach For America or whatever it was called tried to tackle this problem by giving young idealistic people a chance to do *some* teaching in squalid places, without being locked into it for an entire career, but my impression is that it's hard to move the needle with newbies who parachute in for a few years -- you really do need people who will dedicate years and years to the job, to learn how to do it effectively. But who wants to do that? It's like being an oncologist and only taking desperately ill Stage IV patients who almost always die quickly. You'd need the optimism and patience of a saint, to be sufficiently inspired by your 1 out of 100 victories that you can tolerate your 99 out of 100 failures.

Expand full comment

Yes, I agree with all of this. It's a well-known problem, and there is talk among educators about programs to incentivize more experienced teachers to teach in high-need schools and to redirect more resources in general to these struggling schools. But of course a lot of the problem is outside the control of education as an institution - schools don't have control over whether they are located in gang-infested neighborhoods or whatever. Society would need to address social inequality more broadly to solve these problems, rather than leaning on education as an instrument of solving social inequality.

Expand full comment

Agree, at least philosophically, that many non-credentialed teachers could be good high school teachers. They certainly do well enough in colleges.

That teachers unions do not "support reducing restrictions on entering the profession," makes sense in that credentials and certification reduce competition for jobs and thus keep wages high. Unions should do what's best for their members and districts(employers) should negotiate their way out of problematic asks.

Expand full comment

"Agree, at least philosophically, that many non-credentialed teachers could be good high school teachers."

I agree with this as well although my preferred solution would be making it easier for these individuals to get credentials, rather than relaxing requirements which are already too low. As I've discussed elsewhere, teacher education programs are often inadequate and new teachers often enter the field without enough teaching skills and knowledge to succeed, and leave the field within a few years as a result.

But as I've said the main problem is enticing these individuals to become teachers. You've claimed that wages are high, but I don't know a single non-teacher who currently wants to leave their field and become a teacher, which suggests that they are making the (correct) estimation that teaching does not offer enough money to compensate for the difficulty of the work. Only a few passionate people driven by a lifelong desire to teach will switch to teaching - and these people don't need relaxed requirements to do so. But if you want to recruit a lot of professionals from other industries to come and teach, teaching has to offer a competitive salary and benefits package.

Expand full comment

I have not claimed that wages are too high. I'm not sure who you are responding to. My father was a teacher and the treasurer of a teachers union for most of my life. I spent large months of my young life union meetings and around protests.

I am strongly pro-union. I was pointing out that credentials increase wages. If we find that they do not provide value for that increase in wages, then districts should negotiate away the need for credentials in exchange for some other "asks" of the unions.

Expand full comment

One reason a teacher sticks around for a long time, in a school system where firing teachers is hard, is that he is unemployable elsewhere.

Expand full comment

Maybe the reason most teachers don't improve after the first few years is that they don't have any feedback to tell them what they're doing well or badly, and most of them would need incentives to dedicate effort to it, as opposed to just getting along from day to day, which is hard enough already.

Expand full comment

That's a fair point -- it's very difficult to get high quality feedback other than simply by experimenting on children as a teacher. In many districts a teacher gets feedback once or twice a year only, with several weeks notice. These observations are high stakes, and they stress out about them, and then are sometimes yelled at by the principle. It's a pretty adversarial system in some schools. Giving teachers dedicated time to observe other classes, and having a norm of actually doing so, would benefit many teachers, especially in their first few years.

Expand full comment

Okay, I'll admit, I'm deeply confused. We have two competing claims here: One from an educator, who claims that teachers improve over time. Another from a rando, who claims that teachers don't improve over time.

Is there some reason why you've all defaulted to believing the rando?

Did you take a moment to think about it? To Google it?

Someone called me out on not providing evidence. The truth is, this claim is so damned obvious that I would never in a million years have thought I would have to. But, sure, here you go:

https://learningpolicyinstitute.org/product/does-teaching-experience-increase-teacher-effectiveness-review-research

"Based on their review of 30 studies published within the last 15 years that analyze the effect of teaching experience on student outcomes in the United States and met specific methodological criteria, the authors found that:

1. Teaching experience is positively associated with student achievement gains throughout a teacher’s career. Gains in teacher effectiveness associated with experience are most steep in teachers’ initial years, but continue to be significant as teachers reach the second, and often third, decades of their careers.

2. As teachers gain experience, their students not only learn more, as measured by standardized tests, they are also more likely to do better on other measures of success, such as school attendance.

3. Teachers’ effectiveness increases at a greater rate when they teach in a supportive and collegial working environment, and when they accumulate experience in the same grade level, subject, or district.

4. More-experienced teachers support greater student learning for their colleagues and the school as a whole, as well as for their own students."

Can we dispense with this discussion where we try to figure out an explanation for a problem that doesn't exist now?

Expand full comment

I think you just discovered this community's preference for, let's say, a certain type of solution over all other options.

Expand full comment

Yeah I mean I'm all for incentive-based or market-driven solutions, when they are in fact actual solutions to actual problems. And of course we can talk about what factors make teachers improve more or less (as this meta-analysis does) and focus our efforts on increasing the factors that make teachers improve more, and decreasing the factors that make them improve less.

But if I point out that teachers get better with experience, and several people immediately start speculating as to why teachers *don't* get better with experience, I have to wonder exactly what cognitive processes are at play here that cause people to apparently not be able to read sentences that disagree with their preconceived notions.

Like, it's one thing to disagree and argue the opposite perspective. But a complete unwillingness to even *acknowledge* that I've made a claim that doesn't fit their worldview... I find it baffling.

Expand full comment

There's a strong 'anti-school' contingent here, from people who probably, admittedly, didn't need school and had a broadly negative time at it (that sure fits my experience). But I don't think I'm a representative sample of the population, and that seems to be one of the mental blocks here.

Expand full comment

If I may provide an opinion, as a former teacher, I believe that the elephant in the room is that in elementary and high schools, the actually difficult part of teacher's job is maintaining *discipline* in the classroom. Everything else is secondary, because it does not matter how good your lessons are if no one pays attention.

Sadly, this is the part the university does not prepare you for, as a future teacher. At university, all they teach you is how to teach students who pay attention and try to master the subject. Then you get your first job, and you find out that none of what you learned matters.

So, I suspect that most of the improvement teachers get during their job, is learning how to maintain some level of order in the classroom. You either become better at it... or you quit and change profession. (Either option increases the average skill of the remaining teachers.) Only then you can use some of that stuff they taught you at university, if you are not burned out already. Also, doing the same thing over and over again allows you to optimize the process; for example, you can re-use a lot of materials you created during the previous years.

Expand full comment

Beware the man of one study. Googling reveals disagreement, for example: https://tntp.org/assets/documents/TNTP_FactSheet_TeacherExperience_2012.pdf "Teachers gradually reach a plateau after 3-5 years on the job.5 As one study put it, “there is little evidence that improvement continues after the first three years.”6 Another found that, on average, teachers with 20 years of experience are not much more effective than those with 5 years of experience.7"

By the way, check out the left graph of figure 1 of your own link, showing no improvement after 4 years (a slight decline after that).

One of the things I hate most is someone coming in and "Laying the hammer down" on everyone and being completely wrong.

Expand full comment

It's not one study, it's 30. You are delusional and I'm done arguing with you.

Expand full comment

Even if there is zero improvement over time there is still a huge cost to the churn of training and onboarding teachers who then quit, and have to be replaced with newly trained ones.

Expand full comment

We care about teacher quality for the same reason we care about doctor quality or home construction quality. It's not easy to tell it in the short-term, but in the long-term it can matter a lot.

Maybe teacher quality *doesn't* matter. Maybe we get rid of stupid credentials keeping people out and just throw bodies at it. And maybe that's right! Maybe teacher quality *doesn't* matter! It would be good to know and adjust our hiring appropriately.

Expand full comment

Teacher quality definitely does matter. The point is, the way to get quality teachers is to hire more teachers and retain more teachers, so that there's a large enough supply of teachers to promote the top ones and retrain or remove the bottom ones without compromising our ability to actually find teachers for all the kids who need them.

Let me put it this way: If there were 120 teachers for every 100 teaching jobs, then sure - obviously hire the top 100 and let the bottom 20 find some other career.

That's not the situation we're in. It's more like we have 80 teachers for every 100 teaching jobs, and people are suggesting that we hire the top 60. Well, what are we going to do with the 40 unfilled teaching positions?

I'll tell you what: increase teaching hours, increase class sizes, and remove electives with low enrollment, so that the existing teaching supply can be stretched to cover more students.

Never mind that none of those options are good for students - they just exacerbate the problems we already have.

And of course, increasing class sizes contributes to teacher burnout and pushes teachers to leave the profession, as detailed e.g. here: https://www.theatlantic.com/education/archive/2015/07/too-many-kids/397451/

That's a vicious cycle: place more demands on teachers, more teachers leave the profession, thus placing even more demands on the teachers who remain.

Obviously teacher quality matters. Empirically, teacher quality matters. But the precondition for selecting the best teachers is establishing a large enough pool of candidates that you can be selective - without sacrificing education quality elsewhere.

Expand full comment

Your points are so obvious that you don't even need to and should not have made them. Another thing I hate is people who come in blasting everyone for something that they say is obviously wrong, and then later admit that actually, they were obviously right BUT also we should [hire more teachers]. Come on. GTFO with that.

Expand full comment

I don't mind people making 'obvious' points. They aren't always obvious to everyone and boiling an argument down to the point it becomes obvious is a common method here.

I do mind a little telling someone they shouldn't have said what they said (although I will acknowledge your post wasn't the only one in this thread with a tone)

Expand full comment

You are correct sir and it makes me mad! - Freedom

Expand full comment

> so that there's a large enough supply of teachers to promote the top ones and retrain or remove the bottom ones

And right there is why we need some way of distinguishing teacher quality.

Expand full comment

But do we need it now, or only after there's a large enough supply of teachers?

Expand full comment

I don't think I've ever said we shouldn't have a way of distinguishing teacher quality. Somewhere else in the thread I pointed out that I rather like the Stronge system used by my school - https://www.strongeandassociates.com/evaluating.html - which looks at observations, student evaluations, student results, and a portfolio of documentation from the teacher.

I think the idea of only looking at student results and ignoring all the other types of evidence is probably inferior - because of Goodhart's law, and because there are tons of confounding factors. I'm not sure a prediction market could adequately compensate for either of these problems, and also it would introduce a costly layer of complexity, and also it would remove the benefits of a system like Stronge in terms of giving good feedback to teachers (that is, a teacher can look at their Stronge results and identify areas to work on improving, but the same cannot necessarily be said for a system that only looks at student results on standardized tests). Similarly looking only at student evaluations and ignoring all the other types of evidence opens you up to the biases described elsewhere (leniency, attractiveness, cookies). I think a blended evaluation system that looks at all available data is really the way to go.

But that said, these systems already exist. The fact that people outside of education carry on as if educators just have absolutely no idea how to judge teacher quality and we need outsiders to come in and make very obvious suggestions like "well have you tried looking at students' test scores" is, to put it mildly, insulting and unhelpful. I wish there was some acknowledgement that these are hard problems that smart people are working on, and some attempt to engage seriously with the solutions that have already been tried, before people push sweeping reform proposals that completely disregard what we already know and already do in the education field.

Expand full comment

I'm very concerned that student scores are a bad way to measure teachers. Measuring professionals by a single number tends not to work well. I wouldn't want to be measured by how many bugs I fixed.

But many teachers' unions have objected to the other, traditional ways of measuring professionals (which is to have a senior professional make a judgment call about who is better). And unions naturally want an objective number.

So "improvement of test scores" is often where we end up, even though no one likes it.

Expand full comment

It's fine to come off as forceful. The problem is that you've made numerous empirical claims without evidence to support them.

Expand full comment

Go ahead and specify which claim I've made that you doubt and we can discuss the evidence for it.

Expand full comment

> What do you think it is, exactly, that motivates teachers?

Sure, some teachers will go into teaching because it is their passion; but many will do so just to make ends meet, because they couldn't find more lucrative employment elsewhere. One solution would be to reform teaching into a more prestigious job -- much better pay in exchange for much higher skill. It's the difference between being head chef at a restaurant, and a burger-flipper at McDonalds.

Expand full comment

I think we need to decide what the point of education is before we decide how to measure it. I was reading just the other day about the Prussian education system and about how, in addition to the subversive socialist counterculture, there was a movement by middle class burghers to preserve their own educational norms. It was the best system in Germany measured by things like literacy, income of graduates, etc. It was normal for them to send orphans and homeless children through the system too so it wasn't selection effects. Yet it ultimately lost.

The three lines of attack against it (afaict) were that it was insufficiently national (whether in terms of socialist class struggle or conservative nationalism), that it was insufficiently professional, and that it focused too little on things like the cultivation of virtues or poetry. It was stereotyped as creating wealthy people who thought they were educated but only had a kind of shallow cosmopolitanism combined with an obsession with money.

But those aren't attacks on its efficacy. Those are attacks on its purpose. The attacker's response was that education was supposed to prepare you for a career. And the reply was that wasn't the point of education. That's a normative argument. And one that Germany ultimately decided against them. Instead, both the conservative nationalists and the liberal socialists wanted a national professional system.

Likewise, the original purpose of the American national educational system was to inculcate a sort of republican (as in the republic, not the party) virtue and civic nationalism. Before that it was to save your soul. I don't think those are its purpose today. But what is that purpose? I suspect you'd get ten different answers. The last time we had a concerted answer was No Child Left Behind which, more or less, said it was to pass tests proving you had baseline skills like literacy. If we agree on that we can make that happen. But do we agree on that? Even at the time I don't think we did.

Expand full comment

I do feel like, for better or worse, contemporary public discussion about education seems to focus on its role in preparing people for the workforce as its main point. Most people who work in education don't agree with this (and there's probably a lot of robust disagreement about what the primary purposes should be), but I do think it's what a lot of the public, including most relevant elected officials, care about.

Expand full comment

Is it? Why do we spend so much time on broad humanities or theoretical science subjects then? It seems to me the real answer is that public education prepares people to go to college, for better or worse.

Expand full comment

As I said, people who work in education don't agree with this statement of the value of education, but most of the public does, and this causes noticeable tension. (And in any case I'm thinking about education as the whole arc - thinking that the point of high school is to prepare people to go to college is quite compatible with thinking that the point of education broadly is to prepare people for the workforce, if you think that college education is an essential component of workforce preparation.)

Expand full comment

I suppose this is a definitional issue over what "public education" is. I was thinking specifically of publicly funded schools, not private colleges. But I don't think colleges generally see themselves as trade schools either. Trade schools see themselves as trade schools. Colleges tend to have some other purpose de jure if not de facto.

Expand full comment

I'm also thinking of publicly funded schools, like Texas A&M University, where I work. The university's mission statement doesn't particularly mention a workplace preparation orientation (https://www.tamu.edu/statements/mission.html) but much recent rhetoric by the higher administration, and most rhetoric by legislators of any party tends to do so.

Expand full comment

Employers frequently hire people with liberal arts degrees, presumably they, accurately or not, believe there are transferable skills, or it provides a good signal or their abilities regardless of the actual content

Expand full comment

True. And I think going to college is probably a good idea. But I do feel like the humanities is trying to have it both ways on this. Either the humanities imparts practical skills in which case it's fair to judge them on how practical they are or they deal with some ineffable part of the human experience beyond vulgar measurement in which case they should admit they're impractical. Not unworthy but impractical. Instead, we get both. On the one hand we're told that humanities imparts practical leadership skills and on the other that trying to insert direct leadership classes is vulgar.

Expand full comment

I think it's fair to level this criticism at a lot of the way that a lot of people in the humanities talk about this. But I also think that it's reasonable to say that a liberal arts education both does a decent job at developing certain skills that are broadly practical *and* is primarily aimed at something that is not exactly the same as this practical thing. (It could even be possible that it does better at this practical thing *because* it aims at something else, the way that playing a game is better at being fun if it gives you something else to focus on than the having of the fun - though I'd question anyone who claims that they have strong evidence that this is in fact the case for the humanities.)

Expand full comment

I think aiming at the thing is always correct. But I think your point is that leadership might be better taught by indirect methods. Things other than leadership classes. Which I agree. But it still gives you some metrics by which to measure the end result, one they resist because (ime) certain professors consider insulation from practical concerns a perk of the job.

Expand full comment

They don't need to think anything about transferable skills.

The observed evidence is much more compatible with liberal arts degrees for eg management consulting (and most other education) as signalling.

Expand full comment

My friend who at the time was the principal for the local high school that the troubled kids ended up in told me that the true purpose of the American education system was to "groom a complacent workforce."

Expand full comment

"What should we teach students" is one of those questions that attracts so much political vitriol once you get into the details that it's almost impossible to analyze in detail.

I wrote a blog post on primary school education ( https://yevaud.substack.com/p/education-primary-school ) but haven't figured out anything useful to say about secondary schools beyond "prepare people for the next stage of their life".

Expand full comment

No doubt. But if that's the case we should accept the dysfunction is baked in as simply a flashpoint in our society.

Expand full comment

Seems to me that the problem with prediction markets is that they have finite liquidity and are subject to irrational/dishonest bets. Emotionally motivated voters can make money-losing bets, or intentional market manipulators can make bets to drive the market up or down.

In an efficient market, any $ of irrational bets would immediately be overwhelmed by opposite bets from smart investors, who would happily take the money. However, in real life there's a finite trading volume on any given bet, and if irrational bettors spend $$$ faster than smart-money bettors can capture it, the market remains irrational.

We've seen this in multiple US elections where the political prediction markets can behave in very irrational ways despite a small number of smart-money investors plundering as many bad bets as they can.

The problem of irrational money outweighing smart money gets even worse if you try to routinely use prediction markets to answer real-world questions. Let's say you use prediction markets to answer 10 salient questions, and you manage to draw a big enough trade dollar value to swamp any effect of irrational bets.

Since your markets are really practical and useful, you now expand them to 100 salient questions. Either you have to somehow attract 10x the trade volume, which could be difficult or impossible if your volume was already quite high... or each question only gets 1/10 as much trade volume as before. The more questions you add, the less confident you are in any one answer.

It feels like forking a cryptocurrency into many mutually-incompatible coins until each one is vulnerable to a 51% attack.

Expand full comment

That sounds like something that would self-correct pretty fast. If the irrational are giving money to the smart faster than they can receive it, shouldn't the smart soon have received enough money to dominate the market?

Expand full comment

You would think so, but US political betting markets did all sorts of weird sh*t during the 2020 election, including persistently having a 5-10% chance of Pence or some third party person magically becoming President (after Trump survived COVID and Biden won the election)

Expand full comment

Isn't that just the US being an immature market tho? I'd be quite surprised if you could find those kinds of odds in the Premier League market or some other mature prediction market. So I'd expect that to go away as smart money starts taking notice and advantage.

Expand full comment

Absolutely not true. UK markets always overrate the England team's chances of winning the World Cup and Euro competitions.

And I'm not measuring against my personal opinions, but against non-UK markets.

Premier League is generally OK because there is similar amounts of money getting put in by fans of both teams, so the smart money dominates the odds-setting, but that doesn't work for England in international competitions, and there isn't enough smart money to force the odds down.

Also, international arbitrage to make a profit is really hard; you have to pay out two sets of fees and you have currency risk. The result is that there can be a pretty substantial differential between UK prices and (say) German prices without the ability to make a profit on arbitrage - the market certainly has arbitrageurs.

Expand full comment

In the US, Vegas has the concept of the "public team", which is one where a lot of casual bettors will overrate the team, because the team has a lot more fans than most (e.g. the Dallas Cowboys).

This does apply to some extent in the Premier League, but the structure of that league is such that having more fans means having more money to spend on players which makes the team better, so it tends to work out reasonably well.

But there are a lot of people who like to bet on their favourite team.

And there are a lot of people who like to bet on their preferred outcome in all sorts of other events - as I said above, Trump fans bet a lot on him winning in December 2020, ie weeks after anyone who wasn't blinded by fandom knew he had lost.

Allowing people with more money to have a greater weight in the market (which is what happens if the market isn't capped like the US ones are) doesn't help; there are enough wealthy idiots that the smart money still can't keep up.

Expand full comment

Yeah and that's in existing betting markets where there is very little incentive to intentionally manipulate the outcome.

If you were using predictive markets for something like a Supreme Court decision, where the majority of market participants have a partisan stake in the outcome, it'll be totally inaccurate regardless of how high the market volume gets.

Expand full comment

Not just US. You could make a 5-10% return on Trump losing here in the UK, on a platform (betfair) with no limit on bet size, in a market where several tens of millions were at stake. Trump fans in the UK lost well over a million pounds between the media announcing he had lost and the electoral college voting; there just was not enough smart money to push the odds right down.

Expand full comment

I don't remember the odds being as good as 10% after the media announced. I had a look and thought that there was enough uncertainty (Biden has health issue, recount demanded etc.) that it wasn't worth it.

If I staked enough that winning would be significant at whatever the post announcement odds were then loosing would be very significant.

Maybe that means I'm not the smart money but Taleb would quibble with how smart it is to bet big for a small return becuase 'there is no way to loose'

Expand full comment

yeah a lot of the arguments for prediction markets sound like the arguments for all free market solutions that assume infinite liquidity and capital.

Expand full comment

On the valuation of SpaceX - I'm not entirely sure, but could there be some distortion introduced by looking at the *median* valuation rather than some probabilistic average? I suppose that if the valuation can't go negative, then the mean must be at least half of the median, so there must still be some mismatch, but it might not be as extreme as you get by looking just at the median.

Expand full comment

Given the size of the right tail of this distribution, I think the median is more appropriate than an average. If 1 in 500 bettors think SpaceX is a 100 trillion company, I don't care about that as much as if all 500 bettors raised their estimate by 0.2 trillion.

Expand full comment

I don't think your moderation problem (really bad spammy posts) is as bad as you think. In principle theres no reason a sufficiently liquid betting market can't target really unlikely events. But you need to pay out at the odds of the event. E.g. Kalshi could handle this moderately well.

But really what you need to do is re-orient slightly. The reddit/metaculus approach of giving everyone internet points works surprisingly well, and doesn't have problems with these low-probability events being expensive -- because internet points are cheap. Add a ranking system for points holders and it gets much easier. Then you could see if.. e.g. >95% of bets were that you'd block something, you temporarily block it -- subject to review.

Expand full comment

"In some sense, the definition of probability is what a smart person who knows a certain amount of information should estimate"

That doesn't sound right. Probability would exist even if no people did.

Expand full comment

Arguably not. Because, in reality stuff either happens or it doesn't. An omniscient being who knew all the facts in the universe and had the ability to trace their chains of causation forward infinitely would know that the "probability" of any event was either 100% or 0%.

Thus, it is people's guesses based on imperfect information that defines "probability" for purposes of human beings engaging with the world. For example, if I flip a coin, cover it with my hand and take a peek, the probability of heads is either 0% or 100%. To you, however, the probability is 50%. The probability of the coin flip outcome is not an objective fact, it depends on the information available to the observer.

Expand full comment

Quantum mechanics is inherently probabilistic per our current understanding.

Expand full comment

Perhaps. But "Per our current understanding" is sort of the same as Scott's formulation of "the best guess of smart people with the best information."

In any event, even assuming the universe has true randomness, it's hard to extrapolate that from the subatomic level to making a stock market pick. So in the domain of human experience "probability" is going to be a function of human knowledge and intelligence. (Sort of like how human experience occurs in the realm of Newtonian rules rather than Quantum Theory or Relativity Theory).

Whether God really "plays dice with the universe" is probably an unanswerable philosophical question in the end because we will never be able to account for the possibility of "unknown unknowns" in our understanding.

Expand full comment

Reality followed QM before humans formulated QM. Nothing about the nature of reality changed once humans realized reality was probabilistic, it was already probabilistic before.

Expand full comment

Yes, but quantum probabilities are complex numbers, not positive reals.

Expand full comment

You are thinking of amplitudes, not probabilities. Probabilities are the square absolute values of amplitudes, and they are nonnegative reals.

Expand full comment

They need not be. That's just a convenient form of the math. But you can certainly formulate QM to deal entirely in real numbers, if you want. You just have to keep track of the probability density and phase separately.

Expand full comment

I think this understates the nature of the issue, and gives a somewhat misleading impression. There are plenty of scenarios -- indeed, most, involving macroscopic quantities of things -- where even classical dynamics leads to essentially unknowable outcomes. That is, you would have to be very close to truly omniscient -- knowing or order 10^23 facts instantaneously -- to be able to predict in any useful sense a fairly simple outcome. This is what is meant by "chaotic" or "ergodic' systems, systems in which individual trajectories become rapidly unpredictable without absolutely perfect to the last decimal place knowledge of the initial conditions.

So even if one agrees that probability sneaks in by virtue of "ignorance," in principle, the ignorance in question is of a sort that is not in any conceivable practicality avoidable, so it it might as well be wired in to the physics.

Expand full comment

I think the problem comes from confusing two different types of probabilities: epistemic probabilities (what Scott is talking about) and propensities (what quantum mechanics, which you mention below, has). They are different concepts even though they are related.

Expand full comment

"Certain classes, races, and genders of students consistently produce higher VAM than others, and a teacher’s VAM can apparently predict their students’ past performance, which makes no sense unless there’s some kind of bias going on."

This was very intriguing. I wish you'd have done a deeper dive on these VAM issues. It seems to me any problems with this metric are simply due to a failure to define educational "value" correctly. And rigorously examining this definition should also trigger a highly useful debate about exactly what we are trying to accomplish with education and how much it is worth.

For example, is the "value added" metric defined solely in terms of the raw number of additional test questions answered? In that case, it would be easier to add "value" by just selecting a bunch of 99th percentile smart kids who will learn the material better and faster. But if they started and ended in the 99% percentile (and probably would have done so even if they skipped school and taught themselves), has any "value" really been added by the teacher?

But what if "value added" for an individual teacher is the ability to raise the percentile rank of her students. In that case, low motivation and "underperforming" kids are the low hanging fruit who have the most room for improvement. But maybe this "value" isn't being added by the teacher and is just an artifact of underperformers regressing (upward) to their mean ability.

In any event, "teacher value added" should be graded by "degree of difficulty." For example, improving results for kids with IQs of 75-85 and issues with focus and impulsivity should count for more than simply presiding over smart kids being smart.

Finally, have educational researchers ever heard of Bill James? He already figured out how to do "value added" analysis for individual performers. According to Sabermetrics principles, once we correctly define what result we consider an educational "win" we should simply rank teachers like baseball players in the draft -- i.e., based on their expected "wins above replacement."

Expand full comment

Quick comment before I finish reading the post but this sounds confused:

> For example, one stable equilibrium is that the right answer is the obvious Schelling point so everyone tries to coordinate around that. But another stable equilibrium is that “one thousand” is a very round number, so everyone tries to coordinate around that.

Is there an obvious Schelling point I'm failing to think of? Zero years (until nuclear war destroys civilization)? Zero is an extremely focal number.

But the confusion is in the next sentence -- "But another stable equilibrium..." -- which goes on to describe exactly what a Schelling point is, so starting it with "but" doesn't make sense.

(Also "one thousand" isn't a coherent answer to "will nuclear war destroy civilization by the year 2100" so I guess there are layers of confusion here.)

The overall point is clear (and well taken!) to those of us who know all about Schelling points and Keynesian beauty contests but I suspect the less initiated are going to get lost there.

Expand full comment

Inb4 the entire prediction market gets taken over by Wallstreetbets traders who decide to rally the entire market around 420 years till nuclear annihilation.

Expand full comment

He's saying there are two possible Schelling points: 1) the correct answer, and 2) a nice round number like a thousand (or, as you say, zero). We want #1, so #2 is a failure mode.

(I agree about the incoherent answer; I assume he changed the question from "years to nuclear war" to "probability of nuclear war by 2100" and forgot to edit.)

Expand full comment

Forehead-smack. Thank you so much. I was misreading the phrase "the right answer is the obvious SP" as "the following is what the right answer is: the obvious SP" instead of "an obvious SP is the following: the objectively right answer"! I retract my imputation (except the probability vs years thing, as you say).

In conclusion, I am dumb and/or writing is hard. (I actually think part of Scott's brilliance as a writer is that small confusions like I had here are extremely rare with his writing.)

Expand full comment

I think citing Caplan as your authority on education -> income is a lot like citing Borjas as an authority on immigration. He's not a crank, but definitely represents a view that is disputed by other people who know how to run a regression.

Expand full comment

I would be curious to see all these regressions showing that education isn't about signalling.

Expand full comment

This is one of those things where most people agree on the form of the equation but disagree on the coefficients. This blogpost (https://marginalrevolution.com/marginalrevolution/2021/10/the-credibility-revolution-1.html) has a run down of the most famous example of analysis that finds that schooling has impact on earnings through accumulation of human capital.

If you are actually interested, most labor econ textbooks spend at least a chapter on it.

Expand full comment

I take it you're referring to Angrist and Krueger (1991), yes? I'm not sure this really gives a good response to Caplan's view.

First, if memory serves, Caplan's argument is that most although not all of education is signaling. The Angrist & Krueger paper claims to identify a causal effect of an extra year of high school education for male high-school dropouts. It's entirely possible that human capital effects dominate at this margin for these kinds of students. It's also totally consistent with most of education being signalling. (I'm not sure where I stand on this.)

Second, there are serious problems with the Angrist and Krueger paper. In short: their approach is to treat the quarter of the year in which one is born as a "natural experiment." Roughly speaking, this requires that people born in different quarters of the year are identical in their unobserved characteristics, differing only in their average level of education. It's a very clever idea and the paper has deservedly been influential. But this assumption is untenable: the quarter of the year in which a mother gives birth is correlated with her characteristics. As Buckles & Hungerman (2013) point out "the well-known relationship between season of birth and later outcomes is largely driven by differences in fertility patterns across socioeconomic groups, and not merely driven by natural phenomena or schooling laws that intervene after conception."

If you're interested in seeing some more details, I gave a general interest talk to an alumni group last year discussing Angrist & Krueger (1991) and some related papers:

https://youtu.be/NeAkMcgdWxA

Expand full comment

(1) Re: the Virginia governor election - I have no idea about the two candidates apart from one seems like a standard Democrat and one seems like a standard Republican. But going to the linked "Washington Post" story sends me on to a link to an explainer ad on Twitter from Governor McAuliffe, and I would vote for his opponent on this alone:

"Husband, father, former Governor, and proud dog dad to Trooper and Dolly. Now, running to be the next Governor of the Commonwealth of Virginia."

He's 64 years old, he's got five human kids, and he burbles on about being a furry. Oh, you didn't mean it *that* way, Terry? Then why pretend you are the equivalent of an adoptive father to non-human animals, and that they are the equivalent of your human children whom you do not put into your Twitter bio (are you afraid potential voters will see you have five kids and imagine you are some religious zealot conservative?)

Yes, I'm cranky about this. Even in a fun, just-joshing way - animals are not your babies, you are not their parent, it's not an equal relationship.

(2) Setting up bans on the basis of automating banning which is triggered by how much money one person, or a group of people, are willing to co-ordinate to get some comment or some commenter banned. Yes, I see no way at all that can be gamed!

Expand full comment

On the ban via prediction market proposal, to develop what I mean:

The *ideal* is "This is a terrible comment, Scott would ban it were he only aware of it" and person chips in a quid to register this prediction.

Least malicious practical effect: "This is a terrible comment, it should be banned, Scott would ban it" and a group of genuine, sincere people pool their money on a prediction to make sure this gets brought to Scott's (or the Amazing Automatic Ban-Bot's) attention so it will be banned.

Most malicious practical effect: "We don't like you/we don't like what you are saying/we're doing it for the lulz" and a group of people get together to pool their money on a prediction to get Ban-Bot to drop the banhammer. Since Ban-Bot is not actually reading the comment, just springs into automated action when the target sum of "more than $5 on one prediction" is triggered, there can be fun and jollity playing such japes on the unsuspecting.

We're *probably* a better community than to be griefers like that, but all it takes is one person (and since I'm getting flak for allegedly calling for censorship of free comment on another site, I'm going to be alert to how any such systems can be twisted away from their original ideal).

Expand full comment

I'm American, but I agree that calling your pets children is dumb and childish, and giving your dogs billing over you actual children is downright weird, especially for a guy who would presumably benefit from presenting himself as an upright family man.

Also, I knew this guy was the Republican the moment I saw that he named a dog Trooper.

Expand full comment

McAullife is the Democratic Party candidate.

Expand full comment

*McAuliffe

Expand full comment
author

The Republicans are the ones who name their troopers "Dog" - https://en.wikipedia.org/wiki/Dog_the_Bounty_Hunter

Expand full comment

For completeness, the gray tribe proceeds to develop a hybrid dog-trooper life form. Set to altruistic mode, the dog-trooper sniffs out buried landmines. Set to selfish mode, it buries new landmines. "Don't mind it. It's just marking its territory."

Expand full comment

Sorry, he's the Democratic candidate! He's Irish-American Catholic standard Democrat politician, partly why this annoys me. They're your pets, not your grandkids. Who are you trying to appeal to, 20 year olds with a cat and an artisanal sandwich?

Expand full comment

How dare you impugn the deliciousness of Artisanal sandwiches.

Expand full comment

It's ok that you were wrong, but it sounds like you really need to re-evaluate your priors.

Expand full comment

Trooper and Dolly are dogs. McAuliffe's five children are not dogs. Dogs are usually comfortable with strangers' attention. Non-politician humans sometimes aren't. McAuliffe probably names his children from time to time in his public messaging, but he might choose not to do it on every opportunity.

Also, how do we know McAuliffe named his dogs? The wife and the aforementioned five children might have played a role.

Expand full comment

That's a good point. I'm just so used to politicians trotting out their kids in order to prove that they're regular people with families.

Expand full comment

Which does make me wonder about the nature of Modern Society where the politician doing the usual 'behold, I am a Regular Person' bit is now doing it with his dogs than his children (proud DOG dad of NAMED PETS).

Yea verily, I am Too Old For This.

Expand full comment

Well, this is all only my perception, but take me as the average idiot voter, and elections are often won or lost on perception by the voters.

He doesn't have to name out his kids (though their names are up there on the Wikipedia bio when you Google his name) but if he's going to do the whole "husband, father" bit, why the heck stick in "proud dog dad" to the named dogs? Why not "husband, father, ex-governor, owner of two dogs" (if you feel the good people of Virginia really, really need to know about your pets) and leave it at that?

Conversely, why not "proud father of five great kids"? Come on, impress us with your potent virility, O paterfamilias!

And as I've said, the term "dog dad/cat mom" just grates on my sensibilities. Maybe his social media team have told him he needs to Meme For The Younger Voters, but it does sound as if he's way prouder of his two non-human kiddies Trooper and Dolly (a son and a daughter, if I am not presuming their genders) than his biologically human children.

If I were the junior McAuliffes, right now I'd be wondering about the contents of Dad's will 😁

Expand full comment

What irritates me about McAuliffe is his misrepresentation of the argument over books in school. He claims his opponent wants to ban a book. But the bill McAuliffe vetoed in his previous term as Governor and is being attacked for didn't ban any books. It required a teacher who assigned a book with explicit sex (or, I think, violence) to tell the parents of the kids it was assigned to, with the parents having the option of requiring a book not having those characteristics to be offered to their kid as an alternative.

Expand full comment

I don't know what side of the fence TheHill comes down on (it's supposedly independent, so not tilted to one party or the other?) but their coverage indicates McAuliffe made a major boo-boo telling parents to butt the hell out of whatever schools deign to teach their kids, which isn't going to sit well with *any* parent be they Red, Blue, Purple or Green:

https://thehill.com/opinion/campaign/578885-education-blunder-igniting-suburban-parents-driving-mcauliffe-panic-in

"Education blunder igniting suburban parents driving McAuliffe panic in Virginia

© Julia Nikhinson

In a Virginia gubernatorial debate on Sept. 29 with Republican opponent Glenn Youngkin, Democratic nominee and former governor Terry McAuliffe declared: “I don’t think parents should be telling schools what they should teach."`

Terry \>

— Corey A. DeAngelis (@DeAngelisCorey) October 27, 2021

Those 10 words – deserving of a top listing in the Hall of Fame of Political Blunders – may prove to be the turning point in a race in which McAuliffe was expected to cruise to victory, especially since Joe Biden won the blue state by more than 10 points on his way to the presidency in 2020.

Here’s how some media covered it:

CNN – “Virginia Republicans seize on parental rights and schools fight in final weeks of campaign"

Washington Examiner – "McAuliffe says parents shouldn’t tell schools what to teach, handing Youngkin a campaign ad"

Washington Post – "Is this Terry McAuliffe’s last hurrah?"

Recent polls show all the trends going in Youngkin's direction, particularly among parents who (at least in some instances) might have sat out this off-year election if not for McAuliffe telling them their input isn't welcome when it comes to the education of their own children.

NEW Virginia Governor's Race Poll:

(Parents of K-12 Children)

Terry McAuliffe: 39%

Glenn Youngkin: 56%

— Corey A. DeAngelis (@DeAngelisCorey) October 25, 2021"

Expand full comment

I've noticed that The Hill is pretty good at straddling the center, and will post op-eds in both directions while avoiding some of the worst examples of one-sided reporting. I don't follow them closely enough to tell for sure that they have no bias at all, but they're at least trying.

Expand full comment

I'll turn that N=1 into an N=2, FWIW. I live in the DC area, and TheHill strikes me the same way. It's cited often on CSPAN, which is about as close to utterly impartial as I think any radio or TV source can get. (TBF, CSPAN also replays all the major Sunday talk shows from sources I consider biased, but it airs all of them, and without commentary.)

Expand full comment

Well, and calling yourself the "parent" of your dog implies one of two almost equally repellent things:

(1) You treat your actual children as property.

(2) You give your animal agency in a world it did not design which it is completely incapable of exerting to its own benefit ("Who would like to eat nothing but bacon grease? You would? Good boy!"). So you are a poor steward of the animal's welfare.

I realize (as I'm sure you do) that it's just a cute affectation, but I don't like it either, because to my mind it implies that you are taking at least one of the roles -- parent, or animal husband -- less responsibly than you should.

It reminds me of a certain breed of humor in which a powerful CEO will josh that his cute secretary really runs the firm, or a husband will joke that his wife is the "real boss," or a judge will laugh that his clerk is the genuine decision-maker. All of these seem to me to actually demean the person supposedly being honored.

Expand full comment

Re: Decision making by market prediction: What stops a billionaire from putting their thumbs on the scale? There's been value shown in vote-manipulating Reddit posts by companies and governments. If we assigned a dollar value, this would become more expensive, but still a good return on investment for the people with this kind of capital.

Expand full comment
author

In principle, when everything is going right, you can't manipulate prediction markets.

Suppose there's an obvious spam post that I would obviously delete. A billionaire puts a million dollars on "Scott won't delete this".

That means that you could make a million dollars by betting against him. Even if you only have a few thousand dollars to spare on something like this, you would call up your friends and tell them there was free money, and they would call up their friends, and eventually there would be a million dollars on the other side. The billionaire would lose their million, and the moderation decision would be decided correctly.

Expand full comment

With lack of human review and sufficiently large amount of money it would become broken. ACX commenters may have not enough money or friends that can be convinced to counter such whale before automated decision will happen.

Though it seems theoretical issue - ACX volume would make possible to review all cases with high activity.

One more failure mode: post spam and collect payouts from subsidization pool. Likely also minor on ACX scale.

Though anyway fundamental problem that as it stands it is simply illegal due to dumb regulation.

Expand full comment

Lots of bloggers accept money to put obvious spam posts on their blogs; this is often called "AdSense". In this case, if you were bribable like Soulja, the billionaire could not only get advertising, he could also make a million dollars. Maybe he'd find a subtle way to bribe you so you didn't notice you were being bribed.

Expand full comment

Now I kinda want to see something like this implemented so I could start taking bets for how many days before 4chan finds a way to abuse it just for fun.

Expand full comment

> Keynesian Beauty Contests

You could also predict what people will predict at a subsequent prediction in t years. e.g. "What will the prediction be when this same poll is run 3 years from now?" under the assumption that as you approach the terminal endpoint, the predictive values stochastically converge to the true ones.

Expand full comment

Scott,

I suppose you have forgotten this, but in 2015 there was a website called Omnilibrium that already implemented a very similar idea (the old SSC website had a link to it for a while). On Omnilibrium though, instead of making predictions just for the moderator the recommendation algorithm was making predictions for each individual user. The website never acquired many active users and went inactive after about a year.

Expand full comment

<And you can’t do an open tournament, because then lots of stupid people would be in it and the challenge would be figuring out what stupid people would guess. >

You know, you could just ask me.

Expand full comment

For content moderation, you could have reddit-style up and down votes, and then Scott's votes are authoritative. So if Scott downvotes a comment, then everyone who upvoted it will be downweighted in terms of how much their votes counted for moderating other posts. And if someone is heavily downweighted, then anyone who has a similar voting pattern to them will be downweighted as well. This would work best if Scott's votes were invisible and no one knew about the algorithm so they couldn't try to game the system. It obviously works less well for something like Facebook where users don't trust the company's opinions.

Expand full comment

reddit-style moderation encourages echo chambers. Non-substantial confirmation bias comments get sent to the top, and anything that doesn't conform gets sent to the bottom. That's why sorting by controversial is the best way to browse reddit these days.

It used to work because of reddiquette, but those days are long gone. Maybe it would work here because of the like-minded nature of the community, but it isn't a good solution in the long run.

Expand full comment

But as long as we trust Scott to not up vote non substantive conformation bias than the above solution should work.

Human nature will kick in, vacuous comments will be up voted to the top, Scott will down vote them for being vacuous and anyone who voted for it will have their vote down regulated.

Expand full comment

You're re-inventing Slashdot's meta-moderation. Which so far as I can tell kind of works, to be fair.

Expand full comment

Vitalik also had a neat sketch of a mechanism that would use smart contracts to disincentivize spam.

https://ethresear.ch/t/conditional-proof-of-stake-hashcash/1301

"The idea here is that we set up a smart contract mechanism where along with an email the recipient gets a secret key (the preimage of a hash) that allows them to delete some specified amount (eg. $0.5) of the sender’s money, but only if he wants to; we expect the recipient to not do this for legitimate messages, and so for legitimate senders the cost of the scheme is close to zero (basically, transaction fees plus occasionally losing $0.5 to malicious receivers)."

Expand full comment

If we did that, only extremely important messages would ever get sent, since gas fees would send the cost of an email up to $50 or so.

Expand full comment

The Lincoln Project false-flag operation feels like a Too Good To Check playing out and actually swinging the election.

Act 1: Nazis support Youngkin! Youngkin sinks in the polls.

Act 2: Lauren Windsor of the Lincoln Project admits that the McAuliffe-supporting PAC organized the "Nazis" as a false-flag operation, but refuses to apologize, saying that it was fair since Republicans are basically Nazis anyway. McAuliffe sinks in the polls.

Act 3: Leaked documents reveal that the Lincoln Project is a false-flag operation by Youngkin supporters. Youngkin sinks in the polls.

Act 4: Turns out the Nazis were a false-flag operation by Democrats pretending to be Republicans pretending to be Democrats pretending to be Republicans. McAuliffe sinks in the polls.

Or does he? It depends on where people stop.

You can focus on the people whose minds will never be changed, and say that people will keep going until they find a conclusion they like and then stop. But median voters exist. You never hear from them, because they don't produce any of the national conversation, but they do consume it. Admittedly, there's no way to be sure that this actually did move the polls at all. But it could have.

Expand full comment

Missing Act 5: USA sinks. Juárez inhabitants enjoy the day at the new beach, since their customers are gone anyway.

Expand full comment

SpaceX can't be taken public if the plan is to build a Martian colony. An off-world colony is very far from a profit-maximizing idea, it's highly unlikely that public shareholders would be persuaded that SpaceX's profits should be invested into it.

Expand full comment

I am quite curious is the colony an actual project - and is it going to ever happen.

Expand full comment

The development of Starship seems like it would be overkill for any other goal. No idea about the 2nd question.

Expand full comment

Not sure what kind of answer you're looking for. We're talking about an Elon Musk project. It's his stated goal for creating SpaceX. Are you speculating that he never really wanted to achieve that?

Expand full comment

In keeping with the post topic, here's the Metaculus poll on the question https://www.metaculus.com/questions/349/will-spacex-land-people-on-mars-prior-to-2030/

They think it's 20% likely by 2030.

Expand full comment

There are light years of difference between "planting a flag on a planet, because it's there" and "colony on an inhospitable wasteland incapable of keeping an atmosphere". I'm great supporter of the former and take the latter as a sign of not checking the facts or being way too optimistic.

Expand full comment

The goal is to make a company which people can buy a colony from.

Expand full comment

I just think it's too large of a project and too long of a vision to get the public markets to finance such a thing. There are many other lucrative things to do in space, and Colony-as-a-Service is not exactly low-hanging fruit. I think if SpaceX was a public company and they were reinvesting all their profits into Mars habitats, shareholders could pressure the board to focus more on something more lucrative with quicker payoffs, like asteroid mining. I think Elon wants to avoid that risk.

This is why the plan is to spin off Starlink as its own company, that SpaceX owns a large piece of. That way Starlink can become a stable, profit-generating endeavor and fund grander SpaceX projects.

Expand full comment

If the assignment of students to teachers is randomized, can't you just decide who the better is by looking at their scores (or improvements) afterwards? The randomization should even out the demographic differences, as long as the number of the randomly assigned students is large enough.

Expand full comment

Perhaps, but I think whatever value you would get out of teacher evaluation would be offset by the costs of making talented students put up with disruptive low-IQ kids.

And this would work within a school, not between schools. If one district is mostly one and another black, then a randomized classroom isn't going to be much different to a non-randomized one.

Expand full comment

The proposed prediction market scheme also requires randomizing some students between the teachers we are trying to compare, and it doesn't work between schools either if we don't want to randomize children between schools. If the assignment of children to teachers/schools isn't randomized, then bettors can still bet based on demographic factors, and we can't distinguish between demographic factors and the teacher's quality.

Expand full comment

If the assignment of students to teachers is randomized you will see a lot of effort put into evading the system.

Expand full comment

For instance, students can change their minds about taking French or Spanish, or their parents can tell them to change their minds about (as long as the student isn't 18 yet). Then, the necessary schedule change to accommodate the new language class enrolment also gets the student a new math or English class - with the desired teacher.

What's an example of how teachers might game the system?

Expand full comment

Teachers can just plain strike; there's teacher shortages basically everywhere in the West, if they really hate a change it isn't going to happen.

You'll also get plenty of genuine concerns about kids with special needs, kids that can't be put into the same class as their bully, etc. and it only really takes one valid exception before there's a system for overriding the randomness and then the use of that system will expand.

Expand full comment

-"Probably you can prevent that by hiring one expert to make an educated guess outside of the beauty contest, and including that in the mix."

I don't see how one expert would make much of a difference; if there are significant advantages to using a Schelling point like $1000 then it seems they would overwhelm the small penalty for being far away from this expert. Though maybe it would psychologically make a difference.

Expand full comment

A fundamental problem of teacher evaluation is sample size. An average K-12 teacher will have 30-90 students per year. Even if a prediction market manages to actually account for confounders, you're still not going to get reliable measurement with an n of 30.

But no system can account for one of the most important confounders—other students in the class. For example, there are those students who are a constant disruption, and being in a class with them makes learning much harder (regardless of who the teacher is); and often, how disruptive a student is depends on which other students (ie., their friends) are in a class with them. The opposite is also true, where some students can learn much better by being in a class with certain other students. To evaluate how much a student learned from a teacher, you have to be able to control for how that student was affected by the other students in the class. But given the number of permutations of possible classes of students, no model will be able to actually account for this effect.

Prediction markets could probably still be a useful tool for evaluating questions with a large sample size, like the public-vs-private schools example. But it won't work for evaluating an individual teacher, and probably not even for an individual school. Maybe the question "what system can accurately and reliably measure the effectiveness of a teacher using only their students as inputs?" is "there isn't one".

Here's the question that should be asked instead: "how do we make education better?" Obviously, in general, having better teachers leads to better education. But has anyone actually tried to figure out how much a more accurate teacher evaluation system would improve educational outcomes? It's not at all self-evident that this is the highest-return change we could be pushing for in education.

Maybe a better approach towards improving teacher quality would be to just ask teachers, "what would make you a better teacher?" Teachers know a lot about teaching, so they're a good resource. If you ask teachers, they'll almost all say smaller class sizes. Smaller class sizes obviously cost more money, because it means hiring more teachers, so it's often not even considered. But it should be obvious that teachers can reach each student better if they can spend more time and effort focusing on each student individually. And beyond that, we should also expect that for a student, more individual attention from a descent teacher might actually be a lot better than no individual attention from a great teacher.

Expand full comment

> Even if a prediction market manages to actually account for confounders, you're still not going to get reliable measurement with an n of 30.

It won't distinguish a 50th percentile teacher from a 60th percentile teacher, but I'm okay with that. As long as we can identify the zeroth percentile teachers who should be sacked and the 99th percentile teachers who deserve a bonus, it's reasonable to treat the great middle ground as all being basically the same, because they are.

Expand full comment

"As long as we can identify the zeroth percentile teachers who should be sacked and the 99th percentile teachers who deserve a bonus"

Okay, but we don't need a prediction market for that. Systems like Stronge that look at a combination of teacher results and portfolios, lesson observations, and student evaluations can already do this with a degree of accuracy that every teacher and administrator I've ever worked with finds at least satisfactory. You can look at the basics of this system here: https://www.strongeandassociates.com/evaluating.html

Expand full comment

"there are those students who are a constant disruption, and being in a class with them makes learning much harder (regardless of who the teacher is); and often, how disruptive a student is depends on which other students (ie., their friends) are in a class with them"

Excellent point - classroom culture is huge, and there are often one or two students who take up a vastly disproportionate amount of a teacher's time and effort.

That said, there are methods to solve this kind of problem, and if schools and teachers had more resources, they would use more of these methods. I can go into detail on them if anyone is interested but generally it involves teaching students to behave from an early age (what used to be called "character education" and is now called "social-emotional learning") and having a responsive system of interventions in place run by a school counselor in conjunction with teachers.

"But has anyone actually tried to figure out how much a more accurate teacher evaluation system would improve educational outcomes? It's not at all self-evident that this is the highest-return change we could be pushing for in education."

Again, I totally agree. I mean, it's already pretty easy to identify the teachers who are clearly the worst. The reason we're stuck with them is that there's literally no one else to do their jobs. There are US states that have had to waive degree requirements to get teachers in and they still face shortages. It's not clear at all how changing how we evaluate teachers would improve anything.

"Teachers know a lot about teaching, so they're a good resource."

Why thank you for noticing.

As a teacher, here are some definite ways we could improve teaching:

1. Smaller class sizes (I'm at a cushy private school and my largest class ever was 23 - I'd go insane trying to teach 35 and I don't know how public school teachers do it)

2. Fewer teaching hours

3. More time and money for professional development and teacher training

4. More time for collaboration and mentorship

5. Support staff to help students who need extra help (I'm at a cushy private school so I have two counselors and an English language support specialist I can contact if I need help reaching a kid - I'd advocate for all teachers at all schools to have at least this level of support)

6. Later school start times (this improves academic outcomes across the board, especially as kids get older)

7. On that note, more electives, more arts classes, and more recess, because mentally healthy kids perform better.

8. Better social services in the community in general.

Notice what's not on the list:

1. Whatever new fad there is

2. Standardized tests

3. Fancy gadgets

4. Demands from politicians

And yet for some reason, people outside education are forever trying to push these non-solutions on us while ignoring decades of research about the things that demonstrably improve student outcomes. Makes me glad I teach at a cushy private school outside the US so I don't have to watch the education system completely fall apart from the inside while vultures pushing various for-profit schemes devour the remains of our failed schools.

Expand full comment

I was under the impression that Keynesian Beauty Contests were a metaphor for a failure mode in real markets, including prediction markets.

If I know that Enron stock will ultimately crash to zero and they aren't profitable, or that Trump will not successfully execute a coup before 2020, BUT that there are a bunch of idiots who believe the opposite and will drive up the price, then my optimal strategy is to ride that bubble and try to exit before it bursts - NOT to fight it and give my true prediction. The market can stay irrational longer than you can stay solvent.

In the worst case, it's possible for most or even ALL of the investors driving up a stock's price to believe it's actually worthless and will not pay out, but nevertheless correctly believe that it would be irrational for any individual investor to act on this. Many "meme stocks" and minor crypto scams seem to work this way.

The Wikipedia page seems to agree:

A Keynesian beauty contest is a concept developed by John Maynard Keynes ... to explain price fluctuations in equity markets...

Keynes described the action of rational agents in a market using an analogy ...

Expand full comment

I can see how this can mess with people making money on the value of the stock changing, but there's still dividends right?

Makes me wonder what the difference is between "money from dividends" vs "money from gains"

Expand full comment

The difference is the magnitude, plain and simple. Dividends can be a good way to make a nice comfortable 5% real return (number very rough, not the point), but timing a meme bubble correctly can double your money in a couple of months. Even with real stocks priced somewhat reasonably, companies that grow a lot offer much higher potential long-run returns than companies that sit comfortably where they are and pay out steady dividends (though obviously the risk profiles are different)

Expand full comment

"Also, if I were to play this prediction market, I could insider trade and steal all your money. I guess if you trust me enough to make me a moderator, maybe you also trust me enough not to do that?"

I think there's lots of people I would trust to moderate a forum or comments section, but very few that I would trust to not insider trade in that situation (probably including Scott). The incentive just seems to strong, and it also seems hard to have strong enough verification procedures to prevent insider trading.

Expand full comment

I have no idea what you will ban. So will not bet. How many people will be involved in your betting pool?

Expand full comment

>The main flaw I can come up with in five minutes of thinking about this: suppose there’s some obviously terrible post, like outright spam. Nobody would predict I don’t ban it, so how would there be any money to reward the people who correctly predict I will? Maybe there’s a 1% tax on all transactions, which goes to subsidizing every post with a slight presumption toward don’t-ban.

You would have to provide a marginal amount of liquidity (Say $10) per post.

The system should also allow participants to add liquidity.

In an obvious ban post, a sharp comment reading will just snatch the $20.

This is how Polymarket kind of works.

Expand full comment
founding

> I have no source for this, someone told me about it at a meetup.

This might have been me? (Unless it was a recent meetup.) Anyway, there's some work under the name "Bayesian Truth Serum" which is interesting here.

Expand full comment

I have a few questions about the teacher evaluation idea:

[1]

I'm confused about why the prediction market for evaluating teachers would be better than existing methods like VAM. Presumably, any bias in existing methods like VAM are because of non-random assignment of students to teachers. But if the prediction market is predicting things like "performance of Alice conditional on being assigned to Mr. Smith's class", and there's non-random assignment, won't the best strategy for predictors be to reproduce the bias? In other words, if there's some bias that makes Mr. Smith get assigned all the best students, then won't a predictor use the fact that Alice got assigned to Mr. Smith as evidence that Alice is a good student?

It seems to me that the only way out of this would be to make sure you were able to randomize the students to teachers. But if you had a way to effectively randomize students to teachers, then wouldn't you be able to do a properly controlled and unbiased study in the first place, and not need the prediction market? What information does the prediction market give you that the actual test scores don't give you? (Prediction markets might make sense if you have to make a decision now based on something that won't resolve until much later, but that's not the case here. We're perfectly fine waiting until we get the actual test scores before we finalize the evaluations.)

[2]

The proposal here seems to be that the teachers are evaluated on the *market prediction* of their students' performance, not the students' actual performance. Doesn't this create a perverse incentive for teachers to do things that look to market participants like they help, even if they don't actually help? Even if you accept that market participants will never be fooled (which seems optimistic; even in the real stock markets, companies do often try to fool analysts, e.g.https://www.bloomberg.com/opinion/articles/2021-05-04/under-armour-earnings-were-a-bit-misleading) , you still have the problem that there *is no incentive* to do anything that's not visible to market participants, no matter how much you think it will help.

Expand full comment

I doubt these markets could exist, but if a race of super-intelligent aliens showed up and the only technology they were willing to give us was ridiculously accurate prediction markets, you would create conditional markets on how well each student does with each teacher, and then assign the students based on that, getting rid of those that aren't helping any students.

If you need something tangible to be measured in the real world, it could be AP test scores.

Expand full comment

What I mean is, in order to have a prediction market on something that you're not going to actually resolve? Presumably if there was a conditional market on "teacher X with student Y", predictors aren't going to be predicting "what will the performance of student X be if we assign them to teacher Y," they're predicting "what will the performance of student X be, conditional on the rest of the market predicting in such a way as to cause X to be assigned to Y?" So what's being predicted is actually somewhat self-referential.

That seems to me like a problem with a lot of these kind of ideas, like Robin Hanson's "dump-the-CEO market". http://mason.gmu.edu/~rhanson/dumpceo.html. If I understand correctly, if I buy "stock conditional on CEO dumped", I'm betting that if the CEO is dumped the stock will rise, but the bet is only resolved if enough other people agree with me that the market ends up predicting the stock will be better if the CEO is dumped. Are there any sources that actually try to work out what the strategy is for these sorts of self-referential markets? I've read some of Hanson's papers, but I haven't seen him really address the self-reference/conditionality issue; he just sort of assumes that people will bet as though the decision would be made by some independent process.

Expand full comment

I'm a conspiracy theory enthusiast, but I don't recall the specifics of the Dath Ilani conspiracy theory. Is it the one where superior people from a more rational reality come over here and infiltrate our society in order to cause evolutionary uplift? I thought that was just one of Eliezar's thought experiments; is he now claiming it to be genuine?

Could somebody please tell me more? I'm glad that prominent members of the rationalist community are finally getting into conspiracy theories a bit more because they're great fun and really brighten up people's boring days. And since spreading them is very low-cost and carries just great emotional benefits, obviously it's the height of Effective Altruism. ;-)

Expand full comment

The highlight of the Steinhardt post: by 2025 "forecasters predicted 52% on MATH, when current accuracy is 7% (!)" for performance of AI on free-response math problems expressed in natural language. The example problems were generally too hard for me to do in my head. A Berkeley PhD got 75%.

If this is right, it seems to me we ought to expect very good Codex-style automatic programming too in roughly that timeframe, at least for the kind of problems you get in a coding interview, if not on a larger scale.

Expand full comment

Note that there is currently a flaw in the title for the Metaculus question about Robin Hanson's Twitter poll. Scott correctly interpreted the community distribution, but many predictors on Metaculus were confused (as it was unclear whether you were predicting "minutes" or "hours").

Expand full comment

What's going on with the 2024 US Presidential election market? https://www.betfair.com/exchange/plus/politics/market/1.176878927

Expand full comment

Looks to me like they're greatly overestimating the chances of another Trump win, while also overestimating the chances of a Democrat who isn't Biden or Harris.

Expand full comment

On teacher evaluation, why do we assume teachers have to be treated differently/more objectively than other professions? School administrators are far from perfect (like all bosses and indeed other humans). I struggle to think of other professions where we demand such a super-objective system and so highly distrust supervisors to evaluate employee performance. Will all supervisors do this well and fairly? No, but this is true for nearly every other profession. Even other public employees have annual performance plans that are approved by their supervisors and then rated according to the performance plans. Public employees typically have appeal rights for adverse actions (including poor evaluations). Although the systems are often sclerotic, public employees can be rated poorly and miss out on performance bonuses, promotions, etc.

I’m open to the argument that public employers have fewer incentives than private employers to value good performance, but this is an argument for holding school administrators, supervisors, school boards, etc. responsible for the success of the school not for taking teacher performance evaluation out of the hands of school administrators.

Expand full comment

It depends on whether a) good vs average teachers actually make a substantial difference in how well students do over the medium to long term, b) teachers could actually do a better job if there were stronger incentives in place, and c) whether or not it's a zero sum game -- could we have higher quality teachers if they were compensated better and acknowledged more, or not?

As far as I can tell, teachers are already trying about as hard as they can, given the constraints of their situations, including things like needing to pick up kids as soon as school gets out (though there are absolutely some abusive people teaching who should definitely be fired, but that isn't really about test scores); there isn't that much room to produce significantly more excellent teachers, and they affect student experience a lot, but test scores not that much. So I'm in favor of more traditional evaluation methods, and accepting the current level of teacher quality. But it makes sense for someone who believes the opposite on all those points to be willing to invest a lot more in teacher evaluation.

Expand full comment

I work in a very expensive private school. Nobody has tenure and people get laid off every year because their performance is not up to par. Meanwhile, in public schools, the administrators can't fire bad teachers. We don't need complex metrics in order to improve schools. Instead of giving these generous retirement packages, we should pay new teachers better. We should give more room for teacher autonomy and at the same time allow administrators to fire bad teachers. Then we should do away with the arduous teacher credentialing programs, as these are not correlated with better teacher performance.

Expand full comment

"I work in a very expensive private school. Nobody has tenure and people get laid off every year because their performance is not up to par."

Same here, although we're small so we don't lose staff *every* year. But yes - I generally agree with what you say, to the extent that if public schools were funded at the level of private schools they'd be able to attract enough candidates to fire the bad ones - and we don't need complex metrics to figure out who those are (although my school uses Stronge, which is pretty complex).

I'd add, though, that there are ways to train teachers which result in better performance and better retention, and I think the public sector should invest in these types of programs, on the basis that it would seem to be more efficient to hire average teachers and then train them to be great than to keep cycling through teachers until they happen upon enough great ones by attrition.

Expand full comment

> people get laid off every year because their performance is not up to par.

How is performance measured, though ?

Expand full comment

Performance is measured by how much the administration likes you. How else would you measure it? Kidding aside, teaching is a public act. Student performance on tests etc is one piece, but that is taken with a huge grain of salt. Do students think you are a good teacher, challenging them, using class time effectively? Even if they think a teacher is a jerk they tend to acknowledge if they are effective at teaching. Then teachers observe each other and look at others' curriculum and examples of student work. There are formal observations too, but the main metric is student engagement and success in learning. Tests try to measure this, but the test data needs to be coupled with human observation by people who know the students and context.

Expand full comment

May I ask what is the calibre of students in this school? That is, are they all ranges of ability and psychological/behavioural problems, or does the school select for smart and no additional needs? I think it does make a difference if you have a class of thirty bright, attentive, well-adjusted kids versus a class of thirty kids of mixed ability with at least one trouble-maker as to how you evaluate is Teacher Smith better than Teacher Jones.

Expand full comment

Definitely all ranges and abilities. There is almost no such thing as an elementary school, or rarely even a high school with only smart well-adjusted kids. Only elite universities have the kind of selection criteria that would result in such an academically homogeneous group. Go to any rich suburb with educated parents and you will find that the kids of these parents are diverse in their abilities and not all super smart and engaged.

Expand full comment

Step 1:

Produce a really bad comment.

Step 2:

Bet that it gets banned

Step 3:

Profit

Expand full comment

But who will bet against you?

Expand full comment

Maybe the trick is to make comments with ridiculously offensive hidden messages that are not obvious at first glance but which you can't unsee once they are pointed out.

Expand full comment

I'm confused by that for the standard case. Where is the money coming from to pay out people who highlight bad comments?

Expand full comment

>An upvote invests $1 (Vitalik says 1 ETH, but the post is from 2018 and maybe he didn’t expect that to be worth $4325) in a prediction of “Scott won’t ban this”. A downvote invests $1 in a prediction of “Scott will ban this”.

What if I think something is a terrible comment that shouldn't appear high up on the page, but I don't think it deserves banning?

Expand full comment

I feel like you are getting your characterization of the Keynesian beauty contest as a solution to the prediction problem precisely backwards. While it might solve the problem in practice, it does not solve it in theory.

In theory, any prediction is an equilibrium - it works if we all coordinate on it - and no equilibrium is stable by any definition I am aware of - or perhaps all are (I think the Mertens stable set is the set of all equilibria; I'm asking a more knowledgeable friend). There is certainly not a prediction that is singled out by the formal description of the game, which is what I would call the theory. If the truth is a Schelling point - a dubious proposition if the truth is not well established - then that is a psychological observation, not a game theoretic one. In other words, if Keynesian beauty contests solve the problem, it is because of highly contingent factors. It might work in practice, not in theory.

Expand full comment

On GPT Codex: The question becomes slightly confounded by the question of who is a programmer. If GPT Codex works, I'll use it: I can't program, but there's an app I'd like to make (am not working on because of lack of technical skills). The definition of programmer may just shift to "those who do the programming that GPT Codex can't do".

Expand full comment

I'm a programmer, and I use ReSharper's Shift-Space autocomplete and Alt-Ins code generation all the time (kids, ask your grandparents about ancient programming tools !). If GPT Codex can give me better autocomplete accuracy, then I'd be all for it. I don't expect it to write entire programs for me, though.

Expand full comment

I'm a software engineer on sabbatical with a experience making software that gloms onto existing websites and an interest in prediction markets.

I'd be keen to whip up a prototype to use as an experiment on a post! Shoot me an email at jarred.filmer at gmail dot com if you'd like to give it a go 😊

Expand full comment

Prediction markets don't work very well in a society as economically unequal as ours.

Expand full comment

Side note on Moderation: Reddit's problem isn't just that it's vulnerable to cliques. It's that, well, for whatever reason, it's one of the ruder places I encounter. Downvoting trolls doesn't seem to have kicked in. Similarly, StackOverflow noted that regulars are pretty rude to newcomers and put in a new policy of asking everyone to be nice. It's still pretty challenging for timid newcomers working with languages they don't understand yet.

Expand full comment

Not sure what epoch of StackOverflow you are referring to. The latest brouhaha a couple years ago was a disaster. The company handled it very badly and many old moderator hands turned in their badges or left the site entirely. As for being rude, it is unfortunately necessary, to discourage freeloaders. The amount of time regulars can spend on answering questions is limited, and nobody pays them to do it. They have to deter low-effort posting or else worthwhile questions will be inundated by "solve my assignment" and "my codes do not work". From my experience, rudeness to askers who make an effort and aren't being rude themselves is very rare. Much of the rudeness newcomers complained about apparently consisted in regulars commenting "you asked a bad, low-effort question, please read the FAQ on asking good questions first" on bad, low-effort questions. I'm not sure swaddling this core message in soothing verbiage would improve matters.

Expand full comment

By the time students graduate and get accepted to college they have had 30 or more teachers. I don't see any way you could use grad rates to evaluate individual teachers. The average class size in a US elementary school is around 20 kids. Is that going to provide enough data to offset confounds like this teacher got more difficult students, they have the kid with downs syndrome in their classroom, they got the smaller classroom by the noisy gym, they have fewer students than their colleagues?

I would look at how much gain in student achievement a "good teacher" gets, and then see how evident those gains would be in a sample of only 20. How easy would it be to offset those gains by giving them difficult students and other confounds?

Some teachers are martyrs and they like helping difficult kids, are willing to take one for the team and have the smaller classroom etc. Tying their pay to data disincentivizes this generous behavior. Instead you would be encouraged to look at your students, and think about who provides maximum potential gains in the metrics, and then focus on them.

Expand full comment

We oughtn't throw more standardised testing at kids than is necessary. Yes, it's needed to ration access to higher education and for hiring but this proposal seems to be to force teachers to teach to the test at the expense of everything else.

Expand full comment

Teaching to the test is fine, if the test is good enough. There's a reason why the idea of "backward design" is big in education.

Expand full comment

I disagree to some extent. Even if the tests are as good as we can reasonably make them, then optimising too hard towards the test still gives you a suboptimal education.

My favourite teachers I remember from school were the ones who were clearly knowledgeable enough on their subjects to go off on interesting tangents when the opportunity called for it, and to give you a glimpse of the vast store of knowledge lying beyond the K-12 curriculum. I have very clear memories of one maths lesson where the teacher (a grumpy old curmudgeon) spent half the lesson talking about transfinite numbers and Cantor's diagonal argument. It had no relationship to whatever we were supposed to be studying, but it was my first glimpse of non-boring mathematics.

Expand full comment

The story about #tweets is an example of why I'm a lot less bullish on prediction markets as a silverish bullet today than a couple years ago.

It's relatively easy to manipulate outcome of narrow prediction markets. Be that [assassination of particular individual](https://en.wikipedia.org/wiki/Assassination_market), [chance of streaker during superbowl](https://www.insider.com/super-bowl-streaker-bet-on-himself-prop-bet-2021-2) or number of tweets by individual in last two weeks.

Expand full comment

> GPT Codex is an AI that auto-completes code for programmers. You can see a really amazing and/or rigged demo here:

I use GitHub Copilot (as I understood, it's basically the same model, it's GPT-3 tweaked with additional "RAM"). I have to say the demo is not rigged at all. Once I learned how to seed it, it produces very correct 10-15 lines of code I'd say 50% of the time, and mostly correct the other half.

I use it to write tests, and obviously nobody likes writing tests, but it became very fun now – I try to write the prompt so that Copilot creates a correct test case on the first go.

One of the most mind-blowing examples for me was when I wrote code for SVG image and I typed:

// output an image with old paper look

AND IT blasted some distortion filter that did exactly that: created an randomized outline around the image,

Of course that's cherry-picking and that could have been a copy-paste example from the internet, but still it's impressive how deep it can go.

It goes without saying that's a lifesaver for some file or JSON operations, for data manipulations, for-each loops etc.

Say I want to open each file from directory, parse it's content, and then output as a JSON array into one file, where keys are the filenames, and values are the file content. Copilot doesn't even stutter and nails this every time.

So I'd say the number is underestimated. As soon as people try it, and learn how to get value from it, it'll be irreplaceable.

P.S. Another fun story: I needed a URL example and put something like "use random URL for this test". Can you guess which URL it used, without opening? https://www.youtube.com/watch?v=dQw4w9WgXcQ

Expand full comment

Github Copilot is very good at copy-pasting buggy and insecure code, making it *worse* than useless.

https://gist.github.com/virtadpt/98793f6b0474100f5bafc54e7b5e34a9

Expand full comment

That's a nice critique, but I never use it without reading the code. In a way, if you blindly copy code from stackoverflow, you're mostly open to the same issues.

It's true that stackoverflow at least have votes and reputation, but the relevance of the copied block of code is still up to you.

Anecdotal evidence from the counterside: when I was using blockhash as a source of randomness in Solidity, Copilot added a comment saying:

// TODO: use more secure randomness source

Expand full comment

> Vitalik didn’t end with “and we should also replace all lower courts with prediction markets about what the Supreme Court would think”, but I’m not sure why not.

I think this system allows you to risk money to get a ruling in your favour. With comments it's not that bad. I can bet a lot of money on my comment to not have it banned. And it sounds like if it doesn't get reviewed (because it got a good rating because I bet a lot on it not getting banned) then I get it back after a while.

With actual court cases it's a much bigger problem. Because people involved in a case can get gain more if they win the case than just the money they stake on prediction market they might want to bet on odds which are not exactly right. And supreme court doesn't have that much capacity to look at cases so vast majority of bets don't resolve.

Expand full comment

The situation with insider trading isn’t quite as bad as all that with Keynesian beauty contests. If you have some private but verifiable-if-made-public information, you can predict higher than usual averages, then publish the info to whatever forum the predictors use, send it to news agencies, or whatever. You’ve bought up all the shares incorporating your information on the cheap, the other predictors see the information, update, and your on-sale shares turn out to get the big pay out. Even if broadcasting the information enough that the other predictors see it is costly, you can pay it out of your expected profit and still come out ahead.

There is an issue where information that’s either hard to broadcast or to verify is less likely to be incorporated, but that seems at-worst ambiguously bad. There will be some people with genuinely good evidence that’s just illegible for whatever reason, but there will also be people whose “evidence” is the product of a delusion or flight of fancy. I don’t know what sorts of relative magnitudes we’re looking at there, though, and that is the crucial variable.

Expand full comment

Your paragraph about SpaceX is a bit incorrect because growth in market value is not the same thing as return. Suppose SpaceX is worth 100B and raises 100B: the market value of SpaceX doubles but the wealth of existing shareholders is unchanged.

Expand full comment

About Teacher Pay, etc.. I have a related question: What made (public) school so horrible in the first place and being a teacher so stressful and dangerous?

I don't know tons of schoolteachers, but in conversations with the half-dozen acquaintances who are, I've learned there's basically two things. First, there's chasing accreditations and test scores over giving autonomy and freedom-of-action to local administrators to use their judgment of what's best (a kind of de-localization). Second, in many school kids are just straight-up dangerous, from broken or destitute families with little concern for their education and future, and teachers have little recourse to restrain them. It seems like things were okay until the 90s, and then by the early 00s the freefall began in earnest.

Expand full comment

I (weakly) believe that there is enough random variance in student performance over a year, and that student performance is robust enough to bad teachers, that building a model to predicts teacher performance might be impossible..

Expand full comment

Well said. I have the same belief. You might be able to identify exceptional teachers and very lousy ones, but you don't need a model for that anyway.

Expand full comment

I may have a tip on fusion predictions. On r/fusion a Metacalculus prediction was just posted with an average guess of 2041 for a fusion only reactor to deliver 100MW net electricity

Commonwealth Fusion Systems plans to build a high field pilot plant in the early 2030's that I assume would be hooked to the grid with a projected 200+MW output

They are currently building a smaller demonstration for 2025 to validate the physics which are for tokamaks which have the largest track record of experimental results, this plant is expected to hit a plasma Q of 10

Since the larger plant is apparently conservatively listed as a Q of 13 and a net Q of 3 putting out those 200+MW it seems likely that this prediction will pan out about a year after the larger plant is built

Expand full comment

If we pay teachers based on the progress of their students, then students can easily blackmail their teachers. "If you won't let me play with my smartphone during your lessons, I will intentionally screw up the exam, and you can kiss your salary goodbye."

Like, get a C instead of an A, so that you punish your teacher, but don't ruin your own career. Especially if you can coordinate with half of the class; then it will totally seem like the teacher's fault.

Expand full comment

It seems like if a student blackmails their teacher it would be pretty easy for the teacher to invoke a rule that says "if a student blackmails a teacher, we won't include their data in the scoring and also they'll be expelled and fined"

Expand full comment

"All the kids who got Cs were trying to blackmail me."

Expand full comment

Plausible deniability. I am not blackmailing anyone... I am just *saying* that if I am not allowed to regularly check my social networks, I get uneasy... and when I get uneasy, I may have a *problem* focusing on your stupid math test... that's all I am saying. I am just commenting on a psychology of a teenager *in general*, and if you expel me for *that*, my parents' lawyer will want to talk to you.

Also, if I hate a teacher for any reason, I can coordinate with my classmates to punish him financially outside the school, leaving no proof.

Expand full comment

Blackmailing a teacher is a public good. The effect of the student's vigilantism is dispersed between all the students, while he absorbs all the costs of getting a really poor grade. And you might even decide to exclude outliers from the measurement altogether. If so, only a coordinated effort involving multiple students could get around that. But coordinated blackmail leaves a much bigger paper trail and are harder to keep secret. And you may even decide to exclude outlier classes or outlier years altogether to get a better long-term average; you wouldn't want to base a teacher's salary on the performance on one single class in a single year anyway.

Expand full comment

"you wouldn't want to base a teacher's salary on the performance on one single class in a single year anyway."

Teachers move grade levels, change schools, teach multiple subjects, have students coming from different feeder schools. No two years are the same. How do you deal with these confounds when deciding which teacher have more merit?

Expand full comment

What do you mean? None of that makes it trickier. If a teacher has 3 classes per semester you could just average the performance of those 3 classes. You could maybe do a 4-semester average (average out the results of past 4 semesters they taught, perhaps weight it towards more recent performance if you want), and then you throw out the top/bottom 5% outliers to get around the blackmailing problem.

Expand full comment

My point is that teachers tend to change subjects and teach multiple subjects. You might teach as many as 4 or 5 different courses in high school. Maybe you teach a 1 year algebra and a 1 semester geometry, a and a 1 year business math. Then the next year you have slightly different classes. So for many teachers you cant two a 4 semester (2 year average) on scores. And at a small school you could have 6 kids in one of your classes.

Expand full comment

I would have thought that if the student is not paying attention in class but is playing on their phone, then they will fail the exams anyway. If they can skive off in class but still pass the exams with good results, then the teacher's performance has little effect and you're not really measuring "if Billy gets an A on the test, is that because Mr Brown is a great teacher or is it because Billy is smart (but lazy)?"

So before ever you start paying Mr Brown by how Billy scores on tests, you need to know is Billy really good at this one subject, is Billy generally smart, or is it down to Mr Brown that Billy is doing better than when he was taught by Mrs Jones?

Expand full comment

School is populated by kids with very immature prefrontal cortex's. Experimenting with power plays against the system for no rational gain sounds like typical adolescent behavior.

Expand full comment

Scott, you are really going crazy with this prediction market stuff.

In an unsubsidized market, for every long-term winner there is a long-term loser. In your prediction market utopia, WHO WILL BE THE LONG-TERM LOSERS?

If you want prediction markets for comment moderation, or court decisions, or student performance, and if you're not subsidizing these substantially, you MUST answer this question.

WHO WILL BE THE LONG-TERM LOSERS?

No long-term losers means no long-term winners, which means the equilibrium will be "nobody bets in your prediction markets".

Expand full comment

You can have many short-term losers who go away.

Also, in this case people may be happy about moderation outcome and be active despite 0 income.

Expand full comment

I do nothing about the technical details of prediction markets where actual money is involved. But, a question.

Can one make long-term prediction bets (or whatever name the instrument should have) ownership transferable? Like, a large enough correct prediction-bet about an event predicted to happen 50 or 100 years in future should be valuable property worth something in 50 or 100 years, similarly to stocks or a house, and could be transferred to the next generation stocks or house or other property?

People don't treat any other property as a joke, even though they are not going to be around 100 years later.

This won't solve the problem with predictions about events that involve collapse of financial system when the prediction pay-out is dependent on the financial system existing. For that end, has anyone suggested something like as follows:

Party A makes a prediction that nuclear annihilation will happen in 20 years. Party B makes a prediction nuclear annihilation happens only after 50 years. Rough sketch: They could make a two-part loan-like instrument, where holder of part B agrees to pay to holder of part A x units of money per year for the first 20 years, on the condition that A will pay y units of money per year for the years 20 to 50. Amounts x, y, adjusted for parties respective confidence at the moment of signing the contract. Possibly involve a collateral assumed worth 50-20 = 30 years times y. Ownership of both parts of the instrument should be transferable, so that the party B, if they expect they are not around in 20 years to collect winnings for reasons unrelated to nuclear armageddon, should be able sell it to someone who expects to be around to collect the winnings.

Expand full comment

*I do nothing = I know nothing

Expand full comment

You don't actually need to use real money for the prediction market system - bet something similar to Reddit karma instead.

You accumulate karma by making comments that don't get banned, or by better correctly on comments that do get banned. You lose karma by losing these bets. There is also a number of comments posted counter to compare against current karma balance, as well as account age.

At a glance, you can distinguish between:

Lurkers who do a lot of your content moderation who are good at it.

Someone who makes a lot of rule-breaking comments

Someone who makes a lot of comments and is also doing a lot of good moderation

Next you need an algorithm that can weigh all of these factors when considering the sway of each individual vote - votes from high karma to commenting ratios are very favoured vs low karma votes.

The trick is that karma should be visible to the user, as an incentive to actually use the system. Another incentive could be participation in the comments of controversial posts.... Which would motivate a decent chunk of this blog's audience IMO.

Expand full comment

Tying a a person's pay to someone else's performance does not seem fair.

Expand full comment

In which fields does this happen? I can think of sports coaching. Anywhere else?

Expand full comment

Any field that requires teamwork would assumedly have a fair bit of this. Also probably most management positions. Of course, that’s assuming that pay is in any way tied to performance, which is rarely the case

Expand full comment

Welcome to the manager's world, or more precisely everyone who works in a team and isn't entry-level.

Expand full comment

Managers typically have power over their subordinates to enforce their will, though. Many teachers (especially in public schools) have almost no power to stop a student who decides to flunk a test for funsies.

Expand full comment

> DARPA investigates how prediction markets do vs. expert surveys when guessing the results of social science studies. Answer: neither of them does well.

Given only 20-40% of social science studies survive replication, I'm not sure how reliable this result is.

Expand full comment

Err, that is "survive an attempt at replication".

Expand full comment

I got a prototype for a comment moderation prediction market working 😁! I've emailed you the details.

Expand full comment

The major reason that merit pay won't work (even with good value added metrics) is that there's very little variation to measure. In the United States today, within the same school system, students do not learn noticeably more or less from one teacher than another. Partly because they don't learn much at all; for most student most of the time, things are learned, used for a test or project, and then forgotten--a process that begins the next hour and is largely complete in a year. But mostly because 1) most teachers are similarly good in presenting testable information, and 2) how much is remembered depends mostly on the student: how smart the student is and how interested.

It is certainly true that some teachers are more interesting. Some are nicer. Some have a better voice or personality. Some present things in a more sophisticated or challenging way. All that may make a big difference to how enjoyable a class is. But for the vast majority of students, it won't make much of a difference in how much is remembered or how much can be used after the class is over.

Expand full comment

What would happen if we set a question of how much would a stock or crypto cost in 3 years for example. Would the live price follow closely the predicted one, or would the predicted one fluctuate along the live price?

Expand full comment

these are called forward markets and they generally exist, if they're exchange traded you know them as futures. The correct answer is if they're hedgeable (like they would be for a stock or a crypto) the forward price is simply the current price plus or minus the cost of carry. The simplest example of the cost of carry being a crypto or stock which pays no dividemds so the forward price is the current price minus the cost to borrow cash to buy the underlying and hold it for the determined period. i.e. if it's a 1 year contract and the rate of interest is 1% and the underlying trades at 100 then the forward should trade at 99.

It's an arbitrage condition and there is no interesting information in the price (beyond the market estimated cost of carry)

Expand full comment

Better than prediction-markets is: the market. See: driving schools. You only open one, if you feel sure to be a good instructor/manager. And you take very good care the instructors you hire do their job well. Else your school is gone very soon indeed. Potential customers look at the reviews / ask around - no big worry not to get actual instruction in car-driving. - Dath-Ilan probably has vouchers - though even poor Indians manage to just pay up for real instruction: https://www.economist.com/asia/2018/10/11/indian-states-are-struggling-to-lift-public-school-attendance - first 9 sentences free to read - and all you need to know

Expand full comment

prediction markets don't exist because no one cares. The two big prediction markets in the world, i.e. financial markets and sports betting exist because either they are economically useful (and importantly can be hedged by a market maker) or people care about the outcome being gambled on. Also why the largest 'prediction market' event is US elections.

On the Cowen article saying why there aren't markets to bet on home price appreciation, futures markets for this literally exist and are available to be traded on the CME, but they have no liquidity because you can't hedge them and no one cares

Expand full comment

> What if you promote teachers whose students tend to gain many points on their (relative position in) test scores compared to last year? This is the idea behind value-added models [...] Various studies show this works much less well than you would think.

I mean, I immediately think this is a terrible idea, so I'm not sure how much worse it can get.

An average teacher can game the system in this model by just selecting students with (high test score variance and) low (read: below-mean) test scores, and rely on regression to the mean. (It's not _quite_ that simple, because most of the time a teacher can't directly select which students they are teaching... but there are indirect methods & correlations.) Of course, this would show up in correlations between teacher ratings and _past_ performance of students...

> [...] a teacher’s VAM can apparently predict their students’ past performance [...]

...oh.

Expand full comment