Yeah, I was surprised that Scott didn’t mention the fact that they were teaching classes half-full of unmedicated ADHD kids. That has to change the dynamic.
Aye, teaching something really is the best way to learn it, as you end up having to examine it from a lot of different angles to get the ideas to stick in someone else's mind.
When Scott Young branded his method the “Feynman Technique” he was thinking of “A Different Box of Tools” from Surely You’re Joking Mr Feynman. Yet that story doesn’t have much of anything to do with the technique.
I guess it worked though, because I can't Google "Feynman" without seeing "Feynman Technique" everywhere.
It's not surprising if differences in personality, intelligence, and memory specifically mean that different people might have very different steady states. This is why those handwavy anti-intelligence arguments of "well, we could teach those kids calculus if we just tried really hard and for long enough!" don't work, because they require ignoring the existence of forgetting.
Yes, I think retention is the important part here. If Scott took a vocabulary test and didn't get 100% on it, it seems safe to presume that he _understood_ all of the words and simply _forgot_ some.
The "Why Do Test Scores Plateau" mentions this (including the research around spaced repetition), but in it Scott states he remains confused about the role of intelligence, noting that neither 'intelligent people are more intellectually curious and get reminded of things more' nor 'intelligent people have better memories' seems sufficient to explain the difference. I suggest intelligence—the ability to identify patterns and integrate knowledge into a coherent whole—serves not merely as a "network effect", but as _compression_. If you understand chess well, you are better at memorizing game configurations (because you can chunk them into underlying structure); if you understand math well, you are better at retaining formulae (because in a pinch you can just re-derive them); if you understand chemistry well, you are better at memorizing reactions (because you can be guided by general principles); if you understand etymology well, you are better at understanding words (because you can often deduce their meanings from their roots), and so on. The greater the intelligence, the fewer bits of knowledge have to be independently retained, and the greater the performance at equilibrium forgetting.
If this is so, increasing attention might help learning a _little_, since it increases the number of repetitions you're likely to catch and be reminded by, but not much, since if it doesn't also increase intelligence it doesn't improve compression.
> if you understand math well, you are better at retaining formulae (because in a pinch you can just re-derive them)
This clicked for me--especially in the context of the vocabulary test discussion. I'm a terrible math student--in that I rarely get the right answer. But I'm considered "really good" at math--I understand the underlying structure and can re-derive equations, *which doesn't solve the problem of basic addition and subtraction errors.* Tests that merely check for the right answer don't catch that people like me know what we're doing, because those are about guessing the password rather than demonstrating actual knowledge.
> If this is so, increasing attention might help learning a _little_, since it increases the number of repetitions you're likely to catch and be reminded by, but not much, since if it doesn't also increase intelligence it doesn't improve compression.
Yes! We have yet to come up with good techniques for teaching compression and pattern identification/integration. We kind of just rely, as a species, on students' native software and hardware being up to snuff.
Personally, I think the closest we can currently get is figuring out a reliable method to teach curiosity. People seem to get better at compression when they seek out and synthesize information for themselves.
Yes, I don't see how not disrupting the class is fairly described as paying attention. They're clearly not the same thing at all. I wonder if Ritalin isn't helping kids pay attention, it's just helping them to sit still better.
Or d) these classes were disrupted and interrupted constantly by the students, and as a result, everyone's learning was significantly impaired, resulting in the drug not appearing to have a large effect on the student's learning when in fact the constant interruptions made it very difficult for anyone to learn.
IIRC there have been studies that suggested that a single disruptive student could impair everyone in the class's learning, so having a class full of them is likely an enormous cofounding variable.
Also, it's likely that the more other students act out, the more borderline students also act out; that is to say, if no one else acts out, the odds of you acting out in particular are much lower.
I've noticed this in group meetings at work (though I'm not sure if "disruption" is the right word for it (though it is sometimes), but certainly participation); when no one speaks up, the odds of one person speaking up are very low. If at least one person speaks up, the odds of multiple speaking up skyrocket. If a lot of people are speaking up, people will speak up far, far more.
Stimulants impact two distinct factors; focus and impulsivity. People with primarily inattentive symptoms really notice the impact on focus. People with both inattentive and hyperactive symptoms see both improvements. People around either notice the same things the person taking the meds does.
While focus becomes more important as school/work tasks become more complex and detailed, don’t underestimate the value for a kid of behaving better in class. No one likes feeling like they are disturbing others, but can’t stop themselves.
Which is another problem here as well; given how disruptive these classes sound like they were, it's possible that the sheer number of interruptions made everyone's classes much less useful just because they were constantly getting interrupted.
These are opposite sides of the same coin. A kid who can't focus on the class is bored to tears. The disruption is an attempt to get out of the boring situation.
As a child psychiatrist, I think this is important to keep in mind when we prescribe stimulants. At the same time, we're mostly focused on function and actual learning is probably more of a secondary or tertiary goal by the time we're seeing them.
There was a Chinese study trying this but it had to be scrapped because the kids kept looking up the more exciting Russian history pages written by Zhemao.
About 12 hours after a dose the Concerta should be wearing off for the day. Lower performance wouldn't be surprising.
About the following days: from my own experience as someone with ADHD and having tried a few of the different medication options, including Ritalin and Concerta, the days after medication are usually still much better than the unmedicated condition.
Especially in the first month or two after starting treatment, taking meds on one days often provided ADHD relief for two days afterwards. I don't know what the mechanism is. Perhaps something about enabling better sleep by helping the circardian rhythm along?
Do you mean stimulants given at very low doses and either slow-release or repeat low low doses, as for ADHD? Or is there a class of non-addictive stimulants that I should know more about?
I mean stimulants given at the doses used for ADHD. According to https://astralcodexten.substack.com/p/know-your-amphetamines, doses used in drug abuse are around 25x the dose prescribed for ADHD (and furthermore abuse often involves snorting or injecting the drug).
In my experience as a teacher, paying attention to the content is good, but thinking deeply about something sparked by the content is even better. Surface level knowledge often doesn't get beyond the flash-card recall level. But deeply mulling over facts or drifting between them in a semiconscious search for patterns can create lasting updates to one's mental models even once the facts are forgotten. From the outside that can look like spacing out. I may or may not be able to ask the right questions to see if anything interesting happened on the inside.
On the other side of the desk, the dullest teacher trainings I get to sit through are often the best, as I can't help but think about how much better I could have taught it... which means I end up thinking quite deeply about that topic and experiencing various micro-epiphanies.
Also a teacher - definitely all of this. Getting a kid to hold still and play the part of focusing is good for the class because it's less disruptive and can generally help the student Get It Done, but medicating a whole class of kids won't make the lesson or the topic interesting enough for them to retain it.
That's at least part of the mechanism behind the effect of stimulants on concentration, though, in my experience. Maybe there's some sort of "short-term interest" vs "long-term ['actual'?] interest" difference — the former being what makes it possible to spend hours looking at ass pics, when on amphetamine, and the latter what makes assology ultimately less appealing than Bronze-Age history as an avocation.*
------------
*example may be, uh, somewhat idiosyncratic to Himaldr
There's a very strong relationship (I'm not going to back this up with studies or whatever) between 'play', 'creativity', and more importantly 'learning'. Gell-mann has that famous method for discovery where you (after doing all the necessary research and failing over and over to break through a wall) read the last word of the daily newspaper page and try to solve it with that-- Taleb has also spoken about how the introduction of uncertainty is basically mandatory for discovery. I would guess personally that however dopamine works, it has very long 'hangover' effects i.e. something you repetitively experienced as pleasurable from a while ago can dominate future interests in some low-level way. The kind of intense focus involved with ADHD medication doesn't appear in any 'discovery' i.e. perception of novelty, and I think that having both deep concentration/lateral thought are important. --This may be pseudo-science, however, since I'm sure that it's the kind of thinking that 'dopamine-fasts' and that sort of thing are built upon & I'm suspicious of people promotive about it.
Regardless, I do think there's something fundamental in something like 'how one grew up using/typically used their dopamine' -- the people I know with the strongest interests in highly intellectual and complicated fields are all relatively distant from internet/tv, even with a slight natural distaste for sugary things... Again this is extremely anecdotal and probably a pseudo-science (I don't even personally think the sugar stuff matters). But I just wanted to say that, even in these cases, I can think of examples where they are both effective/interested in a deep/complicated domain, but also --by their own testimony-- get distracted and procrastinate. Actually seeing them in 3rd person makes me doubt their self-diagnosis of laziness a lot, because (a) much of what they do when they are avoiding work is still research in their field, requiring reading of quite deep texts, (b) even when not doing more such (unrelated) research, their activities away from work require a baseline of concentration and interest unique to most people-- and especially unique to someone their age (young 20s). But this is someone who grew up without access to internet/tv until much later, and I think that the access to those things during formative years _does_ affect one.
So to summarize, I think you are correct that there is a distinction between short-term long-term aims, but that the long-term aims tend to be formed over years (& more strongly formed in youth), and that certain kinds of environments favour certain kind of 'searching' behaviours (since we seem to get a hit out of novelty itself, changing domains often is the naturally favoured direction when technology gives us freedom to do so, rather than remaining in one domain). It must be a cliche by now, or a 'boomer'-esque stereotype (I'm thinking of those awful newspaper strips), but I think there _is_ something significant in how environmental changes due to technology are very formative in one's ability to sustain interest. (One problem is that even thinking of this as an 'ability' is fairly sketchy (in a philosophical sense), and that might matter in our capacity to understand and 'treat' problems with this ability.)
Final thing, since this got way too long and Idk if any of it is interesting at all:
People often bring up the marshmallow test to try and argue the importance of class differences (I know the creator claimed he controlled for class later) on self-control, and typically argue like this:
-The children are from more stressful environments (lower class), and therefore gather resources immediately whilst they are still around, rather than wait.
To me this is a coherent view of how one adapts certain behaviours according to their environment , and it matters if this view is correct/neurologically feasible, because it entails that environment (especially stressful environments) are _coherently_ addressed by our biology, and it's only _later_ that this 'lack of self-control' becomes an issue; by then the habits may be strongly ingrained. I remember that in terms of brain structure people with ADHD are loosely similar to those who have dealt with addictions, or those who have been exposed to repeated stressors, etc... I think more specifically in terms of frontal lobe development.
My final addition to this conception would be that I _do_ think that certain technologies (internet especially, tv a little less, streaming services/high optionality tv a little more) can prevent people from developing better habits. I don't really think they can be causative in themselves without very very long-term usage, from a young age, and that itself would usually signify some underlying stressor one is trying to escape. We think about dopamine-associations with substances and how this leads to harmful 'addictions': I think this is an interesting time in history because (as a continuation of things like gambling) we are seeing a mass example of an 'activity' [cluster of possible activities] which in itself is addictive i.e. behaviourally degrading, that is --simply by allowing us certain powers-- it teaches us to favour superficial and high variability (I think Kissinger recently wrote about this, not that his 'take' is especially unique; as I say it's a cliche by now).
Perhaps it just the apology at the end, but I felt compelled to state that this was an excellent comment, and broadly matches a lot of perceptions and ideas I've had. So, thanks for that.
I think it could be further extended by thinking about what causes the dopamine hits in different people. I suspect that getting slightly more out of the "searching" behaviors involved in say, learning to play an instrument, than in doing math problems in early age helps explain why one person devotes more of their time to music than quantitative stuff.
Thank you for writing this, it's helpful and encouraging to hear. I'm also not putting things out there because I'm totally certain of what I say, so the intention is to have people counter me or improve upon what I was saying-- that said it's great to hear that you found it helpful! I like the direction you indicate for extending it, although I expect that at some point 'aptitude' becomes a dominating factor in one's continuation of an activity. I.e. at some point we probably get a conflict between something (close to?) 'innate' and something more flexible (dopamine/motivation/self-control... the fact that these things are more 'flexible' is probably why they are the centre of so many moral issues-- although many morally based arguments reject that they are as flexible as we think).
In high school I could read a book cover to cover in a day, sometimes two, stopping just for meals. It wasn't anything high-brow, usually, just some random sci-fi/fantasy from the library - but the books drew me in so much I had no trouble focusing.
Right now, after the mass assault on my attention from internet clickbait, always-on IMs / discord / slack, etc - I struggle to read 50 pages in a sitting.
I'm slowly recovering that capacity through reducing information overload, but I don't think it's even about the formative years, the capacity for attention is adjusted in an ongoing process.
Although I still suspect that the formative years are more important for these kinds of effects-- that the consequences in development will be long term-- you're absolutely right that one can feel these negative affects at any point in one's life.
I have also been surprised a few times about the apparent ability of the brain to recover after degeneration (in anorexia, which is very severe in its affects upon brain structure, the observed affects are much less apparent on the road to recovery https://www.bath.ac.uk/announcements/largest-study-to-date-reveals-stark-changes-in-brain-structure-for-people-with-anorexia/), so I think that the most important thing is to remove the underlying stressors, and at that point the brain can recover with adequate nutrition.
In that vein, I have read many encouraging anecdotes about people quickly regaining (over the course of a couple weeks to a month) much of their focusing capacity (there was a good article about it which I cannot find right now); but the main problem was that in all of the accounts the people were physically incapable of accessing the internet, and that as soon as they could access it once again the problems returned. I personally have a hard time imagining a good measure to emulate this in modern life (excluding things like abandoning modern life & 'moving to the wild', I mean). I suppose blockers can work? Or perhaps exclusively using the internet at work?
[aside: It's probably simplistic to think of it straightforwardly as an addiction, since it has a lot to do with the fact that we tend to favour a number of very attention-superficial sites (twitter & other social media), whereas probably reading long articles would be fine. The optionality is the main aspect here; it represents something similar about addictive activities as gambling, where the 'hit' comes from a kind of emergent novelty.]
I'm reading a book about this right now, actually, called "Deep Work" by Cal Newport and it's really resonating with me. About attention and concentration and distraction.
Meth is actually a highly effective and legal medication for ADHD (very low doses, compared to recreational use, as you can imagine). The problem is, no one wants to be the doctor who prescribed meth for an 8 year old! Or even for a 38 year old.
As a user of ADHD meds I can say no medication will make content interesting. I find i'll only use it when I really need to get through something, and usually best if the content is familiar but uninteresting. Also very helpful when work is tedious, ie. coding or spreadsheet crunching. That's where the chemical boost really helps, since the mind easily wanders. When the subject is inherently interesting or novel, ADHD meds almost hinder learning ability since the brain is somewhat 'in a rut' of concentration. The new neural connections are made less easily in a medicated state. This may vary for different/younger folks, i'm 31 now and started using ADHD meds at the age of 22 where I did find it useful in learning new things. I'm still of the view that giving kids ADHD meds (say below 20 y.o) is not appropriate unless very dire circumstances (dangerous or delinquent hyperactivity).
Definitely case by case, but in my experience (largely elementary sped situations, also diagnosed/on Adderall starting at 23), giving kids meds like this on the regular at a very young age can prevent them from learning to deal with their own nature. The medication replaces adapting to themselves and their environment. Me and my brother (very ADHD, teachers *begged* my parents to diagnose and medicate, they refused) both had to learn how to succeed in the school environment without the focus pull and it made success all the more rewarding. I have a better understanding of my meds as a *tool* than as something I *need,* a distinction I would not have been able to make if I'd been placed on them early.
That said, I've certainly seen cases where it seems like a good choice to medicate. Case by case, but I think it's important to err on the side of "teach kid to deal with it" over "give focus drug."
I take some issue with comparing ADHD etc to things like cancer and broken legs, which are only problems. But point taken.
Personally, I've seen a lot of kids who were *over* medicated. This makes me wary of medication in general and I think of it as a last resort. I recently worked with a student (8 years old) who talked *constantly* as you describe your son - he's also incredibly gifted, but exasperating to the adults around him who don't care about eg Roblox. School is difficult for him, so I'm glad things are going well for your son!
Coming from the school system I feel strongly that many cases of medicated kids could be UNmedicated with changes to the environment - kids don't have enough recess (45 minutes in 6.5 hours!), schoolwork is increasingly completed on computers, and very rarely are there the kinds of hands-on, constructive projects that would help more students care about and focus on the work (ie, a third grade castle project I witnessed this year was just a drawing, whereas I distinctly remember having to build a model in my own elementary school experience).
The incentives for the school system are so poorly aligned and disconnected from each other that I have a hard time coming up with solutions to this that could actually be implemented. There's a lot of hard problems wrapped up together and ultimately everyone needs to make the decisions that work best for them, I just wish the world were better set up for the students for whom school is not a healthy environment.
You're one of the good ones. I was prescribed ritalin in the 90s because a teacher pressed that I didn't pay enough attention. Treatment was stopped after some years when parents didn't notice a difference.
Today, I would not describe myself as having ADHD/ADD. My experience informs my bias which is that less rigid attention in children does not necessitate that they experience life-long ADHD; in part there is a symptom of simply being a child, or being of the day-dreaming persuasion. Not only is it over-prescribed, if we're to believe this research it barely provides value. Whether or not they actually have ADHD might be a moot point.
I had trouble focusing for the entire duration of school, for two reasons: a) insomnia, b) boredom. At the time, sleeplessness was not seen as a genuine concern and no one was equipped to help me deal with it. That mostly remains true of GPs today (everyone's first line of defense), but they can refer you to specialists or therapists who know somewhat better. Fortunately there is plenty of helpful material out there if you seek it out and take the time to learn by yourself.
In Artificial Neural Networks, there's literally a variable called 'Learning Rate'. This is a number, you can set it to anything you want. And indeed, the higher you set it, the faster the ANN learns, all else being equal! But also, the higher you set it, the lower the maximum skill level the ANN can attain before it reaches it's limit and plateaus.
I had the same thought, but I think we have to be careful here because it'll be easy to conflate concepts. When we talk about a student's learning rate I think we're likely to be referring to how much information we expect them to retain or how hard/fast we push them to demonstrate knowledge.
But AI learning rate is essentially how closely the machine learning function traces the curve it's following. A high-learning rate AI will make massive jumps, mapping huge regions of the learning space, but at very low detail. A low-learning rate AI will make smaller jumps, mapping more slowly, but also more precisely.
It's not the difference between covering one chapter and two chapters in a textbook, it's the difference between reading and retaining the same amount of content in a survey-level course and a graduate level one. The survey course will give me a much better high-level mapping, while the graduate course will give me a lot of info on one particular part of the map. If I've already got a good survey-level understanding of a subject, taking another survey course is unlikely to provide me with much value. I'll spend much of my time going "yeah, I know this." Meanwhile if I take a graduate level course on quantum physics, I'll converge towards understanding very slowly, because I don't have that high-level mapping.
I'm not sure if there's a brain chemistry/intelligence interpretation of all this or not. As an ADHD person, my best interpretation would be that I don't have a lot of control over my learning rate, sometimes going very deep on poorly understood topics (and then failing to retain anything), and usually going very broad because deep understanding is hard with low dopamine payoff. I suspect that the bottleneck Scott refers to is more analogous to an AI's computational resources or architecture than to learning rate.
No I'm aware - ANN learning rate is about update speed, AKA 'How Much You Change The Numbers Each Step', not the actual learning that results from these updates. I still think it's a relevant comparison, however. For myself, I feel like my learning rate is set rather low, so I learn slowly, but can eventually build up lots of detail, if I stick at it.
I've taught 4th grade for almost twenty years, and I've had students who I would have sworn were not paying the least attention and were not getting a thing I said...but who, it turns out, were getting a hell of a lot more than I thought they were and who wound up doing pretty well. It's not always easy to judge such things by the obvious, superficial indicators.
I have a memory from 2nd grade where everyone was at their desks learning cursive, and I was in the corner reading a book. Occasionally I'd glance up and memorize the pattern of the new letter the teacher was drawing on the board. I'm sure I also did some practicing (the kinesthetic stuff is important too), but... yeah, it usually would have looked like I wasn't paying attention at all.
AI was brought up at the end, but maybe not the most relevant example. Many neural nets employ "dropout", where 30% or so of the neurons are turned off at random. This seems to help the network develop resiliency, and not depend too much on any particular set of neurons. To extend the metaphor to its speculative extreme, one could imagine that with all neurons paying perfect attention to the task at hand, you would be better behaved for sure (no neurons to misbehave with), and you might even perform better on immediate tests, but you might not "learn" in a resilient way.
Dropout in NNs is mostly there to prevent overfitting the model; the closest analogy in teaching actual humans is probably rote learning (as opposed to conceptual or deep learning (I just love how the terminology gets mixed up when talking about humans and NNs))
I don't think the implied premise of attention in class = learn more material is rock solid to begin with. We don't know what makes kids actually learn better, or at least we don't apply it in schools; instead they get an overworked 27 year old's heuristic approach to what they think teaching is supposed to be.
By the time they are in this study I think the boys have just developed learning styles that fit their strengths and weaknesses, ability to catch random detail but short attention spans. Having been in that situation (and been on and off Ritalin) I am unsurprised that the boys who continued to use the learning style they were experienced in kept up with those who were medically given different strengths they weren't used to.
I'm curious what would happen if you tried this with non adhd students. I would expect a stronger pro Ritalin effect as you'd be strengthening a preexisting strength rather than a weakness.
I suspect you're right in the extreme case: there are certainly people you couldn't teach quantum mechanics to even given 20 years.
But that type of extreme case shadows a *lot*. In practice, we don't really care if we can teach someone something over truly long periods -- if someone can't understand a fairly wide topic in 1 year of poorly focused instruction (a classroom of 20+ students as compared to, say, a private tutor), we have no interest in teaching it to them. Given how long it took me to understand (for example) partial differential equations, I'm confident we lose a lot of potential this way. In other words, we rarely make anything like a serious effort to optimize training our meat computers in any task or subject area. Not sure that's relevant to stimulant function in the classroom, but if we ever work out an educational method that *isn't* inefficiently (human) labor intensive, it means we could see fairly dramatic gains.
More on topic, to echo other people here: attention is not a substitute for interest. I have paid attention to a lot of classroom instruction that I knew I had no interest in and was going to forget immediately because nothing had flipped the switch in my brain saying "this is actually worth knowing". My impression is that there are chemicals that might have something similar to that effect (among others), but they're mostly illegal drugs.
Intuitively, I feel like the analogue of training data for a vocabulary word would be something like how many examples you've seen of the word being used. I don't think the *primary* way humans learn language is by memorizing formal definitions.
"a little more quantum physics every day for twenty years, and eventually expect them to know as much as a smart person would after getting a four-year degree."
I think this would work (and does work - relatively poorer learners in schools at the moment still learn a lot of stuff by the end of school). But there's a condition. I teach English to children, and one of the problems I find that some seem to have run into is coping strategies. We get kids coming in at about age 10, who have been having English classes for four years (they start in 1st grade) and have at some points performed reasonably well on school tests, but they can't identify English words like "I" or "and". They seem to have developed (highly sophisticated!) strategies for giving correct answers on tests and in class, despite not knowing any of the stuff that we call "English".
Those strategies sometimes block any further learning. The student's goal is usually nothing more or less than to pass the test; if they have a strategy that sometimes/consistently enables them to do this, they will actively resist taking on new information that might disrupt their strategy.
I guess that means I agree with Scott that school is badly aligned, i.e. for a large number of kids, it never becomes obvious that getting good at the school subjects is the best way to get through the day; and they aren't given compelling other reasons to want to get good at school subjects. And a little bit of concentration probably doesn't make a big difference
But I feel like it should make a big difference over the longer term, because the kids will build up marginally more experience of having paid attention in class, heard something, used it in a test later, and gotten through the day a little easier.
What are some of these highly-sophisticated strategies? Unless the tests are very regular in a way that I should think it would be a goal to avoid (e.g. "C is usually the answer"), I can't really imagine what they could be.
Sure, it's a bit mind-boggling. They include things like: in class, swift mirror-repetition of the last two words the teacher says, which apparently in their large-class situation in public school is enough to make the teacher think they've answered the question and move onto the next kid, but can be achieved without any mental processing or understanding at all! Use of muttering, finely tuned to the ears of their English teachers, so they produce a sound-sludge that seems enough like an answer to get the teacher to move on. Waiting for and imitating classmates' whispered answers. Verbal parroting, so that sometimes, if you give the prompt "spring," they can respond with "summerautumnwinter," without necessarily knowing what they mean, or even that they are three words.
In writing, they sometimes learn to write whole lists of words in a particular order rather than learn what each word means. (The exams are indeed quite poorly designed, so this strategy gains marks.) Words may be learned in relation to specific prompt images, so a child may reproduce "summer" next to an image of the sun, without knowing or caring that the word means summer. And they may learn to reproduce quite long passages, several sentences long, without knowing what they mean (much like a list of words). These passages can be inserted into the writing part of an English test, and will still get you 4/6 or 6/8 marks, even if they're not on-topic, because these are primary school exams, and marked generously.
It's bad teaching and bad exams that allow these practices to flourish - but that's hardly uncommon!
Fascinating. Initially I assumed for some reason that the students you teach are mainly Spanish or Creole speakers. However, I've heard anecdotes about something similar in the context of standardized test prep, so now I'm wondering, are the students who employ the strategies you describe predominantly Asian?
As it happens, yeah. I live and work in southern China, so they're all Chinese students. I should stress, though, that talking about who the students are seems unfair, because these are so obviously inculcated by the school system. It's not because of who the students are, it's because of the system they're in. English teaching in Chinese public schools is uniquely bad because so many of the teachers can't speak English at all (some literally refuse to speak English in the classroom), and the textbooks they use are conversational English textbooks. (I actually quite like the public school textbooks, but they're wildly unsuitable for large classes with non-native teachers.)
Well, the reason I focused on who the students are is that about a decade ago I heard from a teacher in California that Chinese students getting test prep in hagwons were "gaming" standardized tests via some method that sounded really improbable, but that lines up with what you described.
At the time, I thought that if kids (and adults) really were pulling that off, then they must have some intellectual advantage others (and certainly I) don't, even if they're just learning and employing a successful stratagem. Calling it "gaming" seemed sort of, well, biased.
But the details you describe paint a fuller picture and make me reconsider.
Good test design can defeat these sorts of tactics.
I have noticed that some people write test questions in a particular way that is highly susceptible to gaming the answers, resulting in you being able to correctly answer questions even when you don't actually know the answer.
You can also defeat this in other ways, like asking more open-ended questions or making people do projects which require them to show comprehension.
Of course, people start complaining when you do this and half the students fail.
Thanks for writing this. I have seen similar behavior in college students, and never really knew what to make of it. Getting essay answers that I would now say read like they were written by an AI via "this word and that word usually go together, so I will do that." I had often wondered if they were just throwing together word salad in panic from half remembered information, or just really bad at writing. The notion that it is a perhaps unconsciously learned strategy makes a lot of sense, and starts to suggest solutions for working around it.
Yep, for us it's pretty much always the same thing: you have to go back and teach things that are much more basic than you think. For example, in my language learning context, when students struggle to learn vocab, it's because they don't know the speech sounds. I hear them spelling the word "but" by saying, "b-a-t - no, not that a, the other a." It doesn't matter how much work you put into the teaching of vocabulary when the students literally can't hear the difference between one vocabulary item and another. You have to regress the extra step.
I don't teach anyone of college age, but with middle school students, I've recently been noticing how difficult the language of questions can be. Sometimes they may know the material, but not be able to understand the question at all - and are unable to recognise that failure of understanding in themselves, or unable to admit it. So you have to do a bunch of diagnostic investigation before you can even get started.
Why am I reminded of when Feynman tried to teach in a Brazilian university and discovered that the students had only ever memorized passwords and didn't understand anything?
Yeah, I assume this is a universal phenomenon. I'm guilty of all of these shortcut approaches sometimes (watching me trying to copy-paste learn coding off the internet would make a grown man cry)! But what was surprising to me was to find that a whole chunk of a (reasonably "successful") education system could be run so as to be, for a significant number of kids, nothing more than a mechanism for forcing kids into intellectual bad habits. I'm still not fully on board with Scott's anti-school thing, but this is the kind of phenomenon that could push me that way.
One of my classes was 240 students, I was the only one involved on the staff side, and the administration wanted the marks back within 3 days of the test at the absolute latest, no excuses. That is why it was multiple choice. I conjecture that teachers would drop multiple choice in favour of intelligent questions all by themselves if the circumstances made it possible.
Very interesting! Thank you for taking to the time to provide these examples.
I imagine these types of strategies take place much more than generally acknowledged, across all ages and domains. I saw similar things taking place even at university level CS courses: students would attempt (and often succeed) to "game" the compiler / homework problems, and still get credit without understand the syntax or semantics of what code they were submitting.
As a father of two currently homeschooled kids, I'm noticing a very similar phenomenon. The kids seem to treat learning tasks as some overly-complicated game, and view their main optimization task as making as much progress in it as possible while using as few mental resources as possible. Like, when using a language-learning app, they would skip listening to the instruction and try to guess the answer. That's despite the fact that I actively try to downplay the importance of getting results (in the form of correct answers); there's no score in the app, no penalty for wrong answers, etc. It's just that actively learning is harder then guessing, and when there's no internal incentive to learn (or the benefits are too distant to really register), the mind starts optimizing ruthlessly. The solution seems to be to try to align the most immediate incentives somehow, and hope that the more beneficial time preferences develop eventually as the child matures.
I'm just in the middle of making a homeschooling decision. Mine have been in school up till now, and seem to be a bit zombified by it, and would love to pull them out and let them relax and develop a bit. But that kind of thing could make my relationship with them very tense, so I'm worried about it. Thanks for the salutary thought!
I agree that you could probably teach almost anyone quantum mechanics given enough time. I think the main obstacle is most likely low academic self-concept / self-sabotage.
The reason I think this is that I don't really think the variation in human intelligence is that large on an absolute scale, I think it's more like temperature (we feel a huge variation between 0C and 40C but temperature can go a lot lower). If you look at AI performance in domains where it's superintelligent (eg Go) you'll see average human intelligence is a pretty narrow band that AIs quickly fly through. I can't see any compelling reason why the threshold for being able to learn QM is neatly inside this narrow band.
To me a much more plausible explanation is low academic self-concept. I hypothesise that if I went to school in a class where people learned 2x as fast as me and I always fell short, I would become convinced I couldn't learn as much as them, so any attempt to teach me would result in negative thought spirals causing me to fail to learn. I saw this pattern often when I did private tutoring - I've found it's actually not hard to tutor people with these deeply ingrained insecurities difficult concepts 1-on-1, but they find it psychologically difficult to be persistent and follow through (CBT might help?)
Could the answer lie in something like the lessons these kids were taking having been the same (i.e. using the same plan and resources, with teachers using the same strategies) and basically optimised for kids with ADHD? If so, perhaps the results just mean that Ritalin works *and* SEND teaching works (I've spent this year doing teacher training, and we keep being told that the ideal goal is for a kid with ADHD in your lesson to be able to learn as much as a neurotypical one, and it sounds like these lessons might just have reached that standard).
Re the Ritalin study, which showed that attention improved but learning not so much when ADHD kids got Ritalin, how about the idea that engagement is necessary? When talking a test, students are usually trying to succeed, so they’re engaged. Typically, classwork is boring, so they’re not. I recently read of a study that showed that 1:1 tutoring gave better learning results than either of two classroom paradigms. Could it just be that 1:1 interaction is so much more engaging than being in a classroom? Perhaps in a 1:1 situation, Ritalin would help ADHD kids learn better. Or in other engaging situations, like, say, chess or video games or something.
1:1 tutoring famously outperforms just about every educational intervention ever, offering about 2 s.d. improvement in outcomes, but it also obviously is extremely expensive to scale up so it remains underutilised.
I think there are many factors that make 1:1 tutoring better, having done a bit of it as a student. Firstly, yes, it's much easier to keep 1 kid engaged than a class - there is often no framing that all 30 kids in a class would find interesting, so teachers have to settle for getting most of the kids at best, but with only 1 you can tailor your approach precisely *and in real time* - with only one student, you can watch their face to see what's clear and what's not, what's interesting and where precisely you lost them.
Secondly, you can repeat things exactly as much as needed - instead of, say, three repetitions for everything, you can do one or two for the things that the kid gets easily, and have enough time saved as a result to afford to spend 6 or 7 on the one thing they're truly stuck on.
You can also ask exactly what they do not understand and pinpoint the error they make, understand where their logic/understanding break and try to correct it using counter-examples, alternative ways to do it correctly, analogies with something they like, can already and so on.
It require one-to-one interraction because, unlike a class, it's a real conversation. It also require the teacher to really be proficient in the matter he teach, not only be able to follow the program and pass the exams himself (which is often the case).
That's my experience teaching math to friends that were much less gifted/interested in math than I was (it usually go hand in hand, it's rare to be interrested in something you are not doing better than most, and vice-versa).
And there is also a peer effect: getting teached by your peer is a different psychological experience than being teached by an authority. Some students like it better, others needs/crave the authority, and a class setting is authoritarian.
Now doing that helped them pass the exam and really improved their math skills, for this particular year. But from what those friends told me, it has no lasting effect. Probably because the only way to keep the skills is to use them, by obligation or interest. They had neither one or the other...
What you're pointing to is this: design instruction that mimics what a tutor does. For example, the students need to make active responses quite often; and the instruction needs to respond to the students' demonstrated understanding, or lack thereof; and the students need to have opportunities to ask questions that actually shape the direction of the instruction that follows. These kinds of things can be managed.
I don't think so. Not in a class of 15+ students. Maybe with 3-5, but more than that and the ambiance drastictly changes, nothing ressembling a conversation takes placd anymore....At least that's what I get from my very informal teachnig attempts when in school (One or 2 friends), and teaching assistant during Phd. Teaching 2 friends or a group of 3 students for a project is a very different experience (a much more pleassant one for me) than teaching a group of 15. This was awful, and the only way I made it bearable was to split those in an ex-cataedra part (awfully close to being on scene, which I despise, an probbaly sucked hard for this part) and back to informal one-to-one with me moving around. But it means that instead of 2h one-to-one tutoring, it was 1h online pre-recorded course (only less smooth) + 10 minute tutoring. I am far from a good teacher (although I apparently am a good tutor, at least if I do not loose patience), but having experienced good ones (I think), I see no way to replicate one-to-one (or one-to-a-very few) tutoring in a class of 20+ students. Except with the very time-inefficient tutoring each student successively...
It's quite possible to do what I said. I do it. I think you're missing the point. Tutoring doesn't work because it "resembles a conversation." It works because it elicits frequent active responses from students, because the tutor responds to the input from those responses, and so on. A group can get a lot closer to the relevant features than it does. For example, you can give students individual whiteboards and have them all write questions for the teacher at once; you can pair up students and have them explain things to each other; and you can actually observe what they are doing and then shape your instruction accordingly. No one said it was easy, but it's possible.
Solving math problems is one thing. Following rules is another, related, thing. Learning is another, and doesn't really seem all that closely related to either. You treat it as strange that improving the first two doesn't seem to affect the third. But really, why should it?
You say "Concerta's clearly doing something". And you're right. But why jump to the conclusion that the thing it's doing is "making kids pay attention"? There are other things that could explain better student performance.
I am a math tutor, and I often find myself wishing that my students could become dumber at will. That when the time came to do arithmetic, they could shut off most of their brains and execute rules mechanically. Because frankly, most thoughts are simply obstacles or distractions when you're trying to do kid-level math. That's why a simple computer can out-calculate a brain that has vastly greater processing power.
My first thought, when I heard about the effects of Concerta, was that it might be doing something like that. Inflicting useful stupidity, desirable narrow-mindedness. Which might actually make people worse at learning, not better. A wandering mind is a problem when you're doing arithmetic, but I think it's a good quality in a student overall.
Your point seems very relevant for the kind of learning that enriches a person, but less so for the "learning the teacher's password" that this study examined, which I *would* naively expect to benefit from the kind of narrow-mindedness that you're thinking of
Possible, but learning math is often a much more intellectual process than doing it. It really does help to understand what you're doing and why, even if the actual task is just following memorized steps.
To be honest, I'm not entirely sure why it helps. But the students who "get it" always outperform the students who don't.
> My first thought, when I heard about the effects of Concerta, was that it might be doing something like that. Inflicting useful stupidity, desirable narrow-mindedness. Which might actually make people worse at learning, not better.
The experience of stimulant medications for me is very much this. In normal everyday ADHD life, the senses are wide open. I'm absorbing many things at a time. Lack of stimulus is unnerving. The key point here, though, is that I am actually absorbing those things. My brain is taking it all in and retaining what it feels is important.
Add stimulant medications and it's like looking through a telescope by comparison. Much, much less "wide open" than usual, and much more focused on one thing. That is VERY helpful for DOING. It's not especially helpful for LEARNING except where learning takes the form of doing, as in the case of doing homework.
I have very much not had that experience. Learning is much easier when you can focus on something long enough to actually learn about it. Or when other things your brain feels are important but which aren't actually are not intruding.
This is entirely anecdotal evidence, but as a "gifted" child who got diagnosed with ADHD at age 27, I've always completely separated my ability to learn (processing+analyzing information) from my attention issues. I take an off-brand Concerta, and what it allows me to do is exert less effort to do things I otherwise get paralyzing executive dysfunction for. I.e. just about everything in my life that I'm not really interested in on a whim. Concerta is a tool that allows me to direct my attention better, and most importantly it helps massively with time regulation: so less "oh I can finish this task in 20 minutes because I couldn't make myself do it earlier" and more "ok I want to do this, I can do it at X hour, and allocate Y duration to it." This process is near impossible without the medication. As a child this translated to setting my alarm at 5am in the morning to fail to complete homework and ending up scribbling it in class before it was due.
As stated at the end of this piece, I feel like the major hurdle in these types of studies is defining significant variables that first qualify the ability to learn, separate from overall attention. But you very quickly get into extremely murky territory (and I'll admit I'm not super well read on this topic because I find filtering out my personal bias/experience extremely draining, a poor show of rationalism at work).
I share much of your experience. I was always a distractable kid, but I was never diagnosed with ADHD, due to a combination of success despite distraction, my parents' attitudes towards education and medication, and going to elementary school before Ritalin became the rage in my family's social cohort. My executive function was bad (probably not as bad as yours), but that weakness never caught up with me because while it might have taken me four hours to master an algebra concept that should have taken one hour, I had four hours to spend on it. College and law school were tough, but it was only as a young attorney, expected to work through tedious documents quickly, that the volume of work exceeded the available time. I got on stimulants at that point.
I don't take stimulants every day. When I do, I can't perceive a difference in the quality of my work. Some of that is surely that a legal brief cannot be measured the way that a page of twenty long-division problems can. But the biggest difference is simply time. On stimulants, I sit down and do the work. I don't take three bathroom and two coffee breaks every hour. For what I do, getting the writing done is what matters. There could be differences in the quality of my writing on and off stimulants, but the crucial thing is that the brief get finished and filed. A so-so brief can still win; a brief that never gets filed because I was writing on SlateStarCodex is a professional disaster.
> Something like this must be true if we assume that it takes a certain intelligence level to learn surgery - or quantum physics, or whatever. Otherwise you could get a very dumb person, keep teaching them a little more quantum physics every day for twenty years, and eventually expect them to know as much as a smart person would after getting a four-year degree. I’ve never heard of someone formally trying this, but I predict it wouldn’t work.
... Don't we all start off as very dumb people, and we keep learning a little more every day for many years until we can do the complicated stuff?
Kids with low IQ hit the same milestones as kids with high IQ, just slower. Why would this be different in adult life?
When I tutored in college and HS, and now train (“capacity build”) as an adult I do not find the last graph true at all. Basic algebra is just literally beyond some people. And more and more things as you move up the intellectual hierarchy. Now how much of that is focus/interest versus raw intellect is maybe hard to say.
Could you torture the non algebra understanders into understanding algebra with the right negative stimulus? Maybe? But copious positive stimulus definitely has no effect for some people, once you start getting into mid level HS work. And these are not people who are “morons” or whatever medicalized term you want to use. Just on the low end of the distribution.
Given how hard it is for some adults to uptake certain skills/concepts even with the literal gun of “you will be fired if you don’t perform better” pointed at their head, I don’t think the “just slower” model is correct.
> Basic algebra is just literally beyond some people. And more and more things as you move up the intellectual hierarchy. Now how much of that is focus/interest versus raw intellect is maybe hard to say.
Set theory is beyond most set theorists, which is why Skolem's Paradox is simply swept under the rug.
Surely there's a threshold IQ for mathematics and abstract conceptualization. The more complex and logically fraught, the higher the threshold. Algebra, which is minimally troublesome and merely involves using letters to stand in for variables, has a low threshold -- but a threshold nevertheless -- whereas quantum mechanics and set theory have very high thresholds. With enough guidance, rote memorization, and mechanical problem solving, people below the threshold might be able to "fake it" -- but they won't be able to develop a comprehensive understanding or apply the abstract concepts they're "learning" to their daily life.
Yeah, I was surprised that Scott didn’t mention the fact that they were teaching classes half-full of unmedicated ADHD kids. That has to change the dynamic.
Aye, teaching something really is the best way to learn it, as you end up having to examine it from a lot of different angles to get the ideas to stick in someone else's mind.
A more formalized technique of what you've mentioned you were taught is the Feynman technique.
https://fs.blog/feynman-technique/
I thought the Feynman technique was (according to Gell-Mann)
1) write down the problem
2) think very hard
3) write down the answer
That's the Feynman's Algorithm used for problem solving.
When Scott Young branded his method the “Feynman Technique” he was thinking of “A Different Box of Tools” from Surely You’re Joking Mr Feynman. Yet that story doesn’t have much of anything to do with the technique.
I guess it worked though, because I can't Google "Feynman" without seeing "Feynman Technique" everywhere.
TIL! thanks for sharing
FWIW, Wozniak even tries to use the forgetting curve to extrapolate an average 'total possible retention': https://supermemo.guru/wiki/How_much_knowledge_can_human_brain_hold It's not very high, well under a million. At that point, you're forgetting as much as you learn. (Fictionalized in Scott's story https://slatestarcodex.com/2017/11/09/ars-longa-vita-brevis/ )
It's not surprising if differences in personality, intelligence, and memory specifically mean that different people might have very different steady states. This is why those handwavy anti-intelligence arguments of "well, we could teach those kids calculus if we just tried really hard and for long enough!" don't work, because they require ignoring the existence of forgetting.
Yes, I think retention is the important part here. If Scott took a vocabulary test and didn't get 100% on it, it seems safe to presume that he _understood_ all of the words and simply _forgot_ some.
The "Why Do Test Scores Plateau" mentions this (including the research around spaced repetition), but in it Scott states he remains confused about the role of intelligence, noting that neither 'intelligent people are more intellectually curious and get reminded of things more' nor 'intelligent people have better memories' seems sufficient to explain the difference. I suggest intelligence—the ability to identify patterns and integrate knowledge into a coherent whole—serves not merely as a "network effect", but as _compression_. If you understand chess well, you are better at memorizing game configurations (because you can chunk them into underlying structure); if you understand math well, you are better at retaining formulae (because in a pinch you can just re-derive them); if you understand chemistry well, you are better at memorizing reactions (because you can be guided by general principles); if you understand etymology well, you are better at understanding words (because you can often deduce their meanings from their roots), and so on. The greater the intelligence, the fewer bits of knowledge have to be independently retained, and the greater the performance at equilibrium forgetting.
If this is so, increasing attention might help learning a _little_, since it increases the number of repetitions you're likely to catch and be reminded by, but not much, since if it doesn't also increase intelligence it doesn't improve compression.
I think you are very much onto something here.
> if you understand math well, you are better at retaining formulae (because in a pinch you can just re-derive them)
This clicked for me--especially in the context of the vocabulary test discussion. I'm a terrible math student--in that I rarely get the right answer. But I'm considered "really good" at math--I understand the underlying structure and can re-derive equations, *which doesn't solve the problem of basic addition and subtraction errors.* Tests that merely check for the right answer don't catch that people like me know what we're doing, because those are about guessing the password rather than demonstrating actual knowledge.
> If this is so, increasing attention might help learning a _little_, since it increases the number of repetitions you're likely to catch and be reminded by, but not much, since if it doesn't also increase intelligence it doesn't improve compression.
Yes! We have yet to come up with good techniques for teaching compression and pattern identification/integration. We kind of just rely, as a species, on students' native software and hardware being up to snuff.
Personally, I think the closest we can currently get is figuring out a reliable method to teach curiosity. People seem to get better at compression when they seek out and synthesize information for themselves.
"as fairly describe as" seems a mistake?
Yes, I don't see how not disrupting the class is fairly described as paying attention. They're clearly not the same thing at all. I wonder if Ritalin isn't helping kids pay attention, it's just helping them to sit still better.
Or d) these classes were disrupted and interrupted constantly by the students, and as a result, everyone's learning was significantly impaired, resulting in the drug not appearing to have a large effect on the student's learning when in fact the constant interruptions made it very difficult for anyone to learn.
IIRC there have been studies that suggested that a single disruptive student could impair everyone in the class's learning, so having a class full of them is likely an enormous cofounding variable.
Also, it's likely that the more other students act out, the more borderline students also act out; that is to say, if no one else acts out, the odds of you acting out in particular are much lower.
I've noticed this in group meetings at work (though I'm not sure if "disruption" is the right word for it (though it is sometimes), but certainly participation); when no one speaks up, the odds of one person speaking up are very low. If at least one person speaks up, the odds of multiple speaking up skyrocket. If a lot of people are speaking up, people will speak up far, far more.
Stimulants impact two distinct factors; focus and impulsivity. People with primarily inattentive symptoms really notice the impact on focus. People with both inattentive and hyperactive symptoms see both improvements. People around either notice the same things the person taking the meds does.
While focus becomes more important as school/work tasks become more complex and detailed, don’t underestimate the value for a kid of behaving better in class. No one likes feeling like they are disturbing others, but can’t stop themselves.
Which is another problem here as well; given how disruptive these classes sound like they were, it's possible that the sheer number of interruptions made everyone's classes much less useful just because they were constantly getting interrupted.
These are opposite sides of the same coin. A kid who can't focus on the class is bored to tears. The disruption is an attempt to get out of the boring situation.
I meant that it was a typo.
As a child psychiatrist, I think this is important to keep in mind when we prescribe stimulants. At the same time, we're mostly focused on function and actual learning is probably more of a secondary or tertiary goal by the time we're seeing them.
What does function mean in this context?
Usually it's in the context of being able to function in school and home behaviorally and/or scholastically if they're meeting the criteria for ADHD.
They should’ve let the kids loose on Wikipedia instead and found a way to measure which group learned more
There was a Chinese study trying this but it had to be scrapped because the kids kept looking up the more exciting Russian history pages written by Zhemao.
Source? This sounds hilarious.
Didn’t find the Chinese study yet, but here’s one about the Zhemao thing: https://futurism.com/the-byte/fake-articles-wikipedia
This just seems weird. I would expect the dopamine spike from the medication to improve immediate recall in everyone.
I would also expect the kids to perform below baseline following the removal of the stimulant.
Immediately after? Or in the following days?
12-18 hours after final dose and following days.
About 12 hours after a dose the Concerta should be wearing off for the day. Lower performance wouldn't be surprising.
About the following days: from my own experience as someone with ADHD and having tried a few of the different medication options, including Ritalin and Concerta, the days after medication are usually still much better than the unmedicated condition.
Especially in the first month or two after starting treatment, taking meds on one days often provided ADHD relief for two days afterwards. I don't know what the mechanism is. Perhaps something about enabling better sleep by helping the circardian rhythm along?
Non-addictive stimulants don't give sharp stimuli-dependent dopamine spikes, they increase the amount of dopamine available overall.
Do you mean stimulants given at very low doses and either slow-release or repeat low low doses, as for ADHD? Or is there a class of non-addictive stimulants that I should know more about?
I mean stimulants given at the doses used for ADHD. According to https://astralcodexten.substack.com/p/know-your-amphetamines, doses used in drug abuse are around 25x the dose prescribed for ADHD (and furthermore abuse often involves snorting or injecting the drug).
Cocaine, Ritalin and Amphetamines all increase tonic dopamine levels.
They’re all great nasal decongestants as well.
Selective attention for the win!
In my experience as a teacher, paying attention to the content is good, but thinking deeply about something sparked by the content is even better. Surface level knowledge often doesn't get beyond the flash-card recall level. But deeply mulling over facts or drifting between them in a semiconscious search for patterns can create lasting updates to one's mental models even once the facts are forgotten. From the outside that can look like spacing out. I may or may not be able to ask the right questions to see if anything interesting happened on the inside.
On the other side of the desk, the dullest teacher trainings I get to sit through are often the best, as I can't help but think about how much better I could have taught it... which means I end up thinking quite deeply about that topic and experiencing various micro-epiphanies.
Also a teacher - definitely all of this. Getting a kid to hold still and play the part of focusing is good for the class because it's less disruptive and can generally help the student Get It Done, but medicating a whole class of kids won't make the lesson or the topic interesting enough for them to retain it.
Sounds like we need a new set of medication, capable of making people find things *interesting* and not just force them to pay attention.
Education has been trying to hack this for years (ever?) from the teacher side of things. It's certainly a hard problem.
Unfortunately, making grade schoolers smoke weed is generally frowned upon.
Ritalin in the morning, marijuana in the afternoon.
That's at least part of the mechanism behind the effect of stimulants on concentration, though, in my experience. Maybe there's some sort of "short-term interest" vs "long-term ['actual'?] interest" difference — the former being what makes it possible to spend hours looking at ass pics, when on amphetamine, and the latter what makes assology ultimately less appealing than Bronze-Age history as an avocation.*
------------
*example may be, uh, somewhat idiosyncratic to Himaldr
Bronze Age history is awesome. They should teach more of it in school!
Agree
There's a very strong relationship (I'm not going to back this up with studies or whatever) between 'play', 'creativity', and more importantly 'learning'. Gell-mann has that famous method for discovery where you (after doing all the necessary research and failing over and over to break through a wall) read the last word of the daily newspaper page and try to solve it with that-- Taleb has also spoken about how the introduction of uncertainty is basically mandatory for discovery. I would guess personally that however dopamine works, it has very long 'hangover' effects i.e. something you repetitively experienced as pleasurable from a while ago can dominate future interests in some low-level way. The kind of intense focus involved with ADHD medication doesn't appear in any 'discovery' i.e. perception of novelty, and I think that having both deep concentration/lateral thought are important. --This may be pseudo-science, however, since I'm sure that it's the kind of thinking that 'dopamine-fasts' and that sort of thing are built upon & I'm suspicious of people promotive about it.
Regardless, I do think there's something fundamental in something like 'how one grew up using/typically used their dopamine' -- the people I know with the strongest interests in highly intellectual and complicated fields are all relatively distant from internet/tv, even with a slight natural distaste for sugary things... Again this is extremely anecdotal and probably a pseudo-science (I don't even personally think the sugar stuff matters). But I just wanted to say that, even in these cases, I can think of examples where they are both effective/interested in a deep/complicated domain, but also --by their own testimony-- get distracted and procrastinate. Actually seeing them in 3rd person makes me doubt their self-diagnosis of laziness a lot, because (a) much of what they do when they are avoiding work is still research in their field, requiring reading of quite deep texts, (b) even when not doing more such (unrelated) research, their activities away from work require a baseline of concentration and interest unique to most people-- and especially unique to someone their age (young 20s). But this is someone who grew up without access to internet/tv until much later, and I think that the access to those things during formative years _does_ affect one.
So to summarize, I think you are correct that there is a distinction between short-term long-term aims, but that the long-term aims tend to be formed over years (& more strongly formed in youth), and that certain kinds of environments favour certain kind of 'searching' behaviours (since we seem to get a hit out of novelty itself, changing domains often is the naturally favoured direction when technology gives us freedom to do so, rather than remaining in one domain). It must be a cliche by now, or a 'boomer'-esque stereotype (I'm thinking of those awful newspaper strips), but I think there _is_ something significant in how environmental changes due to technology are very formative in one's ability to sustain interest. (One problem is that even thinking of this as an 'ability' is fairly sketchy (in a philosophical sense), and that might matter in our capacity to understand and 'treat' problems with this ability.)
Final thing, since this got way too long and Idk if any of it is interesting at all:
People often bring up the marshmallow test to try and argue the importance of class differences (I know the creator claimed he controlled for class later) on self-control, and typically argue like this:
-The children are from more stressful environments (lower class), and therefore gather resources immediately whilst they are still around, rather than wait.
To me this is a coherent view of how one adapts certain behaviours according to their environment , and it matters if this view is correct/neurologically feasible, because it entails that environment (especially stressful environments) are _coherently_ addressed by our biology, and it's only _later_ that this 'lack of self-control' becomes an issue; by then the habits may be strongly ingrained. I remember that in terms of brain structure people with ADHD are loosely similar to those who have dealt with addictions, or those who have been exposed to repeated stressors, etc... I think more specifically in terms of frontal lobe development.
My final addition to this conception would be that I _do_ think that certain technologies (internet especially, tv a little less, streaming services/high optionality tv a little more) can prevent people from developing better habits. I don't really think they can be causative in themselves without very very long-term usage, from a young age, and that itself would usually signify some underlying stressor one is trying to escape. We think about dopamine-associations with substances and how this leads to harmful 'addictions': I think this is an interesting time in history because (as a continuation of things like gambling) we are seeing a mass example of an 'activity' [cluster of possible activities] which in itself is addictive i.e. behaviourally degrading, that is --simply by allowing us certain powers-- it teaches us to favour superficial and high variability (I think Kissinger recently wrote about this, not that his 'take' is especially unique; as I say it's a cliche by now).
Ok thanks sorry that got so long
Perhaps it just the apology at the end, but I felt compelled to state that this was an excellent comment, and broadly matches a lot of perceptions and ideas I've had. So, thanks for that.
I think it could be further extended by thinking about what causes the dopamine hits in different people. I suspect that getting slightly more out of the "searching" behaviors involved in say, learning to play an instrument, than in doing math problems in early age helps explain why one person devotes more of their time to music than quantitative stuff.
Thank you for writing this, it's helpful and encouraging to hear. I'm also not putting things out there because I'm totally certain of what I say, so the intention is to have people counter me or improve upon what I was saying-- that said it's great to hear that you found it helpful! I like the direction you indicate for extending it, although I expect that at some point 'aptitude' becomes a dominating factor in one's continuation of an activity. I.e. at some point we probably get a conflict between something (close to?) 'innate' and something more flexible (dopamine/motivation/self-control... the fact that these things are more 'flexible' is probably why they are the centre of so many moral issues-- although many morally based arguments reject that they are as flexible as we think).
In high school I could read a book cover to cover in a day, sometimes two, stopping just for meals. It wasn't anything high-brow, usually, just some random sci-fi/fantasy from the library - but the books drew me in so much I had no trouble focusing.
Right now, after the mass assault on my attention from internet clickbait, always-on IMs / discord / slack, etc - I struggle to read 50 pages in a sitting.
I'm slowly recovering that capacity through reducing information overload, but I don't think it's even about the formative years, the capacity for attention is adjusted in an ongoing process.
Yeah, I have similar experiences.
Although I still suspect that the formative years are more important for these kinds of effects-- that the consequences in development will be long term-- you're absolutely right that one can feel these negative affects at any point in one's life.
I have also been surprised a few times about the apparent ability of the brain to recover after degeneration (in anorexia, which is very severe in its affects upon brain structure, the observed affects are much less apparent on the road to recovery https://www.bath.ac.uk/announcements/largest-study-to-date-reveals-stark-changes-in-brain-structure-for-people-with-anorexia/), so I think that the most important thing is to remove the underlying stressors, and at that point the brain can recover with adequate nutrition.
In that vein, I have read many encouraging anecdotes about people quickly regaining (over the course of a couple weeks to a month) much of their focusing capacity (there was a good article about it which I cannot find right now); but the main problem was that in all of the accounts the people were physically incapable of accessing the internet, and that as soon as they could access it once again the problems returned. I personally have a hard time imagining a good measure to emulate this in modern life (excluding things like abandoning modern life & 'moving to the wild', I mean). I suppose blockers can work? Or perhaps exclusively using the internet at work?
[aside: It's probably simplistic to think of it straightforwardly as an addiction, since it has a lot to do with the fact that we tend to favour a number of very attention-superficial sites (twitter & other social media), whereas probably reading long articles would be fine. The optionality is the main aspect here; it represents something similar about addictive activities as gambling, where the 'hit' comes from a kind of emergent novelty.]
I'm reading a book about this right now, actually, called "Deep Work" by Cal Newport and it's really resonating with me. About attention and concentration and distraction.
Well, there's always meth.
Meth is actually a highly effective and legal medication for ADHD (very low doses, compared to recreational use, as you can imagine). The problem is, no one wants to be the doctor who prescribed meth for an 8 year old! Or even for a 38 year old.
Legal in the US??
As Desoxyn®, yes.
That's called LSD
As a user of ADHD meds I can say no medication will make content interesting. I find i'll only use it when I really need to get through something, and usually best if the content is familiar but uninteresting. Also very helpful when work is tedious, ie. coding or spreadsheet crunching. That's where the chemical boost really helps, since the mind easily wanders. When the subject is inherently interesting or novel, ADHD meds almost hinder learning ability since the brain is somewhat 'in a rut' of concentration. The new neural connections are made less easily in a medicated state. This may vary for different/younger folks, i'm 31 now and started using ADHD meds at the age of 22 where I did find it useful in learning new things. I'm still of the view that giving kids ADHD meds (say below 20 y.o) is not appropriate unless very dire circumstances (dangerous or delinquent hyperactivity).
Definitely case by case, but in my experience (largely elementary sped situations, also diagnosed/on Adderall starting at 23), giving kids meds like this on the regular at a very young age can prevent them from learning to deal with their own nature. The medication replaces adapting to themselves and their environment. Me and my brother (very ADHD, teachers *begged* my parents to diagnose and medicate, they refused) both had to learn how to succeed in the school environment without the focus pull and it made success all the more rewarding. I have a better understanding of my meds as a *tool* than as something I *need,* a distinction I would not have been able to make if I'd been placed on them early.
That said, I've certainly seen cases where it seems like a good choice to medicate. Case by case, but I think it's important to err on the side of "teach kid to deal with it" over "give focus drug."
I take some issue with comparing ADHD etc to things like cancer and broken legs, which are only problems. But point taken.
Personally, I've seen a lot of kids who were *over* medicated. This makes me wary of medication in general and I think of it as a last resort. I recently worked with a student (8 years old) who talked *constantly* as you describe your son - he's also incredibly gifted, but exasperating to the adults around him who don't care about eg Roblox. School is difficult for him, so I'm glad things are going well for your son!
Coming from the school system I feel strongly that many cases of medicated kids could be UNmedicated with changes to the environment - kids don't have enough recess (45 minutes in 6.5 hours!), schoolwork is increasingly completed on computers, and very rarely are there the kinds of hands-on, constructive projects that would help more students care about and focus on the work (ie, a third grade castle project I witnessed this year was just a drawing, whereas I distinctly remember having to build a model in my own elementary school experience).
The incentives for the school system are so poorly aligned and disconnected from each other that I have a hard time coming up with solutions to this that could actually be implemented. There's a lot of hard problems wrapped up together and ultimately everyone needs to make the decisions that work best for them, I just wish the world were better set up for the students for whom school is not a healthy environment.
This makes absolute sense.
You're one of the good ones. I was prescribed ritalin in the 90s because a teacher pressed that I didn't pay enough attention. Treatment was stopped after some years when parents didn't notice a difference.
Today, I would not describe myself as having ADHD/ADD. My experience informs my bias which is that less rigid attention in children does not necessitate that they experience life-long ADHD; in part there is a symptom of simply being a child, or being of the day-dreaming persuasion. Not only is it over-prescribed, if we're to believe this research it barely provides value. Whether or not they actually have ADHD might be a moot point.
I had trouble focusing for the entire duration of school, for two reasons: a) insomnia, b) boredom. At the time, sleeplessness was not seen as a genuine concern and no one was equipped to help me deal with it. That mostly remains true of GPs today (everyone's first line of defense), but they can refer you to specialists or therapists who know somewhat better. Fortunately there is plenty of helpful material out there if you seek it out and take the time to learn by yourself.
In Artificial Neural Networks, there's literally a variable called 'Learning Rate'. This is a number, you can set it to anything you want. And indeed, the higher you set it, the faster the ANN learns, all else being equal! But also, the higher you set it, the lower the maximum skill level the ANN can attain before it reaches it's limit and plateaus.
And then there are learning rate schedules, too.
I had the same thought, but I think we have to be careful here because it'll be easy to conflate concepts. When we talk about a student's learning rate I think we're likely to be referring to how much information we expect them to retain or how hard/fast we push them to demonstrate knowledge.
But AI learning rate is essentially how closely the machine learning function traces the curve it's following. A high-learning rate AI will make massive jumps, mapping huge regions of the learning space, but at very low detail. A low-learning rate AI will make smaller jumps, mapping more slowly, but also more precisely.
It's not the difference between covering one chapter and two chapters in a textbook, it's the difference between reading and retaining the same amount of content in a survey-level course and a graduate level one. The survey course will give me a much better high-level mapping, while the graduate course will give me a lot of info on one particular part of the map. If I've already got a good survey-level understanding of a subject, taking another survey course is unlikely to provide me with much value. I'll spend much of my time going "yeah, I know this." Meanwhile if I take a graduate level course on quantum physics, I'll converge towards understanding very slowly, because I don't have that high-level mapping.
I'm not sure if there's a brain chemistry/intelligence interpretation of all this or not. As an ADHD person, my best interpretation would be that I don't have a lot of control over my learning rate, sometimes going very deep on poorly understood topics (and then failing to retain anything), and usually going very broad because deep understanding is hard with low dopamine payoff. I suspect that the bottleneck Scott refers to is more analogous to an AI's computational resources or architecture than to learning rate.
No I'm aware - ANN learning rate is about update speed, AKA 'How Much You Change The Numbers Each Step', not the actual learning that results from these updates. I still think it's a relevant comparison, however. For myself, I feel like my learning rate is set rather low, so I learn slowly, but can eventually build up lots of detail, if I stick at it.
I've taught 4th grade for almost twenty years, and I've had students who I would have sworn were not paying the least attention and were not getting a thing I said...but who, it turns out, were getting a hell of a lot more than I thought they were and who wound up doing pretty well. It's not always easy to judge such things by the obvious, superficial indicators.
Maybe those were just the smarter kids?
I have a memory from 2nd grade where everyone was at their desks learning cursive, and I was in the corner reading a book. Occasionally I'd glance up and memorize the pattern of the new letter the teacher was drawing on the board. I'm sure I also did some practicing (the kinesthetic stuff is important too), but... yeah, it usually would have looked like I wasn't paying attention at all.
There were lots of times I'd look like I was sleeping but was actually listening carefully. (Other times I actually was sleeping.)
AI was brought up at the end, but maybe not the most relevant example. Many neural nets employ "dropout", where 30% or so of the neurons are turned off at random. This seems to help the network develop resiliency, and not depend too much on any particular set of neurons. To extend the metaphor to its speculative extreme, one could imagine that with all neurons paying perfect attention to the task at hand, you would be better behaved for sure (no neurons to misbehave with), and you might even perform better on immediate tests, but you might not "learn" in a resilient way.
https://www.lesswrong.com/posts/fg9fXrHpeaDD6pEPL/truly-part-of-you
Relevant.
Dropout in NNs is mostly there to prevent overfitting the model; the closest analogy in teaching actual humans is probably rote learning (as opposed to conceptual or deep learning (I just love how the terminology gets mixed up when talking about humans and NNs))
I don't think the implied premise of attention in class = learn more material is rock solid to begin with. We don't know what makes kids actually learn better, or at least we don't apply it in schools; instead they get an overworked 27 year old's heuristic approach to what they think teaching is supposed to be.
By the time they are in this study I think the boys have just developed learning styles that fit their strengths and weaknesses, ability to catch random detail but short attention spans. Having been in that situation (and been on and off Ritalin) I am unsurprised that the boys who continued to use the learning style they were experienced in kept up with those who were medically given different strengths they weren't used to.
I'm curious what would happen if you tried this with non adhd students. I would expect a stronger pro Ritalin effect as you'd be strengthening a preexisting strength rather than a weakness.
Right. As an ADHD person, this study simply seems to be asking the wrong question.
For me as a kid, the problem was *never* being able to hear, understand, and retain what the teacher was saying.
I suspect you're right in the extreme case: there are certainly people you couldn't teach quantum mechanics to even given 20 years.
But that type of extreme case shadows a *lot*. In practice, we don't really care if we can teach someone something over truly long periods -- if someone can't understand a fairly wide topic in 1 year of poorly focused instruction (a classroom of 20+ students as compared to, say, a private tutor), we have no interest in teaching it to them. Given how long it took me to understand (for example) partial differential equations, I'm confident we lose a lot of potential this way. In other words, we rarely make anything like a serious effort to optimize training our meat computers in any task or subject area. Not sure that's relevant to stimulant function in the classroom, but if we ever work out an educational method that *isn't* inefficiently (human) labor intensive, it means we could see fairly dramatic gains.
More on topic, to echo other people here: attention is not a substitute for interest. I have paid attention to a lot of classroom instruction that I knew I had no interest in and was going to forget immediately because nothing had flipped the switch in my brain saying "this is actually worth knowing". My impression is that there are chemicals that might have something similar to that effect (among others), but they're mostly illegal drugs.
Intuitively, I feel like the analogue of training data for a vocabulary word would be something like how many examples you've seen of the word being used. I don't think the *primary* way humans learn language is by memorizing formal definitions.
"a little more quantum physics every day for twenty years, and eventually expect them to know as much as a smart person would after getting a four-year degree."
I think this would work (and does work - relatively poorer learners in schools at the moment still learn a lot of stuff by the end of school). But there's a condition. I teach English to children, and one of the problems I find that some seem to have run into is coping strategies. We get kids coming in at about age 10, who have been having English classes for four years (they start in 1st grade) and have at some points performed reasonably well on school tests, but they can't identify English words like "I" or "and". They seem to have developed (highly sophisticated!) strategies for giving correct answers on tests and in class, despite not knowing any of the stuff that we call "English".
Those strategies sometimes block any further learning. The student's goal is usually nothing more or less than to pass the test; if they have a strategy that sometimes/consistently enables them to do this, they will actively resist taking on new information that might disrupt their strategy.
I guess that means I agree with Scott that school is badly aligned, i.e. for a large number of kids, it never becomes obvious that getting good at the school subjects is the best way to get through the day; and they aren't given compelling other reasons to want to get good at school subjects. And a little bit of concentration probably doesn't make a big difference
But I feel like it should make a big difference over the longer term, because the kids will build up marginally more experience of having paid attention in class, heard something, used it in a test later, and gotten through the day a little easier.
What are some of these highly-sophisticated strategies? Unless the tests are very regular in a way that I should think it would be a goal to avoid (e.g. "C is usually the answer"), I can't really imagine what they could be.
Sure, it's a bit mind-boggling. They include things like: in class, swift mirror-repetition of the last two words the teacher says, which apparently in their large-class situation in public school is enough to make the teacher think they've answered the question and move onto the next kid, but can be achieved without any mental processing or understanding at all! Use of muttering, finely tuned to the ears of their English teachers, so they produce a sound-sludge that seems enough like an answer to get the teacher to move on. Waiting for and imitating classmates' whispered answers. Verbal parroting, so that sometimes, if you give the prompt "spring," they can respond with "summerautumnwinter," without necessarily knowing what they mean, or even that they are three words.
In writing, they sometimes learn to write whole lists of words in a particular order rather than learn what each word means. (The exams are indeed quite poorly designed, so this strategy gains marks.) Words may be learned in relation to specific prompt images, so a child may reproduce "summer" next to an image of the sun, without knowing or caring that the word means summer. And they may learn to reproduce quite long passages, several sentences long, without knowing what they mean (much like a list of words). These passages can be inserted into the writing part of an English test, and will still get you 4/6 or 6/8 marks, even if they're not on-topic, because these are primary school exams, and marked generously.
It's bad teaching and bad exams that allow these practices to flourish - but that's hardly uncommon!
Fascinating. Initially I assumed for some reason that the students you teach are mainly Spanish or Creole speakers. However, I've heard anecdotes about something similar in the context of standardized test prep, so now I'm wondering, are the students who employ the strategies you describe predominantly Asian?
As it happens, yeah. I live and work in southern China, so they're all Chinese students. I should stress, though, that talking about who the students are seems unfair, because these are so obviously inculcated by the school system. It's not because of who the students are, it's because of the system they're in. English teaching in Chinese public schools is uniquely bad because so many of the teachers can't speak English at all (some literally refuse to speak English in the classroom), and the textbooks they use are conversational English textbooks. (I actually quite like the public school textbooks, but they're wildly unsuitable for large classes with non-native teachers.)
Well, the reason I focused on who the students are is that about a decade ago I heard from a teacher in California that Chinese students getting test prep in hagwons were "gaming" standardized tests via some method that sounded really improbable, but that lines up with what you described.
At the time, I thought that if kids (and adults) really were pulling that off, then they must have some intellectual advantage others (and certainly I) don't, even if they're just learning and employing a successful stratagem. Calling it "gaming" seemed sort of, well, biased.
But the details you describe paint a fuller picture and make me reconsider.
Yeah, that kinda makes sense. Perhaps techniques that are survival skills in the Chinese system are like superpowers in other systems.
Good test design can defeat these sorts of tactics.
I have noticed that some people write test questions in a particular way that is highly susceptible to gaming the answers, resulting in you being able to correctly answer questions even when you don't actually know the answer.
You can also defeat this in other ways, like asking more open-ended questions or making people do projects which require them to show comprehension.
Of course, people start complaining when you do this and half the students fail.
Thanks for writing this. I have seen similar behavior in college students, and never really knew what to make of it. Getting essay answers that I would now say read like they were written by an AI via "this word and that word usually go together, so I will do that." I had often wondered if they were just throwing together word salad in panic from half remembered information, or just really bad at writing. The notion that it is a perhaps unconsciously learned strategy makes a lot of sense, and starts to suggest solutions for working around it.
Yep, for us it's pretty much always the same thing: you have to go back and teach things that are much more basic than you think. For example, in my language learning context, when students struggle to learn vocab, it's because they don't know the speech sounds. I hear them spelling the word "but" by saying, "b-a-t - no, not that a, the other a." It doesn't matter how much work you put into the teaching of vocabulary when the students literally can't hear the difference between one vocabulary item and another. You have to regress the extra step.
I don't teach anyone of college age, but with middle school students, I've recently been noticing how difficult the language of questions can be. Sometimes they may know the material, but not be able to understand the question at all - and are unable to recognise that failure of understanding in themselves, or unable to admit it. So you have to do a bunch of diagnostic investigation before you can even get started.
Why am I reminded of when Feynman tried to teach in a Brazilian university and discovered that the students had only ever memorized passwords and didn't understand anything?
Yeah, I assume this is a universal phenomenon. I'm guilty of all of these shortcut approaches sometimes (watching me trying to copy-paste learn coding off the internet would make a grown man cry)! But what was surprising to me was to find that a whole chunk of a (reasonably "successful") education system could be run so as to be, for a significant number of kids, nothing more than a mechanism for forcing kids into intellectual bad habits. I'm still not fully on board with Scott's anti-school thing, but this is the kind of phenomenon that could push me that way.
If the whole multiple-choice exam method were thrown out and replaced with intelligent questions, that kind of thing would tend to disappear.
One of my classes was 240 students, I was the only one involved on the staff side, and the administration wanted the marks back within 3 days of the test at the absolute latest, no excuses. That is why it was multiple choice. I conjecture that teachers would drop multiple choice in favour of intelligent questions all by themselves if the circumstances made it possible.
Very interesting! Thank you for taking to the time to provide these examples.
I imagine these types of strategies take place much more than generally acknowledged, across all ages and domains. I saw similar things taking place even at university level CS courses: students would attempt (and often succeed) to "game" the compiler / homework problems, and still get credit without understand the syntax or semantics of what code they were submitting.
As a father of two currently homeschooled kids, I'm noticing a very similar phenomenon. The kids seem to treat learning tasks as some overly-complicated game, and view their main optimization task as making as much progress in it as possible while using as few mental resources as possible. Like, when using a language-learning app, they would skip listening to the instruction and try to guess the answer. That's despite the fact that I actively try to downplay the importance of getting results (in the form of correct answers); there's no score in the app, no penalty for wrong answers, etc. It's just that actively learning is harder then guessing, and when there's no internal incentive to learn (or the benefits are too distant to really register), the mind starts optimizing ruthlessly. The solution seems to be to try to align the most immediate incentives somehow, and hope that the more beneficial time preferences develop eventually as the child matures.
I'm just in the middle of making a homeschooling decision. Mine have been in school up till now, and seem to be a bit zombified by it, and would love to pull them out and let them relax and develop a bit. But that kind of thing could make my relationship with them very tense, so I'm worried about it. Thanks for the salutary thought!
I agree that you could probably teach almost anyone quantum mechanics given enough time. I think the main obstacle is most likely low academic self-concept / self-sabotage.
The reason I think this is that I don't really think the variation in human intelligence is that large on an absolute scale, I think it's more like temperature (we feel a huge variation between 0C and 40C but temperature can go a lot lower). If you look at AI performance in domains where it's superintelligent (eg Go) you'll see average human intelligence is a pretty narrow band that AIs quickly fly through. I can't see any compelling reason why the threshold for being able to learn QM is neatly inside this narrow band.
To me a much more plausible explanation is low academic self-concept. I hypothesise that if I went to school in a class where people learned 2x as fast as me and I always fell short, I would become convinced I couldn't learn as much as them, so any attempt to teach me would result in negative thought spirals causing me to fail to learn. I saw this pattern often when I did private tutoring - I've found it's actually not hard to tutor people with these deeply ingrained insecurities difficult concepts 1-on-1, but they find it psychologically difficult to be persistent and follow through (CBT might help?)
Could the answer lie in something like the lessons these kids were taking having been the same (i.e. using the same plan and resources, with teachers using the same strategies) and basically optimised for kids with ADHD? If so, perhaps the results just mean that Ritalin works *and* SEND teaching works (I've spent this year doing teacher training, and we keep being told that the ideal goal is for a kid with ADHD in your lesson to be able to learn as much as a neurotypical one, and it sounds like these lessons might just have reached that standard).
Re the Ritalin study, which showed that attention improved but learning not so much when ADHD kids got Ritalin, how about the idea that engagement is necessary? When talking a test, students are usually trying to succeed, so they’re engaged. Typically, classwork is boring, so they’re not. I recently read of a study that showed that 1:1 tutoring gave better learning results than either of two classroom paradigms. Could it just be that 1:1 interaction is so much more engaging than being in a classroom? Perhaps in a 1:1 situation, Ritalin would help ADHD kids learn better. Or in other engaging situations, like, say, chess or video games or something.
1:1 tutoring famously outperforms just about every educational intervention ever, offering about 2 s.d. improvement in outcomes, but it also obviously is extremely expensive to scale up so it remains underutilised.
I think there are many factors that make 1:1 tutoring better, having done a bit of it as a student. Firstly, yes, it's much easier to keep 1 kid engaged than a class - there is often no framing that all 30 kids in a class would find interesting, so teachers have to settle for getting most of the kids at best, but with only 1 you can tailor your approach precisely *and in real time* - with only one student, you can watch their face to see what's clear and what's not, what's interesting and where precisely you lost them.
Secondly, you can repeat things exactly as much as needed - instead of, say, three repetitions for everything, you can do one or two for the things that the kid gets easily, and have enough time saved as a result to afford to spend 6 or 7 on the one thing they're truly stuck on.
Yup. But then the question I was trying to ask was whether Ritalin would help ADHD kids learn more in a 1:1 learning situation.
You can also ask exactly what they do not understand and pinpoint the error they make, understand where their logic/understanding break and try to correct it using counter-examples, alternative ways to do it correctly, analogies with something they like, can already and so on.
It require one-to-one interraction because, unlike a class, it's a real conversation. It also require the teacher to really be proficient in the matter he teach, not only be able to follow the program and pass the exams himself (which is often the case).
That's my experience teaching math to friends that were much less gifted/interested in math than I was (it usually go hand in hand, it's rare to be interrested in something you are not doing better than most, and vice-versa).
And there is also a peer effect: getting teached by your peer is a different psychological experience than being teached by an authority. Some students like it better, others needs/crave the authority, and a class setting is authoritarian.
Now doing that helped them pass the exam and really improved their math skills, for this particular year. But from what those friends told me, it has no lasting effect. Probably because the only way to keep the skills is to use them, by obligation or interest. They had neither one or the other...
What you're pointing to is this: design instruction that mimics what a tutor does. For example, the students need to make active responses quite often; and the instruction needs to respond to the students' demonstrated understanding, or lack thereof; and the students need to have opportunities to ask questions that actually shape the direction of the instruction that follows. These kinds of things can be managed.
I don't think so. Not in a class of 15+ students. Maybe with 3-5, but more than that and the ambiance drastictly changes, nothing ressembling a conversation takes placd anymore....At least that's what I get from my very informal teachnig attempts when in school (One or 2 friends), and teaching assistant during Phd. Teaching 2 friends or a group of 3 students for a project is a very different experience (a much more pleassant one for me) than teaching a group of 15. This was awful, and the only way I made it bearable was to split those in an ex-cataedra part (awfully close to being on scene, which I despise, an probbaly sucked hard for this part) and back to informal one-to-one with me moving around. But it means that instead of 2h one-to-one tutoring, it was 1h online pre-recorded course (only less smooth) + 10 minute tutoring. I am far from a good teacher (although I apparently am a good tutor, at least if I do not loose patience), but having experienced good ones (I think), I see no way to replicate one-to-one (or one-to-a-very few) tutoring in a class of 20+ students. Except with the very time-inefficient tutoring each student successively...
It's quite possible to do what I said. I do it. I think you're missing the point. Tutoring doesn't work because it "resembles a conversation." It works because it elicits frequent active responses from students, because the tutor responds to the input from those responses, and so on. A group can get a lot closer to the relevant features than it does. For example, you can give students individual whiteboards and have them all write questions for the teacher at once; you can pair up students and have them explain things to each other; and you can actually observe what they are doing and then shape your instruction accordingly. No one said it was easy, but it's possible.
My high school physics class used the whiteboard thing.
He considered it shameful if any of his students got a 4 on the AP test. You were supposed to get a 5.
Most of us did.
You could do this even more easily with modern digital tools.
That explains a lot. Seriously.
Hang on, I'm not following the logic here.
Solving math problems is one thing. Following rules is another, related, thing. Learning is another, and doesn't really seem all that closely related to either. You treat it as strange that improving the first two doesn't seem to affect the third. But really, why should it?
You say "Concerta's clearly doing something". And you're right. But why jump to the conclusion that the thing it's doing is "making kids pay attention"? There are other things that could explain better student performance.
I am a math tutor, and I often find myself wishing that my students could become dumber at will. That when the time came to do arithmetic, they could shut off most of their brains and execute rules mechanically. Because frankly, most thoughts are simply obstacles or distractions when you're trying to do kid-level math. That's why a simple computer can out-calculate a brain that has vastly greater processing power.
My first thought, when I heard about the effects of Concerta, was that it might be doing something like that. Inflicting useful stupidity, desirable narrow-mindedness. Which might actually make people worse at learning, not better. A wandering mind is a problem when you're doing arithmetic, but I think it's a good quality in a student overall.
Your point seems very relevant for the kind of learning that enriches a person, but less so for the "learning the teacher's password" that this study examined, which I *would* naively expect to benefit from the kind of narrow-mindedness that you're thinking of
Possible, but learning math is often a much more intellectual process than doing it. It really does help to understand what you're doing and why, even if the actual task is just following memorized steps.
To be honest, I'm not entirely sure why it helps. But the students who "get it" always outperform the students who don't.
Wouldn't this show up on the study's metrics just fine?
> My first thought, when I heard about the effects of Concerta, was that it might be doing something like that. Inflicting useful stupidity, desirable narrow-mindedness. Which might actually make people worse at learning, not better.
The experience of stimulant medications for me is very much this. In normal everyday ADHD life, the senses are wide open. I'm absorbing many things at a time. Lack of stimulus is unnerving. The key point here, though, is that I am actually absorbing those things. My brain is taking it all in and retaining what it feels is important.
Add stimulant medications and it's like looking through a telescope by comparison. Much, much less "wide open" than usual, and much more focused on one thing. That is VERY helpful for DOING. It's not especially helpful for LEARNING except where learning takes the form of doing, as in the case of doing homework.
I have very much not had that experience. Learning is much easier when you can focus on something long enough to actually learn about it. Or when other things your brain feels are important but which aren't actually are not intruding.
I think know what you mean, but dumber kids aren't better at arithmetic. Rigidly simulating an abstract rule based system IS a kind of intelligence
This is entirely anecdotal evidence, but as a "gifted" child who got diagnosed with ADHD at age 27, I've always completely separated my ability to learn (processing+analyzing information) from my attention issues. I take an off-brand Concerta, and what it allows me to do is exert less effort to do things I otherwise get paralyzing executive dysfunction for. I.e. just about everything in my life that I'm not really interested in on a whim. Concerta is a tool that allows me to direct my attention better, and most importantly it helps massively with time regulation: so less "oh I can finish this task in 20 minutes because I couldn't make myself do it earlier" and more "ok I want to do this, I can do it at X hour, and allocate Y duration to it." This process is near impossible without the medication. As a child this translated to setting my alarm at 5am in the morning to fail to complete homework and ending up scribbling it in class before it was due.
As stated at the end of this piece, I feel like the major hurdle in these types of studies is defining significant variables that first qualify the ability to learn, separate from overall attention. But you very quickly get into extremely murky territory (and I'll admit I'm not super well read on this topic because I find filtering out my personal bias/experience extremely draining, a poor show of rationalism at work).
I share much of your experience. I was always a distractable kid, but I was never diagnosed with ADHD, due to a combination of success despite distraction, my parents' attitudes towards education and medication, and going to elementary school before Ritalin became the rage in my family's social cohort. My executive function was bad (probably not as bad as yours), but that weakness never caught up with me because while it might have taken me four hours to master an algebra concept that should have taken one hour, I had four hours to spend on it. College and law school were tough, but it was only as a young attorney, expected to work through tedious documents quickly, that the volume of work exceeded the available time. I got on stimulants at that point.
I don't take stimulants every day. When I do, I can't perceive a difference in the quality of my work. Some of that is surely that a legal brief cannot be measured the way that a page of twenty long-division problems can. But the biggest difference is simply time. On stimulants, I sit down and do the work. I don't take three bathroom and two coffee breaks every hour. For what I do, getting the writing done is what matters. There could be differences in the quality of my writing on and off stimulants, but the crucial thing is that the brief get finished and filed. A so-so brief can still win; a brief that never gets filed because I was writing on SlateStarCodex is a professional disaster.
Could it be that Ritalin is somewhat dumbing down the brains ability to process new information, which counteracts the increase in focus?
> Something like this must be true if we assume that it takes a certain intelligence level to learn surgery - or quantum physics, or whatever. Otherwise you could get a very dumb person, keep teaching them a little more quantum physics every day for twenty years, and eventually expect them to know as much as a smart person would after getting a four-year degree. I’ve never heard of someone formally trying this, but I predict it wouldn’t work.
... Don't we all start off as very dumb people, and we keep learning a little more every day for many years until we can do the complicated stuff?
Kids with low IQ hit the same milestones as kids with high IQ, just slower. Why would this be different in adult life?
When I tutored in college and HS, and now train (“capacity build”) as an adult I do not find the last graph true at all. Basic algebra is just literally beyond some people. And more and more things as you move up the intellectual hierarchy. Now how much of that is focus/interest versus raw intellect is maybe hard to say.
Could you torture the non algebra understanders into understanding algebra with the right negative stimulus? Maybe? But copious positive stimulus definitely has no effect for some people, once you start getting into mid level HS work. And these are not people who are “morons” or whatever medicalized term you want to use. Just on the low end of the distribution.
Given how hard it is for some adults to uptake certain skills/concepts even with the literal gun of “you will be fired if you don’t perform better” pointed at their head, I don’t think the “just slower” model is correct.
> Basic algebra is just literally beyond some people. And more and more things as you move up the intellectual hierarchy. Now how much of that is focus/interest versus raw intellect is maybe hard to say.
Set theory is beyond most set theorists, which is why Skolem's Paradox is simply swept under the rug.
Surely there's a threshold IQ for mathematics and abstract conceptualization. The more complex and logically fraught, the higher the threshold. Algebra, which is minimally troublesome and merely involves using letters to stand in for variables, has a low threshold -- but a threshold nevertheless -- whereas quantum mechanics and set theory have very high thresholds. With enough guidance, rote memorization, and mechanical problem solving, people below the threshold might be able to "fake it" -- but they won't be able to develop a comprehensive understanding or apply the abstract concepts they're "learning" to their daily life.