Thanks for your comments here, I think they perfectly complement the impression I had from other conversations with Chomskian linguists. It's typically toxic, in bad faith, overclaiming their (mostly theoretical) achievements, and it lacks any sort of self-irony and applicability. Also they constantly reference to their supreme leader more than anyone else, who is the greatest example of this behavior. I mean, Darwin didn't refer to his predeccessors like "mere collectors", neither did Linney, Mendeleev or other global theorists. They typically expressed some gratitude and admiration to those who 'collected stuff' for them. I think now that the success of this approach was less 'grammarian generative' and more 'dissertation generative', because it opens an infinite field of theorizing. I mean, you'd probably expect that good theory would somehow predict how we learn languages, how we model them, and what languages can and cannot exist, or how they evolve. In all of this Ch. didn't provide much. I don't devaluate their work fully, that'd be a very low bar given probably majority of linguists still following this paradigm. But now I think it did more harm than good, suppressing less dogmatic research and thinking.
Thanks for that link. I never quite understood the distinctions (or non-distinctions) between self-embedding and MERGE, but the intro explains it rather well.
Yes, what is going on here? And not just AI research but Deep Learning in particular (Bengio, Hinton). This area has now captured a significant majority of researchers for all time?
In some of these cases it’s hard to say what “the research community” is. I would have expected Marx to be higher, since the research community of all Soviet scientists working on any subject was larger than any particular modern research community.
Maybe conducting AI research is less expensive because it's primarily digital. In STEM or social sciences researchers have to interact with meat space. In humanities here are no resource constraints so maybe artificial limits on volume of published work is necessary to maintain prestige.
But humanities researchers publish *much* less than science and engineering researchers (especially when measured by counting discrete publications, as citation counts do, so that being cited by 10 articles of 6 pages each shows up more than being cited on every page of a 200 page book), and there are far fewer humanities researchers than engineering researchers.
In part, the publishing conventions in AI (and adjacent fields) lean towards producing a large number of small papers. That is, the papers per researcher is very high, and the number of researchers is very high.
Because the number of researchers has exploded in the past 10 years, driven by massive investment from industry, rather than academia. Big AI conferences now get more than 10,000 submissions a year. Other areas of computer science are primarily academic, and are at least two orders of magnitude smaller.
Hinton's been around for a long time, but if you look at citations prior to 2014 or so, when neural nets were still mostly an academic curiosity, the number will look much more reasonable.
I did a little experiment. I was browsing some of the papers that cite Chomsky's most cited book, so I picked up one paragraph from the abstract of one of the top results that Google Scholar produced, fed it to an AI chatbot (Grok) and asked for reference suggestions:
"Based on the provided abstract and considering the context of attachment theory, I can suggest several references that would be relevant to your academic paper titled 'Patterns of Attachment: A Psychological Study of the Strange Situation':"
It gave me 8 suggestions. Call me cynical, but I think AI researchers are actively using AI to write their papers. A lot.
Did it suggest any papers actually cited by the allegedly human author? Were they even real papers (hallucinated citations are a real issue with LLMs)?
Because it is industry backed there is constant publishing. See "Two Minute Papers" in YouTube and you will see. There are always new techniques published specifically by companies such as Nvidia or universities that are backed by such companies in this specific area.
There is a big difference between the small amount of language a baby hears when learning to speak and the large amount of language artificial intelligence needs. It is not clear that artificial intelligence disproves Chomsky's theories.
A counter-argument to this “poverty of stimulus” line of thinking is that a baby doesn’t get merely exposed to some words. It also gets a deluge of visual, auditory, sensory etc. information, all perfectly synced, and _then_ it also gets to actively explore and _then_ it also gets multi-faceted types of feedback from the environment and from fellow humans. ML theory suggests that each of these differences can matter quite a bit.
This, indeed, doesn’t disprove Chomsky- but it means the poverty of stimulus argument is far from decisive, as the stimulus is not all that poor.
I am guessing that the baby has most of the deep layers already hard-coded (by evolution), and only requires some equivalent of a LoRA to get up to speed in his native language. The LLM has to be taught from scratch.
Also keep in mind that human language itself evolved to be easy to pick up by humans.
LLMs don't have that luxury: they need to learn a language that evolved not for ease of understanding by them, but by some aliens with completely different quirks in their underlying systems.
LLMs don't really have minds, though: they're just language-learning machines. That's what makes them ultimately unsuitable as the sole basis for AGIs.
Well, there's been a lot of handwaving and jumping up and down by behavioral geneticists about the FOXP2 gene and language acquisition in humans. And FOXP2 seems to play a role in song-bird song acquisition. Although FOXP2 may help with the ability to distinguish patterns in sound, it doesn't explain the emergency of language in the brain.
The human genome doesn't really have enough room to provide all that much hard coding. The entire thing is a couple of gigabytes, and most of it doesn't have anything to do with how particular synapses in the brain are wired. You've got basic reflexes, the learning system for training up the bulk of the mind mostly from scratch, and I'd guess not much else.
I suspect that human babies
1) Get a lot more training signal than just language. You've got to count all the visual information, audio information, etc. as well. This data is probably lower quality, but it still helps a lot.
2) Just have a more efficient architecture than transformers.
Since everyone's working on better architectures and better training procedures, I'd guess the AIs will catch up and then surpass the data efficiency of human brains not too long from now.
Presumably the human genome still contains some kind of a language learning bootstrapper, since human babies can easily learn at least one language, whereas babies of other ape species cannot.
This is precisely the question, no? Whether humans have an innate facility for language, or whether we cobble one together out of other cognitive adaptations.
Definitely. No problem fitting that into a couple of gigabytes.
But I expect that to be coding for a particular kind of brain architecture/training signal/more neurons of a particular type. Because those things probably don't require many bytes of description.
Taking the example of LLMs, encoding the LLM architecture, loss function and optimiser takes very little space. Whereas pretrained weights for hard-coding a bunch of deep language rules would take a lot of space, if our LLMs are any judge.
Aren't you assuming a sort of 1 to 1 ratio between genotype and phenotype? Just because computer programing basically has to work that way most of the time (for the sake of our own sanity when going through code, if nothing else), I don't see why nature should have to. I'm not sure how ration works out in humans between pleiotropy, polygenic traits, and regulatory elements, etc. but if the ratio worked out in favor of low genotype/ high phenotype (which would make some sense for natural selection purposes) then a few gigabytes could actually hold quite a bit.
Information isn't infinitely compressible. If there are x different ways a brain can be (x different possible phenotypes), and if genes were the sole determinator of which of the x different brains you'd get, there'd need to be x different possible genome settings. Otherwise not all x phenotypes could be realised.
And the point here is that x<<2^11, or ca. 3 gigabytes. And I think that's likely not enough to encode all the possible different pieces of deep language heuristics.
Now in reality, the genes don't actually need to encode the entire blueprint for the language faculties. They just need to encode some initial setup that will predictably become a language faculty once the neural network starts training. This requires much less description length. You just need to encode the right optimiser and the right sort of architecture.
This suffices to explain why humans can learn language and other animals mostly can't. It's not that the whole language faculty is hard coded, the setup to be able to produce that language faculty is. Same as how you're not going to get a good language model out of a 2015 era MLP trained with the methods of the time, while a modern transformer or mamba model does great at it.
It can also explain why language faculties usually seem to be encoded in similar brain regions in different humans, and work in roughly the same way. You don't actually need hard coding of the entire language processing apparatus in the genome to explain something like this. I can see how people might have thought that twenty years ago, but a big lesson mechanistic interpretabillity of modern neural networks has taught us is that there's a lot of convergence in what is learned. If you train a bunch of different networks on the same sort of task, even if you do it with different initalisations and different data and even somewhat different architectures, they will often still converge to implement almost identical solutions.
"(x different possible phenotypes), and if genes were the sole determinator of which of the x different brains you'd get, there'd need to be x different possible genome settings. Otherwise not all x phenotypes could be realised."
Not exactly, there is some variability introduced by the RNA transcription process, from what I can gather. Therefore, each combination of DNA could have more than one phenotype (though the difference is likely to be small). The big difference however is that DNA is quaternary storage not binary like our computers. I'm not going to do the math right now, but scientifically speaking, that's a metric fuck ton more information that it can hold.
'Not exactly, there is some variability introduced by the RNA transcription process, from what I can gather. Therefore, each combination of DNA could have more than one phenotype'
This variability can't be used to encode the information needed to specify a language faculty. Otherwise, organisms with the same DNA (humans) would sometimes come out having a language faculty, and sometimes come out not having a language faculty. For each bit that the transcription process contributes, the number of humans that possess the hard coded language abilities should be cut in half.
"The big difference however is that DNA is quaternary storage not binary like our computers."
This is already taken into account in the 'couple of gigabytes number I gave. Humans have ca. 3 billion base pairs, each of which takes 2 bits to encode. That's ca. 0.75 gigabytes total, I misremembered it to be a bit more.And that's not even taking any further compression into account, which would cut the actually available storage down even further. And then most of it is used to specify the rest of our bodies, rather than our brain initialization.
I don't really understand why Chomsky is so opposed to it or why it would disprove him. Statistical learning could just statistically learn grammar as well as or better than an expicit encoding. I suspect he didn't understand machine learning techniques and made some ill-considered comments, and his pride won't let him back off of it.
At some point, this becomes a matter of definition, but for most intents and purposes, we kinda do. (My go-to book for this is Zorina & Smirnova's "What speaking animals taught us", but it is in Russian.)
It just seems astounding unlikely that we are so unique ... but if you have a book with good arguments otherwise in a language I can read (just English, sorry!) then I'll gladly read it
I mean, there's "Why Only Us?", obviously, but it is a bit too condescending and abstract to my taste. Rumbaugh and Savage-Rumbaugh (two different people!) have a number of works, but probably on the more technical side. And Hockett (1966) is a classic.
We already know we're unique. We've built a giant civilization and all sorts of crazy technology. Our uniqueness including our language isn't that surprising.
That said, we also know we're not unique in terms of sheer brain power. I don't know what pilot whales and the like need all that brain power for, but it's at least plausible that they could use language.
Dolphins can at least understand basic syntax, in that they've been taught a language where changing word order can change the meaning.
I’ve long suspected marine mammals are quite intelligent, and it’s a small leap of imagination to think they have a language. Water is an excellent sound conductor; natural underwater waveguides allow sound propagation for literally thousands of miles. Perfect environment to develop a language.
This is very far from a proof that whale communication is language. I think it’s absolutely right to remain open-minded about whether whale communication is or isn’t like human language in whatever sense you mean. There might be a few bird species whose communication is possibly like human language.
But “using AI to identify a ‘phonetic inventory’ of codas” is very far from showing that this is language.
Did you read the whole piece? Where, for example, the researchers describe “[…]the sperm whale "phonemes" could be used in a combinatorial fashion, allowing the whales to construct a vast repertoire of distinct vocalisations. The existence of a combinatorial coding system, write the report authors, is a prerequisite for "duality of patterning" – a linguistic phenomenon thought to be unique to human language – in which meaningless elements combine to form meaningful words.”
They are pretty far along the road toward “yes whales have a language”. The help of machine learning pattern recognition is useful, but far from the only thing pointing toward it.
I hadn’t read the whole article. I refuse on principle to read BBC science articles, because they’re about as bad as they come with sensationalizing things. But the actual Nature article is interesting.
It’s interesting for sure, and seems like a breakthrough in interpreting this communication. But this is very far from saying that this is a language. It’s not possible to show anything significant about a claimed language if your database includes a total of 8000 phonemes (the equivalent of about 1500 words) across several dozen communicative exchanges.
Yeah I’m with you on science coverage, and not just BBC of course. This one seems reasonable enough.
One thing to note: I remember reading somewhere that the vocabulary of an English peasant in the Shakespearean comprised only about 600 words, so the whales have at least the same order of magnitude vocabulary. Sure we don’t know much about their language - yet - but that’s just a matter of time, IMHO.
Looking forward to someone dropping a hydrophone into the ocean and having the first dialog with a whale!
Note - it's not that they have 600 distinct "words" - it's that they have recordings of about "600 words" worth of conversation total. In any human language, about a hundred of those words would likely be repeats.
I could imagine a peasant from a few hundred years ago might not know the following words from the first 500 on that list: school, state, student, country, American, company, program, government, million, national, business, issue, provide, political, include, community, president, real, information, office, party, research, education, policy, process, nation, college, experience, former, development, economic, military, relationship, federal.
But a peasant would surely know several dozen names for animals and plants that aren't on that list, as well as tools and actions to do with them, even ignoring whatever religious and family vocabulary they might have.
They can't mean phonemes in the same way that we mean phonemes when we talk about human languages, can they? Because if so, 8000 phonemes would be massive, right?
No, there are 21 distinct coda-types - they have recordings of 8000 coda-tokens. It's like they have recordings of a few dozen people each saying a few words to each other in a language like Japanese, that has about 20 distinct phonemes. Definitely not enough to prove it's a language, but maybe enough to get its phoneme inventory if they're right.
I feel that is incredibly dependent on the definition of language. Chomsky's definition seems incredibly ridged, as is common in foundational research. "A structured system of communication that consists of grammar and vocabulary" is true of various bird songs. The objection then becomes that grammar is more than just structure but a structure of a certain complexity defined to separate human language from other animals, which seems very post-hoc to me.
>The objection then becomes that grammar is more than just structure but a structure of a certain complexity defined to separate human language from other animals, which seems very post-hoc to me.
You have a good point, but there are fairly obvious measures of the complexity of human language which I'd be surprised to see any other animal's language reach.
For instance, human sentences are sufficiently varied that, barring quotes, repeated sentences (except exceedingly short ones, "Damn!" is common) are rare. Does any other animal match this? This doesn't require any specific properties of the language's syntax.
The simplest and most distinctive features of language:
A very large fraction of utterances (e.g. sentences) are novel combinations of existing elements
The patterns by which these elements are combined are recursive (e.g., any two sentences can be combined with “and”, noun phrases can combine with a preposition to form a modifier within a bigger noun phrase).
Does anyone know what the Chomsky followers would say about the recent findings that non-human animals (a large number of them) clearly have methods of communication?
I'm not sure what important recent findings you're referring to; the fact of non-human communication is ancient knowledge. What Chomskyans would say is that human language is evolutionarily discontinuous (it is not an outgrowth of the same phenomenon in other species) because there are categorical differences. Nonhuman communication systems are syntactically basic (certainly non-recursive), goal-orientated, lack reference, conceptual labelling, symbol arbitrariness and so on. Claims to the contrary based on e.g. training apes to sign have always anthropomorphised the data and there is not wide acceptance that any species has demonstrated human-like capabilities.
Confirmed by a Chomskyan (me!). Also, note that language is not a tool for communication, that function is secondary and forced (see the quote in the beginning of the article), it is originally a tool for structuring thought.
"language is not a tool for communication, that function is secondary and forced (see the quote in the beginning of the article), it is originally a tool for structuring thought"
Can you summarize, for a complete lay person in this area, what the evidence is to support this claim?
Multiple homonymies, synonymies, and other inconsistencies between the likely syntax (and semantics, which is read off syntax in a not-quite-trivial way) of sentences and their observed morphophonologies.
Yeah, got too jargony there for a second (although that itself is a good illustration!). Like, take the "I saw a girl with a telescope" example above. It has a structural ambiguity (do you first join "girl" with "with a telescope" or "a girl" with "saw") _and_ a lexical ambiguity between instrumental and comitative (in Russian, for instance, these meanings are expressed differently). You can also show each of the homonymies separately. We think hierarchically and only then try to push that hierarchy into a linear sequence (a sentence); and, to add insult to injury, our lexical system is also ripe with various meaning switches (metonymies, metaphors, and so on): "I saw a mouse" is ambiguous between a computer gadget and a small animal, and it persists despite us usually not noticing a similarity between the two (although it is, of course, originally, a metonymy on visual similarity).
There's an often-found claim that most homonymies are not a problem in real speech (the one @jumpingjacksplash begins with below), but there remain quite enough.
Wait, the argument is that when I'd never get confused saying "witch/which" whether I mean "sorceress" or "that," but someone listening to me might? That's faintly interesting, but it's a feature of toy systems which aren't remotely for the purpose of thought.
For example, imagine two ships communicating with flag signals, but with a system where letters and concepts use the same flags and you have to guess what the other person means. The crew of ship one try to write "halt," but the flags for H=Denmark, A=Collision, L=Rock and T=West. The crew of ship two thinks a Danish ship has hit a rock west of them and sails off in that direction.
The real evolutionary pattern of language must be something like: 1. Simple sounds to indicate things; 2. more complex sounds to convey relations that can be thought non-linguistically (eg. this rock is on top of that rock; 3. the possibility of having thoughts that couldn't be expressed non-linguistically due to thinking in language (eg. "language is ontologically prior to thought"). 4. The utility of non-linguistic thoughts by allowing complex social co-ordination (e.g. "Obey the king of Uruk or face the wrath of Marduk"). This then just comes back to Wittgenstein's point that sentences of type 3 are basically the equivalent of "colourless green ideas sleep furiously" so far as external grounding is concerned, although I'd modify that to having powerful impacts on social co-ordination. Hence they stuck around and developed as opposed to being sanded off by evolution.
That's the evidence that's easiest to explain in non-technical terms (not that the comment you replied initially succeeded in it). But yeah, the main thrust of the argument is along the lines of "we see a bunch of people hammering nails with microscopes, but the microscopes' internal structure suggests it wasn't made for hammering".
(I originally came across it in Scott's review of "Origin of Consciousness in the Breakdown of the Bicameral Mind")
The person who wrote that claims that they used language to communicate but not to think. I realize it's one, self-reported Reddit post, so not exactly high-quality data, but I wonder if there are other claims like this, and if they have been investigated by linguists?
There are other claims like this, and they are true at some level and false at another. We often indeed don't think in language in the sense of "build a sentence, up to and including how it sounds/what the articulatory commands are" (the acoustic/articulatory debate is its own bag of worms), but the idea is that to formulate _complex_ thoughts specifically, we use a method to chain thoughts which is so similar to syntax it probably _is_ syntax, even if it never gets sent to the systems of articulation ("is never sent to spell-out", in jargon-y terms). Note how using language was tremendously helpful for the guy, because it helped organize thought.
I think I read that post differently than you. He seems to be claiming that he didn't have complex thoughts before he started thinking articulatory in language, e.g.
"I can only describe my past life as ...."Mindless"..."empty"....."soul-less".... As weird as this sounds, I'm not even sure what I was, If i was even human, because I was barely even conscious. I felt like I was just reacting to the immediate environment and wasn't able to think anything outside of it."
So we seem to have a case where language was used exclusively for communication, and not at all for thought. Doesn't this contradict the claim that language is primarily used for thought, with communication being secondary?
Imagine you have a microscope and use it as a hammer. Doesn't this contradict the claim that the primary use of microscope, as evidenced by its structure, is for something else than hammering in nails?
<i>I can only describe my past life as ...."Mindless"..."empty"....."soul-less".... As weird as this sounds, I'm not even sure what I was, If i was even human, because I was barely even conscious.</i>
Huh, maybe p-zombies aren't such an outlandish idea after all.
Also, I wonder if this kind of testimony might be useful in studying how animals experience the world. Obviously there are many more differences between humans and other kinds of animal than just thinking with language, but still, this might be the closest we can get to answering the question, "What's it like to be a bat?"
An interesting question, but, to be fair, other explanations are available. (Is she able to imagine to raise her hand without raising her hand, I wonder?)
>We often indeed don't think in language in the sense of "build a sentence, up to and including how it sounds/what the articulatory commands are" (the acoustic/articulatory debate is its own bag of worms), but the idea is that to formulate complex thoughts specifically, we use a method to chain thoughts which is so similar to syntax it probably is syntax, even if it never gets sent to the systems of articulation ("is never sent to spell-out", in jargon-y terms).
At the very least, some thinking can also be geometrical rather than syntactic. Visualizing a part dangling from a flexible support and envisioning where its center of gravity will be and how it will be oriented is a complex operation, but it certainly doesn't feel linguistic to me.
This reminds me of the other day I was rolling out a pie crust. The instructions said to roll out the size of a sheet of notepaper. (In fairness to the recipe writer, I had misread and this step was not the final step; rather she wanted us to roll this shape prior to doing several folds for flakiness.)
But the point is, I kept trying to do this shape, which was also hard and impossible for me because I was making only a small pie in an 8 inch pan so didn’t have the full amount to make a notebook-paper sized rectangle. Poor reasoning on my part.
Throughout this process my brain vexingly kept flashing in my head a *picture* of the round tin, as if to say, you’re on the wrong track, old girl. Can’t emphasize enough this was a picture only, that I had to keep suppressing: “She said notebook paper! She said rectangle!”
It's _complicated_, but I am not sure that when laypeople (not physicists who learned advanced ways of modeling such things through, well, language) do it, it is _complex_ in the sense of explicitly manipulating multiple parts of the system.
Inner mental experience is far richer than that - this is a massively under-researched and under-discussed topic in our society. The best detailed research into it is by Hurlburt using a technique called Descriptive Experience Sampling.
One result he found is that most people often think in rich unsymbolized concepts - I do this myself, and I often don't *have* words for the concepts. I have to invent language, or spend a long time crafting it to turn the thoughts into words. This to me makes it pretty clear my underlying mind isn't primarily linguistic
(I use inner speech a lot too, and it is very useful for certain kinds of thought. It is but one modality of thought.)
Similarly, lots of people think mainly with images, or mainly with emotions. Others are mainly paying detailed attention to sensations, or have no focussed inner mental experience at all.
There's a phenomena Hurlburt found which the complex chained thoughts you describe reminds me of, where people can have "wordless speech". This has the cadence and rhythym of a sentence, but with concepts in the awreness instead of linguistic words.
Lots going on to investigate here, and I think linguistics would hugely benefit from studying Hurlburt's research.
> This has the cadence and rhythym of a sentence, but with concepts in the awreness instead of linguistic words.
is indeed prima facie evidence for, not against, "to formulate _complex_ thoughts specifically, we use a method to chain thoughts which is so similar to syntax it probably _is_ syntax, even if it never gets sent to the systems of articulation". It is just a specific case where you don't send it to the systems of articulation because the specific terminal nodes correspond to concepts which don't have a phonological word to them. (Compare a situation when you're bilingual and know a word for the concept you need in one language but not in the other.)
Second,
> Similarly, lots of people think mainly with images, or mainly with emotions. Others are mainly paying detailed attention to sensations, or have no focussed inner mental experience at all.
Indeed, there are many non-linguistic ways of thought; the question is, though, whether any of them are _complex_, chaining ways. Like, images seem to just be, well, non-compositional: an image of a house, cognitively, is not images of windows imposed over image of a wall combined with an image of a roof (as evidenced by the default image of window probably differing from the parts of the default image of a house that correspond to the house's windows).
What is the Chomskyan response to Evelina Fedorenko's work that suggests that even when the language systems in adult brains are damaged, their other cognitive functions remain intact? For example, people with global aphasia can still engage in complex causal reasoning tasks such as solving arithmetic problems and reasoning about others' intentions.
From my own experience, I mostly think without using language (i.e. talking to myself). It's only when I have to break down complex tasks into logical subunits that I begin to talk to myself about what I'm doing (but that's mostly a placeholder narrative — "e.g. this is the screw that I need to insert here, after I get this thingy lined up with the thingamajig." Also when I need to write, I'll start thinking in language. So, I'll grant you, language is necessary for structuring thought that leads to communication with others, but I don't think it's integral to reasoning. I'm pretty sure I could get the screw into the thingy and thingamajig without overtly describing the process to myself.
Aphasias are a prime proof of linguistic module's separateness, so this is a very important question indeed. The problem is, there are at least six different kinds of aphasias (Luria's classification), and none of them looks like "break the syntactic module". Broca's aphasia (in Luria's terminology, efferent motor aphasia) is pretty clearly "broken PF" (syntax-phonology interface), Vernike's aphasia (don't remember the Luria name for it) is pretty clearly "broken LF", some others are way more specific (there is, for instance, an aphasia that specifically targets multivalent concepts like "under").
Name-like calls are not sentences with grammatical structure. They don’t imply recursion or an infinitude of possible utterances with a finite vocabulary.
They may communicate, but they don't seem to do language well. All the ape-language studies from the 70s and 80s failed miserably (the most famous of these research projects, Coco the gorilla, made a good mascot but terrible science, particularly as the researcher refused to share or release unedited video of Coco communicating). It seems like even smart animals, like chimps, can't grok language the way humans can.
Thank you, this is very informative. I'm thinking about results from whales and elephants. Do you have thoughts on whether this counts as "language" from a Chomskian view?
I don't know Chomsky well enough to comment. My only insight is that laymen (such as myself) often think that language is simpler and easier than it is, so much so that we are impressed by chimps that seem like they can do sign language, or elephants that seem like they have names for each other. There really does seem to be a huge gulf in capability between humans and all other animals when it comes to language.
>There really does seem to be a huge gulf in capability between humans and all other animals when it comes to language.
Agreed. From the point of view of resolving how language actually evolved, it is a pity that all of our fellow Hominini except the chimps and the bonobos are extinct. At least the hyoid bones can fossilize, consonants and vowels, not so much... I'm skeptical that we will ever really know.
Having a name for each elephant that they use to get each other’s attention is more sophisticated than a finite list of alarm calls, but it does not mean that there is productive recombination of utterances to express novel meanings. One of the major features of language is that you can (largely) understand novel expressions that you have never heard before, and the limitations on your understanding aren’t due to the novelty, but due to inherent uncertainties of context and intention.
They can't grok human language the way humans can. I don't understand why this is sufficient to decide that they can't grok language? Surely one would need to study how they communicate with each other to decide that?
We need to study the ways they communicate to decide that. There are many animals whose communication systems have been studied sufficiently to be sure that they don’t have the complex structure of human language. This is true for dogs, cats, primates, many birds, and many other animals. I think there are some animals whose communication system is complex enough that we know we haven’t yet understood it, so that we can’t be sure it’s not like human language. This is true for some whales, elephants, some birds, and some cephalopods. None of these has been shown to be as complex as human language, but they haven’t been shown not to be either.
One interesting note: apparently Ken Hale, late well respected field linguist who did a lot of work on the Australian language Walpiri among many other things, uncontroversially described Walpiri and some other Australian indigenous languages as lacking clausal embedding in the 70s. This Language Log blog post (http://itre.cis.upenn.edu/~myl/languagelog/archives/004592.html) muses on various possible reasons this basically went unnoticed whereas Everett immediately got in an academic flamewar, but the main reason seems to have basically been that recursion hadn't been proposed as a linguistic universal until the introduction of Chomsky's Minimalist Program in 1995, earlier Chomskyan models of universal grammar did not include it.
So the idea that this may not even be unique to Pirahã seems like an important counterpoint to some claims I've seen that Pirahã in fact probably does have syntactic recursion and the real problem is that its sole expert isn't describing the language correctly. I've seen Everett comment somewhere that a problem may be that field linguists are essentially trained to fit grammar into a certain framework and so may be trying to square peg a round theoretical hole (as he himself says he did in his original PhD thesis description of Pirahã), thus failing to properly document the true linguistic diversity that really exists in the world.
Disclaimer that I don't have any expert understanding of the technical issue here, just a longstanding hobbyist fascination with linguistics. I acknowledge that my amateur understanding is causing me to miss something important here. But overall my reactions to the Chomskyan team response to this, where Everett is probably wrong but even if he's right it actually doesn't matter for his theory, strikes me similarly to you: it feels like dragging a major hypothesis into an unfalsifiable territory where it gets hard to understand what definitive claims it's actually making about the nature of human language at all.
The claim that all human languages have sentences containing more than one word isn’t vacuously true, it’s profoundly true. Just compare this fact with the norm in the communications of other species. No equivalent of “the [color] bird” is known in how any other species communicates. That, is Merge.
Communicating in something composed of parts with their own semantic effect (like, "dog" in "A dog came" has a meaning but "g" doesn't; these parts are normally called morphemes but layman explanations often default to "words" because it's simpler) is _not_ trivial, most animal communication systems don't have morphemes. Of those that do, seemingly none except humans and apes trained by humans are in the primate tree.
Interestingly, no! We know the answer from ontogenesis (children development). There are three prolonged stages in humans: one-word, two-word (the involved words are short and probably analyzed as monomorphemic by the child, so ignore the word/morpheme angle), and then the so-called "grammatical explosion" where the child rapidly builds more and more complex structures. So, to answer the question in a pithy way, two doesn't make recursion (you can combine, but you can't use the combined thing as the input to the next combination, which is where recursion kicks in), but three does.
And see the dolphinspeak comment above on the linearly recursive vs. hierarchically recursive difference.
There’s a difference between obligatorily one-word utterances and a possibility of two-word utterances. What I think is actually the interesting difference though is the difference between utterances with a fixed upper bound on length and utterances that can be in principle unbounded (which even Everett says is true of Pirahã).
LLMs' way of picking language is interesting but certainly different from humans', both because humans need less material and because they show unexpected biases (for instance, non-conservative quantifiers are unlearnable for human children of the critical age but not for LLMs; you can trivially teach an LLM extraction from islands or binding violations; and so on and so forth).
"Mary wrote a letter to her" cannot mean "Mary wrote a letter to herself".
"Mary saw a snake in front of her" _can_ mean "Mary saw a snake in front of herself".
And, finally, "Mary knows that Penny saw herself" cannot mean "Mary knows that Penny saw her" (and vice versa, but the vice versa is uninteresting because it's the same as the first).
There are very pecuilar and technical laws surrounding both when a reflexive _can_ appear and when it _must_ appear.
>"Mary wrote a letter to her" cannot mean "Mary wrote a letter to herself".
Is that strictly true? What about this: "Mary reflected on the woman that she hoped she would become. Mary wrote a letter to her."
Is there some technical linguistic sense in which she's not writing a letter to herself? Is metaphorical meaning considered out-of-scope for linguistics? If so, how is that rigorously defined?
There is a purely technical linguistic sense in which she isn't writing a letter to herself, yes. The antecedent, to use a technical term, of the pronoun "her" is "the woman that she hoped she would become", which is a different phrase than "Mary" and, again, in a technical sense, we know it not to be co-indexed with Mary because the first sentence is possible (cf. #"Mary reflected on her"). The first sentence establishes the two women as different objects for the purposes of binidng theory.
To take a different example, note how "Joe Biden reflected on the president of the United States" is an infelicitous sentence if it happens now but would be felicitous if it happened four years ago.
(Note that nobody explicitly teaches children the laws, and, judging by CHILDES corpora, they don't get enough data to arrive at the very generalizations they tend to arrive at.)
That's what it seems like to me.
Thanks for your comments here, I think they perfectly complement the impression I had from other conversations with Chomskian linguists. It's typically toxic, in bad faith, overclaiming their (mostly theoretical) achievements, and it lacks any sort of self-irony and applicability. Also they constantly reference to their supreme leader more than anyone else, who is the greatest example of this behavior. I mean, Darwin didn't refer to his predeccessors like "mere collectors", neither did Linney, Mendeleev or other global theorists. They typically expressed some gratitude and admiration to those who 'collected stuff' for them. I think now that the success of this approach was less 'grammarian generative' and more 'dissertation generative', because it opens an infinite field of theorizing. I mean, you'd probably expect that good theory would somehow predict how we learn languages, how we model them, and what languages can and cannot exist, or how they evolve. In all of this Ch. didn't provide much. I don't devaluate their work fully, that'd be a very low bar given probably majority of linguists still following this paradigm. But now I think it did more harm than good, suppressing less dogmatic research and thinking.
See Pullum on how Everett has been treated https://youtu.be/06zefFkhuKI?feature=shared&t=1962
Thanks for that link. I never quite understood the distinctions (or non-distinctions) between self-embedding and MERGE, but the intro explains it rather well.
Great video. Thanks for sharing.
Why does AI research have so many citations? Particularly when compared with (other branches of?) Computer Science?
Yes, what is going on here? And not just AI research but Deep Learning in particular (Bengio, Hinton). This area has now captured a significant majority of researchers for all time?
Because there are so many AI researchers. I wonder how the numbers would look like if normalized by the size of respective research communities.
In some of these cases it’s hard to say what “the research community” is. I would have expected Marx to be higher, since the research community of all Soviet scientists working on any subject was larger than any particular modern research community.
I agree. One could parameterize the normalized citation indeces by a notion of community.
Maybe conducting AI research is less expensive because it's primarily digital. In STEM or social sciences researchers have to interact with meat space. In humanities here are no resource constraints so maybe artificial limits on volume of published work is necessary to maintain prestige.
But humanities researchers publish *much* less than science and engineering researchers (especially when measured by counting discrete publications, as citation counts do, so that being cited by 10 articles of 6 pages each shows up more than being cited on every page of a 200 page book), and there are far fewer humanities researchers than engineering researchers.
In part, the publishing conventions in AI (and adjacent fields) lean towards producing a large number of small papers. That is, the papers per researcher is very high, and the number of researchers is very high.
Because the number of researchers has exploded in the past 10 years, driven by massive investment from industry, rather than academia. Big AI conferences now get more than 10,000 submissions a year. Other areas of computer science are primarily academic, and are at least two orders of magnitude smaller.
Hinton's been around for a long time, but if you look at citations prior to 2014 or so, when neural nets were still mostly an academic curiosity, the number will look much more reasonable.
I did a little experiment. I was browsing some of the papers that cite Chomsky's most cited book, so I picked up one paragraph from the abstract of one of the top results that Google Scholar produced, fed it to an AI chatbot (Grok) and asked for reference suggestions:
"Based on the provided abstract and considering the context of attachment theory, I can suggest several references that would be relevant to your academic paper titled 'Patterns of Attachment: A Psychological Study of the Strange Situation':"
It gave me 8 suggestions. Call me cynical, but I think AI researchers are actively using AI to write their papers. A lot.
Which begs the question: what percentage of the conclusions are hallucinations (which I prefer to call bullshit)?
Did it suggest any papers actually cited by the allegedly human author? Were they even real papers (hallucinated citations are a real issue with LLMs)?
5 out of 8 were actually cited by the human author of the book (it was a book). Another one was the original book.
because the papers are AI-generated? :-)
Because it is industry backed there is constant publishing. See "Two Minute Papers" in YouTube and you will see. There are always new techniques published specifically by companies such as Nvidia or universities that are backed by such companies in this specific area.
Pirahã is pronounced pretty close to [piɾahã]. Portuguese spelling is rather straight forward.
There is a big difference between the small amount of language a baby hears when learning to speak and the large amount of language artificial intelligence needs. It is not clear that artificial intelligence disproves Chomsky's theories.
A counter-argument to this “poverty of stimulus” line of thinking is that a baby doesn’t get merely exposed to some words. It also gets a deluge of visual, auditory, sensory etc. information, all perfectly synced, and _then_ it also gets to actively explore and _then_ it also gets multi-faceted types of feedback from the environment and from fellow humans. ML theory suggests that each of these differences can matter quite a bit.
This, indeed, doesn’t disprove Chomsky- but it means the poverty of stimulus argument is far from decisive, as the stimulus is not all that poor.
I am guessing that the baby has most of the deep layers already hard-coded (by evolution), and only requires some equivalent of a LoRA to get up to speed in his native language. The LLM has to be taught from scratch.
Well that's Chomsky's theory. Language is hardcoded.
No, not language itself (that'd be the LoRA), but rather the facility to learn language (however that might function, which I suspect no one knows).
Also keep in mind that human language itself evolved to be easy to pick up by humans.
LLMs don't have that luxury: they need to learn a language that evolved not for ease of understanding by them, but by some aliens with completely different quirks in their underlying systems.
LLMs don't really have minds, though: they're just language-learning machines. That's what makes them ultimately unsuitable as the sole basis for AGIs.
I fixed my comment to remove the unnecessary mention of 'minds'. My point is independent of that word.
"Easy to pick up by humans" during a specific stage of brain development. Much more onerous to pick up after the first five years of life.
Much less onerous? An adult dropped into a foreign language community can pick up the language in several months, not several years.
Well, there's been a lot of handwaving and jumping up and down by behavioral geneticists about the FOXP2 gene and language acquisition in humans. And FOXP2 seems to play a role in song-bird song acquisition. Although FOXP2 may help with the ability to distinguish patterns in sound, it doesn't explain the emergency of language in the brain.
Doesn't the ability to learn language statistically also need to be literally hardcoded as the first step of making an LLM?
Good question. Can anyone answer this?
The human genome doesn't really have enough room to provide all that much hard coding. The entire thing is a couple of gigabytes, and most of it doesn't have anything to do with how particular synapses in the brain are wired. You've got basic reflexes, the learning system for training up the bulk of the mind mostly from scratch, and I'd guess not much else.
I suspect that human babies
1) Get a lot more training signal than just language. You've got to count all the visual information, audio information, etc. as well. This data is probably lower quality, but it still helps a lot.
2) Just have a more efficient architecture than transformers.
Since everyone's working on better architectures and better training procedures, I'd guess the AIs will catch up and then surpass the data efficiency of human brains not too long from now.
Presumably the human genome still contains some kind of a language learning bootstrapper, since human babies can easily learn at least one language, whereas babies of other ape species cannot.
I believe most human children can manage about three different languages, at least if the parents are motivated to teach them.
This is precisely the question, no? Whether humans have an innate facility for language, or whether we cobble one together out of other cognitive adaptations.
Definitely. No problem fitting that into a couple of gigabytes.
But I expect that to be coding for a particular kind of brain architecture/training signal/more neurons of a particular type. Because those things probably don't require many bytes of description.
Taking the example of LLMs, encoding the LLM architecture, loss function and optimiser takes very little space. Whereas pretrained weights for hard-coding a bunch of deep language rules would take a lot of space, if our LLMs are any judge.
Aren't you assuming a sort of 1 to 1 ratio between genotype and phenotype? Just because computer programing basically has to work that way most of the time (for the sake of our own sanity when going through code, if nothing else), I don't see why nature should have to. I'm not sure how ration works out in humans between pleiotropy, polygenic traits, and regulatory elements, etc. but if the ratio worked out in favor of low genotype/ high phenotype (which would make some sense for natural selection purposes) then a few gigabytes could actually hold quite a bit.
Information isn't infinitely compressible. If there are x different ways a brain can be (x different possible phenotypes), and if genes were the sole determinator of which of the x different brains you'd get, there'd need to be x different possible genome settings. Otherwise not all x phenotypes could be realised.
And the point here is that x<<2^11, or ca. 3 gigabytes. And I think that's likely not enough to encode all the possible different pieces of deep language heuristics.
Now in reality, the genes don't actually need to encode the entire blueprint for the language faculties. They just need to encode some initial setup that will predictably become a language faculty once the neural network starts training. This requires much less description length. You just need to encode the right optimiser and the right sort of architecture.
This suffices to explain why humans can learn language and other animals mostly can't. It's not that the whole language faculty is hard coded, the setup to be able to produce that language faculty is. Same as how you're not going to get a good language model out of a 2015 era MLP trained with the methods of the time, while a modern transformer or mamba model does great at it.
It can also explain why language faculties usually seem to be encoded in similar brain regions in different humans, and work in roughly the same way. You don't actually need hard coding of the entire language processing apparatus in the genome to explain something like this. I can see how people might have thought that twenty years ago, but a big lesson mechanistic interpretabillity of modern neural networks has taught us is that there's a lot of convergence in what is learned. If you train a bunch of different networks on the same sort of task, even if you do it with different initalisations and different data and even somewhat different architectures, they will often still converge to implement almost identical solutions.
"(x different possible phenotypes), and if genes were the sole determinator of which of the x different brains you'd get, there'd need to be x different possible genome settings. Otherwise not all x phenotypes could be realised."
Not exactly, there is some variability introduced by the RNA transcription process, from what I can gather. Therefore, each combination of DNA could have more than one phenotype (though the difference is likely to be small). The big difference however is that DNA is quaternary storage not binary like our computers. I'm not going to do the math right now, but scientifically speaking, that's a metric fuck ton more information that it can hold.
'Not exactly, there is some variability introduced by the RNA transcription process, from what I can gather. Therefore, each combination of DNA could have more than one phenotype'
This variability can't be used to encode the information needed to specify a language faculty. Otherwise, organisms with the same DNA (humans) would sometimes come out having a language faculty, and sometimes come out not having a language faculty. For each bit that the transcription process contributes, the number of humans that possess the hard coded language abilities should be cut in half.
"The big difference however is that DNA is quaternary storage not binary like our computers."
This is already taken into account in the 'couple of gigabytes number I gave. Humans have ca. 3 billion base pairs, each of which takes 2 bits to encode. That's ca. 0.75 gigabytes total, I misremembered it to be a bit more.And that's not even taking any further compression into account, which would cut the actually available storage down even further. And then most of it is used to specify the rest of our bodies, rather than our brain initialization.
I don't really understand why Chomsky is so opposed to it or why it would disprove him. Statistical learning could just statistically learn grammar as well as or better than an expicit encoding. I suspect he didn't understand machine learning techniques and made some ill-considered comments, and his pride won't let him back off of it.
Just rowing in here with my usual comment that we have no idea if humans are the only animals with language.
At some point, this becomes a matter of definition, but for most intents and purposes, we kinda do. (My go-to book for this is Zorina & Smirnova's "What speaking animals taught us", but it is in Russian.)
It just seems astounding unlikely that we are so unique ... but if you have a book with good arguments otherwise in a language I can read (just English, sorry!) then I'll gladly read it
I mean, there's "Why Only Us?", obviously, but it is a bit too condescending and abstract to my taste. Rumbaugh and Savage-Rumbaugh (two different people!) have a number of works, but probably on the more technical side. And Hockett (1966) is a classic.
Oh, and "Language Instinct" by Pinker, of course. Can't believe I forgot that one.
We already know we're unique. We've built a giant civilization and all sorts of crazy technology. Our uniqueness including our language isn't that surprising.
That said, we also know we're not unique in terms of sheer brain power. I don't know what pilot whales and the like need all that brain power for, but it's at least plausible that they could use language.
Dolphins can at least understand basic syntax, in that they've been taught a language where changing word order can change the meaning.
We’re not. See, e.g., https://www.bbc.com/future/article/20240709-the-sperm-whale-phonetic-alphabet-revealed-by-ai
I’ve long suspected marine mammals are quite intelligent, and it’s a small leap of imagination to think they have a language. Water is an excellent sound conductor; natural underwater waveguides allow sound propagation for literally thousands of miles. Perfect environment to develop a language.
This is very far from a proof that whale communication is language. I think it’s absolutely right to remain open-minded about whether whale communication is or isn’t like human language in whatever sense you mean. There might be a few bird species whose communication is possibly like human language.
But “using AI to identify a ‘phonetic inventory’ of codas” is very far from showing that this is language.
Did you read the whole piece? Where, for example, the researchers describe “[…]the sperm whale "phonemes" could be used in a combinatorial fashion, allowing the whales to construct a vast repertoire of distinct vocalisations. The existence of a combinatorial coding system, write the report authors, is a prerequisite for "duality of patterning" – a linguistic phenomenon thought to be unique to human language – in which meaningless elements combine to form meaningful words.”
They are pretty far along the road toward “yes whales have a language”. The help of machine learning pattern recognition is useful, but far from the only thing pointing toward it.
I hadn’t read the whole article. I refuse on principle to read BBC science articles, because they’re about as bad as they come with sensationalizing things. But the actual Nature article is interesting.
Here is the key chart, where they classify all 8000 or so codas in their database on the combinatorial dimensions of rhythm, tempo, and ornamentation: https://www.nature.com/articles/s41467-024-47221-8/figures/3
It’s interesting for sure, and seems like a breakthrough in interpreting this communication. But this is very far from saying that this is a language. It’s not possible to show anything significant about a claimed language if your database includes a total of 8000 phonemes (the equivalent of about 1500 words) across several dozen communicative exchanges.
Yeah I’m with you on science coverage, and not just BBC of course. This one seems reasonable enough.
One thing to note: I remember reading somewhere that the vocabulary of an English peasant in the Shakespearean comprised only about 600 words, so the whales have at least the same order of magnitude vocabulary. Sure we don’t know much about their language - yet - but that’s just a matter of time, IMHO.
Looking forward to someone dropping a hydrophone into the ocean and having the first dialog with a whale!
Note - it's not that they have 600 distinct "words" - it's that they have recordings of about "600 words" worth of conversation total. In any human language, about a hundred of those words would likely be repeats.
But also, I'm very skeptical of a claim that there have been adult humans with a vocabulary of about 600 words. I could *maybe* believe 6,000. Here's a list that claims to be the 1,000 most common words in English now: https://www.gonaturalenglish.com/1000-most-common-words-in-the-english-language/
I could imagine a peasant from a few hundred years ago might not know the following words from the first 500 on that list: school, state, student, country, American, company, program, government, million, national, business, issue, provide, political, include, community, president, real, information, office, party, research, education, policy, process, nation, college, experience, former, development, economic, military, relationship, federal.
But a peasant would surely know several dozen names for animals and plants that aren't on that list, as well as tools and actions to do with them, even ignoring whatever religious and family vocabulary they might have.
They can't mean phonemes in the same way that we mean phonemes when we talk about human languages, can they? Because if so, 8000 phonemes would be massive, right?
No, there are 21 distinct coda-types - they have recordings of 8000 coda-tokens. It's like they have recordings of a few dozen people each saying a few words to each other in a language like Japanese, that has about 20 distinct phonemes. Definitely not enough to prove it's a language, but maybe enough to get its phoneme inventory if they're right.
I feel that is incredibly dependent on the definition of language. Chomsky's definition seems incredibly ridged, as is common in foundational research. "A structured system of communication that consists of grammar and vocabulary" is true of various bird songs. The objection then becomes that grammar is more than just structure but a structure of a certain complexity defined to separate human language from other animals, which seems very post-hoc to me.
>The objection then becomes that grammar is more than just structure but a structure of a certain complexity defined to separate human language from other animals, which seems very post-hoc to me.
You have a good point, but there are fairly obvious measures of the complexity of human language which I'd be surprised to see any other animal's language reach.
For instance, human sentences are sufficiently varied that, barring quotes, repeated sentences (except exceedingly short ones, "Damn!" is common) are rare. Does any other animal match this? This doesn't require any specific properties of the language's syntax.
The simplest and most distinctive features of language:
A very large fraction of utterances (e.g. sentences) are novel combinations of existing elements
The patterns by which these elements are combined are recursive (e.g., any two sentences can be combined with “and”, noun phrases can combine with a preposition to form a modifier within a bigger noun phrase).
Does anyone know what the Chomsky followers would say about the recent findings that non-human animals (a large number of them) clearly have methods of communication?
I'm not sure what important recent findings you're referring to; the fact of non-human communication is ancient knowledge. What Chomskyans would say is that human language is evolutionarily discontinuous (it is not an outgrowth of the same phenomenon in other species) because there are categorical differences. Nonhuman communication systems are syntactically basic (certainly non-recursive), goal-orientated, lack reference, conceptual labelling, symbol arbitrariness and so on. Claims to the contrary based on e.g. training apes to sign have always anthropomorphised the data and there is not wide acceptance that any species has demonstrated human-like capabilities.
Confirmed by a Chomskyan (me!). Also, note that language is not a tool for communication, that function is secondary and forced (see the quote in the beginning of the article), it is originally a tool for structuring thought.
"language is not a tool for communication, that function is secondary and forced (see the quote in the beginning of the article), it is originally a tool for structuring thought"
Can you summarize, for a complete lay person in this area, what the evidence is to support this claim?
Multiple homonymies, synonymies, and other inconsistencies between the likely syntax (and semantics, which is read off syntax in a not-quite-trivial way) of sentences and their observed morphophonologies.
Sorry...more lay-persony, please?
Yeah, got too jargony there for a second (although that itself is a good illustration!). Like, take the "I saw a girl with a telescope" example above. It has a structural ambiguity (do you first join "girl" with "with a telescope" or "a girl" with "saw") _and_ a lexical ambiguity between instrumental and comitative (in Russian, for instance, these meanings are expressed differently). You can also show each of the homonymies separately. We think hierarchically and only then try to push that hierarchy into a linear sequence (a sentence); and, to add insult to injury, our lexical system is also ripe with various meaning switches (metonymies, metaphors, and so on): "I saw a mouse" is ambiguous between a computer gadget and a small animal, and it persists despite us usually not noticing a similarity between the two (although it is, of course, originally, a metonymy on visual similarity).
There's an often-found claim that most homonymies are not a problem in real speech (the one @jumpingjacksplash begins with below), but there remain quite enough.
Wait, the argument is that when I'd never get confused saying "witch/which" whether I mean "sorceress" or "that," but someone listening to me might? That's faintly interesting, but it's a feature of toy systems which aren't remotely for the purpose of thought.
For example, imagine two ships communicating with flag signals, but with a system where letters and concepts use the same flags and you have to guess what the other person means. The crew of ship one try to write "halt," but the flags for H=Denmark, A=Collision, L=Rock and T=West. The crew of ship two thinks a Danish ship has hit a rock west of them and sails off in that direction.
The real evolutionary pattern of language must be something like: 1. Simple sounds to indicate things; 2. more complex sounds to convey relations that can be thought non-linguistically (eg. this rock is on top of that rock; 3. the possibility of having thoughts that couldn't be expressed non-linguistically due to thinking in language (eg. "language is ontologically prior to thought"). 4. The utility of non-linguistic thoughts by allowing complex social co-ordination (e.g. "Obey the king of Uruk or face the wrath of Marduk"). This then just comes back to Wittgenstein's point that sentences of type 3 are basically the equivalent of "colourless green ideas sleep furiously" so far as external grounding is concerned, although I'd modify that to having powerful impacts on social co-ordination. Hence they stuck around and developed as opposed to being sanded off by evolution.
That's the best evidence?
That's the evidence that's easiest to explain in non-technical terms (not that the comment you replied initially succeeded in it). But yeah, the main thrust of the argument is along the lines of "we see a bunch of people hammering nails with microscopes, but the microscopes' internal structure suggests it wasn't made for hammering".
How would you respond to this?
https://www.reddit.com/r/self/comments/3yrw2i/i_never_thought_with_language_until_now_this_is/
(I originally came across it in Scott's review of "Origin of Consciousness in the Breakdown of the Bicameral Mind")
The person who wrote that claims that they used language to communicate but not to think. I realize it's one, self-reported Reddit post, so not exactly high-quality data, but I wonder if there are other claims like this, and if they have been investigated by linguists?
There are other claims like this, and they are true at some level and false at another. We often indeed don't think in language in the sense of "build a sentence, up to and including how it sounds/what the articulatory commands are" (the acoustic/articulatory debate is its own bag of worms), but the idea is that to formulate _complex_ thoughts specifically, we use a method to chain thoughts which is so similar to syntax it probably _is_ syntax, even if it never gets sent to the systems of articulation ("is never sent to spell-out", in jargon-y terms). Note how using language was tremendously helpful for the guy, because it helped organize thought.
I think I read that post differently than you. He seems to be claiming that he didn't have complex thoughts before he started thinking articulatory in language, e.g.
"I can only describe my past life as ...."Mindless"..."empty"....."soul-less".... As weird as this sounds, I'm not even sure what I was, If i was even human, because I was barely even conscious. I felt like I was just reacting to the immediate environment and wasn't able to think anything outside of it."
So we seem to have a case where language was used exclusively for communication, and not at all for thought. Doesn't this contradict the claim that language is primarily used for thought, with communication being secondary?
Imagine you have a microscope and use it as a hammer. Doesn't this contradict the claim that the primary use of microscope, as evidenced by its structure, is for something else than hammering in nails?
<i>I can only describe my past life as ...."Mindless"..."empty"....."soul-less".... As weird as this sounds, I'm not even sure what I was, If i was even human, because I was barely even conscious.</i>
Huh, maybe p-zombies aren't such an outlandish idea after all.
Also, I wonder if this kind of testimony might be useful in studying how animals experience the world. Obviously there are many more differences between humans and other kinds of animal than just thinking with language, but still, this might be the closest we can get to answering the question, "What's it like to be a bat?"
Also this woman: https://www.youtube.com/watch?v=u69YSh-cFXY
Not sure you can readily describe her as chaining thoughts together using something like syntax.
An interesting question, but, to be fair, other explanations are available. (Is she able to imagine to raise her hand without raising her hand, I wonder?)
>We often indeed don't think in language in the sense of "build a sentence, up to and including how it sounds/what the articulatory commands are" (the acoustic/articulatory debate is its own bag of worms), but the idea is that to formulate complex thoughts specifically, we use a method to chain thoughts which is so similar to syntax it probably is syntax, even if it never gets sent to the systems of articulation ("is never sent to spell-out", in jargon-y terms).
At the very least, some thinking can also be geometrical rather than syntactic. Visualizing a part dangling from a flexible support and envisioning where its center of gravity will be and how it will be oriented is a complex operation, but it certainly doesn't feel linguistic to me.
This reminds me of the other day I was rolling out a pie crust. The instructions said to roll out the size of a sheet of notepaper. (In fairness to the recipe writer, I had misread and this step was not the final step; rather she wanted us to roll this shape prior to doing several folds for flakiness.)
But the point is, I kept trying to do this shape, which was also hard and impossible for me because I was making only a small pie in an 8 inch pan so didn’t have the full amount to make a notebook-paper sized rectangle. Poor reasoning on my part.
Throughout this process my brain vexingly kept flashing in my head a *picture* of the round tin, as if to say, you’re on the wrong track, old girl. Can’t emphasize enough this was a picture only, that I had to keep suppressing: “She said notebook paper! She said rectangle!”
It's _complicated_, but I am not sure that when laypeople (not physicists who learned advanced ways of modeling such things through, well, language) do it, it is _complex_ in the sense of explicitly manipulating multiple parts of the system.
Inner mental experience is far richer than that - this is a massively under-researched and under-discussed topic in our society. The best detailed research into it is by Hurlburt using a technique called Descriptive Experience Sampling.
I recommend this paper: https://hurlburt.faculty.unlv.edu/heavey-hurlburt-2008.pdf
One result he found is that most people often think in rich unsymbolized concepts - I do this myself, and I often don't *have* words for the concepts. I have to invent language, or spend a long time crafting it to turn the thoughts into words. This to me makes it pretty clear my underlying mind isn't primarily linguistic
(I use inner speech a lot too, and it is very useful for certain kinds of thought. It is but one modality of thought.)
Similarly, lots of people think mainly with images, or mainly with emotions. Others are mainly paying detailed attention to sensations, or have no focussed inner mental experience at all.
There's a phenomena Hurlburt found which the complex chained thoughts you describe reminds me of, where people can have "wordless speech". This has the cadence and rhythym of a sentence, but with concepts in the awreness instead of linguistic words.
Lots going on to investigate here, and I think linguistics would hugely benefit from studying Hurlburt's research.
So, there are several things to unpack here.
First,
> This has the cadence and rhythym of a sentence, but with concepts in the awreness instead of linguistic words.
is indeed prima facie evidence for, not against, "to formulate _complex_ thoughts specifically, we use a method to chain thoughts which is so similar to syntax it probably _is_ syntax, even if it never gets sent to the systems of articulation". It is just a specific case where you don't send it to the systems of articulation because the specific terminal nodes correspond to concepts which don't have a phonological word to them. (Compare a situation when you're bilingual and know a word for the concept you need in one language but not in the other.)
Second,
> Similarly, lots of people think mainly with images, or mainly with emotions. Others are mainly paying detailed attention to sensations, or have no focussed inner mental experience at all.
Indeed, there are many non-linguistic ways of thought; the question is, though, whether any of them are _complex_, chaining ways. Like, images seem to just be, well, non-compositional: an image of a house, cognitively, is not images of windows imposed over image of a wall combined with an image of a roof (as evidenced by the default image of window probably differing from the parts of the default image of a house that correspond to the house's windows).
Comparison to aphantasia in the comments, while overreaching in saying it is literally the same (it isn't), is interesting.
Do Chomskyans equate "language" and "abstract thought" when they say things like this? Seems like some of this conflict could be definitional.
Some of the conflict is definitely definitional, but not all of it.
What is the Chomskyan response to Evelina Fedorenko's work that suggests that even when the language systems in adult brains are damaged, their other cognitive functions remain intact? For example, people with global aphasia can still engage in complex causal reasoning tasks such as solving arithmetic problems and reasoning about others' intentions.
From my own experience, I mostly think without using language (i.e. talking to myself). It's only when I have to break down complex tasks into logical subunits that I begin to talk to myself about what I'm doing (but that's mostly a placeholder narrative — "e.g. this is the screw that I need to insert here, after I get this thingy lined up with the thingamajig." Also when I need to write, I'll start thinking in language. So, I'll grant you, language is necessary for structuring thought that leads to communication with others, but I don't think it's integral to reasoning. I'm pretty sure I could get the screw into the thingy and thingamajig without overtly describing the process to myself.
Aphasias are a prime proof of linguistic module's separateness, so this is a very important question indeed. The problem is, there are at least six different kinds of aphasias (Luria's classification), and none of them looks like "break the syntactic module". Broca's aphasia (in Luria's terminology, efferent motor aphasia) is pretty clearly "broken PF" (syntax-phonology interface), Vernike's aphasia (don't remember the Luria name for it) is pretty clearly "broken LF", some others are way more specific (there is, for instance, an aphasia that specifically targets multivalent concepts like "under").
Maybe he's referring to this?
https://www.nature.com/articles/s41559-024-02420-w.epdf
Yes, and the whales. Thank you.
Name-like calls are not sentences with grammatical structure. They don’t imply recursion or an infinitude of possible utterances with a finite vocabulary.
They may communicate, but they don't seem to do language well. All the ape-language studies from the 70s and 80s failed miserably (the most famous of these research projects, Coco the gorilla, made a good mascot but terrible science, particularly as the researcher refused to share or release unedited video of Coco communicating). It seems like even smart animals, like chimps, can't grok language the way humans can.
https://www.psychologytoday.com/us/blog/the-origin-words/201910/why-chimpanzees-cant-learn-language-1
Thank you, this is very informative. I'm thinking about results from whales and elephants. Do you have thoughts on whether this counts as "language" from a Chomskian view?
Here's a link from Ryan L about the elephants: https://www.nature.com/articles/s41559-024-02420-w.epdf
There's a lab at MIT(?) working on the whale translation.
I don't know Chomsky well enough to comment. My only insight is that laymen (such as myself) often think that language is simpler and easier than it is, so much so that we are impressed by chimps that seem like they can do sign language, or elephants that seem like they have names for each other. There really does seem to be a huge gulf in capability between humans and all other animals when it comes to language.
>There really does seem to be a huge gulf in capability between humans and all other animals when it comes to language.
Agreed. From the point of view of resolving how language actually evolved, it is a pity that all of our fellow Hominini except the chimps and the bonobos are extinct. At least the hyoid bones can fossilize, consonants and vowels, not so much... I'm skeptical that we will ever really know.
Yeah, that is broadly correct.
It doesn't. Symbolic thinking may be one of the many prerequisites, but it doesn't have the, you know, grammar prerequisite.
Having a name for each elephant that they use to get each other’s attention is more sophisticated than a finite list of alarm calls, but it does not mean that there is productive recombination of utterances to express novel meanings. One of the major features of language is that you can (largely) understand novel expressions that you have never heard before, and the limitations on your understanding aren’t due to the novelty, but due to inherent uncertainties of context and intention.
They can't grok human language the way humans can. I don't understand why this is sufficient to decide that they can't grok language? Surely one would need to study how they communicate with each other to decide that?
We need to study the ways they communicate to decide that. There are many animals whose communication systems have been studied sufficiently to be sure that they don’t have the complex structure of human language. This is true for dogs, cats, primates, many birds, and many other animals. I think there are some animals whose communication system is complex enough that we know we haven’t yet understood it, so that we can’t be sure it’s not like human language. This is true for some whales, elephants, some birds, and some cephalopods. None of these has been shown to be as complex as human language, but they haven’t been shown not to be either.
One interesting note: apparently Ken Hale, late well respected field linguist who did a lot of work on the Australian language Walpiri among many other things, uncontroversially described Walpiri and some other Australian indigenous languages as lacking clausal embedding in the 70s. This Language Log blog post (http://itre.cis.upenn.edu/~myl/languagelog/archives/004592.html) muses on various possible reasons this basically went unnoticed whereas Everett immediately got in an academic flamewar, but the main reason seems to have basically been that recursion hadn't been proposed as a linguistic universal until the introduction of Chomsky's Minimalist Program in 1995, earlier Chomskyan models of universal grammar did not include it.
So the idea that this may not even be unique to Pirahã seems like an important counterpoint to some claims I've seen that Pirahã in fact probably does have syntactic recursion and the real problem is that its sole expert isn't describing the language correctly. I've seen Everett comment somewhere that a problem may be that field linguists are essentially trained to fit grammar into a certain framework and so may be trying to square peg a round theoretical hole (as he himself says he did in his original PhD thesis description of Pirahã), thus failing to properly document the true linguistic diversity that really exists in the world.
Disclaimer that I don't have any expert understanding of the technical issue here, just a longstanding hobbyist fascination with linguistics. I acknowledge that my amateur understanding is causing me to miss something important here. But overall my reactions to the Chomskyan team response to this, where Everett is probably wrong but even if he's right it actually doesn't matter for his theory, strikes me similarly to you: it feels like dragging a major hypothesis into an unfalsifiable territory where it gets hard to understand what definitive claims it's actually making about the nature of human language at all.
Uh-huh, the so-called non-configurational language hypothesis. See https://www.ling.upenn.edu/~jlegate/main.pdf.
The claim that all human languages have sentences containing more than one word isn’t vacuously true, it’s profoundly true. Just compare this fact with the norm in the communications of other species. No equivalent of “the [color] bird” is known in how any other species communicates. That, is Merge.
The claim _is_ profoundly true, but note the dolphin comment above.
What's profound about it?
Communicating in something composed of parts with their own semantic effect (like, "dog" in "A dog came" has a meaning but "g" doesn't; these parts are normally called morphemes but layman explanations often default to "words" because it's simpler) is _not_ trivial, most animal communication systems don't have morphemes. Of those that do, seemingly none except humans and apes trained by humans are in the primate tree.
Does, as the author of the article suggest, this mean that any language that has more than one word thereby has recursion?
Interestingly, no! We know the answer from ontogenesis (children development). There are three prolonged stages in humans: one-word, two-word (the involved words are short and probably analyzed as monomorphemic by the child, so ignore the word/morpheme angle), and then the so-called "grammatical explosion" where the child rapidly builds more and more complex structures. So, to answer the question in a pithy way, two doesn't make recursion (you can combine, but you can't use the combined thing as the input to the next combination, which is where recursion kicks in), but three does.
And see the dolphinspeak comment above on the linearly recursive vs. hierarchically recursive difference.
Okay, thanks for clarifying.
Prairie dogs communicate both the color and species of intruders.
You’re confusing a syntactic claim with a semantic claim. Which is funny in context.
There’s a difference between obligatorily one-word utterances and a possibility of two-word utterances. What I think is actually the interesting difference though is the difference between utterances with a fixed upper bound on length and utterances that can be in principle unbounded (which even Everett says is true of Pirahã).
Say what??
This has to be one of the most hilarious pots we've seen.
Watch out for Jesus! His Chomsky rating is off the charts.
Pots? No, I meant posts. Oops.
You can edit posts, you know.
I've done editing in the past. I just can't remember how to do it.
Hit the "..." at the bottom right of your comment.
Thanks. It's always something simple, isn't it.
LLMs pick up language in super interesting ways, for example (for any of my Hinglish speakers out there), try asking ChatGPT:
"Kya aap mere se aise baat kar sakte ho?"
The answer I got:
"Bilkul, main aapse Hindi mein baat kar sakta hoon. Aapko kis cheez ki madad chahiye?"
LLMs' way of picking language is interesting but certainly different from humans', both because humans need less material and because they show unexpected biases (for instance, non-conservative quantifiers are unlearnable for human children of the critical age but not for LLMs; you can trivially teach an LLM extraction from islands or binding violations; and so on and so forth).
Could you explain binding violations?
"Mary saw her" cannot mean "Mary saw herself".
"Mary wrote a letter to her" cannot mean "Mary wrote a letter to herself".
"Mary saw a snake in front of her" _can_ mean "Mary saw a snake in front of herself".
And, finally, "Mary knows that Penny saw herself" cannot mean "Mary knows that Penny saw her" (and vice versa, but the vice versa is uninteresting because it's the same as the first).
There are very pecuilar and technical laws surrounding both when a reflexive _can_ appear and when it _must_ appear.
Oh, super interesting. Thanks!
>"Mary wrote a letter to her" cannot mean "Mary wrote a letter to herself".
Is that strictly true? What about this: "Mary reflected on the woman that she hoped she would become. Mary wrote a letter to her."
Is there some technical linguistic sense in which she's not writing a letter to herself? Is metaphorical meaning considered out-of-scope for linguistics? If so, how is that rigorously defined?
There is a purely technical linguistic sense in which she isn't writing a letter to herself, yes. The antecedent, to use a technical term, of the pronoun "her" is "the woman that she hoped she would become", which is a different phrase than "Mary" and, again, in a technical sense, we know it not to be co-indexed with Mary because the first sentence is possible (cf. #"Mary reflected on her"). The first sentence establishes the two women as different objects for the purposes of binidng theory.
This may not be all that rigorous and obligatory. Consider Earendil's poem:
In panoply of ancient kings
In chained rings *he armoured him*.
Fairly clearly, "he" and "him" both refer to Earendil. And the author knew something about the rules of the English language :)
To take a different example, note how "Joe Biden reflected on the president of the United States" is an infelicitous sentence if it happens now but would be felicitous if it happened four years ago.
(Note that nobody explicitly teaches children the laws, and, judging by CHILDES corpora, they don't get enough data to arrive at the very generalizations they tend to arrive at.)
Oh wow, ChatGPT does not realize that this is written using the Latin script. After directly asking it, it says it's written in Devanagari.
Is that what you meant with this example? I do not speak Hindi