There seems to be a growing field of what you could call “auxiliary AI” for safety/interpretability/control. Auxiliary in that we don’t know how the main ML model functions, so we tack on an extra ML model to help interpret the main ML model. The monosemanticy example was a good case, but there’s tons of these now, in a bunch of different fields. Seems a bit silly to me, because you’re ostensibly just moving the goalposts: if we don’t know how the auxiliary AI is doing what it’s doing, then how do we know it’s actually performing the interpretability/control function we want it to? You can make all sorts of arguments about the structure of a network and how it enforces certain desirable behaviors/patterns/structures in the working of the model, but at the end of the day, if its still just a giant inscrutable tensor of weights, are we really getting anywhere? Seems to me we should either just earnestly abandon our efforts at creating thinking digital structures we can understand, or abandon the neural network paradigm but it doesn’t seem like you can do both. Maybe I’m just not as deep in the lit as a I should be tho.

Expand full comment

Hi for those interested in Scott's piece on AI monosemanticity I wrote a piece about the implications of AI interpretability for biology and drug discovery.


Thanks to Scott for inspiration.

Expand full comment

I am a big fan of the "Thrive-Survive Theory of Politics" by Scott Alexander (https://slatestarcodex.com/2013/03/04/a-thrivesurvive-theory-of-the-political-spectrum/). One thing I always found interesting/incongruous with this theory is that it claims that (material) security leads people to feel secure, and thus become more likely to identify as leftist and adopt their outlooks on life (i.e. post-materialism or "globalism"), while those who are more concerned with "surviving" will be more likely to be right-wing and adopt their outlook (i.e. being more traditional and more harsh towards human failure). I feel that this theory certainly makes sense on a global level - people in poorer countries are more likely to be in a "survive" mindset and thus less likely to be progressive politically than those in wealthier countries. Even inside the US it makes sense, with right-wing "Red States" tending to be poorer than left-wing "Blue States". But there is a major flaw in this theory IMO. Namely that the USA, which is the richest major economy in the world (https://en.wikipedia.org/wiki/List_of_countries_by_GDP_(PPP)_per_capita) is definitely (on average) more towards the "survive" Spectrum than most other rich Western countries, including all other Anglo countries (as can be seen in the WVS map: https://www.worldvaluessurvey.org/WVSContents.jsp?CMSID=Findings). But since the US is richer than almost all other Western countries, wouldn't one expect the US to be more towards the right side ("self-expression) than the other Anglosphere countries and/or Western European countries?

I guess one argument is that Western European countries have a better social safety net than the US and thus allowing people to focus more on "thrive" attitudes politically...but even then, why didn't the US develop a social safety net like those Western European countries in the first place? Is it just because of racial politics? Also, might this change if the US becomes more progressive politically overall? Especially if Western European countries become more focused on "survival" politically because of the energy/geopolitical situation in Europe? Anyway, I would like to know others' view on this here...

Expand full comment

OC ACXLW OpenAI Lessons and Enlightened Sex Scandals

Hello Folks!

We are excited to announce the 50th Orange County ACX/LW meetup, happening this Saturday and most Saturdays thereafter.

Host: Michael Michalchik

Email: michaelmichalchik@gmail.com (For questions or requests)

Location: 1970 Port Laurent Place

(949) 375-2045

Date: Saturday, Dec 2, 2023

Time: 2 PM

OC ACXLW OpenAI Lessons and Enlightened Sex Scandals

Conversation Starters :


The Unsettling Lesson of the OpenAI Mess


Open Access

The Unsettling Lesson of the OpenAI Mess


Is Enlightment Compatiable with sex scandals by Scott Alexander?




Walk & Talk: We usually have an hour-long walk and talk after the meeting starts. Two mini-malls with hot takeout food are readily accessible nearby. Search for Gelson's or Pavilions in the zip code 92660.

Share a Surprise: Tell the group about something unexpected that changed your perspective on the universe.

Future Direction Ideas: Contribute ideas for the group's future direction, including topics, meeting types, activities, etc.

Expand full comment

How good was the average life of the Middle-Ages peasant compared to the average life of the middle-class American today? I'd guess 1/2 as good but with less variance. Probably about as many miserable people as a percentage.

Expand full comment

What are the best things and worst things free market capitalism has given us?

I'd say the best things are bringing billions out of poverty. The worst things are Taylor Swift and superhero movies.

Expand full comment

I came across this while reading Unsong:

> Still other texts say the Messiah will come in a generation that is both the most righteous and the most wicked. I don’t even know what to think of that one.

I do! Because I came across this while reading the perennialists:

> The traditional world was essentially good but contained much evil, while the modern world is essentially evil but contains much good.

You can definitely get away with claiming the modern world is essentially evil if you judge by the standards of the traditional world, but then, much good is done! So we truly are the most righteous and most wicked generation already.

Expand full comment

I thought of looking at what the Arabic Wikipedia says about the events of October 7 and I'm deeply impressed. The page is https://ar.wikipedia.org/wiki/عملية_طوفان_الأقصى and then use auto-translation if you don't know Arabic (I don't).

There's no mention at all of any civilian deaths in Israel. Really, none whatsoever. Apparently what happened was, Hamas fighters broke through the fence, entered various Israeli cities, army bases and "settlements" near Gaza and clashed with the Israeli army there. They captured lots of Israeli soldiers and officers. Nothing is said about any civilian hostages. The music festival is not mentioned.

Expand full comment

Hi, I am a rabbi and have written something on the matter of "God".

It is written for people who care strongly about "God" in one way or another.

If you fall into that category I hope that you enjoy this piece.

Be blessed.


Expand full comment

A theory about why AI is unlikely to be an existential threat to humanity, though it could still be a very serious threat-- I imagine things getting bad-- billions dead as an AI sucks up more and more resources. However, it's going to be tricky for the AI to figure out what resources (possibly including people) it needs to allot to keep itself going, and possibly it will need to prevent and repair sabotage.

This is definitely a good enough premise for science fiction, and possibly reasonable for the real world.

Expand full comment

I have noticed something curious about my career in software development: every time I switch jobs, it is of course for higher pay, but the company I jump ship to is always operationally sloppier than the previous one. Now I am at my highest compensation ever, and this org I'm in is hosting a bunch of their code repositories out of a laptop sitting in a shelf at the office, which, not gonna lie, it traumatized me a little when I finally saw this laptop myself. It's definitely some kind of zenith of sloppiness. Hell, I had heard they were hosted out of the office, but I thought it was like out of a Mac mini or something (though is that better than a laptop? It seems better for some reason).

There are a lot of devs here, have your careers followed a similar trajectory? Also, do you have any amusing stories about operational sloppiness like that laptop server?

Expand full comment

Is there a list of current bans?

Expand full comment
Nov 29, 2023·edited Nov 29, 2023

I'm considering my current economic / social situation, and I think I am in a position to start pivoting to an altruistic career.

My only issue is - what career? I have looked at 80000 hours, but the sort of thing I've been doing up to now is not very conducive to their list of top global problems. I actually really like what I've been doing up to now, and I would like to do something similar. I don't particularly want to earn to give - while I like what I do, I hate working in a for-profit environment.

About me: I have a Bachelor's of Engineering. Skills I have picked up through work have been corrosion prevention, asset maintenance, and pressure equipment management. I know how to work in a big heavy industrial environment (and would in fact prefer it).

Thoughts: I think I would be suited to a role in infrastructure development and maintenance, particularly in chemical plants, nuclear power plants, waste or water treatment plants, or fuel farms. I have done some research in the past and didn't like it, and I think I am a poor fit for research specifically (personality / ways of working issue). I think my ideal role would be an engineer, advisor or project manager for some kind of big infrastructure project, so if anyone knows of someone building a desal plant or sanitary sewers as a charitable thing, I am very interested.

Other than that, I would be interested to hear about any projects, opportunities, or general direction for somewhere I can apply my skills.

I am looking to get out of my current industry in the next 2 - 5 years, so sit tight and accumulate career capital isn't really an option. My current industry is fossil fuels.

Edit: if it matters, I am ethnically Chinese, I'm baseline conversationally fluent in Mandarin (which I can improve with some effort) and a bit less fluent in Cantonese. I don't have close family ties in Mainland China that I'm aware of. I'm also willing to work there.

Expand full comment

These comment sections are sickening.

They're full of people who hysterically freak out if you say anything that remotely resembles white nationalism, and then they turn around espousing violent, genocidal jewish ethnonationalist rhetoric. This is literally like some far-right caricature of jews, but it proves itself to be true over and over again. And then you have jews like Bryan Capalan promoting demented open borders policies for every western country, but then conveniently finding a justification for saying that the ethnostate that aligns with his ethnicity is the sole exception to open borders being good. You literally cannot make this shit up.

Expand full comment

Some Gazans blaming Hamas for the Israeli attack. No guarantee they will continue to do so.


Expand full comment

This is Chesterton's fence: https://media.thegospelcoalition.org/static-blogs/trevin-wax/files/2016/09/Trevin-Wax-at-Chestertons-home.jpg

That is, the fence around the house where G. K. Chesterton lived. It's more of a hedge but I guess it's all right.

Ironically, some articles claim the house was under threat of demolition. Luckily it's still standing https://maps.app.goo.gl/dHM7QRXRhsGHifnV7.

Expand full comment

I am writing a long short story set in the near future (circa GPT-6.5) that features an AI tutor. As a non-technical person fascinated by recent advances in AI, I came up with the following phrase to describe its mind: “the ever-expanding matrix of data points, meaningless in isolation but each quivering with its own unique probabilistic relation to every single other, that served it for a mind.” Question for those who know a lot more than me: does this seem to you to evoke something (potentially?) real? How could it be improved without becoming more technical/verbose?

Expand full comment

Okay, this is funny:


Yeah, there are lots of wanna-be gangstas in European cities and yes, we do have inner-city and 'urban' youth who come from areas of high crime and deprivation, even in Ireland, but it's never not funny to me that they faithfully copy American hip-hop/rap culture trying to be what they're not (i.e. East or West Coast).

It's even funnier when, as in the linked images, it's guys from nice places.

Expand full comment

Any thoughts about getting good at simple navigating in a city? I had an acupuncture appointment in an unfamiliar part of the city, and I spent a lot of time being lost. (It seemed like a lot of time, actually about an hour and a half.) Some of this may be aging, but I've always been bad at that sort of thing.

Some of it may have been bad directions, some of it may have been failing to follow good directions.

There was possibly some wrong-headedness involved. "Go across the bridge", they say. I can't believe it's this big elevated four-lane highway with no obvious way for a pedestrian to get onto it. Surely there must be a little footbridge across the river in the park. No, it was the big bridge-- there was a not-easy-to-see pedestrian access from where I was.

I suspect I need better skills using the maps on my phone, and possibly I should carry a paper mapbook.

I'm not sure I reliably stay conscious of orientation when I turn a corner.

I'm especially interested in accounts of getting better at orientation starting from being bad at it, but if you've been good at orientation all your life and you feel a need to tell me, go ahead, I can't stop you.

If you've actually taught people to be good at orientation (not just have ideas about how you might be able to teach it), I'm interested.

Some might be interested in where I was. Being interested in that sort of thing probably has something to do with being good at orientation. I'm in Philadelphia. I was starting from 10th and Mountain St. (South Philadelphia). I was going to West Philadelphia Community Acupuncture at 4636 Woodland Avenue.

I caught a bus to 33rd and Dickinson. So far so good. It was cold, and it was taking a while for the next bus to show up, so I thought I could walk it. It was only a mile or so. I found myself at a park involving the Schuylkill River. There was a giant FedEx building with inconvenient fences to get past before I could get to the big bridge. Somehow, I was at 49th and Paschall St. I'll note that this an area with few shops and few people on the street to ask for directions.

It turns out there's a trolley that goes from Center City to near the acupuncture place, which should solve the specific problem in the future, but I'm still interested in what might solve the general problem.

Expand full comment

Did taking Gaza and the West Bank after the 6 Day War make sense for Israel? I'm sure it made sense at the time to have borders that looked more defensible, but having borders which include regions of hostile people-- people who have been persistently provoked-- give a military advantage?

Expand full comment
Nov 28, 2023·edited Nov 28, 2023

Tennyson is one of my favorite poets. Nobody (including myself!) reads his longer "epic" poems these days, and probably hasn't for many decades past. But as a rule of thumb if a Tennyson poem fits on one page then it is probably pretty good. Examples include "Break, Break, Break", a sad poem about the death of his friend Arthur Hallam, and possibly his best known poem, The Charge of the Light Brigade", recited quite well here, with extracts from the 1968 film:


Here's a short quote from his poem Ulysses, which should be encouraging for old timers who worry that they're washed up and may never achieve anything more!

"Old age hath yet his honour and his toil;

Death closes all: but something ere the end,

Some work of noble note, may yet be done"

Alfred Tennyson

Expand full comment

I think P(Doom) from AI is very low, < .01. over the next thousand years. But even if I thought it were around .05 or twice that, I would be in favor of going full speed ahead with AI.

There are those who consider human flourishing to be this amazingly great thing. If so, it is because humans in the past have been brave, bold, intrepid. It is because they plunged themselves into mysterious domains without fear.

The argument for being extra cautious about AI is an argument not that humans are precious but that they aren’t any longer worth a shit. That they have become a race of cowards.

Expand full comment

Does anyone know why Kalamazoo and Numazu are sister cities? Did they just pick each other due to the name similarity, or was that coincidence?

Expand full comment

Has anyone done the research about kidney donation for ≈fifty-year-olds? Is there a significant difference in benefits (to the recipient) or risks (to the donor) versus the case of a younger donor?

Expand full comment

I've been reading an excellent book called "The Origins of Music", ed. Nils L. Wallin, Bjorn Merker, Steven Brown, MIT Press - a subject of great interest to me. There's 3 theories - 1 music evolved from language, 2 language evolved from music, 3 both diverged from a common precursor—referred to by Steven Brown as "musilanguage" and by Steven Mithen as "Hmmmmm" (Steven J. Mithen "The Singing Neanderthals: The Origins of Music, Language, Mind, and Body", Harvard University Press). Steven Pinker called music auditory cheesecake and thinks it's a spandrel, so he's #2. Mithen's book and Brown's chapter make a good case for #3. I'm for #1, but I'm obviously biased because I'm a musician. Any of the learned boffins here have opinions?

Expand full comment

Here is how I've come to believe Global Warning will be resolved:

1. CO₂ emissions are already rapidly being replaced by cheaper renewables. It will take more decades than anyone would like to go to zero, but the path is clear.

2. What isn't widely understood is that that doesn't bring CO₂ levels down, since CO₂ is quite stable in the atmosphere. The solution to that will be Carbon Sequestration. You capture CO₂ from the air, and pump it into underground caverns. The pumping part is well established technology from the natural gas industry. Capturing is only getting started, and will also take an uncomfortable number of decades to work at scale.

3. While waiting for points 1 & 2 to pan out, seeding the stratosphere with SO₂ can bring temperatures down by at least 1 degree Celsius, probably more, for about 2 years. This is known because volcanoes regularly does exactly that. Someone just needs to do it, despite the "Precautionary Principle" crowd.

I don't think this is super original. All three pieces are well known. Has anyone seen a "real" essay describing something like this 3 step plan/prediction?

Expand full comment

I've posted similar before, but I'd like to share a bullet point justification of what I think the ultimate goal of human (or superintelligence) action should be, and see what this community thinks of it. Considering all the debate we have about the future of humanity, it makes sense to have some grounded justification for what an ideal outcome for human activity should be, if such a thing can even exist. Personally, I think this outcome I describe is superior because it is based on a system that properly bridges the is-ought gap (the idea that a moral imperative can't be derived from simple observation of how things are), while other systems are based on subjective opinion or tastes that have no absolute grounding. Anyways, here it is:

>Hume supposes the is-ought gap

>We need to transcend this gap if we want to be able to deduce any possible end goal for humanity as deriving from observations of existence itself

>What is able to do so is pleasure (defined as any emotion subjectively experienced as positive): we are able to experience pleasure as good in itself, and good by its own definition implies an "ought" in its own increase

>So the ultimate end is increasing the universal experience of pleasure as much as possible

>But how to do that?

>Focusing on minds that already exist is inefficient, because it is at least theoretically possible to use Von Neumann machines (machines that utilize surrounding matter to make copies of themselves) to produce more minds out of the dumb matter of the universe than ever would naturally exist

>So, the end goal becomes developing the technology to create Von Neumann machines that convert as much of the universe's matter into minds experiencing pleasure as possible

>These minds would be so constructed as to never bore of this pleasure, if you've ever had a moment where you've thought "I wish this moment could last forever," it would be such a thing actualized, pleasure feeling as good as it did at its first moment for near eternity

>While such an end for the universe, by its single-minded simplicity, might seem boring or depressing to contemplate, what matters is not the feeling an outcome produces in us to contemplate, but how it would actually feel to exist within that outcome, so while it may be a "boring" universe to an outside observer, it would be an objectively good universe for the beings making up the majority of its conscious existence

>This is where this theory differs from Yudkowsky's "fun theory," because "fun" is only one of a multitude of pleasurable states, and the important thing is how the system is experienced from the inside (maximally pleasurable), not whether it's a universe that's fun to imagine oneself living in

In summation, Utilitarianism is the only system able to transcend the is-ought gap, and the logical end of Utilitarianism is using self-replicating machines to create as many pleasure-experiencing minds as possible. Hopefully enough people can be convinced by this argument that this can become the end goal that either humans or superintelligences put their long term efforts towards.

Expand full comment

With the Christmas season coming up... Apparently a lot of adults buy Christmas presents for each other, but... why? With children it makes sense: they have predictable interests and you can't just give them cash because they can't use it. Also they'll usually just tell you exactly what they want. But with adults.... If you really want something, you can just buy it yourself. And with no good way to know what the other party wants, any present you buy for them is bound to disappoint. Giving cash or a gift card obviously doesn't make sense either, since you'll both end up with the same amount of money you started with.

So... what's the point? It just seems like a huge waste of time and energy.

Expand full comment

If you haven't seen "This Land Is Mine" yet, you should : https://www.youtube.com/watch?v=JfkXorjMcMI.

Here's the Artist Nina Paley dunking on copyright in the most hilarious (yet entirely fair and true) way possible : Copyright is Brain Damage | Nina Paley, https://www.youtube.com/watch?v=XO9FKQAxWZc.

Expand full comment

Hello all. I voluntarily deleted my acount (Matthieu). The unexpected part is that it also deleted all my comments, with the one with replies marked as deleted and the one without leaving no trace. I realize this is bad for the blog archives' readability, and I apologize for it. I won't delete this temporary account but I should be unable to use it again.

This is a recurring theme, because I was already this guy: https://slatestarcodex.com/2019/11/06/open-thread-140-25/#comment-818659 . My reasons for leaving now are somewhat similar to then, although less dramatic (rereading myself, I certainly feel generally better now than I did then). The added practical reason is that I have a f*ing PhD to finish.

Thanks to Scott and to all the interesting and likeable people here. I should be back at some point.

Expand full comment

I wrote a scifi Western about a gunslinger drone if you’re looking for a short cozy read today: https://solquy.substack.com/p/111123-the-gunslinger

Expand full comment

I have an interactive, international, interdenominational whatsapp chabura that you are welcome to join.

At some point inshallah it will be by invitation only but as we're only 70 people as of now and the average SSC'er is likely to be a good fit you are welcome to join and of course to leave or rejoin whenever you like.


Expand full comment

The recent New Yorker is essentially the AI issue (I linked to a profile in it on Hinton in a thread below) and has a piece called "A Coder on the Waning Days of the Craft". I'm curious to what extent other coders agree with his description of GPT-4's current coding abilities: https://www.newyorker.com/magazine/2023/11/20/a-coder-considers-the-waning-days-of-the-craft

Expand full comment

Trying to scope out who'd be interested in funding a research group on broader social, historical, and philosophical approaches to AI policy and design. I'm from an SWE and PhD academic history background, others interested are PhD stats, PhD moral philosophy, inter alia.

tl;dr academics and policy people want to model and think through and design AI socially and culturally, who is it _not_ rude to ask, is there anything we should be thinking about other than just trying to get funding for compute?

More info here:


Expand full comment

In celebration of a very ER doc Thanksgiving I wrote some satire about the cherished holiday tradition of dumping your old people at the emergency room. Nothing says “let’s be thankful” more than the annual Pop Drop and Yaya yeet (tm). Please don’t do this ;)


Expand full comment

I'm developing a browser-based idle game and as an amateur, it's been a real motherfucker trying to figure out coherent systems for some of the game mechanics such as the loot rewards, the questing system and so on.

Do people just trial and error this shit when they're on their own? ChatGPT helps a bit but is mostly good as a summarizer and organizer. It has occasionally given some good ideas for formulas and can help me make sense of the math needed for some of the things, but I am feeling quite a bit lost.

Are there resources, communities out there that were created for amateur/solo game devs that you know of?

Expand full comment

What this world needs is an easy to type markup language.


Why? Phones don't have room for lots of style buttons, and voice recognition is a serious security threat. I don't like the idea of machines listening in on me. (I'm slowly working on an easier to learn keyboard layout, but since making hardware requires a whole lot more capital than writing software, this one is on the back burner.)

Expand full comment

My takeaways from the Altman saga (from very much an outsider's perspective, based only on news reports and what I think I know about human nature):

- The corporate structure of OpenAI which was designed so that AI safety would be the paramount concern; the board was meant to ensure that. Altman does not want to be constrained by that structure.

- The 90+% of OpenAI employees who sided with Altman agree with him. Furthermore, they apparently found it absurd for Toner to suggest that it would be consistent with the mission of OpenAI to shut down the company if it presented a danger in terms of existential risk, or AI risk more generally. But that statement by Toner is simply a logical consequence of *having* a mission in which AI safety is a paramount concern. So, practically speaking, OpenAI employees, with near unanimity, do not believe that AI safety should be the company's paramount concern.

- It's pretty easy to infer from the above that the point is never going to come where the AI industry voluntarily prioritizes AI safety over other concerns. At every given step between now and ASI it will be easy to rationalize a more or less full-speed-ahead approach to AI development: "if we don't do it the other guys will"; "we are working for the good of humanity"; "our intentions are good, therefore the consequences of our actions must be good." The monetary and professional rewards of being first in the AI race and the glory of building utopia will be powerful motivators for concocting those rationalizations.

I am 100% certain that Sam Altman does believe that AI safety risk is real and profound. It is also clear from profiles of him that he thinks he's building utopia. Designing utopia is akin to making himself something between a king and a god of all humanity. Of *course* he is going to treat safety concerns as secondary to that incredibly powerful ambition.

He is, I'm sure, very smart. He is also an incredibly shallow thinker, from what I've read about him - he has no sense of the weaknesses of utopian thinking, no sense of why the extraordinary power he is projecting for AI (and by extension for himself) is inherently dangerous. He is subject to the worst kind of hubris: the belief that not only is he competent to wield enormous power unchecked, but that he is doing so for the common good. He might also be the most responsible leader of an AI company outside of Anthropic.

I am not feeling great about things at the moment.

Expand full comment


We have enough fraud by humans, there's no need to automate it.

The next step is to submit papers with fake data sets to real journals to find out what gets noticed. I'm pessimistic.

"ChatGPT generates fake data set to support scientific hypothesis"

"In a paper published in JAMA Ophthalmology on 9 November1, the authors used GPT-4 — the latest version of the large language model on which ChatGPT runs — paired with Advanced Data Analysis (ADA), a model that incorporates the programming language Python and can perform statistical analysis and create data visualizations. The AI-generated data compared the outcomes of two surgical procedures and indicated — wrongly — that one treatment is better than the other."

"At the request of Nature’s news team, Wilkinson and his colleague Zewen Lu assessed the fake data set using a screening protocol designed to check for authenticity.

This revealed a mismatch in many ‘participants’ between designated sex and the sex that would typically be expected from their name. Furthermore, no correlation was found between preoperative and postoperative measures of vision capacity and the eye-imaging test. Wilkinson and Lu also inspected the distribution of numbers in some of the columns in the data set to check for non-random patterns. The eye-imaging values passed this test, but some of the participants’ age values clustered in a way that would be extremely unusual in a genuine data set: there was a disproportionate number of participants whose age values ended with 7 or 8."

I don't know how much of the screening protocol is automated.

Expand full comment

How could a fighter in the medieval ages survive multiple swordfights over the course of their life? If you watch any melee weapons fights like HEMA or Dog Brothers, it's impossible not to get hit if you're in range- any given fight has each fighter hitting the other dozens of times. Any martial art that teaches realistic knife defense starts with the principle 'you are absolutely going to get cut a few times'. Swords are obviously lethal, and medical care at the time was obviously extremely poor- if you didn't die of a sword wound you'd probably die of infection later. It seems like if you were unusually agile, athletic, and skilled you could survive maybe 2 or 3 sword fights over the course of your life.....? How could someone be a 'warrior' for multiple years or decades?

Expand full comment

Thank you for the reference on Professor Daniel Kang's work. I've recently started the newsletter Provably Safe AI and published the second post today. It covers attacking and defending LLMs, as well as policy and standards around AI-relevant cybersecurity, so Prof. Kang's work will be a good follow-up to share next. I'm new to the topic (and to writing in this format) and would appreciate corrections and feedback: https://www.provablysafe.ai/p/2-provably-safe-against-ai

Expand full comment

What do you think would happen if America stopped supporting Israel?

Expand full comment

Well, I wrote a thing about five now-defunct Flash games and how to win them (with a bit of rambling at the start about the enshittification of free browser-based games, but you can skip that if you like).


Expand full comment

I posted the other week about using tiny doses of antidepressants as a form of open label placebo and subsequently saw this article posted on Twitter https://academic.oup.com/book/54240/chapter/422452976?login=false

Has it been discussed here on ACT?

Expand full comment

What are the best resources to learn how to implement LLMs in real life?

Context : I’m a surgeon based in the UK NHS in the field of orthopaedics. I think that a LLM-based chat bot would be really helpful to navigate the ‘National Joint Registry’ which is basically a huge database of medical implants, outcome data and survival figures.

I have no idea how to implement this.

Does anyone have any thoughts or guidance for me?

Expand full comment

I've recently found the amount of antisemitism on Twitter anxiety inducing and pretty harmful to my life. Is there a good way to completely block the ability to access it from my phone? I can do it on my PC, but I haven't found a way to IP block from my Android phone

Expand full comment

Has Scott written about “The Body Keeps the Score”? I’ve looked all over for informed reviews, but just find commentary by book critics and laymen.

Expand full comment

I’m looking for some recommendations for sci-fi novels. Specifically, space colony type novels. Does anyone have any favorites they’d recommend from that sub-genre?

Expand full comment

The other Scott blogs about (among less CW relevant stuff like his take on OpenAI) about Greta chanting "crush Zionism", linking the SWC as a source. "[She is] taking time away from saving civilization to ensure that half the world’s remaining Jews will be either dead or stateless in the civilization she saves."

I think I like that framing better than the one of the SWC, which is simply calls that murderist...err ... antisemitic. Personally, I think Zionism has as many interpretations as feminism, some of which I find myself totally on board with (continued existence of Israel) and some I strongly oppose (settlements in the West Bank). That makes it a useless word for anyone aiming to communicate clearly, but the general vibe I get from "crush Zionism" is still one of calling for the destruction of Israel.

While one can oppose a state without being racist towards its inhabitants (for example, I would be on board with "crush the Aztec Empire"), I think that letting Israel continue to exist is actually a great idea. There may well be a statistical correlation between antisemitism and antizionism, but I think that the narrative of Ms Thunberg picking up an old diary of Adolf Hitler and getting infected with antisemitism is probably not what happened.




Expand full comment

In previous Open Thread, it was mentioned twice that the "rationalist community" got its name wrong, because in the "rationalism/empiricism" philosophical dichotomy, they are taking the side of empiricism (although in reality they actually do a lot of armchair reasoning and little experimenting).

I repeat and develop my objection here (so that it does not get lost in the ocean of Palestine/Israel comments), because this is what people keep saying for years, and it is mistaken.

The "rationalism/empiricism" philosophical dichotomy is a false dilemma, because on one hand, human reasoning is influenced by all the evidence we have observed in the past, on the other hand, all the evidence we observe is interpreted (starting way before the conscious level, for example the visual cortex already translates the incoming "pixels" into lines and shapes, which is why the optical illusions are so powerful even after we learn about them). So it is always an interplay of both. (Also, the dichotomy ignores the third major source of information, which is the instincts and cognitive biases we were born with. Were are not born as "tabula rasa", to be shaped purely by the sensory inputs and/or reasoning from the first principles. Although some philosophers tried to shoehorn instinct into a broad definition of "reason".)

Even people traditionally associated with philosophical rationalism (Spinoza, Leibniz) often said things like: "in theory, we could derive everything from the first principles, but in practice, we usually need to use the empirical evidence" (not an exact quote). So if you tried to find someone who is "rationalist" in the sense of "opposing empiricism", you would probably need to go in history all the way back to Plato.

Then there is the political meaning of "rationalism", where using reason is put in opposition not to experimental data, but rather things like the divine right of kings or religious traditions. This is associated (among other things) with materialism, atheism, utilitarianism... and I think it is not a mistake to put the "rationalist community" in this group.

tl;dr -- you make it sound like words have a clear meaning in philosophy and ignorant Yudkowsky made a clearly wrong choice, but in reality it is complicated, some traditional thinkers would agree and some would not

Expand full comment

What are good things your parents did with you when you were a kid? What do you wish they had done but didn't?

Background: my kids are 5+7 now and I feel I now have a few years where I can do all kinds of fun, weird and in the end net-positive things with them before they turn into teenagers. Examples (what I already do): bring them along to diverse encounters (the 7yo just spent 3 days with me and others cooking for 50 people), create time when they can just do whatever they want (current favorite: play outside in the mud without someone watching them), read many books together, have reasonable expectations for what they must do (clean their room once per week). So far we're doing good - I'm just looking for more and also weirder ideas!

Expand full comment

The situation where OpenAI employees massively sign a public petition supporting their boss in his acting against the OpenAI Charter, reminds me of a document called "Anticharter" that people in socialist Czechoslovakia publicly signed to express their loyalty to the regime (and keep their jobs).


This is not a coincidence because nothing is ever a coincidence.

Expand full comment

We continue re-reading old posts about Scott's adventures all over the world. This week, he goes to Japan and visits the sunken corpse-city of R'lyeh. Either that, or some funny shaped rocks. https://archive.ph/k7WOI https://pastebin.com/Hq8vsWER (index of all the old posts https://archive.ph/fCFQx)

Later, Scott came to believe that the "ancient city" is just a natural rock formation, but writing things down immediately after going there he was much more on the fence and leaning towards it being real even. Today we have something that the Internet of 2007 did not - a beautiful video of that very same "city" https://www.youtube.com/watch?v=mnaWn5OPP3c, so that you can decide for yourself if it looks legit to you. (Spoilers: no, but it's very pretty anyway.)

Expand full comment

If you were dictator of your country for a week and could force through one lasting change, what would it be?

Meta-changes that modify the political/governmental apparatus are encouraged but not required.

Expand full comment

These are some of the thoughts that have come up as I bounce around the interwebs trying to find out what Q-star is exactly. Or even approximately --

Consideration #1: any system of so-called morality is nothing more than an attempt to codify sentimentality. But sentimentality works both ways: if you can love, then you can also hate.

Consideration #2: if Q-star is -- or is about to become -- an AGI, it would nevertheless still not be conscious.

Question: so which poses the greater X-risk, a conscious AGI or an unconscious one?

Expand full comment

Summer Solstice! It's summer, people! (This is your semi-humorous reminder that the Southern Hemisphere exists; the end of the year gets a little irritating for me when everyone holds "winter" events and forgets us, with some even going as far as saying that if it is not regional winter you're not welcome to participate.)

Expand full comment

I'm currently scoping out potential ideas and level of interest in an online EA+ community catered for neurodivergent people. If this sounds like something you'd join (or maybe even get involved in setting up), I'd love to get your input via this 10-15 minute survey. The survey is anonymous (even though it asks you to log in).


Expand full comment

Is there agreement about flu shots in the ACX community?

Do most people think the benefits outweigh the costs?

I understand they guess at the flu strains a few months in advance -- is there a way to check on how good they are and take them once it is known if the predicted strains did take over? Or maybe the difference isn't enough to move the needle on cost/benefit?

I figure this has been hashed out by ACX or ACX-adjacent people before.

Expand full comment

As to 2, good: a proper professor researching the actual current state of the AI art.


seems to be causing consternation in the doomer community but does make a rather powerful point about the PR failure of doomerism. I like my climate science (just as an example) to be from 2023 not 2014 and from climate scientists, not philosophers of climate science. And if it has to be 2014 science I would like it at least to be updated to account for huge developments like the unforeseen effectiveness of LLM.

Also, and this seems to be a sensitive topic, there's enough climate scientists about that I can afford to indulge my own distaste for racism by avoiding the output of climate scientists (if there are any) who are the authors of clearly racist posts on the Internet. You may call this an ad hominem position if you will, but it's the way I am wired. I might accept the output of a pure mathematician as being independent of their racism but in a human facing science, I want to know about it. And all science is potentially human facing; nuclear physics is a great deal less abstract than it appeared to be 100 years ago.

Expand full comment
Nov 27, 2023·edited Nov 27, 2023

To Scott, or any other psychiatrist here.

Wikipedia's list of mental disorders (https://en.wikipedia.org/wiki/List_of_mental_disorders) lists Body Indentity Integrity Disorder in "obsessive-compulsive and related disorders". I have severe BIID and mild OCD, and those don't seem connected at all. Most notably, OCD is very clearly ego-dystonic, and BIID is very clearly ego-syntonic. Is there an actual connection? Where did Wikipedia get this from?

Expand full comment

If an alien published the code for ASI on GitHub tonight, and if that code could run on approximately four A100s (which are expensive but consumer-accessible GPUs), what would you estimate as the probability that a paperclip maximizer (PCM) would cause the extinction of humanity by the end of the century?

I propose that any estimate greater than 5% is a miscarriage of the critical thinking principles of this community. While I personally find orthogonality and instrumental convergence to be persuasive, and while I readily admit that I can't explain why they wouldn't happen, these theories don't cause me to forecast instadeath from PCMs. No matter how persuasive PCMs are, I temper any expectation about how they work by the sheer fact that they're theoretical.

And theory by itself doesn't mean you can't forecast. We already have a playbook for that, which is called the inside view. But the inside view here requires making some connection with empiricism. Our epistemic status for PCMs is lower than what we had before making predictions about atmospheric ignition pre-Trinity test and lower than when we made predictions about spontaneous black hole formation during the start of the Large Hadron Collider.

We also have playbooks around agentic simulation, such as when Axelrod and Hamilton used game theoretic models to explain how cooperation might have evolved by animals. But in their case, they still relied on significantly more empirical touchpoints, such as examples from biology. PCMs, on the other hand, can't be inferred from anything in existence. The PCM theory is designed to not pay rent¹.

Setting aside the discussion around PCMs, I find the other parts of the AI Extinction debate distracting. Questions about whether or not we will get to ASI in the first place, whether or not Moore's Law will end or not, whether or not warning shots are a thing, or whether or not humans have the capacity to build controls for anything; are all red herrings. The AI Extinction conversation has evolved into a Shiri Scissor² because the subject of contention has reasonable—but polar—priors. Most of the text spilled online in the AI Extinction debate is about all these distracting preconditions, when we only get to forecasts of 50% or higher when we contemplate the last step, the possibility of an impassive, destructive PCM.

As a personal aside, I went through an update in formulating this comment. I now don't think that Eliezer actually believes there's a >90% chance of AI Extinction³. He hasn't gone on record with a specific number, and I'd bet his actual number is closer to Scott's (20%), maybe even lower. But Eliezer's behavior is consistent with someone who believes in a 10-20% chance of AI extinction. If I had the same estimate, I too would be grabbing everybody I know and shaking them, asking, "Why aren't you all worried about this? We're all going to die ... tonight!"

[1]: https://www.lesswrong.com/posts/a7n8GdKiAZRX86T5A/making-beliefs-pay-rent-in-anticipated-experiences

[2]: https://slatestarcodex.com/2018/10/30/sort-by-controversial/

[3]: https://www.astralcodexten.com/p/why-i-am-not-as-much-of-a-doomer

Expand full comment

Philosophy Bear asks that I link a short survey he made up: https://docs.google.com/forms/d/e/1FAIpQLSdtuAE1MXUmYJ9ilLUOojXL6R3TTTCRy-kEcTivM5al5hCOzw/viewform

Expand full comment

The last week or so some of my EA friends have become totally chagrined. My suggestion: rebrand like sociobiology did into ev psych.

Also, FIRST!

Expand full comment
Comment deleted
Expand full comment