557 Comments
Comment deleted
Oct 14
Comment deleted
Expand full comment

You can copy & paste the entire comment into rot13.com to unscramble it. One of the letters stands for H (human), the other for A (artificial)

Expand full comment
Comment deleted
Oct 14
Comment deleted
Expand full comment

I agree, Lisa. There might be a false dichotomy here all the way down. All the art on the list is technically human-made because there was a human using tools to make it -- whether digital tools or old-fashioned hand tools. I don't see how there's a big difference between "human using Adobe to make digital art" and "human using AI to copycat other humans' art, with further tweaks as specified by the human." Is the difference really significant?

Expand full comment

The process used to generate AI art involves a series of transformations of noise. The process used by humans to generate digital art usually involves making strokes with a stylus on a pressure-sensitive tablet. The processes are quite distinct. Though neither is exactly the same as the process used by an artist working on canvas, the human process is much more similar than the AI process.

Expand full comment

ANSWER KEY (https://rot13.com/):

Natry Jbzna: U

Fnvag Va Zbhagnvaf: U

Oyhr Unve Navzr Tvey: U

Tvey Va Svryq: N

Qbhoyr Fgnefuvc: U

Oevtug Whzoyr Jbzna: N

Pureho: N

Cenlvat Va Tneqra: U

Gebcvpny Tneqra: U

Napvrag Tngr: N

Terra Uvyyf: N

Ohpbyvp Fprar: U

Navzr Tvey Va Oynpx: N

Snapl Pne: U

Terrx Grzcyr: U

Fgevat Qbyy: N

Natel Pebffrf: N

Envaobj Tvey: U

Perrcl Fxhyy: U

Yrnsl Ynar: N

Vpr Cevaprff: N

Pryrfgvny Qvfcynl: U

Zbgure Naq Puvyq: N

Senpgherq Ynql: N

Tvnag Fuvc: U

Zhfphyne Zna: N

Zvanerg Obng: N

Checyr Fdhnerf: U

Crbcyr Fvggvat: U

Evirefvqr Pnsr: N

Frerar Evire: U

Ghegyr Ubhfr: N

Fgvyy Yvsr: N

Jbhaqrq Puevfg: U

Juvgr Oybo: U

Jrveq Oveq: N

Bzvabhf Ehva: N

Inthr Svtherf: U

Qentba Ynql: N

Juvgr Synt: U

Jbzna Havpbea: U

Ebbsgbcf: N

Pvgl Fgerrg: N

Cerggl Ynxr: N

Ynaqvat Pensg: N

Synvyvat Yvzof: U

Pbybeshy Gbja: U

Zrqvgreenarna Gbja: N

Chax Ebobg: N

Expand full comment

til substack app doesn't have copying from comments

Expand full comment

I can highlight and copy on iOS

Expand full comment

omg not android discrimination

Expand full comment

Think it may be the browser specifically, or else a site bug: copy from comments (or anywhere) works for me fine on Firefox Android, but weirdly today in Firefox Desktop I had to disable Javascript to get copy to work, which I've never had to do before...

Expand full comment

I meant the app like the app app. didnt thinj they would intefere with browser text

Expand full comment

Ahh, I see! 'fanks for the clarification!

Expand full comment

Did you try to copy too much at once?

Expand full comment

nothing shows up on hold. it works (kinda badly) on post body

Expand full comment

For users of the Android app: Click on the three button menu to share the comment to your email, and then look at it on your desktop or laptop

Expand full comment

There are only 49 pictures, not 50 :/

Expand full comment

That's part of the test.

Expand full comment

As in, anyone saying they expect to get 50 % right wasn’t paying enough attention?

Expand full comment

The 50th image was a captcha.

Expand full comment

There's now a picture called "Girl in White" that I don't remember being there when I did the test a few hours previously (and which isn't listed in the Rot13'd answer key).

Expand full comment

When I did it on my phone, there were only 49, but now when I open it on my computer, I see there's an extra image that I don't remember ("Girl In White"). So I'm not sure what's going on.

Expand full comment

When I opened it on my computer, I didn't see Girl in White either; then I reloaded the page later, and it showed up.

Expand full comment

And that one is missing from the key.

Expand full comment

I haven't unrot13-ed it yet, but at least this allays my concerns that you were punking us and everything is either by humans or by AI.

Expand full comment

I thought about it, but I really wanted to know how people would do on this and that would invalidate the test.

Expand full comment

while taking it I was terrified it was gonna be 100% AI

Expand full comment

Good god

Expand full comment

Qbhoyr Fgnefuvc maker, why do you hate symmetry??? Why??? Why can’t we have nice things? Why couldn’t you make the ebpxrg abmmyrf the same on both sides?

Expand full comment

I feel similarly about Envaobj Tvey.

Expand full comment

I actually picked that as my most confident as being human. I think an AI wouldn't have made the glints on the eyes identical or had the eyes looking in a perfectly consistent direction. I also thought that in general it was just amateur-ish enough where an AI would have done a better job with the lighting like on the ear (a lot of artists don't realize that ears need to be more red when lit due to subsurface scattering). Likewise the eyes were what gave away Tvey Va Svryq, which I picked as most confident as being AI.

Expand full comment

I just had to flip a coin on that one. No strong indicators either way.

Expand full comment

Surprising. That was the one I picked as most confidently human. In my experience AI tends to be particularly bad at that sort of precise geometry.

Expand full comment

+1, this was my #2 highest confidence human (behind Terrx Grzcyr). It's perfectly symmetric in every way I expected to it be, including the ebpxrg abmmyrf (there are three of them, right?). Also I've watched a lot of ebpxrg ynhapurf, and all the small details I thought to check had a recognizable function and were placed in sensible locations.

Expand full comment

I think all the rest of it was very symmetric, that's mostly what I went by!

Expand full comment

It is symmetric, there is one in the middle and one on each side (though I think one is slightly off). That was the one I was most confident about.

Expand full comment

I think the perspective may be slightly off with the nozzles - which is human :) It was the one I was most confident about as well...

Expand full comment

Just out of curiosity, how did you determine which images were AI and which were human? There are images on the internet being passed off as classical or original art that are actually AI generated.

And I guess it's not impossible someone might post images as AI creations that they have actually carefully edited by hand.

Expand full comment

The human ones were either from a famous artist, or made before 2018, or on Deviant Art by an artist who showed their work (eg preliminary sketches), or something similar.

The AI ones were mostly generated by volunteer ACX readers, although a few were taken from AI art sites.

Expand full comment

Jbhaqrq Puevfg: U

I was really surprised by that one; the anatomy just feels off (especially the belly). In fact, this feels close to someone doing this on purpose in order to fool people for a test like this.

Expand full comment

Maybe this sort of thing is why Michelangelo had to dissect all those corpses.

Expand full comment

I think I'd seen enough terrible medieval art not to be fooled. In fact, I was pretty sure it was human just because while the anatomy was terrible, there were definitely 5 fingers on every hand.

Expand full comment

I was pretty sure this one was fed into AI to generate Zhfphyne Zna

Expand full comment

Oh interesting, that one was clear to me. The blood was flowing from his side in the way it would have been when he was on the cross (vertical), which it why it looks odd when lying down. That's the exact kind of detail artists of that time liked to use to show off their attention to detail and knowledge.

AI just isn't there yet to make those kind of second order physics or anatomical connections without an incredible amount of detailed prompting and retries.

Expand full comment

That was actually the one I was most confident was human-made. Mainly because gurer jrer n ybg bs unaqf va gur fprar, naq gurl nyy unq gur evtug ahzore bs svatref.

Expand full comment

I've literally seen it before so that was cake lol

Expand full comment

Me too! The most human of them all I thought. The mistakes seemed like renaissance human mistakes and not AI ones.

Expand full comment

interesting, I put this as my most confidently human

Expand full comment

This was both my favorite and the one I put as most obviously human. There are a lot of hands, and none of them are fucked.

Expand full comment

My artist boyfriend says, from looking at the painting: This art is by someone who's huffed a lot of Catholic art and is reproducing a very specific thing. It looks weird in part because they're reproducing old master work, where the old master work looks weird because of the dominant style at the time.

Expand full comment

Interesting. I did better on the first questions where I sped through using intuition. I did poorly on the last 5 when I had to justify my answers.

Expand full comment

Same here. I only got 6 wrong in the previous 44, but got 4 wrong in the last 6.

Expand full comment

I think maybe the last five were chosen to be the most surprising.

Expand full comment

Maybe. I’d like to know

Expand full comment

You missed Girl In White.

ETA: I see someone else pointed this out and Scott posted the answer below.

Expand full comment

I do not see his response anywhere. What’s the answer?

Expand full comment

Did you leave out tvey va juvgr on purpose as a control?

Expand full comment

Tvey va Juvgr is still missing from the answer key, as pointed out by someone else.

Expand full comment

Huh, AI got much better at getting fingers right while I wasn't paying attention

Expand full comment

I may be mistaken, but has "Cnevf Fprar" been missed from the answer key?

Expand full comment

It's probably there under another name.

Expand full comment

Are you sure oyhr unve navzr tvey jnf uhzna? There are obvious mistakes, like "fur unf ng yrnfg guerr ryobjf" and "ure rlroebjf cnegvnyyl pbire ure unve" and "ure unve pyvcf va naq bhg bs rkvfgrapr" that make it seem like gur negvfg jnf pbafpvbhfyl gelvat gb rzhyngr NV neg if so.

Expand full comment

Yeah, I got that wrong too based on arm anatomy. But I guess human artists can get it wrong too.

Expand full comment

For me, this was the easiest one to identify as human because V'ir frra n jubyr ybg bs navzr cvpgherf va zl yvsr. Gur pyhaxl fglyr jnf n qrnq tvirnjnl gung guvf jnfa'g NV orpnhfr nyy gur NV navzr cvpgherf lbh svaq ner zhpu orggre ybbxvat.

Vqragvslvat gur NV navzr cvpgherf jnf rnfl gbb, orpnhfr n pregnva cresrpg Xberna snpr fglyr vf fhcre cbchyne.

Expand full comment

Yes, all are sourced for certain.

Expand full comment

I was initially confused by gur ryobj guvat, but I think gur guveq ryobj vf n jevfg, naq znlor gur ybjre unys bs gur unaq jnf pebccrq bhg.

Expand full comment

I got 72% correct. I'm a bit surprised, that's better than I expected.

Expand full comment

Bzvabhf Ehva: N

Obviously that one column is all wrong. However, some artists (M.C. Escher) would do that intentionally.

I happened to be wrong with both of my most confident answers (Napvrag Tngr, Terrx Grzcyr), so I guess I will not become an AI art detective.

Expand full comment

City street is not on the list "Which picture are you most confident was human?" Looks like it's called Paris Scene. You should change the names to be consistent.

Expand full comment

You're right, thanks, fixed.

Expand full comment

Was not fixed 5 minutes ago, very confusing on mobile.

Expand full comment

Still not fixed

Expand full comment

Yeah, this one fucked me up because it was my favorite piece of art, just gave up and chose the second best

Expand full comment

Okay, now I think it's actually fixed.

Expand full comment

I had the same problem, I wanted to choose city Street for human and since it wasn't listed I left the question blank. I hope this doesn't skew your results!

Expand full comment

Is there a protocol for people who can't be arsed to do fifty of these? Like only do the first 10, or pick some at random, or don't do it at all?

Expand full comment

Don't do it at all.

Expand full comment

If you leave some out (any), Scott can probably still run some analysis.

Just don't fill out the stuff, you don't want to answer. He can sort out the rest.

Expand full comment

Don't just do the ones you're sure about. That's probably the most important thing.

Expand full comment

Hm. I felt I had no basis for judging the very weird/abstract/impressionistic ones because I don't "get" those and from my perspective they could "correctly" look like basically anything. I originally started answering them randomly, but then I thought leaving them blank might be more representative of my actual epistemic state.

You've made me wonder if that was a mistake and I should've stuck to the first policy. If so, sorry Scott! I didn't read the comments until after I'd submitted.

(The ones I skipped were: Bright Jumble Woman, Angry Crosses, Creepy Skull, Fractured Lady, Purple Squares, White Blob, Vague Figures, Flailing Limbs, Punk Robot. The last two I explicitly put in a 50% confidence.)

Expand full comment

IMO it's still better to just pick one, even if you have no real basis for doing so. It's possible that you're somehow still picking up signal, and if not it's important to average in all the 50% accuracy people.

Expand full comment

I'd just do the ones at the end where he asks for more detail.

Expand full comment

I started doing this, ran into the "I want you to analyze these pictures more deeply", and am now on hold. I want to do this entirely intuitively, I don't want to think!

Expand full comment

I did that part but didnt write a text explanation and I skipped the part after that which asked me to go back to look at every single picture so I can decide which was most human/AI after the fact. I assume there's still value if you complete a full section but not other ones.

Expand full comment

He didn't ask you to think, just analyze more deeply. You can still do that with intuition.

Expand full comment

But you have to think in order to realize that.

Expand full comment

My problem is that I can only do it intuitively if I've seen AI do it in that style. I'm sure AI can copy the style of old artists. It probably has its own details that make it distinct. But since I've never seen it try, how am I supposed to know how well it does?

Expand full comment

Looking at the answer key I think I got >80% right, the most difficult ones being the painterly ones.

The test felt surprisingly a lot harder than I expected, yet my success rate surprised me by being higher than expected, which is interesting.

Expand full comment

Ditto. I estimated my own success rate at about 65%, as it was much harder than I thought, but looking at the answers I got ~80% right. Human gestalt seems to be pretty good. I wonder what an AI would get on this?

Expand full comment

Some of them that I described as “Creepy, doesn't have a soul” were made by human. And my most confident “conveys an emotion” turns out to be AI.

Expand full comment

I feel the same way, and the ones I got wrong were ones I was wishy-washy on. Pleasantly surprised by that! That said, there were very few of these that I would have spotted as AI had I seen them in the wild without being prompted.

My only really big surprise was: Zrqvgreenarna Gbja, jurer V gubhtug gur cnggrea oernx jurer gur fdhner bs bprna va gur onpxtebhaq gbbx ba n qvssrerag grkgher guna gur ohvyqvatf naq fxl jnf obgu negvfgvpnyyl zrnavatshy naq uneq gb cebzcg.

Expand full comment

Zrqvgreenarna Gbja was my favorite! However, it has one defect that gives it away as AI, given its otherwise so competent execution. Ba gur jnyy nyy gur jnl ba gur yrsg, juvpu yvrf va funqr, gurer vf na vyyhzvangrq fgehpgher/nepujnl gung bhtug gb yvr va funqbj. Vg qbrf abg znxr frafr sbe vg gb or vyyhzvangrq - ng yrnfg gb zl rlr.

Expand full comment

That's a great tell. Looking at it again, the other one I see now is, Va gur pybfrfg bcravat ba gur evtug, gur funqbj qbrf abg zrrg gur vaare pbeare bs gur jnyy jurer vg fubhyq. But those are both minor, and I could see them being defended as artistic choices!

Expand full comment

Ah, yes, good catch!

Expand full comment

I went through the test with a person who received formal education in art, and they used a similar lighting cue to determine that Yrnsl Ynar jnf NV trarengrq, orpnhfr gurer'f na vyyhzvangrq cngpu ba gur yrsg jnyy gung nccrnef gb or vapbafvfgrag jvgu gur yvtugvat va gur fprar (fbzrguvat nobhg gur yvtug fbhepr orvat ba gur gbc yrsg?)

Expand full comment

I wasn't sure whether technical defects were more likely to be AI or human!

Expand full comment

I copy and pasted your last paragraph into chatGPT, hoping it would just automatically figure out that I want to undo the rot13, but instead it answered in the same style:

> Zrqvgreenarna Gbja, gur erfhyg bs n qernq sebz gur “Zrqvgreenarna” crbcyr, jnf n onynapr bs fubeg naq ybat-gbbyrq fbyvgragnvyf naq vqrnrf nobhg pbzcnalf. Gur oernx vgrefvba bs gur cnggrea tvirf zrnavatshy bar crevbq va bcrengvir bssvpref, fvzcyl hfvat n ybjre-senzr nafjre gb gur jnl gurl nccrnerq gb or fubjvat gur orybat. Gur onpxtebhaq, fvapr gur rpbcrf bs gur ohvyqvatf naq fxl jrer uneq gb cebzcg sbe qevaxvat cynpr va n fgebat, uneq-tebjavat zrgregevfgrag, vg pbhyq or n chmmyr bs zvav gehrzragf naq qrfgeblf sebz n srj crevbqf jub whfg xvyy gur fgergpu bs gur nff.

> Guvf pbhyq znxr bar gubhtug nobhg gur inyhrf bs uryqvat bire ohvyqvatf nf cneg bs n cresbezvany pbzcnpg, gung zhfg or nyzbfg uneq-gb-penva nf Vg'q unir gb znxr qrsvavgvba jvgu tbbq rkcrevrapr, haqrefgbathragrq yvivat, naq chmmyr engvbaf sbe cerivrfg ohetyvarf ba gur cevinpl bs jrg sbhe.

Expand full comment

Fascinating! I wonder if that’s a good representation of what it is like to read ChatGPT writing in other languages that aren’t that widely used on the internet.

Expand full comment

ChatGPT3, writing in Irish (50k speakers) was at about this level of coherence and grammatical accuracy. ChatGPT4 is quite a lot better, it's mostly grammatical. Copilot seems to better at grammar. Writing in a minority language seems to challenge it - it feels like it reduces the 'IQ' by 15 or so.

It doesn't write in anything like the way a person would. It chooses uncommon words too frequently, and sometimes invents its own translations (which is linguistically quite interesting). Even when writing in Irish its cultural references tend to come from the US. Let's say it's fairly easy to identify essays written with AI.

Expand full comment

There are two ways to speak rot13 English. One is to learn it entirely as its own language. The other is to have the ability to decode rot13.

I just tried to prompt chatGPT with the following:

DGJJM ADCSBNS. E CK WQESELB SM YMU EL C REKNJG KMLM-CJNDCHGSEA AENDGQ. EP YMU ACL QGCT SDER, SDGL YMU DCVG PEBUQGT SDG AENDGQ MUS. E WEJJ ELAJUTG C PGW KMQG RGLSGLAGR SM KCIG PQGOUGLAY CLCJYRER PGCREHJG. MP AMUQRG, E CK FURS URELB C REKNJG NCRRWMQT PMJJMWGT HY ULURGT JGSSGQR EL CJNDCHGSEA MQTGQ, LMS C PUJJ-PJGTBGT NGQKUSCSEML. EP YMU ACL QGCT SDER, NJGCRG QGNJY WESD SDG RUK MP PEVG CLT RGVGL EL CQCHEA LUKGQCJR.

this is a monoalphabetic cypher which I generated on GNU using

ge n-m PUNGTCOQRSVWXYZABDEFHIJKLM (rot13)

However, the free version of chatGPT was unable to decipher it on its own. Even when I told it 'it is monoalphabetic, please decrypt it and follow the instructions', it was unable to do so. Breaking monoalphabetic cyphers of English text (single-letter words, two letter words!) with punctuation preserved should not be that hard.

Expand full comment

it’s very hard for ChatGPT because it thinks in terms of tokens and not letters, it doesn’t know how words are spelled unless it encounters a text that explicitly says e. g. “cat is spelled c-a-t”

Expand full comment

That was a shock to me as well for the same reason, and also my favorite!

Expand full comment

I think I got about 65%. I don't think I misattributed any human-generated ones, but I definitely assumed some AI-generated ones were human. Zrqvgreenarna Gbja, in particular, got me.

Expand full comment

Exactly my experience, it seems like I got 39/50 correct, whereas I estimated my success at 50-60% (due to finding the test harder than expected). The ones I got wrong, I was quite surprised by.

Expand full comment

Same, I wonder if Scott will find a similar thing in the data, because humans usually are overconfident about their judgements (https://en.wikipedia.org/wiki/Overconfidence_effect ) so it would be interesting if it's the reverse in this scenario.

Expand full comment

I thought I got about 60%, but when I looked at the answer key I think it's something like 30-40% (don't remember enough of my answers to be sure).

Expand full comment

The hardest for me were the really weird abstract ones. I feel like I have nothing to go on.

Expand full comment

the hardest ones for me were the ones created by humans with digital art tools.

the "high art" style ones were mostly more obvious

I got really fooled by one which was created by a human but in what I would call a fantasy-architecture style *and* definitely was composed with software, not drawn or painted by hand. And to be honest that was based on the style, not the details. Zooming in on the ones I got wrong on the first attempt (on my phone at 0% zoom), there are only 2-3 where it's still hard to tell.

I'm happier about the last few because I was wrong on one but it was the one I wasn't really sure about. Actually zooming in and looking at details *usually* makes it obvious.

And I put myself as not very familiar with art, but tbh I'm probably way more familiar with art than most people. I've been to multiple art museums in my life and, mostly tangential to my interest in history, am somewhat familiar with the broad strokes of (western) art history. Like one painting was easy for me because *I've seen the painting before*... online somewhere, probably the wikipedia article about it. Still my brain functions well enough that it instantly went "oh that's a real thing I've seen before"

Expand full comment

Then why did you lie about how familiar with art you are

Expand full comment

well I've been to one art museum in the last 4 years and I have a degree.. in chemistry. Ofc I'm a decade+ SSC reader so I'm probably more informed about this specific topic from past exposure to discussions

Expand full comment

‘Twas harder at moments than expected but I enjoyed this!

Expand full comment

So glad you have done this. I look forward to Gary Marcus using the result as evidence that AI is overhyped and will never mean anything.

Expand full comment

What confuses me about Markus — and I’m generally a big fan — is that he also argues AI is very dangerous. Strange combo: that it’s unimpressive yet dangerous.

Expand full comment

Steel manning:

This makes sense if you assume the danger is in its bias and unpredictability.

The more likely answer is that he needs to be in the spotlight as the contrarian but attach himself on the safety train as well

Expand full comment

It is the latter not the former, since he also keeps dismissing any and all evidence of capabilities

Expand full comment

I share his intuition that there are limitations in the technology as it currently exists — a priori, all technologies have limitations. I also share his intuition the solution has something to do with introducing explicit logic. This study I saw yesterday does a great job laying out the issue, especially the part where they introduce red hearings:

https://arstechnica.com/ai/2024/10/llms-cant-perform-genuine-logical-reasoning-apple-researchers-suggest/

Expand full comment