It's safe to say OpenAI is in crisis right now. Between abruptly cancelling their TikTok clone for AI generated videos and consequently losing out on $1B from Disney and shelving plans for an “erotic” version of ChatGPT, it seems our promised future of hyper-personalized AI generated content consumption is dead — or, at least, on life support.

Days after this tragic passing, I gather you here not to eulogize this app I never used and never wanted to use, but to reflect on what to me is the central question of the idea that brought it to life in the first place: Who the fuck would want that?

To answer this, let's think of something (relatively) smaller: the app that started the bubble, the chatbot. Who would want a chatbot? The first answer that comes to mind seems pretty obvious to me: managerial, C-suite types. It's the perfect tool for people whose job is ordering other people around! From sending e-mails to performing simple computing tasks like filling out basic documents, AI seems almost as if it were designed to perfectly suit their needs. Even the method of interaction, the chat box, mirrors the manager's main method of interaction with their employees: a Slack or Microsoft Teams message. It's easy to prove this thesis: one cursory look at LinkedIn — the one place where AI’s sycophantic, bullet-point heavy, buzzword-ridden, overly verbose prose has flourished — shows how much the acronym-obsessed class has felt seen by this technology.

It simply makes sense that these people would think everyone in the world would want unchallenging cut-to-measure entertainment at all times, created through a probabilistic system you operate one or two sentences at a time. It fits perfectly with how I see the corporate class. But there's another group of people whose embrace of AI chatbots, specifically “vibecoding”1, has baffled me, and I think their behavior serves as the key to understanding the idea of an AI media future: Productivity nerds.

Browse any subreddit for currently buzzy productivity software (Obsidian and Raycast come to mind, but even an e-book organization app had some drama around AI usage recently) and you’ll find a dozen posts every day about clearly “vibecoded” programs or extensions. As a productivity nerd myself, the premise of using chatbots to achieve these goals seems baffling. Anyone who has ever had the displeasure of either working customer service or texting an ex-lover is familiar with how easily the humble chat message lends itself to misinterpretation and signal loss. Typo’d words and short sentences arranged haphazardly into a small text box that covers at most 20% of your screen are about as lossy a communication medium as possible in the modern world, except for maybe faulty phone lines or 1-bar Wi-Fi. Yet people with the exact same strain of neuroticism as me seem to love using it2.

So I've been trying it. I can't go too into the specifics but here's the basics of my job at the moment: I have a lot of hours-long audio recordings that I need to transcribe; text and notes and spreadsheets with lots of information I need to properly format, categorize and visualize; and, on top of that, one or two thousand images and videos that I also need to categorize. It is maddening, often brainless work involving repeating the same tasks over and over again — but, hey, it's a junior job in my field, that's a straight-up luxury in today's market, I’m not complaining.

My first issue in this job was transcribing audio. None of the tools I was using3 were either good enough at handling audio in a few different dialects of Portuguese and complex terms like brand names, and the one that I had found was pretty okay at it — Google's Gemini — was constantly throwing fits, rejecting uploads for exceeding file-size limits, not transcribing the whole thing, or, worst of all, summarizing the recording instead of transcribing it word by word. And this was using the “Google AI Pro” subscription I got for free with my university's email address!

In a last ditch effort, I went to Google's “AI Studio”, described my use-case and my needs to it, and told it to make me a website that'd do it for me. Surprisingly enough, it worked perfectly. Speaker attribution, complex dialects and technical terms, everything. I wanted to know how it worked, so I looked inside what it had coded for me, and what I saw drove me insane. It was just prompting Gemini through its API. The prompts were essentially the same as what I had spent literal hours doing on Gemini's website, except, when using the API, the tool had no limits that'd stop the response half-way. I was using the exact same Google account, there were no extra charges on my card, my subscription covered it.

Gemini's command line interface.

So I started trying to find other workarounds. I downloaded Gemini's command line client. It is ugly as sin, harder to read and it glitches and eats up my computer's RAM for some godforsaken reason, but it does the things the web version of the exact same AI model, connected to the exact same account, refuses to do. What followed was a honeymoon period where I fell down the rabbit hole of AI programs and briefly embodied the lifestyle of a guy who calls Twitter “X” and loves going off about ‘agentic governance’ or whatever. I futzed around with it, it organized some folders, cleaned up a few spreadsheets, I learned to use “Skills” to do things like add events from a DM to my calendar… it was nice. Hell, I’ll even admit that doing these things, especially through a terminal, made me feel kind of cool.4

But as that excitement gave way to boredom (because, fundamentally, there just isn't much in my day-to-day life I want or need automated), I realized that the way I was using these tools was simply to do away with things that had annoyed me about using my computer, much like I had used Gemini to code a website to do away with the things that annoyed me about Gemini. This is, to me, the essence of “vibecoding” in nerd/tech enthusiast circles: remaking the computer in your image.

Browse the Twitter account for Raycast's newly announced “Glaze” app (a marketplace for vibecoded apps) and you’ll see people celebrating making apps for things like browsing a long list of emails or visualizing a large folder. I was listening to the Verge's David Pierce interviewing an extremely monotone Anthropic employee and the one part where he smiled was him getting giddy with excitement about not having to navigate a horribly designed government website and just telling an AI to do it (he also follows it up with an anecdote about using it to file taxes. Don't do that, please, no matter how boring it is).

Glaze's website is drenched in try-hard vibes. It wants so badly to be cool, I love it.

These are all simple, often annoying tasks, and sure, you could just keep doing the boring thing, or, even better, learn how to actually code and spend time coding it, but why'd you do that when all you want to do is make a fancy spreadsheet that's easier to filter. That is the ideology of the productivity nerd. Spend a little extra time hitting your head against the wall (talk to a kind of stupid robot) and get closer to your goal: the One Tool to Rule them All, which perfectly visualizes, organizes, and automates all the things you want done and lets you get on with doing your work as efficiently as possible.

This singular focus on a goal with disregard for the inner mechanics of a process is the fundamental belief of Artificial Intelligence as a Product. It is the consequence of decades of user interface design in service of the slogan “It Just Works”, meant to alienate the consumer from the very tool they're using. Productivity nerds, in our endless hunger for chasing away software pet peeves, found in AI a tool that achieves the goal of “Just Working” (or, at least, working enough), like a student would find in it a tool that achieves the goal of handing in an assignment, like an investor would find in it the goal of having produced ‘content’ to be consumed on their ‘platform’. But computers run on code, schools (supposedly) run on knowledge, and culture runs on culture.

I can be very happy with my little terminal window that obeys my every command, much like how I’m very happy with a search engine seemingly effortlessly fetching the search results I asked for, but, no matter how cool I feel, I simply don't get how it operates, and it doesn't want me to. An app like Sora abstracted away the process of cultural production to merely having an idea and having it be executed, and that is an ideal only an extremely alienated consumer or investor could have. It failed (partly5) because people, in general, are conscious enough to think critically about the processes that lead to the creation of cultural objects — even if that understanding starts and ends at recognizing the appearance of “someone made this”, it undoubtedly exists. This is no longer the case with how we see technology and much of the rest of the infrastructure that rules our lives. Vibecoding feels great because we love to catch a vibe, but it fails because a vibe is simply not enough to go on. You can't easily maintain or expand vibecoded software, because you don't fully understand its machinations. Though the appeal of cool vibes is still undeniable, maintaining a perpetual culture machine failed long-term because the broader public understands the machinations of the culture industries enough to have some attachment to them. I hope it’ll stay that way.

1 “Vibecoding” here meaning coding mostly through AI prompting, not tools like code completion and the like. This is the kind of thing where even I, a non-programmer, can look at the folder structure on GitHub and immediately tell something's up.

2 Though, as with anything AI, there's also vocal opposition to these tools whenever they're mentioned and employed. This essay won't go into that because how these tools are received doesn't matter to me here. I’m interested in why a large amount of individuals belonging to this demographic could love it, not in why others do not.

3 In case you're wondering what I tried: several locally run Whisper-Cpp models, including ones trained for Portuguese; my Google Pixel's built in audio transcriber; and Riverside. The latter I gave up on because it was paid and not appreciably better than the rest.

4 This is maybe the most embarrassing thing I've ever admitted to online.

5 I've heard it argued that Sora could've also shut down due to prohibitive costs, or possibly even due to some kind of lawsuit. We don't know yet, so take this as my thesis on AI “content creation” as a whole.

Thank you for reading noReturn! It is now a biweekly newsletter (turns out writing is hard), releasing roughly every other Friday. Follow for more musings on everything Media™ related. Be it tech, film, games, music… whatever really.

Consider subscribing to be emailed every time I make a post.

If you enjoyed what you read, consider donating on Ko-fi.

Reply

Avatar

or to participate

Keep Reading