Vibe Coding

Colin Wright 56 lượt xem 1 week ago

Video Not Working? Fix It Now

This week we talk about Studio Ghibli, Andrej Karpathy, and OpenAI.
We also discuss code abstraction, economic repercussions, and DOGE.
Recommended Book: How To Know a Person (https://amzn.to/4c9f7Ro) by David Brooks
Transcript
In late-November of 2022, OpenAI released a demo version of a product they didn’t think would have much potential, because it was kind of buggy and not very impressive compared to the other things they were working on at the time. This product was a chatbot interface for a generative AI model they had been refining, called ChatGPT.
This was basically just a chatbot that users could interact with, as if they were texting another human being. And the results were good enough—both in the sense that the bot seemed kinda sorta human-like, but also in the sense that the bot could generate convincing-seeming text on all sorts of subjects—that people went absolutely gaga over it, and the company went full-bore on this category of products, dropping an enterprise version in August the following year, a search engine powered by the same general model in October of 2024, and by 2025, upgraded versions of their core models were widely available, alongside paid, enhanced tiers for those who wanted higher-level processing behind the scenes: that upgraded version basically tapping a model with more feedstock, a larger training library and more intensive and refined training, but also, in some cases, a model that thinks longer, than can reach out and use the internet to research stuff it doesn’t already know, and increasingly, to produce other media, like images and videos.
During that time, this industry has absolutely exploded, and while OpenAI is generally considered to be one of the top dogs in this space, still, they’ve got enthusiastic and well-funded competition from pretty much everyone in the big tech world, like Google and Amazon and Meta, while also facing upstart competitors like Anthropic and Perplexity, alongside burgeoning Chinese competitors, like Deepseek, and established Chinese tech giants like Tencent and Baidu.
It’s been somewhat boggling watching this space develop, as while there’s a chance some of the valuations of AI-oriented companies are overblown, potentially leading to a correction or the popping of a valuation bubble at some point in the next few years, the underlying tech and the output of that tech really has been iterating rapidly, the state of the art in generative AI in particular producing just staggeringly complex and convincing images, videos, audio, and text, but the lower-tier stuff, which is available to anyone who wants it, for free, is also valuable and useable for all sorts of purposes.
Just recently, at the tail-end of March 2025, OpenAI announced new multimodal capabilities for its GPT-4o language model, which basically means this model, which could previously only generate text, can now produce images, as well.
And the model has been lauded as a sort of sea change in the industry, allowing users to produce remarkable photorealistic images just by prompting the AI—telling it what you want, basically—with usually accurate, high-quality text, which has been a problem for most image models up till this point. It also boasts the capacity to adjust existing images in all sorts of ways.
Case-in-point, it’s possible to use this feature to take a photo of your family on vacation and have it rendered in the style of a Studio Ghibli cartoon; Studio Ghibli being the Japanese animation studio behind legendary films like My Neighbor Totoro, Spirited Away, and Princess Mononoke, among others.
This is partly the result of better capabilities by this model, compared to its precursors, but it’s also the result of OpenAI loosening its policies to allow folks to prompt these models in this way; previously they disallowed this sort of power, due to copyright concerns. And the implications here are interesting, as this suggests the company is now comfortable showing that their models have been trained on these films, which has all sorts of potential copyright implications, depending on how pending court cases turn out, but also that they’re no long being as precious with potential scandals related to how their models are used.
It’s possible to apply all sorts of distinctive styles to existing images, then, including South Park and the Simpsons, but Studio Ghibli’s style has become a meme since this new capability was deployed, and users have applied it to images ranging from existing memes to their own self-portrait avatars, to things like the planes crashing into the Twin Towers on 9/11, JFK’s assassination, and famous mass-shootings and other murders.
It’s also worth noting that the co-founder of Studio Ghibli, Hayao Miyazaki, has called AI-generated artwork “an insult to life itself.” That so many people are using this kind of AI-generated filter on these images is a jarring sort of celebration, then, as the person behind that style probably wouldn’t appreciate it;...

News

News Commentary

Comment