14 may 2024

Back when OpenAI announced “multimodal” ChatGPT I felt that their language was deliberately vague enough for it to be several layers functioning separately — e.g., a discrete image recognizer telling the LLM what’s in a picture.

They’ve finally confirmed that was the case, because *now* GPT-4o is a single, actually omnimodal neural network. And I find the idea that this works, and works so well and so fast, really impressive and terrifying (all over again).

Major ChatGPT-4o update allows audio-video talks with an “emotional” AI chatbot

New GPT-4o model can sing a bedtime story, detect facial expressions, read emotions.

Want to know when I post new content to my blog? It's a simple as registering for free to an RSS aggregator (Feedly, NewsBlur, Inoreader, …) and adding to your feeds (or if you want to subscribe to all my topics). We don't need newsletters, and we don't need Twitter; RSS still exists.

Legal information: This blog is hosted par OVH, 2 rue Kellermann, 59100 Roubaix, France,

Personal data about this blog's readers are not used nor transmitted to third-parties. Comment authors can request their deletion by e-mail.

All contents © the author or quoted under fair use.