YouTuber builds Google’s staged Gemini demo in real-time with GPT-4 Vision

YouTuber “Greg Technology” has recreated Google’s discredited multimodal Gemini AI demo using OpenAI’s GPT-4 Vision to demonstrate real-time voice and vision prompts. The original Gemini AI demo video, which was criticized for being staged and not recorded in real-time, featured voice interactions that were later dubbed in. In response, Greg Technology released a video using GPT-4V in which he discussed a drawing, asked about emoticons, and had the AI identify a game. It’s not as polished as Google’s demo, of course, but it’s real-time and real. Greg has published his demo code on GitHub.

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:

Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top