Gemini 1.5 versus GPT-4o, Deepfake Detection, and AI Game Playing

Does Gemini 1.5 outperform GPT-4o? Can GPT-generated deepfakes be detected? How do business use these innovations? Find out here

Welcome (back) to AI Minds, a newsletter about the brainy and sometimes zany world of AI, brought to you by the Deepgram editorial team.

In this edition:

  • 🤯 Why Gemini 1.5 is unlike any other AI!

  • 🎦 Webinar: The future of conversational agents with CEOs of Deepgram and Daily

  • ✈️ Previews of Copilot powered by GPT-4o

  • 🏆 GPT-4o vs. Gemini 1.5 Hackathon Competition and Demos!

  • 📲 New AI App Feature: Dover

  • 🔍 New Deepfake Detection Benchmark

  • 🎞️ Long Video Question-Answering Benchmark and Dataset

  • 💼 How Businesses today are Adopting AI: A Full Guide

  • ♟️ How our own AI Models learned to beat us at our own game

  • 🎙️ AI Minds Podcast with SuperWhisper’s Neil Chudleigh!

Thanks for letting us crash your inbox; let’s party. 🎉

Deepgram just released a brand new text-to-speech model called Aura! Check it out here. 🥳

🎥 Why Gemini 1.5 is unlike any other AI…

In this video, Matthew Berman fully tests Google’s Gemini 1.5, pushing its massive context window to its limits. Does the model live up to the hype? Find out here!

💻 New Webinar: The Future of Conversational Agents

Discover the future of Voice AI agents with industry leaders Scott Stephenson, CEO of Deepgram & Kwindla Hultman Kramer, CEO of Daily in an upcoming panel conversation hosted by VUX World. RSVP today!

Where: Online

When: May 30th 9am PDT | 12pm EDT | 5pm BST | 6pm CEST

Sign-up here!

🐝 Underrated GPT-4o and Gemini 1.5 Tweets

Dover is a comprehensive suite of recruitment tools and services designed to streamline the hiring process, making it easier, faster, and more efficient. With over 1200 companies relying on Dover's expertise, it's clear that they've become a go-to resource for businesses looking to scale their teams with the right people.

🧑‍🔬 New Benchmarks: Deepfake detection and Long Video Q/A

CinePile: A Long Video Question Answering Dataset and Benchmark -  This paper details an innovative approach for creating a question-answer dataset, comprised of 305,000 multiple-choice questions (MCQs) that cover various visual and multimodal aspects, including temporal comprehension, understanding human-object interactions, and reasoning about events or actions within a scene.

EVDA: Evolving Deepfake Audio Detection Continual Learning Benchmark - This paper introduces EVDA, a benchmark for evaluating continual learning methods in deepfake audio detection. EVDA includes classic datasets from the Anti-Spoofing Voice series, Chinese fake audio detection series, and newly generated deepfake audio from models like GPT-4 and GPT-4o.

🏇 How Businesses are Adopting AI & How our models beat us at our own game 

How Businesses are Adopting AI: Full Guide - Businesses can use generative AI for content creation, AI-powered sales and marketing platforms, workflow automation tools, AI-driven customer service chatbots, and AI-enhanced meeting assistants. Effective AI adoption requires high-quality data, AI expertise, ethical and regulatory compliance, and constant adaptation.

How our inventions beat us at our own games: AI Game Strategies - AI techniques used in game-playing include Monte Carlo Tree Search, genetic algorithms, neural networks, and reinforcement learning. AI game-playing agents control non-player characters (NPCs) and make strategic decisions to achieve goals. Challenges and limitations of AI in gaming include lack of creativity, ethical concerns, limited adaptability, and computational resource requirements.

🎙️ AI Minds Podcast! 

In this episode we feature SuperWhisper and its creator Neil Chudleigh. SuperWhisper, a high-quality voice to text software for Mac and iOS, is a tool that has revolutionized the way we create and digest content. 

We will be finding out how the software bypasses the technical and user experience challenges of developing offline models without overloading consumer devices. As they run through the complexities of managing edge computing and potential issues, this podcast will explain how Super Whisper provides a flexible platform for different professions and individuals with varied workflows.