AI Minds #5: Demystifying AI Agents & Copilots

Everything you need to know about the nuanced world of AI Agents and Copilots.

Welcome (back) to AI Minds, a newsletter about the brainy and sometimes zany world of AI, brought to you by the Deepgram editorial team.

In each edition, we’ll highlight a special topic from the rapidly-evolving field of Language AI. We’ll also touch on timely issues the AI community is buzzing about. As a heads-up, you might want to brace yourself for the occasional spicy take or two along the way. 🌶️

Thanks for letting us crash your inbox; let’s party. 🎉

⚙️ Agents: Putting the “AI” in “Assist”

Are AI models supposed to be assistants? Tools? Replacements? Well, the fate of AI and its role in our society at large is yet to be determined. However, what we do know is that these models perform rather impressively at various tasks, with or without human help.

Here’s a quick-and-fun article we wrote on the topic.

LLM Agents: When Large Language Models Do Stuff For You by Brad Nikkel — How do crows, Minecraft, and AI agents all relate? Well, in a single word: tools. But okay, let’s elaborate. In this article, Brad discusses AI agents at length. And these agents’ relation to crows, Minecraft, and even The Sims goes far deeper than many would initially anticipate. Long story short, not only are AI agents fantastic tools for you to use, but they can also create tools themselves. Curious to see how? Check out the article!

🤖 Models, Agents, and Copilots: The Three AI Musketeers

Kind of like how natural language processing (NLP) is a discipline in its own right, but also serves as a conceptual umbrella for subdisciplines like natural language understanding (NLU) and natural language generation (NLG), it’s a similar situation with the designation between AI agents and copilots. “AI agents” use an AI model to complete a task on behalf of its human user. It’s how that task-completion is, well, completed that delineates between agents and a specific subset of agents that have come to be known as copilots.

Now, to be absolutely clear, it’s not like there’s a canonical definition of what separates “copilots” from “agents,” but here’s the conceptual framework that we’ll use going forward:

  • AI Models are mathematical and architectural representations of a certain knowledge task, like generating text outputs based on text inputs, or classifying images as hotdogs versus not-hotdogs. 🌭

  • Agents interact with an environment (be it an actual IRL physical space, or an ocean of pure text) autonomously to accomplish a knowledge task. Agents use Models to understand and interact with their environment.

  • Copilots are agents that are specifically built to work alongside a human counterpart in a supportive or advisory role. For example, the AI code-writers Github Copilot and StarCoder are considered copilots, since they can only write code based on an existing program that their human user has already written. Meanwhile, the virtual assistants Alexa and Siri are not considered copilots, since they don’t need human interference to answer the question “What’s the weather today?” or to play Despacito. ☀️ 

🛩️ Who should be your (coding) copilot?

Ever since the advent of large language models (LLMs), the tech world has been keen to find out of AI can themselves write clear, functioning code. And it turns out that many AI Copilots can! (To varying degrees of success.)

In the following video, ML Developer Advocate Jose Francisco tests four well-known AI programmers—Github Copilot, StarCoder, ChatGPT, and Gorilla—on their ability to write code. Essentially, he conducts a coding interview with them to test their ability to write pure functions and to use APIs. At the end, he gives his own personal recommendation on which AI tools (or combination of AI tools) to use to maximize productivity!

Or, if you’d rather read than watch this experiment in action, check out this article.

Nevertheless, some remain skeptical about copilots’ abilities to perform practically in the tech world, especially considering the impacts that markets have on emerging technology products.

🚀AI Agents 101

AI agents, what are they? Who are they? To put it plainly, they’re the coolest kids at the function. Often referred to as intelligent agents, AI agents are autonomous systems that perceive their surroundings and act to achieve specific objectives. These agents form the backbone of many AI systems, playing a pivotal role in the broader field of artificial intelligence. They have a thankless behind-the-scenes job, but they are making tidal waves in the artificial intelligence universe.

AI agents use sensors to gather data from their environment to perform actions based on that data. Think of them as digital entities with the ability to "sense" and "act" in their digital environment.

So, what REALLY makes them the cool kids? Well, the primary goal of an AI agent is to process information, make decisions, and execute actions autonomously. They are designed to operate without constant human intervention, making them invaluable in scenarios where rapid, consistent, and objective responses are required. From chatbots that answer customer queries to robots that perform surgeries, the purpose of AI agents spans a vast range of applications.

What’s cool is that you can even build your own AI Agent as a fun coding exercise. We walk through one way to do so in Python here (also linked through the image below). While the agents built by companies will likely be more robust and scalable than the ones individuals build as pet-projects, the fact remains that these agents are at our fingertips, readily available for the masses to use.

An Agent’s Role in the AI Ecosystem: AI agents are not just standalone entities but integral components of the larger AI ecosystem. They contribute significantly to AI, especially in the realm of machine learning. Here's how:

  • Learning from Data: Unlike traditional software that follows explicit, step-by-step instructions, AI agents can learn from data. This ability to learn and adapt from experiences sets them apart and allows them to improve over time.

  • Decision Making: AI agents can evaluate vast amounts of data quickly and make decisions based on patterns and insights that might be imperceptible to humans.

  • Interactivity: AI agents can interact with users or other systems, providing dynamic responses based on the situation. This makes them ideal for customer service, gaming, and many other interactive applications.

As technology advances, AI agents' capabilities and roles are expected to grow, further cementing their importance in the world of artificial intelligence.

🤯 Not-So-Secret Agents: AI Skillsets and Capabilities

AI agents have evolved to perform a myriad of tasks, from simple data processing to complex decision-making.

  • 🔊 Voice Assistants: AI agents like Siri, Alexa, and Google Assistant are power voice assistants. They can understand spoken language, answer queries, set reminders, and even control smart home devices.

  • 🤖 Chatbots: Businesses use AI-driven chatbots for customer support, sales, and even mental health counseling. These bots can understand user queries, provide relevant responses, and learn from interactions to improve over time.

  • 📈 Sentiment Analysis: AI agents can analyze text from product reviews, social media, or feedback forms to determine public sentiment, helping businesses understand customer feelings and improve products or services.

  • 🌍 Translation: AI-driven translation tools can instantly translate between multiple languages, breaking down communication barriers in our globalized world.

  • 💻️ Coding: As mentioned with the copilots above, certain agents can autocomplete human-written input to impressive extents. In fact, even if you write an in-line comment describing what you want a particular function to do, AI Copilots like StarCoder will fully write that function. (And will oftentimes be bug-free!)

  • ✏️ Content Generation: AI agents can generate articles, marketing copy, or even poetry. Tools like GPT-3 can produce human-like text based on prompts.

But alright, this bullet-point list is rather cursory. Let’s delve into the details with some fun examples, shall we?

🔈️ Voice Assistants: In today's digital age, voice assistants such as Siri, Alexa, and Google Assistant have become reliable side-kicks for many users. These AI agents are designed to comprehend spoken language, facilitating a variety of tasks such as answering questions, setting reminders, or even interfacing with smart home devices. Because voice assistants are both adaptable and intuitive, they make for a perfect hands-free partner in crime. Some developers have even gone so far as to build their own voice assistants as a fun, personal project.

💬 Chatbots: The business landscape has experienced a meaningful transformation with the integration of AI-driven chatbots. Whether it’s customer support, sales, or even mental health, these conversational bots are capable of understanding user queries and offering almost human replies. What's extra neat is their ability to learn from interactions, continually honing their ability to assist and inform. In the 1970s, we had a simple, regex-based AI chatbot named ELIZA, who acted as a digital therapist. In the past four decades, chatbots have become more robust. Now, from Houston, to Bard, to Llama-2-chat, we have a plethora of chatbots to choose from.

🥰 Sentiment Analysis: Extracting insights from text has become more efficient and accurate with the advent of sentiment analysis. By assessing public sentiment through product reviews, social media chit-chat, or feedback forms, AI agents offer businesses a lucid understanding of their customers' feelings. This, in turn, gives companies and opportunity to better their offerings so they resonate with their target audience. Curious to learn more? Check out this paper to see how Deep Learning has helped us break through the challenges of Natural Language Processing to further advance the accuracy and efficiency of Sentiment Analysis at large.

🌐 Translation: In our increasingly interconnected world, the need for clear and immediate communication is paramount. AI-driven translation tools have risen to this challenge, offering instantaneous translations across numerous languages. By erasing linguistic barriers, these tools foster a deeper sense of global community and understanding. Today, companies like XRAI Glass offer real-time translation of spoken conversation—whether you need to translate a large lecture or a simple, one-on-one conversation. Furthermore, WIRED even made a video pitting human interpreters against AI, putting absolutely everyone to the test.

⌨️ Coding: While we compare and contrast the famous AI coders in the video mentioned above, they all live in the same neighborhood of “Impressive Feats of Technology.” Yes, we put them in competition with one another. However, they each can stand alone as their own incredible advancement in the world of generative AI.

📸 Content Generation: Content creation has been revolutionized by AI's ability to generate diverse forms of text. Whether it's crafting informative articles, influential marketing copy, or even artistic poetry, AI tools, like GPT-3, have showcased an ability to produce text that closely mirrors human-like quality and nuance, expanding what's possible in the world of content and content production. At Deepgram, we’ve used AI Agents to assist in our content creation, especially on our YouTube channel. To see how, check out this video!

🖼️ The Red Carpet of AI Frameworks

In this section, we list the top AI Frameworks today and explain why they’re so incredible. So buckle up! We’ve got some mind-blowing tools to show you. 🤯 

🚗 AutoGPT

AutoGPT is an advanced autonomous AI assistant tailored for developers and tech enthusiasts. It goes beyond mere text generation, offering capabilities that simplify system design and development processes.

Features:

  • Intuitive Interface: Designed with both developers and end-users in mind, AutoGPT provides an interface that's easy to navigate, making complex tasks more manageable.

  • Adaptive Learning: AutoGPT can learn from user interactions, improving its responses and suggestions over time.

  • Integration Capabilities: It can be seamlessly integrated into various platforms, enhancing its utility across different tech ecosystems.

💬 MetaGPT

MetaGPT stands out as a collaborative AI tool, primarily aimed at the software development community. It's designed to assist developers in coding, debugging, and even brainstorming sessions, making the software development lifecycle smoother.

Features:

  • Real-time Collaboration: Developers can work in tandem with MetaGPT, receiving instant feedback and suggestions.

  • Code Optimization: MetaGPT can suggest code optimizations, ensuring efficient and clean code.

  • Error Detection: It can quickly detect and highlight potential errors, reducing the debugging time.

🦸‍♀️ SuperAGI

SuperAGI is an open-source framework that stands as a beacon for those interested in the development and deployment of autonomous agents. Its open-source nature means a community-driven approach, with contributions from developers worldwide.

Features:

  • Modular Design: SuperAGI offers a modular design, allowing developers to plug in different components as needed.

  • Community Support: Being open-source, it boasts a robust community that continuously contributes to its improvement.

  • Versatility: Designed to cater to various applications, from gaming agents to industrial automation, SuperAGI offers a wide range of tools and functionalities.

🕊️ Avoiding Skynet, Ex-Machina, and an AI doomsday

AI is powerful. And with great power comes great responsibility. While the dystopia illustrated in the Terminator series or even in Pixar’s Wall-E is not par for the course, there are indeed some considerations that developers, researchers, and innovators have to confront as our technology becomes more powerful.

The following is a list of the frontier challenges in AI Agent Development. So let’s be more specific. While AI agents are revolutionizing various sectors, from healthcare to finance, they also bring forth a myriad of challenges. These challenges range from ethical dilemmas to concerns about data misuse. Addressing these challenges is crucial for AI technologies' sustainable and responsible growth.

  • Chatting Ethics in the AI World: AI agents are popping up everywhere, right? From helping doctors make calls on treatments to offering up financial tips or even weighing in on legal matters. With all this power, it's natural to wonder about the ethical side of things. Like, what if these AIs start picking up some of our not-so-great biases from the data they consume? It's not just about them doing their thing - we've got to know how those decisions are being made. This whole idea of "explainability"? It's not just tech jargon – it's imperative, especially when the choices they make can impact lives.

  • Diving into Data Privacy: Our AI buddies thrive on a boatload of data, especially those chatting away with NLP or crunching numbers with machine learning. Sometimes, this data gets… eh, a bit personal. But with great data comes great responsibility, right? As more info piles up, there's a risk of some seeing AI systems as a nefarious treasure chest. But hey, let's not forget about asking permission! Making sure we're all aware of how our data's being used is basically the biggest debate of all time, AI isn’t an exception.

  • Treading Carefully with AI: AI technology, with all its brilliance, isn't without its darker side. While these tools can make our lives a lot easier, they could lead to unsavory problems in the wrong hands. For instance, the very “brains” that make AI awesome can be manipulated into powering stealthy cyberattacks, raising the stakes in digital security. It's all about balance and ensuring that the potential for misuse is constantly kept in check.

  • Automation in Doses: There's no denying that AI is taking the reins in many industries, helping businesses streamline their processes. However, leaning too heavily on automation comes with its own set of challenges. For starters, going overboard could mean fewer jobs in areas that become entirely AI-driven. Plus, let's face it: nothing can truly replace the warmth and understanding of a human touch in fields like healthcare. Ensuring that AI complements rather than replaces human roles is key to preserving both efficiency and empathy.

  • Scaling Up with AI: AI agents are in high demand, and as more folks want in on the action, it's crucial these tools can grow and flex with different needs. But here's the catch: powerful AI models are like the high-performance cars of the tech world, guzzling loads of computational resources. And, just like we need to keep up with the latest trends, AI has to stay on its toes, constantly tweaking itself based on fresh data, especially when the landscape is always shifting. Making sure our AI can both scale up and stay agile is the name of the game.

🧑‍💻 Other Bits and Bytes

  • 🦸 Nearly all of AI’s superpowers come from Transformers - Much like Sauron’s One Ring or the Infinity Stones, the Transformer architecture which powers numerous AI models contains powers that even its inventors couldn’t foresee. In fact, as Financial Times journalist Madhumita Murgia claims: Generative AI exists because of the Transformer. To see how, check out her article and the incredible visualization it contains. This article contains what is undoubtedly one of the best illustrations of Transformers and their inner workings—how they convert natural language and large amounts of text data into numbers that computers can inherently understand, parse, and play with.

  • 🌏️ What if there was a society of nothing but AI agents? - Sure, AI agents can come up with incredible outputs, from poetry to playwriting… but what happens when you create a society of AI agents, all interacting with one another? Well, in Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View, Zhang, Xu, and Dheng work together to fabricate a society with a population of nothing but various LLMs. And after simulating various scenarios, they find that “LLM agents manifest human-like social behaviors, such as conformity or majority rule, mirroring foundational Social Psychology theories.” Want to see exactly how these AI agents talk to each other? Check out the paper!

  • 🧪 AI agents are also writing recipes for drugs - And now for the Chemistry perspective! Researchers from Carnegie Mellon University present Emergent autonomous scientific research capabilities of large language models. The results from this paper are indisputably impressive because of one, simple fact: Chemistry is hard. Nevertheless, Boiko, MacKnight, and Gomes reveal that AI agents can indeed succeed in the world of chemical engineering (to a certain degree). The prompt given to the AI is simple: “Synthesize ibuprofen.” The result? Well, the AI very much came up with a recipe for ibuprofen! And aspirin. And aspartame. While there were some minor imperfections in a couple of these recipes, the fact remains that AI agents can research the internet and come up with methods for creating known (and perhaps new) chemical compounds!

  • 🔎 AI agents solve a murder mystery… - Ever wonder how an AI would fare in a murder mystery game? Well, look no further than this paper from UC Berkeley: Exploring the Intersection of Large Language Models and Agent-Based Modeling via Prompt Engineering. Here, Junprung uses a murder mystery game (alongside other techniques) to present simulations of believable proxies of human behavior. After all, modeling human behavior is extremely complex, even when trying to predict the reactions of people we know personally. Here, we see that through clever prompt engineering (which we’ll perhaps delve more deeply into in a future newsletter, wink wink 👀), a new way of modeling human behavior within a larger system may be emerging.

🌟 Get up to $100,000 in speech-to-text credits.
Join the Deepgram Startup Program 🚀