Nov 3, 2024
AI Tools
Developing Rational AI Assistants: Understanding System One vs System Two
Summary
Building AI Agent Teams
Clearly define roles, goals, and backstories for AI agents to enable them to focus on the right tasks and provide relevant, accurate information in their outputs.
Utilize CrewAI's sequential process to create complex workflows where agents work together linearly, with one agent's output serving as input for the next.
Enhancing AI Capabilities
Integrate tools like LangChain to give AI agents access to real-time data from sources such as Google searches, Reddit, and Wikipedia for more accurate and relevant information.
Develop custom tools to scrape specific data from sources like Reddit, allowing for greater control and flexibility over information gathering.
Practical Implementation
Consider using local models to avoid API call fees and maintain privacy, but be aware that only 1 out of 13 open-source models tested successfully completed the given task.
Timestamps
00:00 AI assistants can mimic slow conscious thinking to help with decision-making, with the ultimate goal of being able to process requests and offer rational solutions.
01:40 Use tree of thought prompting and CrewAI to build custom AI agents that can collaborate and solve complex problems, make them more intelligent with real world data, and avoid paying fees by running models locally.
04:18 Three AI agents with specific roles collaborate to understand product needs and reach a wider audience, creating a business plan for a new product idea.
07:30 CrewAI generated a business plan with 10 build points, 5 goals, and technology considerations, while using real-time data tools and a Google scraper to produce a report on AI and machine learning innovation.
10:28 The speaker discusses improving their newsletter with custom tools and a pre-built tool called Humaninalope, creating a custom tool to scrape Reddit for information, and using an AI assistant to handle API exceptions and compile data.
13:15 AI assistants can read and summarize posts in less than a minute, but they can be inconsistent and sometimes fail to follow instructions.
14:30 Using local AI models can save on expensive API calls, with CrewAI able to understand tasks and run on at least 8 GB of RAM.
16:22 The speaker experimented with 30 AI models, finding Open Chat to be the best for generating a newsletter, but it didn't include data from the local llama subreddit, and invites viewers to share their experiences with CrewAI.
Transcript
00:00 Have you ever found yourself on the verge of making a controversial purchase just as you're about to click on that buy button. An unexpected thought suddenly crosses your mind wait a minute. They look a little bit like soy cheese. Don't they no no no no no they're absolutely beautiful and Kanye West loves them. He wears them all the time but if I like things that Kanye likes is that really a good thing okay I need to relax everything is fine and buying these makes me a Visionary a trend Setter do these holes exist for ventilation purposes oh okay time for a break I need to urgently distress from all this thinking with some Pringles wait is you think this like really unhealthy so that inner dialogue you just witness is what Daniel conman author of Book Thinking Fast and Slow calls system to thinking it's a slow conscious type of thinking that requires deliberate effort and time.The opposite of that is system one or fast thinking system. One is subconscious and automatic for example when you effortlessly recognize a familiar face in a a crowd but why am I talking about this in a video about AI assistance well in order to understand that I have to mention an amazing YouTube video posted by Andre karpati a great engineer at open AI in that video Andre clarifies that right now all large language models are only capable of system one thinking they're like Auto predict on steroids. None of the current llms can take let's say 40 minutes to process a request think about a problem from various angles and then offer a very rational solution to a complex problem and this rational or system to thinking is what we ultimately want from AI.
01:40 But some smart people found a way to work around this limitation. Actually they came up with two different methods. The first and simpler way to simulate this type of rational thinking is with tree of thought prompting you might have heard of it so this involves forcing the llm to consider an issue from multiple perspectives or from perspectives of various experts. These experts then make a final decision together by respecting everyone's contribution. The second method utilizes platforms like crew aai and agent systems. Crei allows anyone literally anyone even non-programmers to build their own custom agents or experts that can collaborate with each other.Thereby solving complex tasks you can tap into any model that has an API or run local models through AMA. Another very cool platform and in this video I want to show you how to assemble your own team of smart AI agents to solve tricky complex problems and I'll also demonstrate how to make them even more intelligent by giving them access to real world data like emails or redit conversations and finally I'll explain how to avoid paying fees to companies and exposing your private info by running models locally instead and speaking of local models. I've actually made some really surprising discoveries and I'm going to talk about it a little bit later so let's build an agent team. I'll guide you through getting started in a way that's simple to follow along even if you're not a programmer in my first example. I'll set up three agents to analyze and refine my startup concept okay so let's begin first open vs code and open a new terminal I've already created and activated my virtual environment and I recommend you do the same and once that's done you can actually install crew AI. By typing the following in the terminal Next Step will be to import necessary modules and packages and you're going to need an open API key. So in this case I'm going to need the standard module and I need to import agent task processing crew from crew. AI you can set the open AI as the environmental variable so by default crew AI is going to use GPT 4 and if you want to use use GPT 3.5. You have to actually specify that but I don't think that you're going to get amazing results with 3.5. I actually recommend use GPT 4. Now let's define three agents that are going to help me with my startup. There's no actual coding. Here.
04:18 This is just good old prompting so let's instantiate three agents. Like this each agent must have a specific role and I want one of my agents to be a market researcher expert. So I'm going to assign it or this specific role. Also each agent should have a clearly defined goal. In my case. I want this research expert agent to help me understand if there is a substantial need for my products and provide guidance on how to reach the widest possible target audience and finally I need a backstory for my agent something that's going to additionally explain to the agent. What this role what this role is about lastly you can set verbos to True which will enable agents to create detailed outputs and by setting this parameter to true I'm allowing my agents to collaborate with each other. So I will save this agent as a marketer and I'm going to do the same for two other agents. So overall I I'll have a marketer a technologist and a business development expert on my team of AI agents. So once this part is done.It's time to Define tasks. Tasks are always specific and results um. In this case. It can be let's say a detailed business plan or market analysis. For example agents should be defined as Blueprints and they should be reused for different goals but tasks should always be defined as specific results that you want to get in the end and tasks should have a description always something that describes what the task is about and they should also always have an agent that's going to be assigned to every specific test. So in my case I want to have three specific tasks. My business idea is to create elegant looking plugs for Crocs so this iconic Footwear looks less like Swiss. Chees I will assign the first task to a marketer agent and this agent will analyze the potential demand for these super cool plugs in advis on how to reach the largest possible customer base. Another task is going to be given to a technologist and this agent will provide the analysis and suggestions for how to make these plugs and the final task will be given to a business. Cons consultant who's going to take into consideration everyone's reports and write a business plan now that I have defined all the agents and all the tasks as a final step.I'm going to instantiate the crew or the team of Agents. I'm going to include all the agents and tasks and I'm going to define a process. Process defines how these agents work together and right now. It's only possible to have a sequential process which means output of the first agent will be the input for the second agent and then that's going to be the input for the third agent and now I'm going to make my crew work. With this final line of code. I also want to to see all the results printed in the console.
07:30 So that's the most basic possible example and it's the best way to understand actually how crew AI works and I expect these results to be far from impressive. I actually believe that the results are going to be just a little bit better than just asking Char with to write a business plan but let's see okay. So now I have the results I have business plan with 10 build points. I have five business goals and a time schedule and so I should have a 3D printing technology and injection molds. Laser Cuts apply machine learning algorithms to analyze custom preferences and predict future buying Behavior. So I guess this agent really took very seriously my business idea and I even have sustainable or recycled materials that's great.So there you go so how to make a team of Agents even smarter making agents. Smarter is very easy and straightforward with tools by adding these tools you're giving agents access to real world realtime data and there are two ways to go about this first and easier option is to add built-in tools that are part of L train and I will include a link to a complete list of Lang chain tools but some of my personal favorites are 11 Labs text to speech which generates the most realistic AI voices. Then there are tools that allow access to YouTube and all kinds of Google data and Wikipedia so now I'll change my crew and in this next example I'll have three agents researcher. Technical writer and writing critic everyone will have their own task but in the end I want to have a detailed report in form of a blog or a newsletter about the latest Ai and machine learning Innovation. The blog must absolutely have 10 paragraphs. It has to have all the names of all the projects tools written in bold and every paragraph has to have a link to the project.I'll use Lang chain Google seral tool which will fetch Google search results but first I'll send it for free API key through serer Dev I'm going to include the link to all the code and all the prompts in the description box as usual. So let's begin by importing necessary modules and let's initialize. Sur API tool with API key so I'll instantiate the tool I'll name the tool. A Google scraper tool and I'll give it a functionality which is to execute search queries and along with description to indicate the use case as a last step before running the script. I should assign this tool to my agent that's going to run first and once I run the script I can see all the scrape data in blue letters. Green letters show agent processing this information and white letters are going to be the final output of each agent.
10:28 So this is what my newsletter looks like right now and I have 10 paragraphs as requested. Each paragraph has a link and around two to three sentences so the form it is fine. It's exactly what I was looking for but there is a big problem so the quality of information in the newsletter is not really the best. None of these projects are really in the news at this moment and my newsletter is only as good as the information that goes into it. So let's fix that how do I improve the quality of the newsletter well. It's actually quite simple. I just need to find a better source of information and that brings me to custom made tools. But before I dive into that it's worth mentioning that there is one more cool and very useful pre-built tool that people might Overlook and that is human inal lope. This tool will ask you for input if it runs into conflicting information okay so back to fixing the newsletter. My favorite way to get information is local llama subreddit.The community is amazing and they Shir an incredible amount of cool exciting projects and I just don't have enough time to sit and read through all of it. So instead. I'm going to write a custom tool that scrapes latest 10 hot posts as well as five comments per each post. There is a preil tool through length chain a Reddit scraper but I don't really like using it my own custom. Tool gives me a lot more control and flexibility. Here's a quick look at the code so import Pro and Tool from link chain and I'm going to create a new class that's called browser tools which is how I'm going to create this Custom Tool. Then I'm going to need a decorator and a single line dog string that describes what the tool is for the scrape. Reddit method starts by initializing the pro rdit object with client ID client secret and user agent.It then selects the subreddit local llama to scrape data. From then the method iterates through 12 hotest posts on the Reddit extracting the post title URL and up to seven top level comments. It handles API exceptions by pausing the scraping process for 60 seconds before continuing and the scrape data is compiled into a list of dictionaries each containing details of a post and its comments which is returned in the end and the rest of the code is the same so I'm just going to copy it from the previous tool. With the exception of this time I'm going to assign a Custom Tool uh from the browser tool class and this is the result that I'm getting with jp4. I'm just going to copypaste the output into my notion notebook so that you can see it better. I have to say that I'm more than pleased with the results.
13:15 It would take me at least an hour to read latest posts on Lo and Lama then to summarize them and take notes but CI agents did all of this in less than a minute. This is a type of research that I need to do a few times a day and also this is the first time that I managed to completely automate part of my work with agents. One thing that I noticed is that sometimes even GPT 4 doesn't really follow my instructions. There are no links to these projects in this output and I asked for them but when I run the script yesterday the agent successfully included all the links and these outputs were made on the same day but they're formatted differently. So output varies and agents can act a little bit flaky. From time to time. I also test the Gemini Pro which offer offers a free API key. You can request it through a link that I'm going to include in the description box. Essentially you just need to import special package from L chain. You need to load Gemini with this line and then you're going to need to assign this llm to every agent. So Gemini output was a little bit underwhelming the model didn't understand the task. Instead. It wrote a bunch of generic text from its training data which is really unfortunate so let me know if you run into different results.
14:30 I'm I'm really curious and now let's talk about price I rent The Script many times and as part of my experiments. But on this particular day 11th of January I remember that I ran the script four times which means that I paid around 30 cents every time I ran it so as you can tell it adds up pretty quickly and of course. This is gp4 how to avoid paying for all these pricey API calls and how to keep your team of agents and conversation. Private yes local model mod so let's talk about that right.Now. I've tested 13 open source models in total and only one was able to understand the Tas and completed in some sense. All the other models failed which was a little bit surprising to me because I expected a little bit more I guess from these local models and I'll reveal which ones perform the best and the worst but first let me show you how to run local models through all llama. The most important thing to keep in mind is that you should have at least 8 GB of RAM available to run models with 7 billion parameters 16 GB for 13 billion and 32 GB to run 33 billion parameter models having said that even though I have a laptop with 16 GB of RAM I couldn't run Falcon that only has 7 billion parameters and vuna with 13 billion parameters. Whenever I try to run these two models. My laptop would freeze and crash so something to keep in mind. If you already installed a llama and you downloaded a specific model. You can very easily instruct crew AI to use local model instead of openi with this line just import a llama from Lang chain and set set the open source model that you previously downloaded. Once you do that you should also pass that model to all the agents otherwise they're going to default to CH GPD.
16:22 Among 30 models that I experimented with the worst performing ones were llama 2 Series with seven b parameters and another model that performed poorly was 52. The smallest of all of them latu was definitely struggling to produce any type of meaningful output and Fu was just losing it. It was painful to watch the best performing model with seven bilder parameters in my opinion was open chat which produced an output that sounds very newsletter. The only downside was that it didn't actually contain any data from from local llama subreddit which was the whole point. Obviously. The model didn't understand what the task is similarly but with a lot more emojis mistol produced a generic but fine newsletter.This is basically Mistro's training data. None of these projects are companies were part of local subred discussions which means that mistal agents didn't understand what the task is and open hermis and new hermis had a similar output. All of these outputs are are the best attempts they were even worst outputs since the results weren't really that great I played with different prompts variations of prompts but that didn't really achieve anything. Also I changed the model file that comes with local models played with parameters for each of the models and I added a system PRT that specifically references local llama but again no improvement. My agents still didn't understand what the task is. So the only remaining idea I had was to run more modles with 13 billion parameters which is the upper limit for my laptop.So I first ran llama 13 billion chat and text bottles not quantized but full Precision models. My assumption was that these models are going to be better at generating a newsletter because they're bigger models but I was wrong. The output didn't look better than let's say open chat or mistro and the problem was still there agents couldn't really understand what the task is. So I ended up with a bunch of generic texts about self-driving cars as usual again nothing even remotely similar to actual Reddit conversations on logal Lama. So out of pure desperation I tried a regular llama 13 billion parameters model a model that is not even fine-tuned for anything. My expectations were really low but to my surprise. It was the only model that actually took into consideration this great data from the subreddit. It didn't really sound like a newsletter or a Blog but at least the names were there together with some random free flung thoughts which I found a little bit surprising. So there you have it you can find my notes about which local modes to avoid and which ones were okay together with all the code on my GitHub which I'll link down below and I'm curious have you tried crew Ai and what were your experiences like thank you for watching and see you in the next one.



