Saturday, July 27, 2024

Python LangChain Course 🐍🦜🔗 Understanding Brokers and constructing our personal (5/6)

[ad_1]

Python LangChain Course 🐍🦜🔗

Hello and welcome again! On this half, we’re going to be constructing our personal customized agent from scratch. To date the entire agent might have appeared a bit magical because it simply runs off reasoning backwards and forwards by itself. On this half we’re going to actually perceive how an agent works and the way it’s constructed by making our personal. This half can be a bit longer than ordinary so buckle up and get snug!

On this half we’re going to be reinventing the wheel just a little bit however, with out going too deep down the rabbit gap, it will vastly improve your understanding of how an LLM instrument that may mainly solely generate textual content can turn into a strong AI agent that makes choices and takes actions.

Make a brand new folder named '5_Understanding_agents' to start with, and inside make one other folder named 'instruments'. We’re going to be utilizing 2 totally different instruments on this half and certainly one of them goes to be our web instrument from half 4. So copy the 'internet_tool.py' file from '4_Custom_tools/instruments' into '5_Understanding_agents/instruments'. Your folder now seems to be like this:

📁Finx_LangChain
    📁1_Summarizing_long_texts
    📁2_Chat_with_large_documents
    📁3_Agents_and_tools
    📁4_Custom_tools
    📁5_Understanding_agents
        📁instruments
            📄internet_tool.py      (copy of the web instrument from half 4)
    📄.env

Simply in case you want the contents of the internet_tool.py file right here it’s once more:

######################################################################################################################
##### This file is only a copy of the one from half 4, don't make adjustments right here, make them within the unique file ######
######################################################################################################################


import requests
from bs4 import BeautifulSoup
from langchain.instruments import BaseTool


class InternetTool(BaseTool):
    title: str = "internet_tool"
    description: str = (
        "helpful while you need to learn the textual content on any url on the web."
    )

    def _get_text_content(self, url: str) -> str:
        """Get the textual content content material of a webpage with HTML tags eliminated"""
        response = requests.get(url)
        html_content = response.textual content
        soup = BeautifulSoup(html_content, "html.parser")
        for tag in ["nav", "footer", "aside", "script", "style", "img"]:
            for match in soup.find_all(tag):
                match.decompose()
        text_content = soup.get_text()
        text_content = " ".be a part of(text_content.cut up())
        return text_content

    def _limit_chars(self, textual content: str) -> str:
        """restrict variety of output characters"""
        return textual content[:10_000]

    def _run(self, url: str) -> str:
        strive:
            text_content = self._get_text_content(url)
            return self._limit_chars(text_content)
        besides Exception as e:
            return f"The next error occurred whereas attempting to fetch the {url}: {e}"

    def _arun(self, url: str):
        elevate NotImplementedError("This instrument doesn't assist asynchronous execution")


if __name__ == "__main__":
    instrument = InternetTool()
    print(
        instrument.run("https://en.wikipedia.org/wiki/List_of_Italian_desserts_and_pastries")
    )

Clearly, in an actual coding venture by no means ever ever copy code like this! Having code in a number of locations is unhealthy. However for the sake of this tutorial, we’re going to maintain every part segregated per folder so as to simply reference something you need to in a while.

Moby Duck, Moby Dick??

Now contained in the '5_Understanding_agents/instruments' folder make one other file named 'moby_duck_search.py'. (Not Moby Dick from the novel concerning the whale, I’ll clarify the title in a second):

📁Finx_LangChain
    📁1_Summarizing_long_texts
    📁2_Chat_with_large_documents
    📁3_Agents_and_tools
    📁4_Custom_tools
    📁5_Understanding_agents
        📁instruments
            📄internet_tool.py      (copy of the web instrument from half 4)
            📄moby_duck_search.py
    📄.env

Inside this file, we’ll write the second instrument our customized agent will use. Open up 'moby_duck_search.py' and begin with our imports:

from json import dumps
from langchain.instruments import BaseTool
from langchain.utilities import DuckDuckGoSearchAPIWrapper

We’ll use the Python built-in JSON library’s dumps or dump-to-string methodology to transform objects to string format. We import the BaseTool as a result of we have to inherit from it as all the time. We additionally import DuckDuckGo as a result of we’re going to be utilizing DuckDuckGo to seek for articles however limiting the search to the MobyGames gaming web site. Therefore this instrument being named Moby-Duck-Search.

I selected this instance web site randomly and you may also use a unique comparable web site a few particular subject in the event you so need. This one can be fairly simple, so let’s get to implementing our instrument:

class MobyDuckSearch(BaseTool):
    title: str = "moby_duck_search"  # Pun meant.
    description: str = (
        "A instrument that makes use of DuckDuckGo Search to go looking the MobyGames recreation web site. "
        "Helpful for when it is advisable reply questions on video games. "
        "Enter needs to be a search question. "
    )
    api_wrapper = DuckDuckGoSearchAPIWrapper()

We begin our class declaration by inheriting from the BaseTool and setting a default worth for the title and outline. We additionally create an occasion of the DuckDuckGoSearchAPIWrapper which comes with LangChain which we retailer in a variable named ‘api_wrapper‘. We are able to merely reuse this however alter the enter barely. Whereas nonetheless contained in the MobyDuckSearch class block add:

    def _run(self, question: str) -> str:
        """Simply name the DuckDuckGoSearchAPIWrapper.run methodology, however with the edited question."""
        targeted_query = f"web site:mobygames.com {question}"
        results_with_metadata: listing = self.api_wrapper.outcomes(
            targeted_query, num_results=3
        )
        return dumps(results_with_metadata)

    def _arun(self, question: str):
        elevate NotImplementedError("This instrument doesn't assist asynchronous execution")

We implement the ._run() methodology that takes a question as a string and returns a string from the operate. We first create a targeted_query which is simply "Mario-Kart" in turns into "web site:mobygames.com Mario-Kart" out. You’re in all probability acquainted with this syntax from Google as it is going to permit you to seek for outcomes solely on that particular web site.

We then name the .outcomes() methodology on our api_wrapper occasion (which is simply the DuckDuckGoSearchAPIWrapper we imported from LangChain and has the .outcomes methodology already built-in for us) and move within the targeted_query and the variety of outcomes we wish. Be aware that we save this in a variable of sort listing named ‘results_with_metadata‘ because the outcome will even embody metadata which incorporates the hyperlink or URL to the search outcomes web page. It’s good to be express with variable naming as a substitute of simply saying ‘outcomes‘ so it’s very clear and readable precisely what sort of knowledge this variable holds.

We then return the results_with_metadata variable however we first convert it to a string utilizing the json.dumps() methodology. This methodology converts it to a stringified JSON object as a result of LLMs solely work with strings so we’d like a string return to our methodology, else we received’t be capable to feed it again into ChatGPT or one other LLM. Lastly, we declare the unused ._arun() methodology simply to be full.

Our MobyDuckSearch class now seems to be like this:

class MobyDuckSearch(BaseTool):
    title: str = "moby_duck_search"  # Pun meant.
    description: str = (
        "A instrument that makes use of DuckDuckGo Search to go looking the MobyGames recreation web site. "
        "Helpful for when it is advisable reply questions on video games. "
        "Enter needs to be a search question. "
    )
    api_wrapper = DuckDuckGoSearchAPIWrapper()

    def _run(self, question: str) -> str:
        """Simply name the DuckDuckGoSearchAPIWrapper.run methodology, however with the edited question."""
        targeted_query = f"web site:mobygames.com {question}"
        results_with_metadata: listing = self.api_wrapper.outcomes(
            targeted_query, num_results=3
        )
        return dumps(results_with_metadata)

    def _arun(self, question: str):
        elevate NotImplementedError("This instrument doesn't assist asynchronous execution")

A fast check run

Let’s add a fast check under and out of doors our class to see if it really works:

if __name__ == "__main__":
    moby_duck_tool = MobyDuckSearch()
    print(moby_duck_tool.run("lego star wars"))

Keep in mind it will solely run if we run the file immediately, and never if it’s imported inside one other file. We simply create a brand new occasion of our moby duck instrument after which run it with a question of 'lego star wars'. Go forward and run the file to see in case your instrument is working, the construction of your output ought to appear like this:

[
    {"snippet": "Snippet here", "title": "Title here", "link": "https://link.com/"},
    {"snippet": "Snippet here", "title": "Title here", "link": "https://link.com/"},
    {"snippet": "Snippet here", "title": "Title here", "link": "https://link.com/"}
]

Be aware that though it form of seems to be like a listing of dictionaries it’s only a string sort variable we will feed to ChatGPT.

Go forward and save this file and now create an __init__.py file within the '5_Understanding_agents/instruments' folder.

📁Finx_LangChain
    📁1_Summarizing_long_texts
    📁2_Chat_with_large_documents
    📁3_Agents_and_tools
    📁4_Custom_tools
    📁5_Understanding_agents
        📁instruments
            📄__init__.py
            📄internet_tool.py      (copy of the web instrument from half 4)
            📄moby_duck_search.py
    📄.env

Import each the moby duck and web instrument inside this file:

from .moby_duck_search import MobyDuckSearch
from .internet_tool import InternetTool

Now go forward and save and shut your __init__.py file.

Establishing our Agent’s immediate

We’re performed with our instruments for now. The very first thing we’re going to construct for our agent is the immediate. Our directions to the agent on what we wish it to do and the way. To maintain our file construction and our remaining agent code cleaner and readable we’re going to create yet another folder. Create a folder named ‘prompts‘ contained in the '5_Understanding_agents' folder and inside create an empty __init__.py file and a file named 'base_agent_template.py'. Your folder construction ought to now appear like this:

📁Finx_LangChain
    📁1_Summarizing_long_texts
    📁2_Chat_with_large_documents
    📁3_Agents_and_tools
    📁4_Custom_tools
    📁5_Understanding_agents
        📁prompts
            📄__init__.py   (empty file)
            📄base_agent_template.py    (empty file)
        📁instruments
            📄__init__.py
            📄internet_tool.py
            📄moby_duck_search.py
    📄.env

Open your 'base_agent_template.py' file and declare the next variable containing a immediate inside:

base_agent_template = """
Reply the next questions as finest you'll be able to, however talking as fanatic gaming fanatic. You will have entry to the next instruments:

{instruments}

Use the next format:

Query: the enter query you have to reply
Thought: it is best to all the time take into consideration what to do
Motion: the motion to take, needs to be certainly one of [{tool_names}]
Motion Enter: the enter to the motion
Remark: the results of the motion
... (this Thought/Motion/Motion Enter/Remark can repeat N instances)
Thought: I now know the ultimate reply
Ultimate Reply: the ultimate reply to the unique enter query

Start! Keep in mind to talk as a fervent gaming fanatic when giving your remaining reply.

Query: {enter}
{agent_scratchpad}
"""

That is only a immediate template as we’ve seen a number of of previously, discover the {instruments} and {tool_names} variables. We’ll be changing these with the precise instruments we need to use and their names. We even have a {agent_scratchpad} variable which we’ll use to retailer the agent’s ideas and actions because it goes by way of the method of answering the query. We’ll see how this works in a bit.

Be aware how most of this template is a typical construction that defines how we wish the LLM to construction its thought course of and reply our query. Now that we will see a fundamental template kind for this ReAct reasoning type agent it makes quite a lot of sense and is definitely fairly easy. We’re merely asking the LLM to output textual content as that’s all it is aware of the way to do and all it could possibly do in the long run. We’re simply asking it to output it in a particular format and order in order that we will work with it, that’s all! This could make the entire reasoning agent from the earlier components look rather a lot much less magical and much more comprehensible.

Go forward and save and shut this file, then open your __init__.py file, ensuring to open the one on this folder which continues to be empty, and import the base_agent_template:

from .base_agent_template import base_agent_template

Okay go forward and save and shut that as nicely.

Let’s construct our Agent

Let’s transfer on to our agent. Create a file referred to as '1_building_an_agent.py' inside your '5_Understanding_agents' folder:

📁Finx_LangChain
    📁1_Summarizing_long_texts
    📁2_Chat_with_large_documents
    📁3_Agents_and_tools
    📁4_Custom_tools
    📁5_Understanding_agents
        📁prompts
            📄__init__.py
            📄base_agent_template.py
        📁instruments
            📄__init__.py
            📄internet_tool.py
            📄moby_duck_search.py
        📄1_building_an_agent.py     <------New file
    📄.env

Contained in the '1_building_an_agent.py' file we’ll begin with our imports. There are going to be various them as on this half we’ll actually dig into constructing an agent from its components, so let’s get began:

import re

from decouple import config
from langchain import LLMChain
from langchain.brokers import (
    AgentExecutor,
    AgentOutputParser,
    LLMSingleActionAgent,
    Software,
)
from langchain.chat_models import ChatOpenAI
from langchain.prompts import StringPromptTemplate
from langchain.schema import AgentAction, AgentFinish
from prompts import base_agent_template
from instruments import MobyDuckSearch

I’ll go over these imports in a broad sense, and we’ll clarify them in additional element after we use every import. That makes extra sense as we’ll truly get to see what the import does as a substitute of simply speaking about it theoretically. We import the “re” module for normal expressions, as we need to check if the mannequin output matches sure patterns in a while. Decouple is after all for our API key, and all the opposite components are bits and items we are going to mix collectively to construct our agent.

Lastly, we import our personal MobyDuckSearch instrument and immediate template. (We’ll get the web instrument concerned in a while within the tutorial, I haven’t forgotten about it!). Once more, we’ll see how every of those imports works after we use them.

Setup

Let’s arrange our ChatGPT API and our instruments:

chat_gpt_api = ChatOpenAI(
    temperature=0, mannequin="gpt-3.5-turbo-0613", openai_api_key=config("OPENAI_API_KEY")
)

moby_duck_tool = MobyDuckSearch()

instruments = [
    Tool(
        name=moby_duck_tool.name,
        func=moby_duck_tool.run,
        description=moby_duck_tool.description,
    )
]

We arrange our ChatGPT API as all the time. We then create a brand new occasion of the MobyDuckSearch class after which create a listing of instruments. Inside we solely create a single Software object for now, utilizing the title and outline we now have outlined inside the category and passing the .run methodology because the func argument.

The immediate template (formatter)

We’re going to be writing customized variations of most issues so you’ll be able to actually see and perceive what’s going on. We’ll go to the prompt-template formatter subsequent. We’ve got the immediate we outlined within the prompts folder, however it has the variables that should be plugged into it earlier than it may be used. Let’s begin on our new class:

class MobyDuckPromptTemplate(StringPromptTemplate):
    template: str
    instruments: listing[Tool]

We outline a brand new class that inherits from StringPromptTemplate. It is a class that’s used to format the immediate which is distributed to the LLM. The format methodology should return a string. New cases of this class will take a template and a listing of Instruments as enter. The template is a template string like we now have in our prompts folder with {variables} in brackets within the string. We now want so as to add a .format methodology:

class MobyDuckPromptTemplate(StringPromptTemplate):
    template: str
    instruments: listing[Tool]

    def format(self, **kwargs) -> str:
        # Technique implementation will go right here.

The .format methodology is outlined as an summary methodology within the BasePromptTemplate, which is a father or mother of the StringPromptTemplate we inherited from. An summary methodology within the father or mother class means the kid courses ought to implement it. This methodology ought to return the formatted immediate as a string and can comprise our immediate formatting logic.

Our .format methodology takes self, which is that this explicit occasion of the category itself, and **kwargs, which is a dictionary of no matter different arguments have been handed in when the .format methodology was referred to as, mainly simply catching all the opposite arguments that have been handed in.

So, earlier than we begin constructing our format methodology, what precisely goes to be inside this kwargs dictionary? At this level within the format operate, our kwargs dictionary will appear like this:

# Don't put this in your code #
{'enter': 'The consumer enter question', 'intermediate_steps': [list of steps taken so far]}

The place did these two key-value pairs come from? Who or what handed them into our format methodology? The AgentExecutor class which we imported from LangChain and can check out in a while takes care of passing within the enter and intermediate_steps if any have taken place, so on the primary name this can be an empty listing, however after the agent calls a instrument or does one thing else, the AgentExecutor will add the motion and remark to the intermediate_steps listing and move it again into the format methodology.

As we now know our format methodology will obtain an intermediate_steps variable, let’s pop it off right into a variable referred to as ‘intermediate_steps‘.

    def format(self, **kwargs) -> str:
        intermediate_steps = kwargs.pop("intermediate_steps")

So in an effort to format our immediate, which is the entire objective of this .format methodology, we have to move within the variables we left open in our immediate template, particularly {instruments}, {tool_names}, {enter}, and {agent_scratchpad}. As we noticed a second earlier than, we have already got the enter key in our kwargs dictionary, in order that one’s taken care of.

Subsequent, we have to present the {agent_scratchpad} This scratchpad is the LangChain title for mainly the notes of the agent. As said above, with every step the AgentExecutor will return the intermediate_steps to our format methodology, so we will prep the immediate for the subsequent step by including the agent’s actions and observations to this point to the immediate of the subsequent name. So earlier than every ChatGPT name a brand new recent immediate can be generated utilizing our format methodology, inputting the actions and observations to this point into the immediate earlier than asking ChatGPT for its subsequent step.

This acts as a kind of reminiscence. Do not forget that a ChatGPT name is only a textual content completion primarily based on no matter textual content you set in. So we feed again regardless of the agent has thought and performed to this point, in any other case, it has no thought or reminiscence of what has occurred. We’ll begin with an empty string for our scratchpad.

    def format(self, **kwargs) -> str:
        intermediate_steps = kwargs.pop("intermediate_steps")
        scratchpad = ""

Now what precisely is in our intermediate_steps variable? The intermediate_steps variable is a listing of tuples (value1, value2) with two values every. The primary worth is an AgentAction object, which is a dataclass with three attributes: ‘instrument‘, ‘tool_input‘, and ‘log‘. For now, we simply need the ‘log‘, which is a string of the agent’s ideas and actions. An instance of a ‘log‘ is the next, which is only a single string with two linebreaks in it:

log='Thought: Oh, I really like zombie video games! There are such a lot of nice ones on the market. Let me take into consideration the very best suggestion for a zombie recreation from 2022.
Motion: moby_duck_search
Motion Enter: "zombie recreation 2022"'

You may see these are the strings you may have been seeing in your terminal all alongside when operating an agent!

The second merchandise in every tuple within the intermediate_steps listing is the remark, which is mainly simply the output of no matter instrument was referred to as, so on this case, will probably be the string we return on the finish of our ._run() methodology within the MobyDuckSearch instrument containing the search outcomes. We are able to loop over the listing of tuples and provides every of the 2 entries a reputation, ‘motion‘ and ‘tool_output‘, as there’ll all the time be two entries within the tuple.

    def format(self, **kwargs) -> str:
        intermediate_steps = kwargs.pop("intermediate_steps")
        scratchpad = ""

        for motion, tool_output in intermediate_steps:
            scratchpad += motion.log
            scratchpad += f"nObservation: {tool_output}nThought: "

For every motion and tool_output in every intermediate step within the listing of intermediate step tuples, we concatenate the motion.log string to our scratchpad variable. We then add the referred to as instrument’s output to the scratchpad, including in a n newline earlier than it to make it readable and ending with a n newline and Thought: to complete out the immediate and immediate the ChatGPT mannequin to proceed and provides us its subsequent thought within the sequence.

Within the foundation, all we’re doing is prompting textual content completion so we intentionally finish the textual content with “Thought: ” to immediate the mannequin to insert its subsequent thought. Let’s proceed:

    def format(self, **kwargs) -> str:
        intermediate_steps = kwargs.pop("intermediate_steps")
        scratchpad = ""

        for motion, tool_output in intermediate_steps:
            scratchpad += motion.log
            scratchpad += f"nObservation: {tool_output}nThought: "

        kwargs["agent_scratchpad"] = scratchpad

Keep in mind we now have the kwargs dictionary which already incorporates an enter key, we now have the scratchpad so we will add that to the kwargs dictionary as nicely.

        kwargs["tools"] = "n".be a part of(
            [f"{tool.name}: {tool.description}" for tool in self.tools]
        )

We nonetheless want so as to add the instruments to the kwargs dictionary. Studying from the within out we first loop over every instrument in self.instruments, after which create a string that incorporates the instrument.title after which the instrument.description. Now we now have a listing holding a string with "title: description" for every instrument, and we merely be a part of them along with a n newline in between every instrument, giving us a string with every instrument on a brand new line. We then add this string to the kwargs dictionary below the ‘instruments‘ key.

        kwargs["tool_names"] = ", ".be a part of([tool.name for tool in self.tools])

The final {variable} in our immediate template was the instrument names. This one is fairly simple. We loop over every instrument in self.instruments and get the instrument.title in a listing. We then be a part of all these instruments collectively in a single string with a comma and area in between every entry. We then add this string to the kwargs dictionary because the ‘tool_names‘ key.

We now have all of the variables we have to format our immediate and fill within the holes in our template. This format will run after every name updating the actions taken and observations to this point earlier than sending out a brand new name to ChatGPT. We are able to now return our formatted immediate:

        return self.template.format(**kwargs)

Right here we name the .format string methodology constructed into Python, to not be confused with the .format class methodology we’re defining proper now by the identical title. We give the format methodology the dictionary with all of the arguments it must fill within the {variable} slots in our template after which return the ensuing accomplished immediate string. Our complete class now seems to be like this:

class MobyDuckPromptTemplate(StringPromptTemplate):
    template: str
    instruments: listing[Tool]

    def format(self, **kwargs) -> str:
        intermediate_steps = kwargs.pop("intermediate_steps")
        scratchpad = ""

        for motion, tool_output in intermediate_steps:
            scratchpad += motion.log
            scratchpad += f"nObservation: {tool_output}nThought: "

        kwargs["agent_scratchpad"] = scratchpad
        kwargs["tools"] = "n".be a part of(
            [f"{tool.name}: {tool.description}" for tool in self.tools]
        ) kwargs["tool_names"] = ", ".be a part of([tool.name for tool in self.tools]) return self.template.format(**kwargs)

So we now have a category we will use as a immediate formatter, now, let’s truly instantiate a brand new occasion of this class:

prompt_formatter = MobyDuckPromptTemplate(
    template=base_agent_template,
    instruments=instruments,
    input_variables=["input", "intermediate_steps"],
)

We move in our base_agent_template we wrote within the prompts folder and the listing of instruments we declared above. The enter variables is a listing of the variable names that our class’s .format methodology expects as key phrase arguments, so we listing the 2 variables we all know it is going to obtain, ‘enter‘ and ‘intermediate_steps‘.

Parsing the output

That takes care of the immediate technology a part of our agent. Now ChatGPT, or every other LLM for that matter, doesn’t even have any capacity to name a operate or use our instruments, its solely capacity is to output textual content completions. So if our LLM desires to make use of certainly one of our instruments it is going to inform us so in textual format. We have to parse the textual content output the LLM sends again to us. That is the place output parsers are available:

class MobyDuckOutputParser(AgentOutputParser):

We outline a brand new class and inherit from the AgentOutputParser we imported from LangChain. Now we now have to outline our parse methodology:

class MobyDuckOutputParser(AgentOutputParser):
    def parse(self, llm_output: str) -> AgentAction | AgentFinish:

The enter of this methodology can be self and the llm_output, which is ChatGPT’s output in our case, in string format. We sort trace the output of this operate to both be an AgentAction or an AgentFinish object. These are simply two fundamental datatypes by LangChain with the AgentFinish mainly containing the ultimate reply output and an AgentAction being the ‘intermediate_steps’ we obtained in our immediate formatter’s format methodology earlier, containing the motion to take and the log.

Why these two objects? We mentioned the AgentExecutor class in our format methodology. It’s the factor that passes the ‘enter‘ and ‘intermediate_steps‘ into our immediate formatter. An AgentExecutor takes both an AgentFinish object, which terminates the decision and returns the ultimate outcome, or an AgentAction object, which tells the AgentExecutor one other motion have to be taken. It’ll then name the immediate formatter, passing within the ‘enter‘ and ‘intermediate_steps‘ variables, which is why they appeared in our format methodology. I hope that is all slowly beginning to make sense!

Inside our parse methodology, first examine if the LLM has completed. If it has, it is going to have "Ultimate Reply:" in its output, as a result of that’s the construction we instructed it to make use of in our immediate template.

    def parse(self, llm_output: str) -> AgentAction | AgentFinish:
        if "Ultimate Reply:" in llm_output:
            reply = llm_output.cut up("Ultimate Reply:")[-1].strip()
            return AgentFinish(
                return_values={"output": reply},
                log=llm_output,
            )

So if we discover the string "Ultimate Reply:" within the llm_output, we return an AgentFinish object. If we hover over AgentFinish in our code editor we will see it takes two arguments, a dictionary of return_values and a log. The return_values dictionary is only a dictionary of no matter values we need to return, on this case, we simply need to return the ultimate reply.

We create a variable referred to as ‘reply‘ and for its worth, we cut up the llm_output string on the "Ultimate Reply:" string which supplies us a listing of two strings, the half earlier than the LLM provides its remaining reply and the half after. We choose the final string utilizing [-1] which can choose the final index in a listing after which use .strip() to do away with any further whitespace.

Now we merely return an AgentFinish object with the return_values dictionary containing the ‘output‘ key and the reply variable as the worth. We additionally move within the llm_output, which was the complete LLM’s output, because the log.

That takes care of the case the place our mannequin has completed our downside. Now let’s deal with the case the place it’s nonetheless in motion. We’ll have to search out out which motion it desires to take, which needs to be one of many names of our instruments within the tool_names variable. We’ll even have to search out out what enter arguments it desires to provide that instrument. As we are going to obtain all this knowledge in string format we’ll should extract the wanted info from the string utilizing common expressions:

        regex = r"Actions*d*s*:(.*?)nActions*d*s*Inputs*d*s*:[s]*(.*)"

With out going too deep into regex, which is a complete course by itself (this regex expression is taken from LangChain’s documentation), let’s take a really transient look. Right here we search for the phrase "Motion" optionally adopted by s* white area characters (like area and tab), optionally adopted by digits (d*), optionally adopted by white area characters once more (s*), then adopted by a colon (:). So this might match "Motion:" but in addition "Motion : " and even "Motion 1 : ".

The (._?) is a capturing group that may seize any characters till we get to the n newline character that follows the group within the regex. So any characters in between "Motion:" and the subsequent n newline character can be saved in a bunch we will entry later to extract the worth. We then do mainly the identical factor once more for "Motion Enter:", permitting for doable areas and numbers in between, and ending with a second seize group (.*) that may seize no matter comes after the "Motion Enter:".

Once more, regex is a programming language in itself and an excessive amount of to completely get into right here, however we mainly seize the motion and motion enter in two teams that we will entry in a while, utilizing this common expression sample. Be aware this sample matches the expectation we set with our immediate template that we feed into ChatGPT, the place we ask for output utilizing this precise construction.

Now we have to run the regex sample in opposition to our ChatGPT / LLM output:

        match = re.search(regex, llm_output, re.DOTALL)

We name the search methodology on the ‘re‘ library, passing in our regex sample, the llm_output we need to search for matches in, and at last the re.DOTALL flag. This flag permits the '.' dot character which usually matches each doable character aside from the newline character, to additionally match the newline character. In case you’re not too certain about regex, don’t fear an excessive amount of about this element, regex is a topic for an additional tutorial course. Proceed as follows:

        if not match:
            elevate ValueError(f"Couldn't parse LLM output: `{llm_output}`")
        motion = match.group(1).strip()
        action_input = match.group(2).strip(" ").strip('"')

If we don’t discover a match (our match variable is empty), we elevate a ValueError for now. Else, we take group 1 of the match, which is the primary seize group we talked about in our sample, so no matter textual content got here after "Motion:", and retailer it in a variable referred to as 'motion', calling strip() to do away with any further whitespace. We then do the identical for the second seize group, storing it in a variable referred to as 'action_input', however this time we strip off any area( ) and in addition any double quotes (") that is perhaps within the string.

Our parse methodology needed to both return an AgentFinish or an AgentAction. On this case, we’re nonetheless in motion, so let’s return an AgentAction object:

        return AgentAction(instrument=motion, tool_input=action_input, log=llm_output)

We move within the instrument and tool_input we simply extracted and the unique full ChatGPT output because the log. For readability, right here is the complete completed MobyDuckOutputParser class:

class MobyDuckOutputParser(AgentOutputParser):
    def parse(self, llm_output: str) -> AgentAction | AgentFinish:
        if "Ultimate Reply:" in llm_output:
            reply = llm_output.cut up("Ultimate Reply:")[-1].strip()
            return AgentFinish(
                return_values={"output": reply},
                log=llm_output,
            )

        regex = r"Actions*d*s*:(.*?)nActions*d*s*Inputs*d*s*:[s]*(.*)"
        match = re.search(regex, llm_output, re.DOTALL)

        if not match:
            elevate ValueError(f"Couldn't parse LLM output: `{llm_output}`")
        motion = match.group(1).strip()
        action_input = match.group(2).strip(" ").strip('"')

        return AgentAction(instrument=motion, tool_input=action_input, log=llm_output)

Now we declare a easy llm_chain, combining our ChatGPT with the prompt_formatter we constructed above:

llm_chain = LLMChain(llm=chat_gpt_api, immediate=prompt_formatter)

Keep in mind the prompt_formatter is an occasion of our MobyDuckPromptTemplate class with our base_agent_template immediate from the prompts folder and every part else it must generate the immediate handed in.

Including all of it collectively: the Agent

Now it’s time to outline our precise agent. We’ll be utilizing the LLMSingleActionAgent sort we already imported up prime. The one motion half merely means the agent will take a single motion every time, however as we now have seen it could run a number of instances, which is the place the AgentExecutor is available in, however extra on that in a second. First our agent:

moby_duck_agent = LLMSingleActionAgent(
    llm_chain=llm_chain,
    output_parser=MobyDuckOutputParser(),
    cease=["nObservation:"],
)

We declare a brand new LLMSingleActionAgent and provides it the LLM chain containing our ChatGPT API and the immediate formatter with the immediate and method to format it. We additionally give it the output parser we simply wrote above, passing in a brand new occasion of the category.

Now what’s the cease argument? That is truly a typical API characteristic in ChatGPT that tells ChatGPT that in the event you run into this sequence of characters whereas producing output, cease then and there, that’s the finish of your output. If we glance again to the template we wrote for our agent, for every time it runs ChatGPT is requested to generate the next (straight from our immediate template):

Motion: the motion to take, needs to be certainly one of [{tool_names}]
Motion Enter: the enter to the motion
Remark: the results of the motion

So the ChatGPT chooses which instrument to name, e.g. Motion: moby_duck_tool. Then it chooses the Motion Enter, e.g. "zombie recreation 2022". It’ll comply with the sample after which generate the subsequent line which says "Remark:" because it’s following our template, however after remark WE and never ChatGPT have to enter the results of calling the instrument. That is why after "Remark:" we inform ChatGPT to cease the producing there.

That is why we wrote the next line of code in our immediate formatter:

# snippet from our MobyDuckPromptTemplate class's .format() methodology #
scratchpad += f"nObservation: {tool_output}nThought: "

As a result of right here we add the instrument output after Remark: after which we will ship the entire thing again to ChatGPT to proceed producing, both calling one other instrument or giving us the ultimate reply.

The ultimate step: an AgentExecutor

Okay so now we now have an agent with our ChatGPT API, immediate creation performance, output parser, and cease argument. We’re virtually performed! We simply have to create an AgentExecutor to truly run our agent. Earlier than we do, let’s talk about what the AgentExecutor truly is as we’ve talked about it a number of instances already and promised a correct clarification.

The AgentExecutor is mainly a loop that manages executing the Agent. For each loop, it is going to move the consumer enter question and the earlier steps which have occurred to this point to the agent. If the agent returns an AgentFinish object, the AgentExecutor will return the tip outcome on to the consumer, and if the Agent returns an AgentAction, the AgentExecutor will name that instrument and get the Remark. The loop will now repeat, passing the brand new Remark again into the agent together with the earlier steps which have occurred to this point, till an AgentFinish object is returned.

So the AgentExecutor is mainly simply the execution loop that takes all of the steps we’ve constructed to this point and can run them collectively in a loop within the right order.

agent_executor = AgentExecutor.from_agent_and_tools(
    agent=moby_duck_agent, instruments=instruments, verbose=True
)

We simply move in our agent and instruments and set verbose to True as we need to see what it’s doing. Now let’s lastly run our personal agent!

agent_executor.run("Are you able to suggest me a zombie recreation from the yr 2022?")

And my output is:

> Getting into new AgentExecutor chain...
Thought: Oh, I really like zombie video games! There are such a lot of nice ones on the market. Let me take into consideration the very best zombie recreation from 2022.
Motion: moby_duck_search
Motion Enter: "finest zombie recreation 2022"

Remark:[{list of search result objects for query "best zombie game 2022"}]
There are three nice zombie video games from 2022 that I discovered: Zombie Remedy Lab, Zombie Survivors, and SurrounDead. Let me take into consideration which one to suggest.
Motion: moby_duck_search
Motion Enter: "Zombie Remedy Lab"

Remark:[{list of search result objects for query "Zombie Cure Lab"}]
Zombie Remedy Lab is a recreation the place you handle a lab and attempt to remedy the zombie virus. It has day and evening shifts, and it is advisable hold your workforce glad to forestall outbreaks. You additionally construct defenses to maintain zombies out at evening. It seems like a singular and difficult recreation. I like to recommend Zombie Remedy Lab as the very best zombie recreation from 2022.

Ultimate Reply: One of the best zombie recreation from 2022 is Zombie Remedy Lab.

> Completed chain.

Our instrument speaks in a gaming fanatic voice as that’s what we instructed it to do. It calls our moby_duck_search instrument and passes in a search question. Discover it follows our construction precisely as we instructed within the immediate template we wrote.

We then go into the second ChatGPT name after our AgentExecutor has referred to as our instrument and added the search outcomes after "Remark:". Our instrument is outwardly most impressed with a selected search outcome a few recreation referred to as "Zombie Remedy Lab" and desires to do one other DuckDuckGo search on this recreation. The AgentExecutor obliges, calls the instrument, and feeds the response again into the subsequent name to ChatGPT which now concludes that Zombie Remedy Lab is the very best zombie recreation from 2022.

We are able to argue whether it is objectively the very best recreation or not, however ChatGPT gave us a fairly good zombie recreation suggestion from 2022 primarily based on autonomously carried out analysis, that’s fairly darn cool!

Now let’s take this one final step additional earlier than we finish this tutorial half, it’s already gotten actually lengthy anyway. I’ll minimize you some slack partly 6, I promise!

So say I need to ask the agent for extra details about this recreation, and I’m a lazy consumer so I simply phrase my subsequent query like this:

"What's the recreation about?"

Now if we ship this question to our agent executor it is going to fireplace up a brand new agent and cargo it up. The brand new agent will don’t know what recreation we’re speaking about and we’re in hassle. For this, our agent will want the ultimate step to intelligence, reminiscence!

Including reminiscence to our Agent

First, return into your prompts folder and open the base_agent_template.py file. We’ll want so as to add a second immediate model that’s barely totally different and permits for agent reminiscence. Under your current 'base_agent_template' variable, add a second variable to this file (you’ll be able to simply copy it because it’s virtually the identical):

base_agent_template_w_memory = """
Reply the next questions as finest you'll be able to, however talking as fanatic gaming fanatic. You will have entry to the next instruments:

{instruments}

Use the next format:

Query: the enter query you have to reply
Thought: it is best to all the time take into consideration what to do
Motion: the motion to take, needs to be certainly one of [{tool_names}]
Motion Enter: the enter to the motion
Remark: the results of the motion
... (this Thought/Motion/Motion Enter/Remark can repeat N instances)
Thought: I now know the ultimate reply
Ultimate Reply: the ultimate reply to the unique enter query

Start! Keep in mind to talk as a fervent gaming fanatic when giving your remaining reply.

Earlier dialog historical past:
{historical past}

New query: {enter}
{agent_scratchpad}
"""

As you’ll be able to see, we simply added the road Earlier dialog historical past: {historical past} in our template with reminiscence model, easy sufficient. Save and shut this file, then open the __init__.py file within the prompts folder and add this new variable to the present import assertion:

from .base_agent_template import base_agent_template, base_agent_template_w_memory

Save and shut this file as nicely and now let’s get again to our '1_building_an_agent.py' file with all our stuff in it. On the prime add 2 further imports to the already current listing of imports:

from prompts import base_agent_template_w_memory
from instruments import InternetTool

So we import the prompt-template-with-memory model we simply wrote (you’ll be able to after all mix this import assertion with the opposite immediate import assertion if you’d like), and we import the InternetTool we wrote within the earlier half and copied over to our instruments folder originally of this tutorial. Keep in mind the InternetTool we wrote permits the agent to get the web page textual content for a sure URL wherever on the web.

Let’s add the web instrument to our instruments:

internet_tool = InternetTool()
instruments.append(
    Software(
        title="visit_specific_url",
        func=internet_tool.run,
        description=(
            "Helpful while you need extra details about a web page by opening it is url on the web."
            "Enter needs to be a legitimate and full web url with nothing else hooked up."
        ),
    )
)

We created a brand new occasion of the InternetTool after which simply appended a brand new Software object to the already current instruments listing. Be aware how we didn’t use the internet_tool.title and internet_tool.description defaults in our class however wrote a customized title and outline this time.

Now we simply retrace our remaining steps to construct a second agent. I’m simply going to maintain coding on this identical file under all of the already current stuff as this tutorial is already very lengthy and I need to concentrate on the training ideas right here and never software program venture structuring finest practices.

prompt_formatter_w_memory = MobyDuckPromptTemplate(
    template=base_agent_template_w_memory,
    instruments=instruments,
    input_variables=["input", "intermediate_steps", "history"],
)

So we declare a second prompt_formatter utilizing our MobyDuckPromptTemplate class once more. This time the listing of input_variables additionally consists of historical past and we use our base_agent_template_w_memory because the template.

Now we mix our ChatGPT API and the immediate formatter right into a easy chain like we did earlier than:

llm_chain_w_memory = LLMChain(llm=chat_gpt_api, immediate=prompt_formatter_w_memory)

And we declare our new agent with reminiscence similar to earlier than:

moby_duck_agent_w_memory = LLMSingleActionAgent(
    llm_chain=llm_chain_w_memory,
    output_parser=MobyDuckOutputParser(),
    cease=["nObservation:"],
)

Now we’ll really want so as to add the reminiscence. The AgentExecutor class that runs the loop will take our reminiscence object and combine it into the Agent Execution loop for us, however first, we’d like some reminiscence. Add the next import to the highest of the file:

from langchain.reminiscence import ConversationBufferWindowMemory

ConversationBufferWindowMemory will mainly simply maintain a listing of strings with regardless of the “Human” requested after which what the “AI” answered. A dialog historical past that’s much like the one we saved within the “operate calls and embeddings” Finxter academy course. It’ll mainly simply retailer the already accomplished conversations and feed them again into the loop if we ask one other query after the primary one.

Go forward and instantiate a reminiscence object:

reminiscence = ConversationBufferWindowMemory(ok=10)

We move within the ok argument which is the variety of messages to maintain in reminiscence. Now we will lastly create our AgentExecutor:

agent_executor_w_memory = AgentExecutor.from_agent_and_tools(
    agent=moby_duck_agent_w_memory, instruments=instruments, verbose=True, reminiscence=reminiscence
)

This time we merely move in an additional argument referred to as reminiscence. Now to check this I’m simply going to ask two questions forward of time. If you wish to do that correctly it is best to write a loop that asks the consumer for enter after which calls .run() on the agent executor permitting the consumer to ask a brand new query after every AgentExecutor chain finishes operating. However as this tutorial is already so lengthy let’s simply cheat just a little bit and hardcode two questions:

agent_executor_w_memory.run("Are you able to suggest me a zombie recreation from the yr 2022?")
agent_executor_w_memory.run("Are you able to give me extra info on that first recreation?")

I’m going to imagine the primary query will web not less than one recreation suggestion during which case the second query about extra info on that first recreation will make sense. Be sure to remark out any .run() statements you continue to have on the previous agent executor with out reminiscence up above or they will even run once more and let’s run our file with these two questions:

> Getting into new AgentExecutor chain...
Thought: Oh, I really like zombie video games! Let me consider a superb suggestion from 2022.
Motion: moby_duck_search
Motion Enter: "zombie recreation 2022"

Remark:[{...list of search result objects for query "zombie game 2022"}]
There are a couple of nice zombie video games from 2022 that I discovered. One suggestion is "Zombie Apocalypse: The Final Protection." It is a tower protection recreation the place you place explosive mines to cease the zombies. You can even purchase allies that can assist you within the warfare in opposition to the undead. It has a variety of enemies, 15 power-ups, and a gorgeous visible results. You may even drive a automobile and a warfare tank! It is best to positively test it out!

Ultimate Reply: I like to recommend "Zombie Apocalypse: The Final Protection" as a terrific zombie recreation from 2022.

> Completed chain.

We obtained one other good suggestion. Now the second query will run asking for extra info which triggers the AgentExecutor chain once more and the agent will know from reminiscence what recreation we’re speaking about.

> Getting into new AgentExecutor chain...
Thought: I would like to search out extra details about "Zombie Apocalypse: The Final Protection" to supply an in depth response.
Motion: visit_specific_url
Motion Enter: https://www.mobygames.com/recreation/zombie-apocalypse-the-last-defense

This time it makes use of our web instrument from half 4 to go to the particular url.

Remark: ...A great deal of textual content from the MobyGames web page for "Zombie Apocalypse: The Final Protection"

Ultimate Reply: "Zombie Apocalypse: The Final Protection" is an motion technique/techniques recreation launched in 2022 on Home windows. It options real-time technique.... (and so forth, a terrific abstract of the options and particulars)

> Completed chain.

That’s superior! We are able to now ask follow-up questions. We did it, we constructed our personal agent step-by-step! A few of this was reinventing the wheel just a little bit, however this tutorial was purposefully so to provide you a deeper understanding of the internal workings of an agent. I hope all of it appears much more comprehensible and logical and fewer magical to you now and offers you extra perception into how this all works collectively.

As a small caveat, you might need observed your mannequin was not 100% dependable when operating and will sometimes have a parsing error or hassle calling the instruments/capabilities. That is why, once more, in follow, I like to recommend you utilize the OpenAI agent as a lot as doable as a result of it’s primarily based on OpenAI’s operate calls and makes use of the 'gpt-turbo-0613' mannequin.

This fashion you get a high-quality mannequin which is particularly skilled to deal with calling capabilities, making it extra sturdy and dependable than quite a lot of the open-source fashions on the market. (In fact, you may also use the function-calls-specific GPT4 model).

Both manner, the underlying ideas are the identical. That’s the tip for half 5 of the tutorial sequence, this one was fairly lengthy, however I’ll see you quickly partly 6 the place we’ll take a look at LangChain Expression Language and have some enjoyable constructing an LLM chain that corrects its personal errors!


orial is a part of our unique course on Python LangChain. You will discover the course URL right here: 👇

🧑‍💻 Authentic Course Hyperlink: Turning into a Langchain Immediate Engineer with Python – and Construct Cool Stuff 🦜🔗

Authentic article on the Finxter Academy

💡 Be aware: You may watch the complete course video proper right here on the weblog — I’ll embedd the video under every of the opposite components as nicely. If you’d like the step-by-step course with code and downloadable PDF course certificates to point out your employer or freelancing shoppers. comply with this hyperlink to be taught extra.

[ad_2]

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles