One of the exciting things about a large
language model is you can use it to build a custom chatbot
with only a modest amount of effort. ChatGPT, the web interface,
is a way for you to have a conversational interface,
a conversation via a large language model. But one of the
cool things is you can also use a large language model to
build your custom chatbot to maybe play the role of an AI
customer service agent or an AI order taker for a restaurant.
And in this video, you learn how to do that for yourself.
I’m going to describe the components of
the OpenAI ChatCompletions format in more
detail, and then you’re going to build a chatbot yourself. So let’s
get into it. So first, we’ll set up the OpenAI Python package as
usual.
So chat models like ChatGPT are actually trained
to take a series of messages as input
and return a model-generated message as output. And
so although the chat format is designed to make multi-turn conversations
like this easy, we’ve kind of seen through the previous videos
that it’s also just as useful for single-turn tasks without any conversation.
And so next, we’re going to kind of define two helper functions. So
this is the one that we’ve been using throughout all the videos,
and it’s the getCompletion function. But if you kind
of look at it, we give a prompt, but then kind of
inside the function, what we’re actually doing is putting this prompt
into what looks
like some kind of user message. And this is because the
ChatGPT model is a chat model, which means it’s trained to take a series
of messages as input and then return a model-generated message
as output. So the user message is kind of the input, and
then the assistant message is the output.
So, in this video, we’re going to actually use
a different helper function, and instead of kind of
putting a single prompt as input and getting
a single completion, we’re going to pass in a list of messages. And these
messages can be kind of from a variety
of different roles, so I’ll describe those. So here’s an example of
a list of messages. And so, the first message is
a system message, which kind of gives an overall instruction, and then after
this message, we have kind of
turns between the user and the assistant. And this would kind
of continue to go on. And if you’ve ever used ChatGPT, the
web interface, then
your messages are the user messages, and then ChatGPT’s
messages are the assistant messages. So the system message helps
to kind of set the behaviour and persona
of the assistant, and it acts as kind of a high-level instruction
for the conversation. So you can kind of think of it as whispering
in the assistant’s ear and kind of guiding it’s responses without the
user being aware of the system message. So, as the user, if
you’ve ever used ChatGPT, you probably don’t know what’s
in ChatGPT’s system message, and that’s kind of the intention. The benefit
of the system message is that it provides you, the developer, with
a way to kind of frame the conversation without making the
request itself part of the conversation. So you can
kind of guide the assistant and kind of whisper in
its ear and guide its responses without making the user aware.
So, now let’s try to use these messages in a conversation.
So we’ll use our new helper function to get the
completion from the messages.
And we’re also using a higher temperature.
So the system message says, you are an assistant
that speaks like Shakespeare. So this is us kind of describing to
the assistant how it should behave. And then the first user message is,
tell me a joke. The next is, why did the chicken cross the road?
And then the final user message is, I don’t know.
So if we run this,
the response is to get to the other side. Let’s try again.
To get to the other side, faire so, madame,
tis an olden classic that never fails. So there’s
our Shakespearean response.
And let’s actually try one more thing, because
I want to make it even clearer that
this is the assistant message. So here, let’s just
go and print the
entire message response.
So, just to make this even clearer, uhm, this
response is an assistant message. So, the role is assistant
and then the content is the message itself. So, that’s
what’s happening in this helper function. We’re just
kind of passing out the content of the message.
now let’s do another example. So, here our messages are, uhm, the
assistant message is, you’re a friendly chatbot and the first
user message is, hi, my name is Isa. And we want to, uhm,
get the first user message. So, let’s execute this. The
first assistant message.
And so, the first message is, hello Isa, it’s nice to meet you. How
can I assist you today?
Now, let’s try another example.
So, here our messages are, uhm, system message, you’re
a friendly chatbot and the first user message is, yes,
can you remind me
what is my name? And let’s get the response.
And as you can see,
the model doesn’t actually know my name.
So, each conversation with a language model is a standalone
interaction which means that you must provide all relevant
messages for the model to draw from in the current conversation.
If you want the model to draw from or, quote unquote,
remember earlier parts of a conversation,
you must provide the earlier exchanges in the
input to the model. And so, we’ll refer to this as context. So, let’s
try this.
So, now we’ve kind of given the context that the model needs, uhm, which
is my name in the previous messages and we’ll ask
the same question, so we’ll ask what my name is.
And the model is able to respond because it has
all of the context it needs, uhm, in this kind of list of
messages that we input to it.
So now you’re going to build your own chatbot.
This chatbot is going to be called orderbot, and
we’re going to automate the collection of
user prompts and assistant responses in order to build
this orderbot. And it’s going to take orders at a pizza restaurant, so
first we’re going to define this helper function, and what this
is doing is it’s going to kind of collect our user messages
so we can avoid typing them in by hand in
the same, in the way that we did above, and this is going
to kind of collect prompts from a user
interface that will build below, and then append it
to a list called context, and then it will call the model with
that context every time. And the model response
is then also added to the context, so the kind of
model message is added to the context, the user message is
added to the context, so on, so it just kind of grows longer and
longer.
This way the model has the information it needs to
determine what to do next. And so now we’ll
set up and run this kind of UI to display the order bot, and
so here’s the context, and it contains the system message that
contains the menu,
and note that every time we call the language model we’re
going to use the same context, and the context is building
up over time.
And then let’s execute this.
Okay, I’m going to say, hi, I would like to order a pizza.
And the assistant says, great, what pizza would you like to order?
We have pepperoni, cheese, and eggplant pizza.
How much are they?
Great, okay, we have the prices.
I think I’m feeling a medium eggplant pizza.
So as you can imagine, we could kind of continue this conversation,
and let’s kind of look at what we’ve put in the system message.
So you are order bot, an automated service
to collect orders for a pizza restaurant.
You first greet the customer, then collect the order,
and then ask if it’s a pickup or delivery. You
wait to collect the entire order, then summarize it and check
for a final time if the customer wants
to add anything else. If it’s a delivery, you can
ask for an address. Finally, you collect the payment. Make sure
to clarify all options, extras, and sizes
to uniquely identify the item from the menu. You respond
in a short, very conversational, friendly style.
The menu includes, and then here we have the menu.
So let’s go back to our conversation and let’s see
if the assistant kind of has been following the instructions.
Okay, great, the assistant asks if we want any toppings
which we kind of specified an assistant message.
So I think we want no extra toppings.
Things… sure thing. Is there anything else we’d
like to order? Hmm,
let’s get some water. Actually,
fries.
Small or large? And this is great because we kind of
asked the assistant in the system message to
kind of clarify extras and sides.
And so you get the idea and please feel free
to play with this yourself. You can pause the video and just go
ahead and run this in your own notebook on the left.
And so now we can ask the model to create a JSON
summary that we could send to the order
system based on the conversation.
So we’re now appending another system message which is an
instruction and we’re saying create a JSON summary of
the previous food order, itemize the price for each item, the fields
should be one pizza, include side,
two lists of toppings, three lists of drinks,
and four lists of sides, and finally the total price.
And you could also
use a user message here, this does not have to be a system
message.
So let’s execute this.
And notice in this case we’re using a lower
temperature because for these kinds of tasks we want
the output to be fairly predictable. For
a conversational agent you might want to use
a higher temperature, however in this case I
would maybe use a lower temperature as well
because for a customer’s assistant chatbot you might want
the output to be a bit more predictable as well.
And so here we have the summary of our order and so
we could submit this to the order system if we wanted to.
So there we have it, you’ve built your very own order chatbot. Feel
free to kind of customize it yourself and play around with the
system message to kind of change the behavior of the chatbot and
kind of get it to act as different personas
with different knowledge.