GPT is a great technology when it comes to understanding the semantics and context of statements given by the user. But with prompt engineering, we can boost its assertiveness by providing clear and effective orders and also the context in which the answer should be based upon. It´s one of the most important parts of developing custom applications using GPT and as of today, this is the way that we are able to provide custom context to the model. You can find more about prompt engineering and best practices in this documentation from OpenAI.
Prerequisites
This blog is part of the blog series “Connecting the dots – Using SAP Business Technology Platform to make GPT a better product”. This is the second blog of this series. If you have not checked the first blog, “Connecting the dots – Using SAP Data Intelligence to generate embeddings with custom data in OpenAI”, I strongly advise you to do so before moving forward.
- AWS S3 bucket or other filesystem storage system that contains the embeddings file generated on the previous blog
- SAP Data Intelligence tenant
- Advanced python knowledge
- SAP Data Intelligence knowledge
- Basic javascript knowledge
- REST API and Postman knowledge
Creating a REST API in SAP Data Intelligence
For this second blog, we will create a new pipeline. Before we add any complexity involving OpenAI and python operators, it is important to understand how the OpenAPI Servlow operator works. This operator will provide us a server side endpoint accordingly to the OpenAPI (not to be confused with OpenAI) that we specify in the operator. The best way to fully understand it, in my opinion, is to use the OpenAPI Greeter graph that SAP provided as a sample within the tenant. Below you can find an overview of the graph needed for this scenario:
In the OpenAPI Servlow configuration, insert the below path as the base path for it:
/openAI/
To specify the endpoints and methods available in our REST API, add the JSON below In the swagger specification field:
{
"swagger": "2.0",
"info": {
"contact": {
"email": "vora@www.sap.com"
},
"description": "This is a vFlow greeter demo.",
"license": {
"name": "Apache 2.0",
"url": "http://www.apache.org/licenses/LICENSE-2.0.html"
},
"termsOfService": "http://www.sap.com/vora/terms/",
"title": "vFlow greeter demo API",
"version": "1.1.0"
},
"schemes": [
"http",
"https"
],
"basePath": "/openAI",
"paths": {
"/completion": {
"post": {
"consumes": [
"application/json"
],
"description": "",
"operationId": "completion",
"parameters": [
{
"description": "A message",
"in": "body",
"name": "body",
"required": true,
"schema": {
"type": "string"
}
}
],
"produces": [
"application/json"
],
"responses": {
"200": {
"description": "",
"schema": {
"$ref": "#/definitions/echoReply"
}
}
},
"summary": ""
}
}
}
}
While this pipeline is running, this operator will be listening for HTTPS requests. Once a request is received, it will output it as a message with the body from the request and its attributes. These attributes from the output message are going to be used to identify the request while the pipeline is executing and will be required to send a reply for the client that made the request. Another requirement is the javascript operator to be the one to send the reply to the client. More information about the OpenAPI Servlow operator can be found on the excellent blogs that Ian Henry and Jens Rannacher have written. Before adding the python operator, first add the javascript operator and run some tests to see if you can get a reply echoing the body. Use the javascript code below in the operator:
$.setPortCallback("input",onInput);
function isByteArray(data) {
return (typeof data === 'object' && Array.isArray(data)
&& data.length > 0 && typeof data[0] === 'number')
}
function sendResponse(s, m, e) {
if ($.output === null) {
// invoke the callback directly
$.sendResponse(s, m, e);
} else {
// let the subsequent operator decide what to do
if (e !== null) {
m.Attributes["message.response.error"] = e;
}
$.output(m);
}
}
function onInput(ctx,s) {
var msg = {};
var inbody = s.Body;
var inattributes = s.Attributes;
// convert the body into string if it is bytes
if (isByteArray(inbody)) {
inbody = String.fromCharCode.apply(null, inbody);
}
// prepare for a response message
msg.Attributes = {};
for (var key in inattributes) {
// only copy the headers that won't interfer with the recieving operators
if (key.indexOf("openapi.header") < 0 || key === "openapi.header.x-request-key") {
msg.Attributes[key] = inattributes[key];
}
}
msg.Body = {"openAI": inbody};
sendResponse(s, msg, null);
}
Now, we will add the python operator in between the OpenAPI Servlow and the javascript operators. Just as we did on the first blog, this python operator will be using the same dockerfile and tags so that we can use the necessary python libraries within it.
Now we will add some magic again to this pipeline. The script below will also embed the query made by the user upon API request. We will retrieve the CSV file which we embedded in the previous blog so that we can apply similarity cosine to identify the data that most relates to the user query. The most similar contexts will then be added as part of the prompt to added on the completions API from OpenAI. Finally, after we have a reply from the OpenAPI completions API, the response body will be added to the reply message and be passed on the javascript operator to be returned to the client.
import os
import openai
import pandas as pd
import boto3
import tiktoken
import json
from io import StringIO
import numpy as np
os.environ["OPENAI_API_KEY"] = 'YOUR_OPENAI_API_KEY'
openai.api_key = os.getenv("OPENAI_API_KEY")
session = boto3.Session(
aws_access_key_id='<YOUR_S3_ACCESS_KEY>',
aws_secret_access_key='<YOUR_S3_SECRET_ACCESS_KEY>',
)
bucket_session = session.client('s3')
def get_csv_document( bucket: str, key: str):
return pd.read_csv(bucket_session.get_object(Bucket=bucket, Key=key).get("Body"))
def vector_similarity(x: list[float], y: list[float]) -> float:
"""
Returns the similarity between two vectors.
Because OpenAI Embeddings are normalized to length 1, the cosine similarity is the same as the dot product.
"""
return np.dot(np.array(x), np.array(y))
def order_document_sections_by_query_similarity(query: str, contexts: dict[(str), np.array]) -> list[(float, (str, str))]:
"""
Find the query embedding for the supplied query, and compare it against all of the pre-calculated document embeddings
to find the most relevant sections.
Return the list of document sections, sorted by relevance in descending order.
"""
query_embedding = get_embedding(query)
document_similarities = sorted([
(vector_similarity(query_embedding, doc_embedding), doc_index) for doc_index, doc_embedding in contexts.items() if vector_similarity(query_embedding, doc_embedding) > 0.8
], reverse=True)
return document_similarities
def get_embedding(text: str, model:str="text-embedding-ada-002"):
result = openai.Embedding.create(
model=model,
input=text
)
return result["data"][0]["embedding"]
def construct_prompt(question: str, context_embeddings: dict, df: pd.DataFrame) -> str:
"""
Fetch relevant
"""
most_relevant_document_sections = order_document_sections_by_query_similarity(question, context_embeddings)
chosen_sections = []
chosen_sections_len = 0
chosen_sections_indexes = []
MAX_SECTION_LEN = 500
SEPARATOR = "n* "
ENCODING = "gpt2" # encoding for text-davinci-003
encoding = tiktoken.get_encoding(ENCODING)
separator_len = len(encoding.encode(SEPARATOR))
df = df.set_index("AIRPORT")
for _, section_index in most_relevant_document_sections:
# Add contexts until we run out of space.
document_section = df.loc[section_index]
# print(document_section)
chosen_sections_len += len(encoding.encode("Country: "+ document_section.COUNTRY + " | City: " + document_section.CITY + " | Airport: " + section_index)) + separator_len
if chosen_sections_len > MAX_SECTION_LEN:
break
chosen_sections.append(SEPARATOR + "Country: "+ document_section.COUNTRY + " | City: " + document_section.CITY + " | Airport: " + section_index)
chosen_sections_indexes.append(str(section_index))
# Useful diagnostic information
print(f"Selected {len(chosen_sections)} document sections:")
print("n".join(chosen_sections_indexes))
header = """Answer the question as truthfully as possible using the provided context, and if the answer is not contained within the text below, say "I don't know."nnContext:n"""
return header + "".join(chosen_sections) + "nn Q: " + question + "n A:"
def answer_query_with_context(
query: str,
df: pd.DataFrame,
document_embeddings: dict[(str, str), np.array],
show_prompt: bool = False
) -> str:
prompt = construct_prompt(
query,
document_embeddings,
df
)
if show_prompt:
print(prompt)
COMPLETIONS_API_PARAMS = {
# We use temperature of 0.0 because it gives the most predictable, factual answer.
"temperature": 0.7,
"max_tokens": 300,
"model": "text-davinci-003",
}
response = openai.Completion.create(
prompt=prompt,
**COMPLETIONS_API_PARAMS
)
return response
def query_callback(input):
inputDict = json.loads(input.body)
airports = get_csv_document('<YOUR_S3_BUCKET>','<YOUR_S3_OBJECT_WITH_DATA_NAME>')
embedding = get_csv_document('<YOUR_S3_BUCKET>','<YOUR_S3_OBJECT_WITH_EMBEDDINGS_NAME>')
# It´s important to output the input.body within the output message, so that the OpenAPI Servlow operator recognizes the request message and replies it correctly to the client.
message = {}
answer = answer_query_with_context(str(inputDict["query"]), airports, embedding)
message["body"] = input.body
message["completion"] = answer
input_message_attributes = input.attributes
api.send("indexStr", api.Message(message, input_message_attributes))
api.set_port_callback(["inputStr"], query_callback)
Now we can run some tests using Postman. After starting the pipeline and while it is running, you can use Postman to test the endpoint. The request in Postman must be done to the URL below using the method POST:
https://<YOUR_DI_HOST>/app/pipeline-modeler/openapi/service/<YOUR_OPENAPI_SERVLOW_BASE_PATH>/<YOUR_ENDPOINT_AS_DESCRIBED_IN_SWAGGER_SPECIFICATION>
This endpoint requires basic authentication to be accessed. In the authorization area from Postman, you should inform the tenant you are using and the user that you used to login into the SAP Data Intelligence tenant in the format described below. You can verify this information by clicking on your profile in SAP Data Intelligence:
Add the JSON below to the body of the request, you can try different queries for the API:
{
"query":"What are the airports available in New York?"
}
If everything is correct, you should get the response below, containing the two airports codes that are included in the context that we provided in the prompt for the completion APIs after processing the python operator in SAP Data Intelligence:
For now, we were able to infere custom data context into our prompt to OpenAI´s completion API. That is just the beginning of making OpenAI a better product using SAP Business Technology Platform services. The next step will be to provide OpenAI contextual information about tasks so that we can automate them… Stay tuned!