Qualtrics allows users to create surveys and generate reports without much programming effort. we have faced many requests where customers want to connect Qualtrics in their BTP setup and get real-time feedback/surveys built in their application along with a reporting tool to understand the feedback surveys and take informed decisions in their projects. We implemented this scenario for an internal project at SAP and customers/partners can use the below setup to best utilize the Qualtrics capabilities and aggregate all user feedback on various parameters. In our scenario we were monitoring the progress of our project.
lets look at the architecture:
We have 4 components in the architecture
- BTP hana cloud
- CAP nodejs application to expose the Qualtrics table via api (in our scenario nodejs server acted as our application server )
- Ui5/workzone as the frontend to call Qualtrics survey
- Python application running on cloud foundry platform that connects with Qualtrics server and fetches the responses back from the server into BTP hana cloud instance and persists them as tables in an external schema. It is consumed by creating a user provided service into your nodejs application server and exposed to UI
1) How does it all work
In your BTP ui5/workzone cards you can call the Qualtrics http endpoint directly or in case the qualtrics endpoint is not accessible on BTP then use a destination service.
The ui5 application sends the responses collected from the user through the qualtrics url. Pass your parameters on which you need to collect the responses. the aggregation and reporting on the responses is done based on these parameters that you will append to the end of the Qualtrics survey url.
BTP Hana cloud instance can connect with Qualtrics through an http endpoint which is called via a python script. we have used oauth for authentication but for productive scenarios we would like to use passport/ other methods to securely inject the clientid and credentials into the script.
2) Domain Based auth
in our current scenario we have a CAP application that has a nodejs backend and we have setup roles in our hana cloud We are not using the xs-security.json to create role collections in BTP as we have to assign roles manually and the collections get overwritten on each deployment. The security groups are also not propagated from the IAS instance back to our Cloud foundry application.
Secondly if you have partners,external suppliers and employees using the same application then based on the domain/email you can identify the user and direct them to the right resource that they are authorized to access. The below code runs and provides you the logged in user on your CAP application. Considering Cloud Foundry requires you to have an email as an identity of the user, its very important to fetch the logged in user in CAP nodejs application
const readJwt = function (req) {
const authHeader = req.headers.authorization;
if (authHeader) {
const theJwtToken = authHeader.substring(7);
if (theJwtToken) {
const jwtBase64Encoded = theJwtToken.split(".")[1];
if (jwtBase64Encoded) {
const jwtDecoded = Buffer.from(jwtBase64Encoded, "base64").toString(
"ascii"
);
return JSON.parse(jwtDecoded);
}
}
}
};
4) Python application
I created a very simple python flask application. the application has one script file called as QualtricsResponsesSurvey.py that needs to run everytime the user fills up a feedback survey. you can also enhance the application by scheduling it to run through a job scheduler.
the application has the below structure:
Now my manifest.yml looks like below:
---
applications:
- name: qualtricssentiments
routes:
- route: sap-qualtricssentiments.cfapps.eu10.hana.ondemand.com
path: ./
memory: 128M
buildpack: python_buildpack
command: python QualtricsExportSurveyResponsesAPI.py
the qualtricsresposes.py looks like below:
import sys
import requests
import zipfile, io
import getpass
import pandas as pd
import datetime
from dateutil.relativedelta import relativedelta#
from os import *
### Define extraction parameters
## add your surveyid
surveyId = ""
dataCenter = 'fra1'
dataFormat = 'csv'
missingValues = -9
### Define today - 5 days in Qualtrics freindly format
today_time = datetime.datetime.now()
today_time = today_time + relativedelta(days=-5)
today_time = today_time.strftime("%Y-%m-%dT%H:%M:%SZ")
### enter secret and get bearer token, need to use passport for securely passing the oauth credentials, not recommended method for productive scenarios
print ("Please enter your O-auth client secret: ")
ClientSecret = '******'
ClientID = '*******'
baseUrl = "https://{0}.qualtrics.com/oauth2/token".format(dataCenter)
data = {'grant_type': 'client_credentials', 'scope': 'read:survey_responses'}
r = requests.post(baseUrl, auth=(ClientID, ClientSecret), data=data)
bearer = r.json()['access_token']
### initialize file extraction
requestCheckProgress = 0.0
progressStatus = "inProgress"
url = "https://{0}.qualtrics.com/API/v3/surveys/{1}/export-responses/".format(dataCenter, surveyId)
headers = {
"content-type": "application/json",
"authorization": "bearer " + bearer,
}
# Creating Data Export format
data = {
"format": dataFormat,
"seenUnansweredRecode": missingValues
# "startDate": today_time
}
downloadRequestResponse = requests.request("POST", url, json=data, headers=headers)
print(downloadRequestResponse.json())
try:
progressId = downloadRequestResponse.json()["result"]["progressId"]
except KeyError:
print(downloadRequestResponse.json())
sys.exit(2)
isFile = None
### trigger export file creation and wait until status is 100%
while progressStatus != "complete" and progressStatus != "failed" and isFile is None:
if isFile is None:
print ("file not ready")
else:
print ("progressStatus=", progressStatus)
requestCheckUrl = url + progressId
requestCheckResponse = requests.request("GET", requestCheckUrl, headers=headers)
try:
isFile = requestCheckResponse.json()["result"]["fileId"]
except KeyError:
1==1
print(requestCheckResponse.json())
requestCheckProgress = requestCheckResponse.json()["result"]["percentComplete"]
print("Download is " + str(requestCheckProgress) + " complete")
progressStatus = requestCheckResponse.json()["result"]["status"]
if progressStatus == "failed":
raise Exception("export failed")
fileId = requestCheckResponse.json()["result"]["fileId"]
### download file + unzip + read into dataframe + skip first two rows which contain header and column formats
requestDownloadUrl = url + fileId + '/file'
requestDownload = requests.request("GET", requestDownloadUrl, headers=headers, stream=True)
zipfile.ZipFile(io.BytesIO(requestDownload.content)).extractall("QualtricsDownloads")
df_responses = pd.read_csv('./QualtricsDownloads/' + listdir('./QualtricsDownloads/')[0])
df_responses = df_responses.iloc[2:len(df_responses.index), :]
import tempfile
# load API responce into a temporary folder and from there into a dataframe
temp_dir = tempfile.TemporaryDirectory()
requestDownloadUrl = url + fileId + '/file'
requestDownload = requests.request("GET", requestDownloadUrl, headers=headers, stream=True)
zipfile.ZipFile(io.BytesIO(requestDownload.content)).extractall(temp_dir.name)
df_responses = pd.read_csv(temp_dir.name + '/' + listdir(temp_dir.name + '/')[0]).iloc[2:len(df_responses.index), :]
df_responses = df_responses[['ResponseId','Progress','Q1','Q18','Q8_1','Q8_2','Q8_3','Q2','Q17','AccountID','ProjectID','PhaseID']]
df_responses['Q1'] = df_responses['Q1'].fillna(-9)
df_responses['Q8_1'] = df_responses['Q8_1'].fillna(-9)
df_responses['Q8_2'] = df_responses['Q8_2'].fillna(-9)
df_responses['Q8_3'] = df_responses['Q8_3'].fillna(-9)
df_responses['Q2'] = df_responses['Q2'].fillna(-9)
df_responses['Q18'] = df_responses['Q18'].fillna('')
df_responses['Q17'] = df_responses['Q17'].fillna('')
df_responses['AccountID'] = df_responses['AccountID'].fillna('')
df_responses['ProjectID'] = df_responses['ProjectID'].fillna('')
df_responses['PhaseID'] = df_responses['PhaseID'].fillna('')
from hdbcli import dbapi
from sqlalchemy import create_engine
host = "*******"
port = "****"
username = '****'
password = '******'
connection = dbapi.connect(host, port, username, password)
table_name = '"DBADMIN"."TEST_QUALTRICS"'
records = df_responses.to_records(index=False)
print(records)
def prepare_insert_statement(df, destination):
columns = df.columns
columns_string = '"' + '", "'.join(columns) + '"'
values_placeholder = ['?' for column in columns]
values_placeholder_string = ', '.join(values_placeholder)
statement = f'UPSERT {destination} ({columns_string.upper()}) VALUES ({values_placeholder_string}) WITH PRIMARY KEY'
return statement
query_template = prepare_insert_statement(df_responses, table_name)
cursor = connection.cursor()
cursor.executemany(query_template, list(records))
print("cursor closed")
cursor.close()
connection.commit()
connection.close()
After you deploy this python app on Cloud foundry and run it(app shows as crashed but dont worry it works !)
if you have followed the tutorial then you should see the responses in form of a hana table like this :
3)Future enhancements for productive scenarios
the python script run can be scheduled regularly using the BTP scheduler service.i would recommend having separate servers for backend services in your application that could be on java/nodejs server and having a python application running on a separate server(as we have stated in this solution). You would need to create a user provided service to consume the artifacts from the Qualtrics schema in hana cloud from your hdi container that hosts the application data. You need to create hdbroles so that the hdi runtime container user can access the external schema that holds the Qualtrics responses.
Now the frontend can look like this below if you show the responses in form form of a chart in a workzone card:
Let me know if you find a better way in connecting to qualtrics!
(Bonus points for enhancing the python app 😉