This is part of a series on exploring the SAP Analytics Cloud (SAC) data export API. When the series is complete, it will also be available as a single tutorial mission. The completed export API wrapper can be found on github.
We’ve taken a tour of SAC’s Model Export API and began constructing our Python API wrapper. We’ve marked out the foundation of the SACConnection class, made our initial connection to the tenant, acquired the OAuth token and read the provider list. Now we are going to handle model metadata and master data. Much like we used the SACProvider class to store the connection relevant metadata for our providers (models), we are going to create a class for managing metadata, master data and data of individual models. Specifically, we’ll want to differentiate different type of dimensions from one another and we’ll want to track their master data. We will want to track:
- Generic dimensions
- Date dimensions
- Measures
- Accounts
- Versions
We’ll create a dictionary for each of the different types of dimensions. The keys will be the dimension member ID and the values will be their descriptions. We’ll call this class ModelMetadata().
class ModelMetadata(object):
dimensions = {}
dateDimensions = {}
measures = []
accounts = {}
versions = {}
Unfortunately, while all of this information is available, it does not come in any nice, handy, prepackaged form. We’ll have to tease it out. There are two broad strategies that we can take, but first, we’ll take a quick tour of the OData EDMX document.
- In the OData EDMX document, there are a handful of EntityType elements. One of them is for defining the fact data. Its name attribute is “FactData”. Among others, is another for defining the master data and its name attribute is “MasterData”.
- Within the Fact Data EntityType element, there is a single Key element. Key has a number of child PropertyRef elements. The name attribute of each of these PropertyRef elements refers to the column name of a dimension. Measures do not show up in the Key element.
- Within the Fact Data EntityType element, there is a Property element for each column – measure or dimension – in the fact data. The name attribute is the name of the column. This information, combined with the Key element PropertyRef membership, tells us what our measures are.
- Within the Property element, there is one or more Annotation elements. Within the Fact Data EntityType element, they tend to occur once and are used for signifying the data type of the fact data column. In the Master Data EntityType element, they refer to the columns in the dimension definition for all the model’s dimensions and the Annotation elements annotation elements can occur multiple times.
- Each Annotation elements refers to a datatype. This is stored in a String child element. So the datatype of a column in the fact table is inside the relevant ../Annotation/String element for that column.
Here is the Fact Data EntityType element for the Best Run Juice demo model.
<EntityType Name="FactData">
<Key>
<PropertyRef Name="Account_BestRunJ_sold"/>
<PropertyRef Name="Store_3z2g5g06m4"/>
<PropertyRef Name="Location_4nm2e04531"/>
<PropertyRef Name="Product_3e315003an"/>
<PropertyRef Name="Sales_Manager__5w3m5d06b5"/>
<PropertyRef Name="Date_703i1904sd"/>
<PropertyRef Name="Version_BestRunJsold_V"/>
</Key>
<Property Name="Account_BestRunJ_sold" Type="Edm.String" MaxLength="256">
<Annotation Term="Integration.OriginalDataType">
<String>NVARCHAR</String>
</Annotation>
</Property>
<Property Name="Store_3z2g5g06m4" Type="Edm.String" MaxLength="256">
<Annotation Term="Integration.OriginalDataType">
<String>NVARCHAR</String>
</Annotation>
</Property>
<Property Name="Location_4nm2e04531" Type="Edm.String" MaxLength="256">
<Annotation Term="Integration.OriginalDataType">
<String>NVARCHAR</String>
</Annotation>
</Property>
<Property Name="Product_3e315003an" Type="Edm.String" MaxLength="256">
<Annotation Term="Integration.OriginalDataType">
<String>NVARCHAR</String>
</Annotation>
</Property>
<Property Name="Sales_Manager__5w3m5d06b5" Type="Edm.String" MaxLength="256">
<Annotation Term="Integration.OriginalDataType">
<String>NVARCHAR</String>
</Annotation>
</Property>
<Property Name="Date_703i1904sd" Type="Edm.String" MaxLength="6">
<Annotation Term="Integration.OriginalDataType">
<String>VARCHAR</String>
</Annotation>
</Property>
<Property Name="Version_BestRunJsold_V" Type="Edm.String" MaxLength="300">
<Annotation Term="Integration.OriginalDataType">
<String>NVARCHAR</String>
</Annotation>
</Property>
<Property Name="SignedData" Type="Edm.Decimal" MaxLength="32" Precision="31" Scale="7">
<Annotation Term="Integration.OriginalDataType">
<String>DECIMAL</String>
</Annotation>
</Property>
<Annotation Term="Integration.CDataID">
<String>sap.epm:BestRunJuice_SampleModel</String>
</Annotation>
</EntityType>
Strategies for determining dimension type
The Fact Data EntityType element is where we’ll start. Here, we can get a list of model dimensions. Unfortunately, the dimensions don’t give up what kind of dimension they are based on datatype. So after looping through Fact Data EntityType element, to determine which columns are measures and which are dimensions, you’ll do one of two things:
- After processing the Fact Data EntityType element, go to the Master Data EntityType element. Each dimension will have several Property elements, and each will have one or more Annotation elements. In the soup of ../Annotation/String elements, the dimension will tell us what it is. If it is a version dimension, at least one of the ../Annotation/String data types will be VERSION. If it is an account, then at least will one will be ACCOUNT_TYPE. If it is a date dimensions, then at least one will be DATE.
- A less computationally complex approach is to call the Masterdata endpoint and look in its response JSON. We have to call this endpoint anyway, to get the ID and Description pairs. ID and Description are not the only keys in the JSON entities. If it is a date, then there is a DATE key. If it is a version dimension, then there will be a VERSION key and accounts will have an accType key. This is the approach we will use.
Let’s now interrogate the OData EDMX document. Let’s call this method getModelMetadata(). Fetch the EDMX document, from the model’s $metadata endpoint. Parse the returned XML. This is why we imported minidom at the start of this coding project.
def getModelMetadata(self, providerID):
modelMetadata = ModelMetadata()
urlMetadata = self.urlProviderRoot + "/" + providerID + "/$metadata"
response = self.oauth.get(urlMetadata)
xmlData = minidom.parseString(response.text.encode("UTF-8"))
We want to select the Fact Data EntityType element. We’ll do this selection by opening a for loop for every EntityType element and then checking to see if the current one is named FactData.
for entityTypeElement in xmlData.getElementsByTagName("EntityType"):
nameAttribute = entityTypeElement.getAttribute("Name")
if nameAttribute.find("FactData") > -1:
# There will be more than one EntityType element, but only one named "FactData"
# all non-measure columns appear in the PropertyRef elements
In this element, the dimension columns are listed inside PropertyRef elements. Let’s find all of the dimension columns and list them in a list, called dimList. We can determine the kind of dimension later.
dimList = []
for propertyRefElement in entityTypeElement.getElementsByTagName("PropertyRef"):
prnAtt = propertyRefElement.getAttribute("Name")
dimList.append(prnAtt)
We’ll want to loop over the child Property elements and get the name of the Property element (i.e. the column name) of each column.
# Property elements include all columns
for propertyElement in entityTypeElement.getElementsByTagName("Property"):
prAtt = propertyElement.getAttribute("Name")
dataType = ""
Call the master data endpoint for each column and parse the response JSON.
urlCurrDimMetadata = self.urlProviderRoot + "/" + providerID + "/" + prAtt + "Master"
currDimResponse = self.oauth.get(urlCurrDimMetadata)
currDimResponseJson = json.loads(currDimResponse.text)
If the column is not in dimList, then it is a measure. If it is in dimList, then check the keys in the master data JSON entities, to see which kind of dimension it is.
if prAtt not in dimList:
# Measures and versions show up in the modelMetadata.measures list
if prAtt not in modelMetadata.dimensions:
modelMetadata.measures.append(prAtt)
else:
# modelMetadata.dateDimensions
# modelMetadata.accounts
# modelMetadata.versions
# modelMetadata.dimensions
isAccount = False
isVersion = False
isDate = False
mdMembers = {}
for cdMember in currDimResponseJson["value"]:
cmID = cdMember["ID"]
cmDesc = cdMember["Description"]
mdMembers[cmID] = cmDesc
if 'VERSION' in cdMember:
isVersion = True
elif "accType" in cdMember:
isAccount = True
elif "DATE" in cdMember:
isDate = True
if isAccount:
modelMetadata.accounts[prAtt] = mdMembers
elif isVersion:
modelMetadata.versions[prAtt] = mdMembers
elif isVersion:
modelMetadata.dateDimensions[prAtt] = mdMembers
else:
modelMetadata.dimensions[prAtt] = mdMembers
Store the model metadata and return
self.modelMetadata[providerID] = modelMetadata
return modelMetadata
The complete getModelMetadata() method.
def getModelMetadata(self, providerID):
try:
modelMetadata = ModelMetadata()
urlMetadata = self.urlProviderRoot + "/" + providerID + "/$metadata"
response = self.oauth.get(urlMetadata)
xmlData = minidom.parseString(response.text.encode("UTF-8"))
for entityTypeElement in xmlData.getElementsByTagName("EntityType"):
nameAttribute = entityTypeElement.getAttribute("Name")
if nameAttribute.find("FactData") > -1:
# There will be more than one EntityType element, but only one named "FactData"
# all non-measure columns appear in the PropertyRef elements
dimList = []
for propertyRefElement in entityTypeElement.getElementsByTagName("PropertyRef"):
prnAtt = propertyRefElement.getAttribute("Name")
dimList.append(prnAtt)
# Property elements include all columns
for propertyElement in entityTypeElement.getElementsByTagName("Property"):
prAtt = propertyElement.getAttribute("Name")
dataType = ""
# occurs oncce, so this little for loop will fetch us our only String grandchild of Property
for stringElement in propertyElement.getElementsByTagName("String"):
dataType = stringElement.childNodes[0].data
urlCurrDimMetadata = self.urlProviderRoot + "/" + providerID + "/" + prAtt + "Master"
currDimResponse = self.oauth.get(urlCurrDimMetadata)
currDimResponseJson = json.loads(currDimResponse.text)
# sort dimensions into the dimension dicts and measures into the measure list
if prAtt not in dimList:
# Measures and versions show up in the modelMetadata.measures list
if prAtt not in modelMetadata.dimensions:
modelMetadata.measures.append(prAtt)
else:
# modelMetadata.dateDimensions
# modelMetadata.accounts
# modelMetadata.versions
# modelMetadata.dimensions
isAccount = False
isVersion = False
isDate = False
mdMembers = {}
for cdMember in currDimResponseJson["value"]:
cmID = cdMember["ID"]
cmDesc = cdMember["Description"]
mdMembers[cmID] = cmDesc
if 'VERSION' in cdMember:
isVersion = True
elif "accType" in cdMember:
isAccount = True
elif "DATE" in cdMember:
isDate = True
if isAccount:
modelMetadata.accounts[prAtt] = mdMembers
elif isVersion:
modelMetadata.versions[prAtt] = mdMembers
elif isVersion:
modelMetadata.dateDimensions[prAtt] = mdMembers
else:
modelMetadata.dimensions[prAtt] = mdMembers
self.modelMetadata[providerID] = modelMetadata
dimList = list(modelMetadata.dimensions.keys())
self.addFilterProvider(providerID)
return modelMetadata
except Exception as e:
errorMsg = "Unknown error during token acquisition."
if e.status_code:
errorMsg = "%s Status code %s from server. %s" % (errorMsg, e.status_code, e.error)
raise RESTError(errorMsg)
else:
errorMsg = "%s %s" % (errorMsg, e.error)
raise Exception(errorMsg)
Next time, we’ll actually fetch the fact data.
Subscribe
Login
Please login to comment
0 Comments