Does everything has to be “Real time”?
In my last blog on connecting source and target systems to CDP, I summarized for you how connectivity is achieved from the SAP CDP perspective. We have seen that there is a Connector Library which already contains a number of predefined connectors from the SAP side into both SAP and non-SAP systems. In addition to this, we have seen that the Connector Studio enables the connection of systems for which there is no connector available yet within the CDP, but which have a REST interface. The Connector Studio then provides all the means to connect these APIs of the source and target systems without complex implementation. We have also seen that especially with regard to the connection of the source systems, from which the events come, challenges exist in terms of “real-time” integration, since not every source system is inherently ready to communicate its customer events directly to a consuming CDP system. We have seen that ideally there is a push mechanism on the part of the source system that immediately passes an incoming event to the API of the CDP. If this is not the case, the CDP has to “pull” these events. If you would like to read it again, you can find the blog post here.
Today I would like to take a closer look at the connection of a source system from an architectural perspective. We have already seen that there are several ways to connect the source system, and which of these options is the best depends on the use case.
Let’s start with the wishful thinking that the source system has a connector that automatically and immediately informs the SAP CDP when a user event occurs. For example, a customer orders something in the commerce system, which in turn reports this event directly and with the well-defined payload to the CDP’s API. That would be nice, but in most cases it is just wishful thinking that a system vendor, especially outside of SAP, has a SAP CDP connector in their portfolio out of the box. Or can you imagine that a manufacturer of such a source system has an interest in providing such a connector explicitly for SAP and bears the necessary implementation costs for this? At best, this will be a custom development that will certainly not find its way into the standard. Ok, for a SAP product to integrate, you might expect a direct connector on the source system side, but even there, it is still not common…
Send notifications or fire actions after events with webhooks
Then we come to a more realistic case: the source system can fire a webhook toward SAP CDP. First, what is it anyway? Wikipedia has the following definition:
“A webhook in web development is a method of augmenting or altering the behavior of a web page or web application with custom callbacks. These callbacks may be maintained, modified, and managed by third-party users and developers who may not necessarily be affiliated with the originating website or application.” (source: Wikipedia)
That’s something like:
Source system: “Hey, CDP, I have something for you about the customer Joe User”
SAP CDP: “Ok, thanks, I’ll ask your REST API what’s new about Joe!”
SAP CDP: “Well, Source System REST API, I’m SAP CDP. Please give me Joe User’s latest customer activities”
Source system: “Hi SAP CDP. Confirmed, it’s you. Here are the details about Joe…”
This is exactly what webhooks are for: they deliver incoming events of any kind on the source system side and transmit them to the consuming target systems (such as a CDP) in near real-time. More precisely, there are two types of webhooks:
-
- those that inform the target system that there is a new event and that the target system should fetch this information via a secure (and possibly authenticated) channel. This is especially important if the data to be obtained by the target system is confidential and should only be retrieved over a secure channel. The above example about Joe User illustrates this.
-
- and those that fully describe the event in their payload and that can be directly processed by the target system. This means that all information about the event is already available in the payload of the webhook and no further demand is required from the source system. For example, Qualtrics, Marketo or the Commerce Cloud (CCv2 2108) fire such webhooks, which can be consumed directly by the SAP CDP without any further request.
SAP Customer Data Platform supports this feature for both types — that is, we can include webhooks that contain all data, but also those that require enrichment.
Offline connection versus near-real-time interactions
Often, however, a source system is not even willing to fire a webhook, and merely provides the API through which a consuming system like the CDP can fetch the respective information. From the CDP’s point of view, however, this means that it must knock on the source system’s door at regular intervals and ask if there is anything new… be it for the customer Joe or Anna or Tanja or for all of them. And the source system then tells you that Tanja placed an order 10 minutes ago. Ok, the CDP can also knock on the door every ten seconds and ask. That would move the integration of the system in question more into a real-time mode. However, having a neighbor checking with you every ten seconds can be really annoying… in other words, the more often we check, the more we flood the network with requests that may go nowhere (also not all source systems will allow a call every 10 seconds; some have very low rate limitations. Therefore, the schedule should also consider the source limits).
Conversely, the CDP can also ask only once a day, but then the notion of “real time” is quite distant. In this case, CDP simply assumes that at the end of the day, the source system will provide a file on a drive that compiles all the daily activities of the customers in the form of, for example, a CSV-formatted list that is fetched via a batch-driven sftp access. Such “offline connections” are not uncommon: sending groups of people (audiences, possibly with their activities) to email campaigns or ad campaigns does not require real-time connectivity. And data warehouses are usually connected in the same way because time criticality is not the decisive factor for their use cases. For them, for example, it is important to have a daily updated overall picture of all orders to deduce how business will develop in the coming weeks and months.
Put simply, anything that is connected to SAP CDP via batch-driven offline integration cannot be real-time because you are asking the system to group data between a period. Accordingly, the concept of real-time does not apply to batch, but it is perfectly adequate for use cases that are “audience” driven.
Offline or near real-time: What does an architecture look like now?
In terms of architecture, the use cases determine the conceptual path for integrating source systems.
There are use cases where the company, for example, wants to be informed immediately that its VIP customer Joe User has just made a very negative comment about the service and is well on the way to terminating his loyalty to the company. This information is so important that it cannot wait for a batch process to transfer this information at some point. So, for near-real-time use cases, an event-driven architecture is used architecturally. For that, webhooks (or an event notification system) are preferred, which allow subsequent events to be triggered at SAP CDP. In most cases, the event that SAP CDP receives is a single event. But, for example, in cases where there is a peak event — let’s say 1,000 logins in a minute — SAP CDP will likely receive 1,000 different webhook event notifications in a minute, depending on the source system. Fortunately, SAP CDP can handle such a high load and wait for “many notifications per second.” This is one of the strengths of SAP Customer Data Platform.
There are also some use cases where a “near-real-time definition” can be 1 to 2 minutes rather than 1-2 seconds, such as sending emails, SMS, or use cases where customers normally expect a short delay. These are different from use cases that need the data immediately (e.g. personalization, next best step, etc.). In these cases, even the “schedule batch” query is acceptable, with a minimum duration of 1 minute.
On the other hand, we have seen that batch processes largely serve offline integrations. These serve use cases with high performance and without high integration requirements, where the real-time factor only plays a subordinate role. And finally, the term “offline” may be a bit irritating: batch processes that query source systems at time intervals of several minutes already reach close to a near-real-time customer experience.
SAP CDP API and the “Direct applications”
It is not always just servers as source systems that deliver customer data; it can also be applications that access SAP CDP directly and deliver their user data there. However, this also means not only that SAP CDP requests customer data from a source system on its own initiative, but that source systems can also contact the SAP CDP APIs on their own initiative and deliver their full payload there — if they are authorized to do so.
In SAP CDP, “Direct Applications” provide a “low-code/no-code” approach that means individual customer applications and touchpoints (mobile apps on smartphones or smart TVs) can also be integrated into SAP CDP. Two aspects are necessary for this: The app must successfully log on to SAP CDP and it must be able to provide a unique ID of the user. If these apps can be uniquely assigned to a person, SAP CDP can store customer data and activity from each touchpoint using unique identifiers and identity resolution. For example, it is conceivable that a mobile app that has a unique device ID or phone number could send customer activity directly to SAP CDP for profile enrichment. That, for example, Joe User’s smart TV will report when it celebrates its 10,000th hour of operation, assuming he has mapped the TV’s serial number to his customer profile. SAP CDP has advanced identity resolution rules that can be configured to use any identifier(s) that contribute to identity resolution, including device IDs, email addresses, identity IDs, CRMIDs, or randomized email addresses.
Which of the aforementioned integration options can be considered in an architecture remains a question of the capabilities of the source system and the underlying use cases. In any case, with the integration approaches described, the spectrum of data exchange is possible for almost any point in time at which this data is required.
Combine integrations with extensions
These are the options we have from the SAP CDP side to provide source system integration. Perhaps there will be more options over time, but at this stage these are the most important ones. Let’s move on to another aspect of integration that involves externally generated events: the enrichment of this data by third-party systems. For this, SAP CDP provides a mechanism of “extensions.”
Extensions are intersections during the data processing of incoming customer activities. Let’s stay with the example of Joe User’s TV set. We receive information from wherever that Joe’s TV with serial number “S/N: 123ABC” has exceeded the threshold of 10,000 hours of runtime. For the further processing of the event in the CDP it would now be of high interest whether this event would have to be considered in connection with warranty and guarantee. For example, if Joe has a ten-year warranty on the TV and the TV manufacturer knows that after 10,000 hours generally the first units become defective, then this information may be helpful in offering Joe a new unit at a lower price instead of a repair under warranty coverage, even before the defect actually occurs. The incoming information about the running time of the TV alone is not sufficient for this Customer Journey. Instead, when this information arrives, an external system is contacted via an extension, which adds the contract data (such as Joe’s warranty conditions) to this event and only then stores it in SAP CDP. Extensions are the interfaces in the CDP process that listen for specific incoming events and, when such events occur, can first query external systems for additional information that enriches the events before they are further processed in the customer’s profiles and Activity Indicators.
In summary…
How a data source for customer information and customer activities should be integrated into the CDP always remains a question of the use case. The criticality of the incoming information may justify real-time integration, but in many cases near real-time to offline integrations are sufficient (Joe’s TV is unlikely to give up the ghost just a few seconds after the 10,000-hour threshold). From this perspective, the use cases for CDP will become increasingly complex and exciting — to the benefit of the customer.
(Credits to Gheerish with his excellent remarks and contributions to the discussion!)
More about SAP CDP:
Customer Data Platform: The Core Element for Customer Information
The Real 360° Customer View – SAP Customer Data Platform in Action
Data Governance and Compliance in CDP
Unlocking Value in CX Development: The Way to the Perfect Set of Use Cases