Why SDI ABAP for virtual access in SAP Data Warehouse Cloud should be avoided

In the past two years, I noticed one of the common mistakes when using SAP Data Warehouse Cloud is trying to use SDI ABAP connection for federation. Thus I am thinking of some simple examples to make everyone understand why it will not work.

First, it is strongly recommend everyone read SAP Data Warehouse Cloud : First Guidance: Data Integration for ABAP Source Systems. before starting the DWC project. However, the information there is so condense that it might take a while till you realize what it does mean exactly. One of the main points is Smart Data Integration (SDI) ABAP for virtual access in DWC should be avoided. as we know limitation of SDI ABAP: it supports SELECT, WHERE, TOP, or LIMIT but it does not support to push down joins and group by.

But what does it mean exactly to your source systems (like BW, S/4 HANA connect to DWC)? The key question in the end: How much data (No. of records , No.of columns and size)you need to transfer from the source systems to DWC to make your report work? If you do not model it in a proper way, it will be much larger than you could imagine.

How much data will be transferred to the cloud?

In this blog, three examples are demostrated.

	Business Scenario	Technique proofs
Example 1	Calculate how many times for each customer has booked the flight	Select from single table SBOOK, group by is not pushed down to the source. In a real case, it means one table of millions of records transferred to DWC and the result (No. of customers) are only several ten thousand.
Example 2	Calculate total of each airline company	Select from two tables SBOOK, SFLIGHT. Join is not pushed down to the source .In a real case, it means two huge tables of millions of records transferred to DWC to calculate around 10 to 20 records.
Example 3	Calculate total only one airline company	Select from two tables SBOOK, SFLIGHT. Filter is pushed down to the source to both tables, In a real case, this could help you reduce the records transferred back to DWC.
Example 1, 2 and 3		From all three examples, you will see in ABAP side, all the columns are also selected and read even they are not used. For some big tale like ACDOCA (more than 300 columns), this will also be a performance killer to the source system (S/4 or BW).

Architecture

The tables and views used in this blog could be referred as below picture. I use the Sflight mode, basically you could find this demo in each ABAP system those tables. And I am only using two tables Sbook and Sflight.

If you do not know how to configure the SDI ABAP connection. Here is SAP help document how to install, configure and connect Data Provision Agent to DWC.

each%20example%20in%20a%20summized%20architetury

Data Models for three examples

Here are the versions: DWC 2022.10.61, DP Agent 2.5.3.1. S/4 HANA: S/4H 2020.

In ABAP system transaction SE80, you could search package SAPBC_DATAMODEL. Then I check table SBOOK, it has 24 columns and 5 of them are keys.

Sbook%20Data%20Fields

Sbook Data Fields

Here is Sflight and it has 14 columns, 4 of them are keys.

Sflight

In order to make the examples are as simple as possible, we just use less than 10 records to illustrate the issue. But of course, if we are talking only about hundreds of records, replication or federation will not make a big difference regards performance. However, in a real business case, when it comes to hudred thousand to millions of records in the S/4 or BW systems. It will make a huge difference.

8 records in SFLIGHT

9 Records in SBooks

Tools used

Tools	Purpose
Database Explorer on HANA Cloud	Generate PlanViz, run SQL
ST05 Trace in ABAP on S/4 HANA	Track SQL trace triggered by Data Warehouse Cloud in the on premise system
ABAP Development Tool (Or similar Eclipse Tool) on your PC	Check the generated PlanViz Plan

Example 1 Calculate how many times each customer has booked the flight

First, let us have a look at the architecture. Sbook in S/4 HANA has 24 columns and 9 rows. Example 1 is only on this one table. It tries to get only 2 colmns and final result will be aggreaged to 3 rows.

data%20model%20example%201

data model example 1

Below are the definition of booking_customerID. It is simply a projection of SBOOK_Demo, which is the technique name I deployed for SBOOK as a remote table in DWC.

SELECT  "MANDT",  "CARRID",  "CONNID",  "FLDATE",  "BOOKID",  "CUSTOMID" FROM "SBOOK_Demo"

Then I tried to aggregate the data to count how many bookings each customer has.

select "CUSTOMID" , count ("BOOKID") as "No  of bookings" from "booking_customerID" group by "CUSTOMID"

In the data preview of the data model,

data%20preview%20in%20DWC

Data preview in DWC

you can see there is only 3 records. 3%20customers%20and%20their%20No.%20of%20bookings

3 customers and their No. of bookings

In an idea case, as this is from only one system and one table, you expect probably everything could be pushed down to the source for calculation then only 3 records are returned. So let us have an insight look via different tools.

Generate PlanViz in HANA Cloud for example 1

If you do not know what is Planviz, please check this blog The HANA PlanVisualizer (PlanViz) – Quick and Easy

To do this, you need to enough authorization to access the backend HANA Cloud system. You can check the database users and groups in the help document.

Open%20DB%20explorer

Open DB explorer in DWC

Here you can run similar SQL in DB explorer as the SQL generated by data preview. Pay attention that you need to add the data schema in DB Explorer (different than the SQL view you create in a space), which is a DWC space technique name.

generate%20PlanViz%20in%20DB%20Explorer

Generate PlanViz in DB Explorer

In above picture, you will see the result is the same as data preview. In case the query comes from SAC, you can trace the Ina query and use HANA built-in stored procedure SYS.EXECUTE_MDS to generate the PlanViz in HANA Cloud. I will not cover how to trace SAC in this blog. But you can follow steps in SAP note 2525191 – SAP Analytics Cloud Best Practice

Generate%20PlanViz

Generate PlanViz

Save Planviz locally, I tried to give a meaningful prefix of the name.

save%20Planviz

save Planviz

Now I am trying to open it locally on my PC with ABAP Development Tool in Eclipse. You can use your some Eclipse Tool like HANA Studio or Visual Studio Code (GitHub Repository)

here is how the PlanViz looks like

PlanViz%20of%20single%20table%20grouping

PlanViz of single table grouping

ST05 SQL Trace on S/4 HANA

Now let us have a look at what kind of SQL has been set to the ABAP system. Now you need to log into the S/4 system at the ABAP side. In case you do not know what is ST05 traces in ABAP, here is a classic blog (The SQL Trace (ST05): Quick and Easy) .

Run ST05 Performance Trace, Turn on the trace with filter. I am tracing the technique user of DWC and all servers. Important: ABAP SDI connect only on one applicaiton server. And it could be a different server than your current user log on. By default, you only trace current application server. If it is a different one than the DWC use, you can not get anything.

st05

st05 in ABAP

If you do not know which application server it use at DWC side, you could check below (In case you do not have authorization in DWC to check below, you need contact your admin) . Then in ABAP system, switch to this application server which you use to connect for DWC in below screen for traces.

Edit Connection in DWC

So the steps will be

Turn on ST05 trace in ABAP systems
Run data preview in DWC or similar SQL in DB explore.
Make sure you see the result and stop the ST05 trace in ABAP system
Check the ST05 Trace

You will now see a lot of entries in the traces but you can restrict to only see the SQL which access our table SBOOK or SFLIGHT.

Only%20check%20certain%20tables

Only check table SBOOK and SFLIGHT

Double click the SQL and you will see columns wise it is also not restricted. All 24 columns are read even the view only tries to read two columns.

SQL%20to%20read%20Sbook

SQL to read Sbook in ABAP

Here I can conclude the 1st example, group by and columns are not push down. Imagine if you use this for a very wide and big table. Even the final result are only small, the records returned could be big.

Example 2 Calculate total of each airline company

Here are the data models of example 2. We will see all the records and columns from two tables are transferred from S/4 HANA to DWC. The aggregations and joins cannot be pushed down, this will only happen in DWC after the records are transferred.

Data%20Model%20example%202

Data Model example 2

Definition of view flight_booking_basic, it is inner join and projection of two tables Sflight and Sbook.

SELECT
  "SBOOK_Demo"."MANDT",
  "SBOOK_Demo"."CARRID",
  "SBOOK_Demo"."CONNID",
  "SBOOK_Demo"."FLDATE",
  "SBOOK_Demo"."BOOKID",
  "SBOOK_Demo"."CUSTOMID",
  "SBOOK_Demo"."CUSTTYPE",
  "SBOOK_Demo"."CLASS",
  "SBOOK_Demo"."ORDER_DATE",
  "SBOOK_Demo"."COUNTER",
  "SBOOK_Demo"."AGENCYNUM",
  "SFLIGHT_Demo"."PRICE",
  "SFLIGHT_Demo"."CURRENCY",
  "SFLIGHT_Demo"."PLANETYPE",
  "SFLIGHT_Demo"."PAYMENTSUM"
FROM ("SBOOK_Demo" INNER JOIN "SFLIGHT_Demo" ON "SBOOK_Demo"."MANDT" = "SFLIGHT_Demo"."MANDT" AND "SBOOK_Demo"."CARRID" = "SFLIGHT_Demo"."CARRID" AND "SBOOK_Demo"."CONNID" = "SFLIGHT_Demo"."CONNID" AND "SBOOK_Demo"."FLDATE" = "SFLIGHT_Demo"."FLDATE")

Here is definition of view Total_Company, it calculate the total of all the airline companies.

In a real business case, calculation of earnings fits better to calculate LOCCURAM (price of booking in local currency airline). But just to demonstrate the technique aspect of joins and filters, I use PRICE in Sflight table to make it simple. (Date and currency conversions are not considered)

select "CARRID", sum("PRICE") as "Total" from "flight_booking_basic" GROUP BY "CARRID"

I will follow the same procudure for the traces as example 1

Turn on ST05 trace in ABAP systems
Run data preview of view “Total_company” in DWC or similar SQL in DB explorer.
Make sure you see the result and stop the ST05 trace in ABAP system
Check the ST05 Trace

Here is the result you see in data preview of DWC. 2 records with 2 columns.

Result%20of%20Total_Company

Result of Total_Company

Here you can see the planVIz, all the records from SFLIGHT and SBOOK are transferred back to Data Warehouse Cloud. Joins and Aggregations happen later in the DWC (HANA Cloud behind).

PlanViz example 2

In ABAP side from ST05 traces, you can see records (column: Sätze) selected and how many seconds (column: Dauer) it takes to read for each table.

Double click the SQL column, you will see unfortunately besides all records are scanned, all columns are also read.

SBOOK%20ABAP%20SQL

SBOOK ABAP SQL

SFLight%20ABAP%20SQL

SFLight ABAP SQL

Example 3 Calculate total only one airline company

In this example, I just run SQL command in DB Explore to add a filter (where condition) to view Total_Company to demonstrate you that the filter could be pushed down to the source systems and help reduce the records.

Here is how this SQL command and result looks like. I add CARRID as a filter in below SQL. It return in the end only one result.

Example%203%20SQL%20and%20result

Example 3 SQL and result

From the planViz, we can see the filter has been applied to the remote table scan. Instead of returning 8 and 9 records, here it returns 4 and 5 records after the filter happens in remote source.

PlanViz%20Example%203

PlanViz Example 3

In S/4 HANA, ABAP ST05 trace, we can again see the filter has been passed here.

Sflight%20with%20filter

Sflight with filter

Sbook%20with%20filter

Sbook with filter

Thus adding filters could help you reduce the records as much as possible. But it will also depends on the SQL cases, some complicated SQL filters can also not pushed down. But with above methods, you already learn how to verify in a real business scenario how it works for you.

Additional questions:

Another often coming questions are : shall I use the SDI HANA Adapter to connect to HANA DB directly because it has better push down? You can see from the help that select individual columns, join , agrgregations data via group by are supported.

First, You can again do some similar test. The differences are in HANA connection, you need to turn on HANA SQL traces instead of ABAP traces. For HANA connections, you can even have one more possiblity in addition to SDI for federation as below. Cloud Connector can be used both for federation (using Smart Data Access) or replication with data flows in DWC.

Secondly, even the SDI HANA adapter or SDA Cloud Connector has better push down, you can not also simply say that the HANA connection is better for a complete use cases. You need to evaluate your own cases based on different factors like security control, how to reuse authorizations, input parameters in S/4 HANA CDS views, limitations while generation HANA views from BW systems and many others.

Connect HANA on Premise to DWC

Conclusion:

SDI ABAP is for replication, not good for federation. If you are still not sure what to use, an assessement workshop/service together with SAP and customers are recommended.

Call for actions:

You can test your own data model with above methods. And test different senarios including

SDI ABAP using Data Provision Agent.

SDI HANA using Data Provision Agent.

SDA for HANA using Cloud Connector.

Your feedback is welcomed!

Tags: SAP Data Warehouse Cloud SAP HANA smart data integration

Sara Sampaio

Author Since: March 10, 2022

0 0 votes

Article Rating

0 Comments

Inline Feedbacks

View all comments

Business Scenario

Technique proofs