Federated Analytics is an architecture pattern that distributes queries directly to sources via the SAP Data Warehouse Cloud and helps to build richer live data analytics on SAP Analytics Cloud combining SAP and non-SAP data, thereby eliminating the need for data replication or duplication.
SAP discovery center mission explaining this architecture pattern
Related blog explaining Data Federation between SAP Data Warehouse Cloud and Amazon Athena
In this blog, we will walk you through a recent validation exercise for this architecture done at a SAP Strategic customer, Decathlon for their business use case.
Customer : Decathlon , A France based large sporting goods retailer.
Business Use Case and Motivation:
Decathlon’s challenges with current analytics solution include volume limit with data import and that data from their hyperscaler sources could not be brought in live on charts for doing better analytics. Decathlon has large amounts of sales and forecasted sales data split across several tables stored in Apache Parquet format in their Amazon S3 data lakes. The expected outcome from the proposed solution is elimination of the import challenges and bringing data live for doing better comparative analytics of forecasted and actual sales data on a story dashboard .
The Solution Architecture:
This use case serves to directly fit the Federated Analytics architecture where SAP Data Warehouse Cloud, SAP Analytics Cloud and Amazon Athena Services are integrated to create the analytics solution end to end.
Decathlon’s Amazon Athena housed their sports equipment forecasted-sales data as well as historical weekly actual sales data in tables and views (queried in real time directly against their several parquet files in Amazon S3 data lakes, roughly 40 million rows in all for this validation).
This Amazon Athena is connected with SAP Data Warehouse Cloud where remote tables are modeled to look up Athena data.
Analytical models created off of the remote tables are used for transforming, aggregating and projecting the data directly queried from Amazon Athena. The analytics story created in their external SAP Analytics Cloud tenant consumes these remote models to bring in the data for rich Visualizations.
Solution diagram showing data federation architecture between SAP Data Warehouse cloud and Amazon Athena
Validation Exercise:
The validation was executed by Decathlon’s business users and analytics users in a new trial SAP Data Warehouse Cloud tenant. This would be their first time working with SAP Data Warehouse Cloud.
With the help of our initial architectural guidance, support and information from SAP blogs and missions, the Analytics developers and business users at Decathlon were able to execute these phases in the PoC:
- Configuration on Amazon S3 and Amazon Athena to identify views/tables that needed to be queried
- Security policy configurations to allow integration from SAP Data Warehouse Cloud to query Athena
- Establishing trust in SAP Data Warehouse Cloud by configuring AWS CA Certs onto SAP Data Warehouse Cloud
- Creating Remote tables and Analytical models in SAP Data Warehouse Cloud
- Configuring live connection from SAP Analytics Cloud to SAP Data Warehouse Cloud
- Creating analytical dashboards in SAP Analytics Cloud
- Monitoring performance at SAP Data Warehouse Cloud and remote queries at Amazon Athena
Result:
Decathlon completed the entire end to end architecture validation starting from data source integration planning till the completion of the Analytics dashboard within a span of just 4 weeks and iterating it over the next 2 weeks to fine tune, monitor and do diagnostic observations.
The end-to-end SAP Analytics Cloud story showing several comparative sales analysis charts, all of them bringing live data through SAP Data Warehouse Cloud’s analytical models that federates queries directly to Amazon Athena in real time.
Monitoring Performance:
End-to-end individual Query Performance was diagnosed and optimized starting with the SAP Data Warehouse Cloud’s remote query monitor tool and tracing it to Amazon Athena helping review the data quality and applying optimizations to improve the performance.
At the end of the 6-week PoC here is a direct quote from the business users at Decathlon:
“
SAP Data Warehouse Cloud enabled us to increase our time to market, from idea to final story, by removing time consuming steps in data preparation.
We were able to consume data where this data is located without duplicated the source of information.
The process was simple and straight forward without even having any training in DWC and with the help of few documents and wiki’s, we were able to create a robust pipeline of information that creates rapid value for our users.
We see in Data Warehouse Cloud an extension of our SAC initiative that goes beyond our expectation.
In the upcoming weeks, we are going to explore other data sources and increase our experience with DWC and SAC.
“
The customer has thereon planned to expand their validation exercises to include their other use cases that involve data from sources, such as Amazon Redshift, SAP BW On HANA and Google BigQuery .
Art of the Possible:
Data federation architecture in SAP Data Warehouse Cloud can be leveraged to provide real time data access connecting to external hyperscaler sources such as Amazon Athena, Amazon Redshift, Google Big Query and Azure Data Explorer and combining it with business critical SAP data to deliver powerful insights , eliminating the need to duplicate any data.
For step by step guidance for implementing use cases, follow the SAP Discovery Mission