Introduction
Big Data is a term created in 1980 by American sociologist Dr. Charles Tilly in his article “The old-new social history and the new old social history” and although the context for the use of the term Big Data in this article was different at the time it was published in 1997 two NASA researchers Michael Cox and David Ellsworth published the article “Application-controlled demand paging for out-of-core visualization” where the term Big Data was used for the first time within the current technological context and has been gaining a lot of space and notoriety in recent years including recently boosted by the COVID pandemic that motivated many people to work from home. This is a highly discussed technological topic at the moment and involves large and complex volumes of data produced daily in milliseconds by capturing and sharing information around the world. The whole process encompasses large corporations, social networks and more recently, the application of Artificial Intelligence associated with the growing use of smartphones with easy access to the Internet in many countries (50 billion connected devices by 2020), facilitating the sending and receiving of information at very high speed. This large volume of data in various formats is stored in structured, semi-structured and unstructured combinations often with levels of complexity that make it difficult to process by databases or traditional applications, resulting in the need for the use and improvement of specific technologies so that this information can be properly processed.
Key Features of Big Data and SAP Technologies
According to Anurag Agrahari and Dharmaji Rao (2017), we can consider that Big Data is characterized by 7 Vs: Volume, Velocity, Variety, Value, Variability, Visualization, and Veracity. Moreover, all the data collected during the acquisition phase needs to be stored in different locations due to the large volume of information captured and this data gets organized in structural models configured in rows and columns or unstructured models that are usually data in video format, images, comments, emails or sounds common in social networks but that cannot be organized in rows and columns. These types of storage represent a great challenge for information technology professionals as they are often difficult to access or retrieve data that lack the necessary components for identification and interpretation. However, this information shapes the future of business and attracts the interest of large companies seeking to base their strategic decisions within the context of Big Data to guide their plans and services.
Considering the way the data collected is stored along with the great strategic appeal it provides, it is also necessary to understand the challenges the corporate world faces in finding professionals qualified to handle this type of information. Given that Big Data goes beyond the traditional Information Technology storage concept the future of business depends entirely on the willingness of companies to provide the right skills to market professionals within a new Information Technology context. In this new context, SAP professionals in particular play an important role within consultancies or working directly with costumers themselves compiling large amounts of data and providing information and data analysis mechanisms through new tools.
Among the new SAP tools available, BW/4HANA is considered the core of the SAP big data platform and consists of a data warehouse package based on SAP HANA. It is an on-premise data warehouse layer of SAP’s Business Technology Platform (BTP) that SAP consultants use to consolidate data from across the enterprise and provide a more consistent and simplified view as a source of new ideas in real-time. Its in-memory data architecture, has a columnar distributed format to handle large data sets. This format allocated entirely in memory, allows the SAP professional to perform additional complex calculations, perform data-intensive functions, and operations on content available directly in the database with the use of a rich SQL layer, and patented indexing without the need for aggregations that can be time-consuming and costly.
In addition to BW/4Hana, the SAP Hana can also host data from other information sources such as Hadoop, extending the processing range. Hadoop provides flexible and spacious storage for data objects regardless of their structure that are too large to fit in memory, requiring a pre-processing step before they can be analyzed. By connecting the SAP HANA to Hadoop, the SAP consultant can run jobs on Hadoop loading the information into HANA and provide a final and reliable analysis to his customer. Another technology called SAP BusinessObjects BI, can also bring together the Hadoop and database sources to serve as information to business applications. All this technology works together to provide complex and strategic analysis generating great results for the company. In addition, a new processing technology called Intelligent Enterprise, where artificial intelligence is used to automate systems through the Machine Learning, uses Big Data as a source of information and is becoming promising within the technological market.
The Big Data’s management depends on systems with the power to process and analyze significantly the large amounts of complex and different information. In this sense artificial intelligence demonstrates a reciprocal and fundamental relationship in the organization and practical use of Bid Data to analyze it. The Machine Learning is enriched with Big Data and uses its algorithms to define incoming data and identify patterns in that data as insights. The more robust the analyzed datasets the greater the opportunity for the system to learn, evolves and continually adapt its processes and analytics, presenting these insights to inform and corroborate business decisions and automate processes.