ERGIN TUGANAY, PARTNER, HEAD OF INDUSTRY 4.0, April 21, 2021
The importance of data to business is constantly growing. As the amount of data increases, the accessibility of data services and a smooth user experience will become a lifeline for companies in the future. Therefore, the implementation of data platforms is increasingly moving to modern cloud-based and API-based architectures.
A typical data project issue is well described in the 17th episode of the second season of South Park, Gnomes, which had already gained cult status in 1998. In that episode, South Park heroes Stan, Kyle, Cartman, and Kenny, and their buddy Tweek found themselves losing their underwear. Eventually, the boys find a community of small gnomes snatching their underpants, with money and wealth in mind. When asked about the genius of their business plan, the undies gnomes describe their idea as follows:
The absence of Phase 2 indicates that the gnomes did not have an actual plan for monetizing the sets of underwear they cunningly collected.
Incomplete business plan. Picture: Wikipedia (“Gnomes” South Park)
As with the South Park underwear gnomes, the lack of Phase 2 has been a typical problem in data platform projects for many actual companies. Specific challenges faced in data platform projects include:
Compared to many other fields of engineering, information systems are still at an early age. For example, bridge construction, one of the oldest engineering fields, is known to date back to 850 B.C. However, the first steps of digital information systems date back to 1935, when Alan Turing’s works were published at the University of Cambridge.
The Data Warehouse and Data Lake concepts related to Big Data and Data Platform issues are even younger within the information systems industry, almost in their infancy. For example, the term ‘Big Data’ did not appear in academic publications until 1997. Likewise, the first widely used commercial products such as Apache Hadoop, AWS S3, Google Storage, and Azure Data Lake did not become common until the 2010s.
One of the basic rules of engineering is that knowledge and precision increase through experience, which is to say, through trial and error. This is also reflected in the maturity of the data architecture implemented so far.
Bridge construction skills date back about 3,000 years, but the first steps in digital information systems were taken at the University of Cambridge in only 1935.
Although the concepts under the term Data Platform have barely reached their early teens, no company wants to stand still and wait for the industry to evolve. Indeed, several companies have set off with a strong “data is the new oil” agenda, intending to provide different user communities with a buffet of data on a familiar self-service principle.
The armament race of companies has been boosted by strong promises in the marketing of technology companies about the quick profit that can be achieved with analytics based on AI and machine learning. Although service providers sometimes come out with somewhat unrealistic promises, the reality is undeniable: in the future, data and its exploitation will inevitably play an even greater role in improving the competitiveness of companies. For many companies, data and its skillful utilization are already a standard.
Most of today’s data warehousing projects are still built on a centralized organization and monolithic architecture. Its key components are:
Typically, projects like this meet the very traditional DW, reporting and analytics needs of organizations. However, the data platform architecture of the future must meet the needs of operational monitoring and real-time analytics as well. The need for real-time applications in production can be seen as a growing trend, especially in industrial companies. In the past, this need has been met by various applications from traditional automation vendors. Still, in the future, the application development in production plants will also increasingly focus on modern hybrid/cloud-based data platforms:
“The data platform architectures of the future must meet the needs of, not only reporting but also, operational monitoring and real-time analytics.”
The examples mentioned above present challenges for traditional data warehousing architecture, incurred purely from DW and business analytics needs. As a result, new, more comprehensive data platform architectures and the technologies that support them are emerging – and have already occurred.
One of the new approach milestones emerged in spring 2019 when Zhamak Dehghani published his views on the future Data Mesh concept. Organizations within the Data Mesh architecture will also move from their monolithic data warehouses to logically distributed data services implemented according to microservices principles. In contrast, modern application architectures have already moved from monolithic systems to microservices and APIs.
Gartner also represents the same architectural mindset. For example, shortly after the publication of Dehghani’s article at the 2020 Gartner IT Symposium/XPO, almost every presentation was based on two themes: Composable Business Architecture and Packaged Business Capabilities.
PBC (Packaged Business Capability) has become the most popular theme based on Zhamak Dehghan’s views on the Data Mesh concept.
In Gartner’s view, all business applications can be implemented in the future as a collection of packaged digital business capabilities. As a result, they are being recombined with each other repeatedly, which in turn creates completely new innovative capabilities.
Both issues, the Data Mesh presented by Zhamak Dehghani and Gartner’s Composable Business Architecture, are firmly based on the modern application development and integration principles. A distributed architecture and an API-based ecosystem approach play a significant role there. It can be said that future data platform projects will comply with the following principles:
Will the data warehouse based on the centralized relational database technologies continue to be at the heart of corporate information management, and what are the mutual roles between it and, for example, the API platform, which provides digital services? Today, both have their distinct place in architecture, but we believe this distinction shall shrink in many ways in the future.
No doubt, we will see more and more companies soon move from their centralized, monolithic relational data warehouses to distributed API architectures. As such, productized, API-enabled data services present clear and understandable business capabilities for various needs: from monthly management reporting to factory production line real-time dashboards and maintenance workers’ mobile app notifications. And why not also to the underpants gnomes tightening up the details of their important underwear business.
References:
At Nortal, we specialize in the design and implementation of both modern application development and data platforms. We also consult on challenging business ideas to bring about their success.