Until recently, access to mainframe data and information in real time by distributed applications and cloud users has been difficult. Access rights have needed to be established; data has needed to be protected; data integrity has needed to be preserved – and these precautions have slowed access to real time back-end data (it has been like a great big wall has been placed between the distributed application/cloud worlds and back-end mainframe data). The workaround has been to copy that data (extract/transform/load – ETL) – and then query it separately. And, of course, ETL-ing takes away real-time access to mainframe data.
With no easy way to share real time mainframe data while keeping it secure, IBM has long struggled to find a work-around to the problem of sharing vital, real-time mainframe information with applications and cloud users who need it. That is, until recently. With the announcement of its Z Digital Integration Hub (zDIH), IBM can now provide access to back-end mainframe data, making it possible for distributed applications and cloud users to get access to a subset of real-time relevant information – enough data to perform queries in real-time and generate various events based on that information.
New use-cases for this real-time hub are now starting to evolve– and early results are showing that this new digital “integration hub” does indeed provide access to vital data in a protected manner in real-time without compromising data security and data integrity. Interest in this hub is growing – and IBM has also shared that due to increased cloud adoption, inquiry traffic is escalating and becoming more unpredictable. IBM has told us that it is seeing an increased adoption of event-based architectures that feed on real-time data. And customers are realizing that cloud applications don’t really need all the raw data, but usually data subsets or composed aggregates.
A closer look
Batch and transactional data workloads still reside largely under mainframe auspices. And most business analysts are still running their applications off-line using copies of mainframe data extracts – thus removing the important element of real-time information use. To address this “offload” situation, IBM has created a hub (a data and application integration environment) that resides on an IBM System Z and that places the needed subsets of production data or the information derived from production data in main memory where business analysts and downstream applications can easily access data from their respective public/private clouds/hybrid clouds and by applications requiring access to that data. Using this hub, mainframe data becomes transparently accessible to cloud users, and that data does not have to be moved (it remains under the auspices of mainframe control). IBM’s Z Digital Integration Hub allows users to easily access relevant business information from their familiar cloud environments while the raw data remains rigidly protected by mainframe security controls.
The data subset created by the hub consists of production, batch and transactional information that is computed, composed and derived from data and applications that reside on the mainframe – and places that data into main memory. (This means that users don’t need to access vast volumes of raw mainframe data being produced by production applications. Instead, the data has been abstracted to core data contexts that allow a multitude of applications to use up-to-the-minute production information in real-time).
Now, with a curated subset of data available, the question becomes, “How to present this data to cloud users?” To do this, IBM uses an open interface approach, using standards-based application program interfaces (APIs) to interface with ZDIH instances optimized for individual core systems each time information is needed. Using these open interfaces, cloud users (Amazon Web Services, OpenShift, Google Cloud, Oracle Cloud, IBM Cloud, Azure, etc.) can make use of the interfaces (including REST, JDBC, ODBC and Kafka-based EVENT), tools and programs already resident in their environments to leverage mainframe resident information – without having to move that data.
A database memory manager manages this hub structure (an in-memory compute and SQL engine known as GridGain for z/OS) residing at the mainframe level. Data at rest is virtualized at the mainframe level using IBM DVM for z/OS). And, together, these 2 components combined with java-based applications that integrate with core systems of record to populate the intra-day caches allow users to perform recurring pulls of data at small time intervals; write application exits/events directly to zDIH or to z/OS Logstream; and change the capture of raw data to drive recompute of business logic. In other words, the data is prepared and managed at the hub level (it is abstracted and placed into formats and data stores that can more efficiently serve consuming applications) – and users can then pull that composed information and massage it as needed to drive a variety of use cases.
When to, and when not to
Some situations ideally lend themselves to a zDIH deployment. These include situations where the surfaced information needed for business operations benefits from a subset of composed/aggregated data (as opposed to situations where raw data needs to be used). And situations where hybrid cloud applications need real-time or more current information (and can’t afford to be bogged down waiting for an ETL process to deliver data for analysis). Situations that require a more event-driven approach for information flow from the mainframe also benefit from zDIH deployments, as do situations where a combination of batch & OLTP data are needed. Also, cases that require more efficient information sharing between multiple and various z/OS applications, and situations where the re-use of composed information by multiple cloud-consuming applications, can benefit from a zDIH deployment.
zDIH, however, would not be a good fit for situations where enterprises want to move all z/OS core systems data to the cloud or another environment (a typical ETL environment) – or when access to all core systems of record data for ad hoc query interaction is needed. In addition, cases such as those for AI/ML where more granular raw data is needed vs. composed/aggregated information are not well-suited for zDIH. Nor should zDIH be used by enterprises that tend to stream all data off the platform; or by enterprises that want to cache all data from a system-of-record in zDIH (as opposed to the curated information that zDIH creates and manages).
In short, zDIH is ideal for those enterprises that can benefit from access to a curated dataset of mainframe production data – and that need that access to that information in real-time.
With the vast acceptance of cloud architecture, it has become necessary to make mainframe production information accessible by cloud business applications and users. With the number of enterprises moving to cloud architecture, it now makes sense to offer that mainframe production information to cloud users regardless of which cloud architecture they are using and regardless of which server architecture they are using. Only, it still doesn’t make sense to move all the raw data (for the same reasons given earlier in this blog). Given that cloud users and applications need access to back-end production data – and given that it still doesn’t make sense to move that data – it is clear that a new means of presenting relevant data subsets to cloud users needs to be found.
With IBM’s recent “Z Digital Integration Hub” offering, the company now offers cloud users access to structured, secured data (not raw data) that still resides on the mainframe. This production data thus remains under mainframe control (security, management); it can be placed into main memory (with transparent spill to disk as needed), and it can be shared with external users using various interfaces. It represents a means to create a series of new, flexible, efficient information flow applications that can exploit data from existing systems-of-record to serve cloud environments without having various workloads that use that data be run at the mainframe level and without impact to production systems.
We still believe that mainframe production data should be kept under mainframe auspices whenever possible for security and integrity reasons. We also suggest that various workloads be run at the mainframe level to reduce costs and speed processing. But now, with the growing number of cloud users who need access to real-time back-end system-of-record information, IBM’s Z Digital Integration Hub offers a decent alternative to moving raw data off of the mainframe. zDIH provides business analysts and cloud applications with access to essential production information on the mainframe in real-time. It thus opens the door to a whole new series of applications that can benefit from access to system-of-record back-end data residing on the mainframe – while, at the same time, reducing the risks and costs associated with unnecessary data movement.