Three Data Management Dilemmas Financial Services Technology Leaders Must Resolve

“To unlock the value of our data, we must solve this paradox: We must make data easy to share across the organization while maintaining appropriate control over it,” reads a joint commentary by senior technology leaders at JP Morgan Chase in a May 2021 blog post on the AWS Big Data Blog.
With that statement, the data team from the world’s largest financial institution described a dilemma familiar to technology leaders across the financial sector. As a highly regulated industry, financial services organizations must maintain tight restrictions on data access while acknowledging that the real value of their data lies in enabling streamlined access across teams and blending data liberally across operational data stores.
Such data management dilemmas are to be expected, given the scale of data produced by the financial services sector. The industry manages immense volumes of structurally diverse data, including information in 82.6 million credit card transactions per hour, 35 million trades on the NASDAQ per day, and over 50 million US e-banking customers per year. These figures are projected to grow sharply over the coming years. Leading financial services firms have built infrastructures to extract business value from this data deluge—including personalized customer experiences, targeted advertising, and sophisticated customer lifetime value analyses.
In addressing data challenges for the world’s biggest financial institutions, we at CData have developed a unique viewpoint on their success. Here, we dive into how these teams deliver transformative business results by addressing three common dilemmas of financial services data architecture.
Dilemma 1: Protracted time-to-value in a central source of truth
One of the most impactful architectural choices in extracting insights from raw data is the implementation of a central source of truth for analytics. Financial services firms are evaluating cloud data warehouse (CDW) technologies (like Snowflake, Databricks, and Azure Data Lake ADLS) as the foundational layer of their analytics and AI initiatives due to the scalability, high availability, and operational efficiency that these CDWs bring. However, data centralization with these systems can have protracted implementation timelines. According to AirByte, it can take several months to a year or more (not to mention the high data engineering costs) to start seeing value in data centralization.
The extended time-to-value for cloud data warehouse implementation is largely driven by the variety of data that financial services firms produce and consume. This includes data stored in disparate file formats (including ISO 20022 XML, EDI, JSON, CSV, TSV, etc.), SaaS applications and their corresponding APIs, relational databases (on-premises and in private cloud), message streaming queues like Apache Kafka and IBM MQ, etc. In addition to the variety of internal data sources, financial services firms are also consumers of external APIs that provide market data regarding real-time stock prices, transaction data, and social sentiment metrics—NASDAQ’s Data Link API is one example of the diverse streams critical external data that financial services firms rely on to make high-quality decisions across the business. Given the rising costs and consumption-based pricing of such market data, it becomes critical to manage this data effectively. According to EY, creating a single “golden source” of truth can help manage the costs and complexity of market data consumption. However, financial services data teams are overwhelmed by the vast and growing variety of internal and external data sources that must be ingested into such a central source of truth.
The prospect of developing bespoke custom-code pipelines to ingest all this data into a central data warehouse deters many financial services data teams from investing in data centralization. The lack of a standardized means of ingesting data results in an inefficient and expensive set of manual processes and tools that slow down the business and hinder its ability to compete. Ultimately, data leaders face high costs and low return on cloud investment in the short term, with the promise of streamlined analytics, enterprise AI, and competitive advantage in the long run.
How can data leaders make this investment in a central source of truth more palatable for their firms? By standardizing and automating data ingestion to reduce data engineering cost and time-to-value.
CData customer NJM Insurance, for example, faced a nearly year-long timeline for their implementation of a central lakehouse in Databricks, which included ingestion of marketing data from more than 10 advertising channels. CData was able to help them automate the ingestion of data, with continuous updates using change data capture (CDC) to ensure that Databricks stayed up to date. CData Sync was also deployed on NJM’s infrastructure, which meant that their data did not have to pass through third-party servers in transit. Overall, their investment in CData helped them achieve 10 times faster implementation of the project and 66 percent lower overall cost.
Financial services data leaders are weighing the benefits of data centralization with the cost and timeline of implementation. An automated data ingestion process resolves this concern and helps drive toward a more data-driven business.
Dilemma 2: Balancing modernization with information security
According to RedGate, 36 percent of all organizations host their production databases mostly or all in the cloud, but this figure falls to 29 percent in financial services. This is partly explained by the information security and data privacy headwinds that reduce the viability of cloud adoption in this sector. As a result, data leaders must choose between the flexibility, high availability, and efficiency of the cloud and ensure that security and data privacy standards are appropriately met. Sixty-seven percent of financial services organizations maintain a hybrid on-premises/cloud architecture for this reason.
Forward-looking organizations in this sector, however, choose to adopt trusted cloud technologies that support innovation without compromising security. They do this by creating a bridge between on-premises systems and the cloud that enables the best of both deployments. This secure on-premises/cloud connectivity layer gives an enterprise data architect the freedom to modernize strategically without worrying about compatibility with the rest of their tech stack.
For instance, take another CData customer in the financial services sector. As a global leader in life and health insurance, they faced strict infosec mandates that limited their ability to fully move to the cloud. Moreover, they had invested hundreds of developer hours into their on-premises SQL Server Analysis Services cubes, which would not be compatible with a cloud data warehouse.
Their team ultimately made the decision to modernize and adopt Snowflake as their central source of truth, and CData was able to help refactor their SSAS cubes to Snowflake. This effectively allows them to securely bridge between sensitive on-premises analytics and the cloud, thus opening up further opportunities to modernize without compromising information security.
Financial services organizations that capitalize on opportunities to selectively modernize their technology stack while ensuring data continuity across their data architecture will be best equipped to unlock the most business value of their data.
Dilemma 3: Balancing free data access with governance
A former CTO for Data and Analytics at JP Morgan Chase states that AI/ML engineers spend ”something like 60 percent to 70 percent” of their time just trying to access the right data. Most organizations across all verticals prioritize streamlined access to data as a driver of better decisions and business outcomes, and financial services are no exception.
However, as we have established, financial data entails a high bar of privacy. This brings us back to the dilemma raised by the JP Morgan Chase data team: Given that data democratization necessarily removes centralized hierarchies and barriers to access, how do we improve data access without compromising privacy standards?
This challenge is neatly solved by an enterprise semantic layer—a central data platform that houses data products across domains in an organization. The semantic layer supports the central administration of role-based access control, PII masking, and data quality while ensuring that data consumers have streamlined access to authorized data via a single access point. This combination of data democratization with central governance has gained favor among data teams at the largest financial institutions.
CData customer PGGM is one of the world’s largest not-for-profit pension funds. As data volume and variety grew, its teams implemented manual, bespoke cross-team data sharing, which did not scale well. This also led to confusion over data ownership and authorization due to a lack of central data governance. Data consumers also spent large portions of their time in data preparation instead of analysis.
With help from CData, PGGM’s data architecture team achieved the dual goal of ensuring standardized, streamlined data access (which accelerated analytics timelines), as well as central data governance and authorization management. As their Data Management leader, Marco van der Winden commented, “We have significantly reduced time-to-value from weeks or months to hours or even minutes while ensuring data governance.”
A path forward
The financial services industry presents a unique set of data management challenges. Technology leaders in this space face difficult decisions with complex tradeoffs; they have to navigate these tradeoffs and establish a solid data foundation to maintain organizational relevance and performance into the future.
Explore CData connectivity solutions
CData offers a wide selection of products to solve your data connectivity needs. Choose from hundreds of connectors between any source and any app. Get started with free trials and tours.
Try them out