Financial institutions, like many other industries, are struggling with how to best leverage and extract value from big data. Enabling users to either “see the story” or “tell their story” is key to deducing value with data visualization tools, especially as data sets continue to grow.
With terabytes and petabytes of flood organizations, legacy architectures and infrastructures are overmatched to store, manage and analyze big data. IT teams are poorly equipped to handle the increasing requests for different types of data, specialized reports for tactical projects and ad hoc analysis. Traditional Business Intelligence (BI) solutions, where IT presents slices of data that are easier to manage and analyze or create pre-designed templates that only accept certain types of data for mapping and graphing, miss the potential to capture deeper meaning in order to enable proactive, or even predictable, decisions from big data.
Out of frustration and under pressure to deliver results, user groups are increasingly circumventing IT. They acquire applications or build custom applications without the knowledge of IT. Some go as far as acquiring and providing their own infrastructure to speed up data collection, processing and analysis. This rush-to-market creates data silos and potential GRC (governance, regulatory, compliance) risks.
Users accessing cloud-based services – increasingly on devices they own – cannot understand why they face so many obstacles in trying to access enterprise data. Externally retrieved data mashups such as social networks, market data sites or SaaS applications are practically impossible unless users have the technical skills to integrate various data sources on their own.
Steps to visualize big data success
Archiving from the users’ perspective with data visualization tools is important for management to visualize big data success through better and faster insights that improve decision outcomes. An important advantage is how these tools change the delivery of the project. As they allow the value to be quickly visualized through prototypes and test cases, models can be validated at low cost before algorithms are built into production environments. Visualization tools also provide a common language in which IT and business users can communicate.
To help change the perception of IT from being an inhibitory cost center to a business activation, it needs to pair data strategy to business strategy. As such, IT must deliver data in a much smoother way. The following tips can help IT become integrated into how their organizations effectively allow users to access big data without compromising GRC mandates:
- Aim for context. The people who analyze the data need to have a deep understanding of the data sources, who will consume the data and what their goal is in interpreting the information. Without establishing context, visualization tools are less valuable.
- Plan for speed and scale. To properly enable visualization tools, organizations need to identify the data sources and decide where the data should reside. This must be determined by the sensitive nature of the data. In a private cloud, the data must be classified and indexed for quick search and analysis. Whether in a private cloud or a public cloud environment, cluster architectures that leverage memory and parallel processing technologies are most effective today for exploring big data sets in real time.
- Ensure data quality. While the big data hype is centered on the amount of data, the speed and the variety of data, organizations need to focus more on the validity, authenticity and value of the data. Visualization tools and the insights they can enable are only as good as the quality and integrity of the data models they work with. Businesses need to incorporate data quality tools to make sure data feeding the front end is as clean as possible.
- Show meaningful results. It is difficult to map points on a graph or chart for analysis when dealing with massive data sets with structured, semi-structured and unstructured data. One way to solve this challenge is to cluster data into a higher-level view where smaller groups of data are exposed. By grouping the data, a process called “binning”, users can visualize the data more efficiently.
- Handling outliers. Graphical representations of data using visualization tools can reveal trends and outliers much faster than tables that contain numbers and text. Humans are innately better at identifying trends or problems by “seeing” patterns. In most cases, outliers make up 5% or less of a data set. While small as a percentage when working with very large datasets, these outliers become difficult to navigate. Either remove the outliers from the data (and therefore the visual presentation), or create a separate chart just for the outliers. Users can then draw conclusions from viewing the distribution of data as well as outliers. Isolating outliers can help reveal previously unseen risks or opportunities, such as detecting fraud, changes in market sentiment or new leading indicators.
Where visualization is heading
Data visualization evolves from the traditional charts, graphs, heat maps, histograms, and scatter charts used to represent numerical values that are then measured against one or more dimensions. With the trend toward hybrid enterprise data structures included in traditional structured data, usually stored in a data warehouse with unstructured data coming from a wide variety of sources, measurement allows for much wider dimensions.
As a result, you can expect to see greater intelligence in how these tool indexes perform. Also expect to see enhanced dashboards with game-style graphics. Finally, you can expect to see more predictable properties to anticipate user data requests with personalized memory caches to help with performance. This continues to evolve into self-service analytics, where users define the parameters of their own queries for ever-increasing data sources.