2018 was a momentous year for data. Not only has the overall importance of data and information within organizations continued to grow, but we’ve also seen the continued rise of megatrends like IoT, big data – even too much data – and of course, machine learning. That’s along with the ongoing maturation of other, perhaps less known, but equally important data initiatives such as governance and integration in the cloud.
So what does the coming year have in store?
In many ways, the top trends of 2019 will largely be a continuation of what’s already been happening this year. But we’ll also see new and exciting developments take shape that will spur even more data sources and types, more demand for integration and cost optimization and even better analytics and insights for organizations.
Here are the top seven big data analytics trends for 2019:
1. IoT and the growth of digital twins
Even though the Internet of Things was on everyone’s lips in 2018, the buzz around the digitization of the world around us and its implications for data isn’t going away. The frenzied growth of IoT data – along with many organizations’ continued inability to handle or make sense of all that data with their traditional data warehouses – will be a major theme of 2019. Adding fuel to this ever-expanding fire is the ongoing growth of digital twins, which are digital replicas of physical objects, people, places and systems powered by real-time data collected by sensors. By some estimates there will be more than 20 billion connected sensors by 2020, powering potentially billions of digital twins. To capture the value of all that data, it needs to be integrated into on a modern data platform using an automated data integration solution that engages in data cleaning, de-duplication and unification of disparate and unstructured sources.
2. Augmented analytics
In 2018, most qualitative insights are still teased out by data scientists or analysts after poring over reams of quantitative data. But with augmented data, systems use artificial intelligence and machine learning to suggest insights pre-emptively. Gartner says this will soon become a widespread feature of data preparation, management, analytics and business process management, leading to more citizen data scientists as barriers to entry come down – especially when combined with natural language processing, which makes possible interfaces that let users query their data using normal speech and phrases.
3. The harnessing of dark data
Gartner defines dark data as “the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes.” This kind of data is often recorded and stored for compliance purposes only, taking up a lot of storage without being monetized either directly or through analytics to gain a competitive advantage.But with organizations increasingly leaving no business intelligence-related stone unturned, we’re likely to see more emphasis placed on this as-of-yet relatively untapped resource, including the digitization of analog records and items (think everything from dusty old files to fossils sitting on museum shelves) and their integration into the data warehouse.
4. Cold storage and cloud cost optimization
Migrating your data warehouse to the cloud is almost always less expensive than an on-premise build, but that doesn’t mean cloud systems can’t be cost optimized even further. It’s because of this that 2019 will see more organizations turning to cold data storage solutions such as Azure Cool Blob and Google’s Nearline and Coldline. And with good reason: Parking older and unused data in cold storage can save organizations as much as 50 percent on storage costs, thus freeing up cash to invest in data activities that can generate ROI instead of being a money drain.
5. Edge computing and analytics
Edge computing takes advantage of proximity by processing information as physically close to sensors and endpoints as possible, thus reducing latency and traffic in the network. Gartner predicts the evolution of edge computing and cloud computing as becoming complimentary models in 2019, with cloud services expanding to not just live in centralized servers, but also in distributed on-premise servers and even on the edge devices themselves. This should not only decrease latency, but also costs for organizations processing real-time data.
Some say that edge computing and analytics can also help increase security due to its decentralized approach, which localizes processing and reduces the need to send data over networks or to other processors. Others, however, note that the increased number of access points for hackers that these devices represent – not to mention that most edge devices lack IT security protocols – leaves organizations even more open to hacks. Either way, the explosion in edge computing and analytics means an even greater need for a flexible data warehouse that can integrate all your data types when it’s time to run analytics.
6. Data storytelling and visualization
Another trend that’s well-established, data storytelling and visualization will take the next step in 2019 as more organizations move their traditional and often siloed data warehouses to the cloud. An increase in the use of cloud-based data integration tools and platforms means a more unified approach to data, in turn meaning more and more employees will have the ability to tell relevant, accurate stories with data using an organization’s single version of the truth.
And as organizations use even better and improved integration tools to solve their data silo problems, data storytelling will become more trusted by the C-suite as insights gleaned across the organization become more relevant to business outcomes.
The concept of DataOps really started to emerge this year, and will grow significantly in importance in 2019 as data pipelines become more complex and require even more integration and governance tools. DataOps applies Agile and DevOps methods to the entire data analytics lifecycle, from collection to preparation to analysis, employing automated testing and delivery for better data quality and analytics. DataOps promotes collaboration, quality and continuous improvement, and uses statistical process control to monitor the data pipeline to ensure constant, consistent quality.
Because when experts predict organizations should be able to handle 1,000 data sources in their data warehouse, it means truly automated and always-on data integration will be the difference between delivering value and drowning.