Share on Facebook Share on Twitter Share on LinkedIn Share on Google+

Big Data in Review

By: Erik Nor, | April 21, 2017

The landscape for Big Data continues to change, but not at the pace seen in previous years.  Highlighted here are a few points of interest from 2016 and predictions for 2017.  

2016-

  • Enterprise Adoption of Open Source Solutions
    • According to Gartner1 48 percent of companies have invested in Big Data in 2016.  The number of companies continues to rise and drive the features and requirements that Big Data solutions are expected to implement.
  • Pace of Releases Slowed
    • The rise of adoption by enterprise businesses has caused the distributors to slow their release schedule.  Enterprises could not afford to upgrade at the previous pace of releases.  New features are expected to be secure, stable, and complete.  The distributions must respect these expectations if they are to be utilized by large companies.
  • Security Moves to the Forefront
    • This is a second effect of the adoption by the enterprise.  Enterprises have established security practices and procedures.  In order to utilize Hadoop within these businesses, providers had to provide solutions for the most common and standard security requirements.
  • In-Memory Analytics Increasingly Important
    • The desire for in-memory analytics continues to grow.  Spark is one of the most active Apache projects following Ambari and Hadoop2.  Hive has released an in-memory feature called LLAP (live long and process)3.  Kudu is promoted to a top-level Apache project, demonstrating a readiness for wide spread adoption4.

 2017-

  • Refocus on the Meaning of Big Data
    • While Big Data has always represented Volume, Velocity, and Variety, the focus has historically been on the Volume.  As many adopters are beginning to realize, the volume of data typically available does not always necessitate the use of a Big Data solution.  What these adopters are finding, though, is that the functionality provided by some Big Data solutions are solving other problems within the enterprise related to Velocity and Variety.
  • Data Governance
    • This is a requirement driven by the enterprise.  There is a need to provide truth in data, data lineage, and very specific data security.  As 2017 unfolds, there will be an increasing number of solutions implementing some level of data governance.
  • Cloud, Virtualization, and Automation
    • An increasing number of companies are seeing the benefits of the Cloud.  The benefits include cost of implementation, dynamic sizing, and time to deployment.  The security and benefits of these solutions has been proven.  Automation will see related growth as it will be utilized to magnify the benefits offered by a cloud deployment.
  • Push Towards Standardization
    • The increasing number of solutions for Big Data has created a diverging API with custom features and interfaces being created to meet very specific requirements.  ODPi hopes to change that with a set of standard interfaces5.  This will allow a consistent user experience when utilizing solutions from mixed vendors.  No longer will an integrated stack be required.  Instead, the various layers of the solution can easily be exchanged.

 References

  1. https://www.gartner.com/newsroom/id/3466117
  2. https://projects.apache.org/statistics.html
  3. https://cwiki.apache.org/confluence/display/Hive/LLAP
  4. https://kudu.apache.org/2016/07/25/asf-graduation.html
  5. https://www.odpi.org/
Share this Article: Share on Facebook Share on Twitter Share on LinkedIn Share on Google+

About Erik Nor,

Erik Nor is a Principal Consultant and Big Data Technology Leader at Moser Consulting. He's been working with Big Data and Hadoop for over four years and holds multiple development and admin certifications from Hortonworks and Cloudera.