Hadoop and Four Big Enterprise Software Vendors: Notes from Hadoop Summit 2014

Hadoop Summit 2014 in San Jose is the North American edition of Hadoop event, which is managed and run by Hortonworks. This was my latest stop in the big data conference circuit.

This year, I have been moving around in the Big Data conference circuits in the US and Europe. I spoke at Hadoop Summit in Amsterdam. Hortonworks people have recently uploaded the video of my talk, which you can watch here:
http://youtu.be/739v1-v9Y9A

In February 2014, when I visited Strata Conference in Santa Clara, I questioned to myself that why every major software vendor was present in the Strata, except the bellwether of data, Oracle. They became big by selling technology to store large volumes of enterprise data. Strata Conference is a conference on data related topics, which is organised by O’Reilly Media. O’Reilly Media can be safely considered as a vendor neutral event organiser dedicated to spreading the knowledge of technology. Major enterprise software vendors such as SAP, IBM, Terradata, SAS and Microsoft have adjusted their data platform strategy slides to acknowledge the advent of big data technology in the enterprise. Is Oracle relying upon its strategy to wait and take over as they did with Peoplesoft, Siebel and BEA Systems etc.?

To my surprise, in the Hadoop Summit 2014 at San Jose, Oracle was present. This must have made the Hortonworks folks happy who always promoted Hadoop Summit as a community effort and refrained from the prominent display of their logo, green livery and other marketingware during the summit despite of doing all the hard work in making this a successful event.

The presence of four major enterprise software vendors namely SAP, Microsoft, IBM and Oracle as well as others such as Teradata, SAS, HP and Pivotal in the Hadoop Summit is a clear sign that Hadoop is fast becoming the common and central block in data platform architecture. It is interesting to note how large software vendors have aligned their sales story with Hadoop by creating a picture that projects a peaceful coexistence of their proprietary software and hardware with Hadoop.

In a presentation from John Schitka of SAP, I learnt that SAP Hana can complement Hadoop. He talked about a data virtualization layer that can seamlessly fetch data from Hadoop, Hana and Sybase without the end-user ever noticing it. For data stored on Hadoop, it can create a virtual table on Hana. It can also use Hana to cache a part of Hadoop data for faster access.

Microsoft has become a bold adopter of Hadoop. I say ‘bold’ because Microsoft has never been associated with open source. Now, they even have a team of Hadoop committers that regularly collaborate with the open source community. Microsoft cloud platform, Azure, offers Hadoop as a service. Microsoft BI tools interoperate well with Hadoop.

IBM with its Infosphere BigInsights suite of products already embraced Hadoop a long back. They have also built an ecosystem of proprietary and open source tools to harness the power of Hadoop in big data analytics. They also see Hadoop complimentary to their Netezza offering.

The response of Oracle to the developments in Hadoop world has been subdued, so far. At Oracle booth, in Hadoop Summit, they explained that Oracle Solaris hardware can be used to run Hadoop.

To run Hadoop, we have much better and cheaper options available so I wondered why should we bother about running Hadoop on Solaris Hardware.

An interesting hardware option to run Hadoop, is available from a company called Quanta. They pack 4 compute nodes in a 2U chasis. I had a chance to look into their modular racks design and how easily hardware components can be serviced.
http://www.quantaqct.com/en/01_product/01_list.php?mid=27&sid=149

These nodes can have dense RAM or dense hard disks. They have already earned Certification for Three Server models from newly launched Open Compute Certification Labs.
http://www.quantaqct.com/en/03_about/03_detail.php?nid=4&id=124

As the Hadoop gets faster, secure and ready for enterprise, the enterprise software vendors will have tough time convincing enterprises to invest in their proprietary hardware and software unless they can deliver a hard evidence that their products indeed compliment what is available in Hadoop and its open source ecosystem.

 
2
Kudos
 
2
Kudos

Now read this

Where is my Data Lake?

Growing interest in gaining value out of the data stored in the enterprise systems and the freely available data on the World Wide Web, has led IT industry to invent a new term called Data Lake. The term Data Lake or Business Data Lake... Continue →