Introduction to Hadoop 2, with a simple tool for generating Hadoop 2 config files

Introduction to Hadoop 2 Core Hadoop 2 consists of the distributed filesystem HDFS and the compute framework YARN. HDFS is a distributed filesystem that can be used to store anywhere from a few gigabytes to many petabytes of data. It is distributed in the sense that it utilizes a number of slave servers, ranging from 3 to a few thousand, to store and serve files from. YARN is the compute framework for Hadoo ...

Read more

Thanks for attending the meetup last night…(Jan 24th 2013)

Dear Members, Well, I must say that I was pretty happy last night seeing the enthusiasm among the community members about the topic of the night, “Security in a Hadoop environment”.  We had a great interactive session and all of us learned about the challenges, issues and industry practices. We promise to put together another session on the same topic with more varied participants. One of our community memb ...

Read more

Making Hadoop Secure for Enterprises – An Insight into the Imperative

The Big Data Infra for Enterprises – Making Hadoop Secure for Enterprises session was probably my 4th or 5th attendance to BigDataCloud meetup hosted by DJ and Jeeta Das. Time is luxury and if you login to meetup.com, several meetups with names that generate significant interest crop up. I am sure all agree with me. However, attending BigDataCloud meetup is time well invested. Few days ago I had spoken to J ...

Read more

Big Data Infra for Enterprises – Making Hadoop Secure for Enterprises

This meetup is sponsored by Zettaset. Security is the greatest challenge for the widespread adoption of Hadoop in enterprises. This meetup will discuss ways and means of how such challenges are being met with various solutions and/or products in the industry today. Industry security experts will showcase their varied experiences. This is first among the “Big Data Infra for Enterprises” series of Big Data Cl ...

Read more

Apache Hive & Pig – BI Developer

BI Developers need to access, transform & load data sets. For performing these activities over Big Data sets, in a Hadoop environment, Hive and Pig are extremely handy skills to have. In this one (1) day course , we will learn in-depth about Hive and Pig’s architecture & design and development framework including installation steps and performance tuning of Map Reduce Programs covering SessionLog Da ...

Read more

Morgan Stanley Takes On Big Data With Hadoop

When Morgan tried to do some portfolio analysis 18 months ago it found that traditional databases and grid computing just wouldn’t scale  to the very large volumes of data that its data scientists wanted to use. Gary Bhattacharjee, executive director of enterprise information management at the firm, had worked with Hadoop as early as 2008 and thought that it might provide a solution. So the IT department ho ...

Read more

Fraudsters are not Model Citizens

In my earlier post, I did an overview of the outlier detection techniques in big data and specifically Hadoop context. As I mentioned, fraud detection is essentially translates to outlier detection in data mining parlance. In his post, I will go over a distribution model based technique which has been implemented as two map reduce jobs in beymani available in github. I will use the credit card transaction a ...

Read more

Which is less expensive: Amazon or self-hosted?

Amazon Web Services (AWS), as the trailblazing provider of Infrastructure as a Service (IaaS), has changed the dialog about computing infrastructure. Today, instead of simply assuming that you’ll be buying and operating your own servers, storage and networking, AWS is always an option to consider, and for many new businesses, it’s simply the default choice. I’m a huge fan of cloud computing in general and A ...

Read more

Puppet, Chef Ease Transition to Cloud Computing

Organizations as diverse as Northrop Grumman (NOC), Harvard University, Zynga, and the New York Stock Exchange (NYX) have filled job websites with requests for talented puppeteers and master chefs. A quick dig into the job listings reveals that these positions have nothing to do with office entertainment or gourmet meals. Instead, the companies want people who have mastered Puppet or Chef, competing softwar ...

Read more

Big Data Cloud August 11th Meetup – Hadoop powered Engines; 100+ Attendees; Corporate Sponsorships & Giveaways

BigDataCloud’s theme of “Hadoop Powered Predictions & Recommendations Engines” attracted over 100 people to the meetup last night, sponsored by LexisNexis & ThirdEyeCloud. The attendees thronged the LexisNexis’s booth about its newly debuted HPCC Systems & got an understanding of how its Hadoop alternative can actually solve “Big Data” challenges in enterprises. The attendees also had a chance to “meet & gr ...

Read more

2013 © Big Data Cloud Inc. All Rights Reserved.

Hadoop and the Hadoop elephant logo are trademarks of the Apache Software Foundation.

Scroll to top