Google Cloud Dataflow Makes Big Data Offerings Even Bigger – Dj Das Concurs!

Google Cloud Dataflow Makes Big Data Offerings Even Bigger – Dj Das Concurs!

At the Hadoop Summit in Brussels, Google launched a beta version of Google Cloud Dataflow, a managed logic-processing service. The Internet giant also unveiled upgrades to its popu ...

Michael Stonebraker wins the A.M. Turing Prize!

Michael Stonebraker wins the A.M. Turing Prize!

We, at Third Eye are thrilled that Michael Stonebraker has been awarded the A.M. Turing Prize, computer science’s highest award. You can see the detailed article about it on the Fo ...

How to Debug Map-Reduce Code

How to Debug Map-Reduce Code

It All Starts with HDFS Everything in Hadoop cluster is based on the HDFS (Hadoop Distributed File System). Quite often HDFS problems will manifest themselves in various components ...

Comparative Analysis of Big Data Analytical Tools – Hive, Tez, Impala, SparkSQL & PrestoDB running on the Google Cloud

Comparative Analysis of Big Data Analytical Tools – Hive, Tez, Impala, SparkSQL & PrestoDB running on the Google Cloud

Big Data analytics is the answer for businesses to glean insights from data to take timely actions. Businesses now have a plethora of technologies at their disposal to perform such ...

Recent performance improvements in Apache Spark

In this post, we look back and cover recent performance efforts in Spark. In a follow-up blog post next week, we will look forward and share with you our thoughts on the future evolution of Spark’s performance. 2014 was the most active year of Spark development to date, with major improvements across the entire engine. One particular area where it made great strides was performance: Spark set a new world re ...

Read more

How-to: Translate from MapReduce to Apache Spark (Part 2) | Cloudera Engineering Blog

The conclusion to this series covers Combiner-like aggregation functionality, counters, partitioning, and serialization.Apache Spark is rising in popularity as an alternative to MapReduce, in a large part due to its expressive API for complex data processing. A few months ago, my colleague, Sean Owen wrote a post describing how to translate functionality from MapReduce into Spark, and in this post, I’ll ext ...

Read more

Bulk Insert, Update and Delete in Hadoop Data Lake | Mawazo

Hadoop Data Lake, unlike traditional data warehouse, does not enforce schema on write and serves as a repository of data with different formats from various sources. If the data collected in a data lake is immutable, they simply accumulate in an append only fashion and are easy to handle. Such data  tend to be fact data e.g., user behavior tracking data or sensor data. However, dimension data or master data ...

Read more

An introduction to Spark Streaming | Opensource.com

Apache Spark is an open source cluster computing framework. In contrast to Hadoop’s two-stage disk-based MapReduce paradigm, Spark’s in-memory primitives provide performance up to 100 times faster for certain applications. Spark Streaming was launched as a part of Spark 0.7, came out of alpha in Spark 0.9, and has been pretty stable from the beginning. It can be used in many almost real time use cases, such ...

Read more

Cloud Wars: Can Microsoft Overtake Amazon? – Forbes

The cloud — a service that lets people and organizations rent computing services for a monthly fee — has turned into a battlefield of giant companies.And despite intense price-cutting by rivals such as Google GOOGL +2.89% Cloud Computing and Amazon’s AWS, we learned on April 23, that there is profit in the cloud.At the pinnacle of this $100 billion industry sits AWS — but Microsoft MSFT +10.45% Azure is gro ...

Read more

Dice Report: Fastest-Growing Tech Skills – Dice News

It’s no surprise that tech pros skilled in data analytics and building apps continue to be in high demand by employers. That being said, some skills are more frequently requested than others, as demonstrated by Dice’s latest analysis of its online job postings. In the past few years, several tech skills have skyrocketed in terms of demand. Although some of these sought-after skills don’t enjoy the widesprea ...

Read more

Cloud Machine Learning Wars: Amazon vs IBM Watson vs Microsoft Azure

In two previous posts, I covered the emerging industry of cloud-based machine learning solutions. First, I covered Microsoft's Azure Machine Learning and IBM's Watson Analytics. Microsoft's Azure ML provides a graphical drag-and-drop interface for connecting preprogrammed components of a data science pipeline together. The service is similar to KNIME and seemed targeted for users who knew just enough to kno ...

Read more

Productization’ of Analytics | India Digital Review

The concept of using business data to improve organization’s decision-making is not new. It all started with improvements in digital data storage and computing systems in 1970s and within a decade, software systems such as Decision Support Systems (DSS) and Executive Information Systems (EIS) came in vogue.  But still, most of the work done during this initial phase centered around intelligent reporting and ...

Read more

White Paper: Big Data Belongs in the Cloud – Qubole

Big data is growing faster than ever before, and businesses are looking to take full advantage of it. In fact, the big data market is growing about six times faster than the IT market in general, making it an essential ingredient to success for many companies and industries in the world. It’s through big data analytics that businesses of any size are gaining tremendous benefits to their operations, giving t ...

Read more

Google Cloud Dataflow Makes Big Data Offerings Even Bigger – Dj Das Concurs!

At the Hadoop Summit in Brussels, Google launched a beta version of Google Cloud Dataflow, a managed logic-processing service. The Internet giant also unveiled upgrades to its popular BigQuery analytic platform, including new European zones. Google blogged extensively about it here. CRN covered the release with its own article here. They interviewed Dj Das for that interview for his thoughts specifically on ...

Read more

2015 © Big Data Cloud Inc. All Rights Reserved.

Hadoop and the Hadoop elephant logo are trademarks of the Apache Software Foundation.

Scroll to top
UA-18319319-1