In-Memory Computing Summit 2016

In-Memory Computing Summit 2016

The best minds of the In-Memory Computing industry will gather in San Francisco on May 23-24 for IMC Summit 2016 to network, learn and exchange ideas that will power the future of ...

Big Data Analytics at FaceBook

Big Data Analytics at FaceBook

This meetup will be an “unconference” style one and will have various presentations to choose from. Please review the topics below and upon registering, select your 2 preferred top ...

Data Warehousing With Google BigQuery

Data Warehousing With Google BigQuery

Data warehousing and the resulting business intelligence are the basic necessities of business today. And today’s technologies makes it possible to have a sophisticated data wareho ...

Innovative Big Data Application Optimizes Lead Conversions, built on the Google Cloud Platform – CASE STUDY

Innovative Big Data Application Optimizes Lead Conversions, built on the Google Cloud Platform – CASE STUDY

In the era of Big Data, many enterprise executives are struggling with the sheer volume of available data and how to transform all that information into intelligence they can use t ...

Predictive policing: The future of law enforcement

Predictive policing: The future of law enforcement

As Dj Das, founder and CEO of Third Eye Consulting Services, sums it up, “For fighting crime and keeping every citizen safe, Microsoft has the most sophisticated cloud-based big da ...

Encrypt Data At-Rest and In-Flight on Amazon EMR with Security Configurations – AWS Big Data Blog

Customers running analytics, stream processing, machine learning, and ETL workloads on personally identifiable information, health information, and financial data have strict requirements for encryption of data at-rest and in-transit. The Apache Spark and Hadoop ecosystems lend themselves to these big data use cases, and customers have asked us to provide a quick and easy way to encrypt data at-rest and dat ...

Read more

Apache Impala (incubating) vs. Amazon Redshift: S3 Integration, Elasticity, Agility, and Cost-Performance Benefits on AWS – Cloudera Engineering Blog

As measured across multiple dimensions (see analysis below), Impala provides a better cloud-native experience than Redshift for a number of common use cases. Impala 2.6 brings read/write support on Amazon S3, which provides cloud capabilities such as direct querying of data from S3, elastic scaling of compute, and seamless data portability and flexibility that are unique amongst cloud-based analytic databas ...

Read more

Apache Kudu 1.0 is Released – Cloudera VISION

This week, the Apache Kudu team announced the release of Kudu 1.0. This release marks the one-year anniversary of Kudu’s public debut, and is the culmination of much hard work by a growing team of developers and community members. In this blog post, I’ll recap the original vision for Kudu, review our accomplishments over the last year, and share where I see the project going in the future. The Origins of Ku ...

Read more

Encrypt Data At-Rest and In-Flight on Amazon EMR with Security Configurations

Customers running analytics, stream processing, machine learning, and ETL workloads on personally identifiable information, health information, and financial data have strict requirements for encryption of data at-rest and in-transit. The Apache Spark and Hadoop ecosystems lend themselves to these big data use cases, and customers have asked us to provide a quick and easy way to encrypt data at-rest and dat ...

Read more

Alarm Flooding Control with Event Clustering Using Spark Streaming | Mawazo

You show up at work in the morning and open your email to find 100 alarm emails in your inbox for the same error from an application running on some server within a short time window of 1 minute. You are off to to bad start, struggling to find other emails. I was motivated by this unpleasant experience to come up with a solution to stop the deluge of the same alarm emails in a small time window. When there ...

Read more

Building Deep Neural Networks in the Cloud with Azure GPU VMs, MXNet and Microsoft R Server | Cortana Intelligence and Machine Learning Blog

Deep learning has been behind several recent breakthroughs in machine learning applications. In the field of computer vision, novel approaches such as deep residual learning developed at Microsoft Research have helped reduce the top-5 classification error at the ImageNet competition by 47% in just one year. In the field of speech and machine translation, deep neural networks (DNNs) have already enabled mill ...

Read more

Apache Impala (Incubating) on Amazon: Performance and Cost Considerations for S3 vs. EBS – Cloudera Engineering Blog

The benchmark testing results detailed below can help you make an informed decision about AWS storage options for Impala. In a recent post, you learned how Impala 2.6 on S3 delivers cloud-native features unmatched by other analytic databases in the cloud. With support to read/write data from Amazon S3, Impala provides cloud capabilities such as direct querying of data from S3, elastic scaling of compute, an ...

Read more

Writing SQL on Streaming Data with Amazon Kinesis Analytics – Part 2 – AWS Big Data Blog

Amazon Kinesis Analytics allows you to easily write SQL ­­­on streaming data, providing a powerful way to build a stream processing application in minutes. The service allows you to connect to streaming data sources, process the data with sub-second latencies, and continuously emit results to downstream destinations for use in real-time alerts, dashboards, or further analysis. This post introduces you to th ...

Read more

Processing Billions of Events in Real-Time with Heron by Karthik Ramasamy, Twitter – YouTube

[embed]https://www.youtube.com/watch?list=PLGeM09tlguZQyemL0Y5CdpEFrBs-hGGM8&v=Ug4WigMc1ms[/embed]   Twitter generates tens of billions of events per hour when users interact with it. Analyzing these events to surface relevant content and to derive insights in real time is a challenge. To address this, we developed Heron, a new real time distributed streaming engine. In this talk, we first describe the ...

Read more

The Netflix Tech Blog: Netflix Data Benchmark: Benchmarking Cloud Data Stores

The Netflix member experience is offered to 83+ million global members, and delivered using thousands of microservices. These services are owned by multiple teams, each having their own build and release lifecycles, generating a variety of data that is stored in different types of data store systems. The Cloud Database Engineering (CDE) team manages those data store systems, so we run benchmarks to validate ...

Read more

2015 © Big Data Cloud Inc. All Rights Reserved.

Hadoop and the Hadoop elephant logo, Sprark are trademarks of the Apache Software Foundation.

Scroll to top