In-Memory Computing Summit 2016

In-Memory Computing Summit 2016

The best minds of the In-Memory Computing industry will gather in San Francisco on May 23-24 for IMC Summit 2016 to network, learn and exchange ideas that will power the future of ...

Big Data Analytics at FaceBook

Big Data Analytics at FaceBook

This meetup will be an “unconference” style one and will have various presentations to choose from. Please review the topics below and upon registering, select your 2 preferred top ...

Data Warehousing With Google BigQuery

Data Warehousing With Google BigQuery

Data warehousing and the resulting business intelligence are the basic necessities of business today. And today’s technologies makes it possible to have a sophisticated data wareho ...

Innovative Big Data Application Optimizes Lead Conversions, built on the Google Cloud Platform – CASE STUDY

Innovative Big Data Application Optimizes Lead Conversions, built on the Google Cloud Platform – CASE STUDY

In the era of Big Data, many enterprise executives are struggling with the sheer volume of available data and how to transform all that information into intelligence they can use t ...

Predictive policing: The future of law enforcement

Predictive policing: The future of law enforcement

As Dj Das, founder and CEO of Third Eye Consulting Services, sums it up, “For fighting crime and keeping every citizen safe, Microsoft has the most sophisticated cloud-based big da ...

TensorForce: A TensorFlow library for applied reinforcement learning – reinforce.io

This blogpost will give an introduction to the architecture and ideas behind TensorForce, a new reinforcement learning API built on top of TensorFlow. This post is about a practical question: How can the applied reinforcement learning community move from collections of scripts and individual examples closer to an API for reinforcement learning (RL) — a ‘tf-learn’ or ‘skikit-learn’ for RL? Before discussing ...

Read more

When every drop counts: Schneider Electric transforms agriculture with the Internet of Things for sustainable farming – Transform

In the grassy Canterbury Plains of New Zealand, Craig Blackburn raises cattle and sheep in a line of work with a long tradition, in which he keeps a close eye on crops, land, weather and water. But Blackburn blends modern technology with his agricultural roots to manage the 990-acre Blackhills farm, a complex, bustling operation with 2,100 cattle and 800 sheep. The farm runs on irrigated water from the scen ...

Read more

Exactly-once Semantics is Possible: Here’s How Apache Kafka Does it

I’m thrilled that we have hit an exciting milestone the Kafka community has long been waiting for: we have introduced exactly-once semantics in Apache Kafka in the 0.11 release. In this post, I’d like to tell you what exactly-once semantics mean in Apache Kafka, why it is a hard problem, and how the new idempotence and transactions features in Kafka enable correct exactly-once stream processing using Kafka’ ...

Read more

Baidu employs the PaddlePaddle framework internally for prediction systems, along with Python to make training models and deriving predictions a snap Many of the latest machine learning and data science tools purport to be easy to work with compared to previous generations of such frameworks and libraries. Chinese search engine giant Baidu now has an open source project in the same vein: a machine learning ...

Read more

Azure Data Lake Store: a hyperscale distributed file service for big data analytics | the morning paper

Azure data lake store: a hyperscale distributed file service for big data analytics Douceur et al., SIGMOD’17 Today’s paper takes us inside Microsoft Azure’s distributed file service called the Azure Data Lake Store (ADLS). ADLS is the successor to an internal file system called Cosmos, and marries Cosmos semantics with HDFS, supporting both Cosmos and Hadoop workloads. Microsoft are in the process of migra ...

Read more

Amazon Kinesis vs. Apache Kafka For Big Data Analysis – Dataconomy

Data processing today is done in form of pipelines which include various steps like aggregation, sanitization, filtering and finally generating insights by applying various statistical models. Amazon Kinesis is a platform to build pipelines for streaming data at the scale of terabytes per hour. Parts of the Kinesis platform are a direct competitor to the Apache Kafka project for Big Data Analysis. The platf ...

Read more

Serverless Scaling for Ingesting, Aggregating, and Visualizing Apache Logs with Amazon Kinesis Firehose, AWS Lambda, and Amazon Elasticsearch Service | AWS Database Blog

In 2016, AWS introduced the EKK stack (Amazon Elasticsearch Service, Amazon Kinesis, and Kibana, an open source plugin from Elastic) as an alternative to ELK (Amazon Elasticsearch Service, the open source tool Logstash, and Kibana) for ingesting and visualizing Apache logs. One of the main features of the EKK stack is that the data transformation is handled via the Amazon Kinesis Firehose agent. In this pos ...

Read more

Running Streaming Jobs Once a Day For 10x Cost Savings – The Databricks Blog

This is the sixth post in a multi-part series about how you can perform complex streaming analytics using Apache Spark. Traditionally, when people think about streaming, terms such as “real-time,” “24/7,” or “always on” come to mind. You may have cases where data only arrives at fixed intervals. That is, data appears every hour or once a day. For these use cases, it is still beneficial to perform incrementa ...

Read more

Manage Query Workloads with Query Monitoring Rules in Amazon Redshift | AWS Big Data Blog

Data warehousing workloads are known for high variability due to seasonality, potentially expensive exploratory queries, and the varying skill levels of SQL developers. To obtain high performance in the face of highly variable workloads, Amazon Redshift workload management (WLM) enables you to flexibly manage priorities and resource usage. With WLM, short, fast-running queries don’t get stuck in queues behi ...

Read more

Google Spanner: Beginning of the End of the NoSQL World? – ACM SIGMOD Blog

Google has recently announced that its flagship wide-area database named Spanner has been made available on the Google Cloud. Google Spanner is the next generation globally-distributed database built inside Google and announced to the world through the paper published in OSDI 2012 [1]. This article explores the implication of Google Spanner, in particular to the NoSQL world. CAP Theorem: A Quick Recap The t ...

Read more

2015 © Big Data Cloud Inc. All Rights Reserved.

Hadoop and the Hadoop elephant logo, Sprark are trademarks of the Apache Software Foundation.

Scroll to top