Explore Customer Churn with Cramer Index

Classification problems involve predicting a response variable based on  a set of feature variables for some entity. But there is another problem whose solution is a prerequisite for solving classification problem. We may want to know which among the set of feature variables are most strongly correlated to the response variables. Once we have identified those, we may only want to use that sub set of the fea ...

Read more

WANdisco Completes Acquisition of Pioneering Big Data Company AltoStor | Press Releases | News | WANdisco

Acquisition of Hadoop expertise will accelerate WANdisco’s product development in fast-growing Big Data market WANdisco (LSE: WAND), a leading provider of global collaboration software to the software development industry, announced today it has completed the acquisition of Silicon Valley-based software company AltoStor. This acquisition underscores WANdisco's drive into the Big Data solutions market, forec ...

Read more

The Hive « Supercharging Visualization with Data Mining – A Cyber Security Story (Pt.2)

PART 2 In the previous blog post I introduced the concept of integrating the four key analytics disciplines (user experience design, interactive visualization, a scalable data store, and data mining) to create security intelligence. The post was fairly theoretical. This post is about putting the theory from the last post into practice. We are going to continue on our banking example. Follwoing is a quick an ...

Read more

It’s a lonely life for outliers

In this post, I am back to outliers and fraud analytic. In this earlier post, I did an overview of outliers detection techniques that are being implemented with Hadoop in my open source project beymani. In this earlier post, I talked about a multivariate distribution model based implementation in beymani. We postulated that data point falling in the low frequency histogram bins are potentially outliers. In ...

Read more

Fraudsters are not Model Citizens

In my earlier post, I did an overview of the outlier detection techniques in big data and specifically Hadoop context. As I mentioned, fraud detection is essentially translates to outlier detection in data mining parlance. In his post, I will go over a distribution model based technique which has been implemented as two map reduce jobs in beymani available in github. I will use the credit card transaction a ...

Read more

IBM Taps i2 for Big Data Analytics Expertise

IBM has signed a definitive agreement to acquire i2, a maker of intelligence analytics tools for crime and fraud prevention. IBM has announced an agreement to acquire i2 to accelerate its business analytics initiatives and help clients in the public and private sectors address crime, fraud and security threats. Financial terms of the deal were not disclosed. i2, with more than 4,500 customers in 150 countri ...

Read more

New tools driving big data analytics, survey finds

New technologies are enabling companies to perform increasingly sophisticated data analytics on very large and very diverse data sets, an upcoming report from The Data Warehousing Institute (TDWI) shows. The report is based on responses from 325 IT managers, business users and consultants at small, medium and large companies. Slightly more than a third of the respondents said they are currently running some ...

Read more

5 real-world uses of big data

In the past year, big data has emerged as one of the most closely watched trends in IT. Organizations today are generating more data in a single day than that the entire Internet was generated as recently as 2000. The explosion of “big data”–much of it in complex and unstructured formats–has presented companies with a tremendous opportunity to leverage their data for better business insights through analyti ...

Read more

Big Data Cloud’s Monthly Meetup: June 3rd Topics & Speakers Announced

We have an excellent round of speakers & topics for Big Data Cloud’s Monthly June 3rd Meetup: Session 1: Optimizing bursty Hadoop analysis demands for big data using AWS Topics Covered: The tradeoffs of storage on S3, EBS, and HDFS Optimizing bursty Hadoop analysis demands Latencies, Prices, and stretchy clusters From: 6:15 pm to 6:45 pm Speaker: Paul Baclace Bio: Paul is a veteran Software Engineer of 9+ s ...

Read more

IBM Ups Big Data Bet with New Software, $100 Million in Research

On the same day that IBM passed Microsoft in market cap, Big Blue showed how it will ride the growth of big data to continue its momentum. IBM announced a new $100 million investment for future data analytics along with new services and software aimed at helping improve data analysis and new services for IT professionals. The news, shared at an event at its Watson Research Center, highlights the work IBM ha ...

Read more

2013 © Big Data Cloud Inc. All Rights Reserved.

Hadoop and the Hadoop elephant logo are trademarks of the Apache Software Foundation.

Scroll to top