Big Data Analytics at FaceBook
This meetup will be an “unconference” style one and will have various presentations to choose from. Please review the topics below and upon registering, select your 2 preferred topics that you’d like to hear discussed the night of the event!
Event Doors open
6:30pm – 7:30pm
Happy Hour & Networking
7:30pm – 7:45pm
7:45pm – 8:30pm
2 Presentations and Q&A
Happy Hour & Networking
Speaker & Talk Details
Using data at Facebook
Presenter – Alex Schultz
Alex will share how Facebook uses data to better serve our users. He was on the Growth team at Facebook for 9 years before being appointed to run all of Data Analytics. Through this experience, he will share his stories about how data has changed Facebook’s overall direction and mission.
Data Pipelines development/deployment and management using Data Swarm
Presenter – Mike Starr
At Facebook, data is used to gain insights for existing products and drive development of new products. In order to do this, engineers and analysts need to seamlessly process data across a variety of backend data stores. Dataswarm is a framework for writing data processing pipelines in Python. Using an extensible library of operations (e.g. executing queries, moving data, running scripts), developers programmatically define dependency graphs of tasks to be executed. Dataswarm takes care of the rest: distributed execution, scheduling, and dependency management. Talk will cover high level design, example pipeline code, and plans for the future.
Data wins arguments – Importance of shipping good data
Presenter – Piotr Grabowski
In this presentation, Piotr Grabowski (Data Engineer at Facebook), will discuss importance of shipping good data. How to think about data quality and accuracy in a fast paced big data environment. He will review few examples of how data quality checks have been implemented by him at Facebook and reflect on lessons learned.
Data analysis tools at Facebook
Presenter – Janet Wiener
At Facebook, our data analysis systems store huge volumes of data, ranging from hundreds of terabytes in memory to hundreds of petabytes on disk. They grow by millions of events (inserts) per second and process tens of petabytes and hundreds of thousands of queries per day. We use these systems for troubleshooting, identifying trends, and making decisions. In this talk, I will describe our data systems ODS, Scuba, and Hive and how we use them.
Predicting the Value of a Social Game Install
Presenter – Parsa Bakhtary
Parsa Bakhtary, Games Product Analyst, Facebook explored the concept of customer life time value (LTV) in his talk “Predicting the Value of a Social Game Install”. In simple terms, it is the dollar value of a customer relationship to a company. In other words, it is the upper bound of the amount that could be spent to acquire new customers. Specifically, for free-to-download games with in-app purchases, LTV is the total expected revenue from an install. Social/mobile game LTV varies a lot by demographic factors such as country, age, and gender. The traditional LTV model for apps is based on average revenue per daily active user (ARPDAU), retention, and virality. There are various challenges with the traditional model, such as, the model relies heavily on forecasting (for ARPDAU and retention values). Also, the model is very tightly-coupled to the app, and thus, varies a lot for different apps.