Frank Kane’s Taming Big Data with Apache Spark and Python (pdf)

$5.00

Author Frank Kane
Edition 1
Edition Year 2017
Format PDF
Language English
Number Of Pages 298
Publisher Packt Publishing
ISBN 9781787287945

Description

What you will learn

  • Find out how you can identify Big Data problems as Spark problems
  • Install and run Apache Spark on your computer or on a cluster
  • Analyze large data sets across many CPUs using Spark’s Resilient Distributed Datasets
  • Implement machine learning on Spark using the MLlib library
  • Process continuous streams of data in real time using the Spark streaming module
  • Perform complex network analysis using Spark’s GraphX library
  • Use Amazon’s Elastic MapReduce service to run your Spark jobs on a cluster

About the Author

My name is Frank Kane. I spent nine years at Amazon and IMDb, wrangling millions of customer ratings and customer transactions to produce things such as personalized recommendations for movies and products and “people who bought this also bought.” I tell you, I wish we had Apache Spark back then, when I spent years trying to solve these problems there. I hold 17 issued patents in the fields of distributed computing, data mining, and machine learning. In 2012, I left to start my own successful company, Sundog Software, which focuses on virtual reality environment technology, and teaching others about big data analysis.

Table of Contents

  1. Getting Started with Spark
  2. Spark Basics and Simple Examples
  3. Advanced Examples of Spark Programs
  4. Running Spark on a Cluster
  5. SparkSQL, Dataframes and Datasets
  6. Other Spark Technologies and Libraries
  7. Where to Go From Here? – Learning More About Spark and Data Science

Additional information

Author

Frank Kane

Edition

1

Edition Year

2017

Format

PDF

Language

English

Number Of Pages

298

Publisher

Packt Publishing

ISBN

9781787287945

Reviews

There are no reviews yet.

Be the first to review “Frank Kane’s Taming Big Data with Apache Spark and Python (pdf)”

Your email address will not be published. Required fields are marked *