Scala and Spark for Big Data Analytics: Explore the concepts of functional programming, data streaming, and machine learning (pdf)

$14.00

Author Md. Rezaul Karim, Sridhar Alla
Edition 1
Edition Year 2017
Format PDF
ISBN 9781785280849
Language English
Number Of Pages 796
Publisher Packt Publishing

Description

Book Description

Scala has been witnessing wide-scale adoption over the past few years, particularly in the field of data science and analytics. Spark, which is built on Scala, has also gained recognition, and is now being used widely in production. This book is designed to help you leverage the power of Scala and Spark to make sense of big data.

Scala and Spark for Big Data Analytics begins by introducing you to Scala and helping you understand the object-oriented and functional programming concepts required for Spark application development. You’ll then move onto Spark and cover basic abstractions using Resilient Distributed Dataset (RDD) and DataFrame. This will help you develop scalable, fault-tolerant streaming applications by analyzing structured and unstructured data using SparkSQL, GraphX, and Spark structured streaming. In the sections to follow, you’ll explore advanced topics, such as monitoring, configuration, debugging, testing, and deployment, which will further help you to manage your data effectively.

After this, you’ll learn to use SparkR and PySpark APIs to develop impactful applications, and deploy Zeppelin to help you create interactive data analytics. Towards the concluding chapters, you’ll be able to use Alluxio to facilitate in-memory data processing.

By the end of this book, you’ll have a clear understanding of Spark and be able to perform full-stack data analytics regardless of the amount of data.

Key Features

  • Experience Scala’s sophisticated type system, combining functional programming and object-oriented concepts
  • Work on an array of applications, ranging from simple batch jobs to stream processing and machine learning
  • Perform large-scale data analysis by exploring both common as well as complex use-cases

    What you will learn

    • Get an in-depth understanding of Scala collection APIs
    • Work with RDD and DataFrame to learn Spark’s core abstractions
    • Analyse structured and unstructured data using SparkSQL and GraphX
    • Build scalable and fault-tolerant streaming applications using Spark structured streaming
    • Discover machine-learning (ML) best practices for classification, regression, dimensionality reduction, and recommendation system to build predictive models with widely used algorithms in Spark MLlib and ML
    • Develop clustering models to cluster a vast amount of data
    • Get to grips with tuning, debugging, and monitoring Spark applications
    • Deploy Spark applications on real clusters in Standalone, Mesos, and Yet Another Resource Negotiator (YARN)

    Who this book is for

    If you want to learn how to perform data analysis by harnessing the power of Spark, this is the book for you. Prior knowledge of Spark or Scala is not required. Programming experience (particularly with other Java virtual machine(JVM) languages) will be useful to help you grasp the concepts easily.

    Table of Contents

    1. Introduction to Scala
    2. Object-Oriented Scala
    3. Functional Programming Concepts
    4. Collection APIs
    5. Tackle Big Data – Spark Comes to the Party
    6. Start Working with Spark – REPL and RDDs
    7. Special RDD Operations
    8. Introduce a Little Structure – Spark SQL
    9. Stream Me Up, Scotty – Spark Streaming
    10. Everything is Connected – GraphX
    11. Learning Machine Learning – Spark MLlib and Spark ML
    12. My Name is Bayes, Naive Bayes
    13. Time to Put Some Order – Cluster Your Data with Spark MLlib
    14. Text Analytics Using Spark ML
    15. Spark Tuning
    16. Time to Go to ClusterLand – Deploying Spark on a Cluster
    17. Testing and Debugging Spark
    18. PySpark and SparkR

Additional information

Author

Md. Rezaul Karim, Sridhar Alla

Edition

1

Edition Year

2017

Format

PDF

ISBN

9781785280849

Language

English

Number Of Pages

796

Publisher

Packt Publishing

Reviews

There are no reviews yet.

Be the first to review “Scala and Spark for Big Data Analytics: Explore the concepts of functional programming, data streaming, and machine learning (pdf)”

Your email address will not be published. Required fields are marked *