High Performance Spark

Best Practices for Scaling and Optimizing Apache Spark

Nonfiction, Computers, Database Management, Data Processing, Application Software, Business Software
Cover of the book High Performance Spark by Holden Karau, Rachel Warren, O'Reilly Media
View on Amazon View on AbeBooks View on Kobo View on B.Depository View on eBay View on Walmart
Author: Holden Karau, Rachel Warren ISBN: 9781491943151
Publisher: O'Reilly Media Publication: May 25, 2017
Imprint: O'Reilly Media Language: English
Author: Holden Karau, Rachel Warren
ISBN: 9781491943151
Publisher: O'Reilly Media
Publication: May 25, 2017
Imprint: O'Reilly Media
Language: English

Apache Spark is amazing when everything clicks. But if you haven’t seen the performance improvements you expected, or still don’t feel confident enough to use Spark in production, this practical book is for you. Authors Holden Karau and Rachel Warren demonstrate performance optimizations to help your Spark queries run faster and handle larger data sizes, while using fewer resources.

Ideal for software engineers, data engineers, developers, and system administrators working with large-scale data applications, this book describes techniques that can reduce data infrastructure costs and developer hours. Not only will you gain a more comprehensive understanding of Spark, you’ll also learn how to make it sing.

With this book, you’ll explore:

  • How Spark SQL’s new interfaces improve performance over SQL’s RDD data structure
  • The choice between data joins in Core Spark and Spark SQL
  • Techniques for getting the most out of standard RDD transformations
  • How to work around performance issues in Spark’s key/value pair paradigm
  • Writing high-performance Spark code without Scala or the JVM
  • How to test for functionality and performance when applying suggested improvements
  • Using Spark MLlib and Spark ML machine learning libraries
  • Spark’s Streaming components and external community packages
View on Amazon View on AbeBooks View on Kobo View on B.Depository View on eBay View on Walmart

Apache Spark is amazing when everything clicks. But if you haven’t seen the performance improvements you expected, or still don’t feel confident enough to use Spark in production, this practical book is for you. Authors Holden Karau and Rachel Warren demonstrate performance optimizations to help your Spark queries run faster and handle larger data sizes, while using fewer resources.

Ideal for software engineers, data engineers, developers, and system administrators working with large-scale data applications, this book describes techniques that can reduce data infrastructure costs and developer hours. Not only will you gain a more comprehensive understanding of Spark, you’ll also learn how to make it sing.

With this book, you’ll explore:

More books from O'Reilly Media

Cover of the book Secure Programming Cookbook for C and C++ by Holden Karau, Rachel Warren
Cover of the book Practical UNIX and Internet Security by Holden Karau, Rachel Warren
Cover of the book Dreamweaver 8: The Missing Manual by Holden Karau, Rachel Warren
Cover of the book Async in C# 5.0 by Holden Karau, Rachel Warren
Cover of the book Learning the vi and Vim Editors by Holden Karau, Rachel Warren
Cover of the book Bootstrap by Holden Karau, Rachel Warren
Cover of the book Learning XNA 3.0 by Holden Karau, Rachel Warren
Cover of the book Java SOA Cookbook by Holden Karau, Rachel Warren
Cover of the book Apache: The Definitive Guide by Holden Karau, Rachel Warren
Cover of the book Clojure Cookbook by Holden Karau, Rachel Warren
Cover of the book MacRuby: The Definitive Guide by Holden Karau, Rachel Warren
Cover of the book Sinatra: Up and Running by Holden Karau, Rachel Warren
Cover of the book 60 Recipes for Apache CloudStack by Holden Karau, Rachel Warren
Cover of the book Mastering Perl for Bioinformatics by Holden Karau, Rachel Warren
Cover of the book Visualizing Streaming Data by Holden Karau, Rachel Warren
We use our own "cookies" and third party cookies to improve services and to see statistical information. By using this website, you agree to our Privacy Policy