Learning Spark Lightning-Fast Big Data Analysis

seeders: 1

leechers: 0

Download torrent

Added on March 3, 2015 by Shishirkumarin Books
Torrent verified.

Main

add to bookmarks

Learning Spark Lightning-Fast Big Data Analysis (Size: 7.9 MB)

		Learning Spark Lightning-Fast Big Data Analysis.pdf	7.82 MB
		Cover Image.jpg	76.83 KB

Description

About Book
Data in all domains is getting bigger. How can you work with it efficiently? This book introduces Apache Spark, the open source cluster computing system that makes data analytics fast to write and fast to run. With Spark, you can tackle big datasets quickly through simple APIs in Python, Java, and Scala.Written by the developers of Spark, this book will have data scientists and engineers up and running in no time. You'll learn how to express parallel jobs with just a few lines of code, and cover applications from simple batch jobs to stream processing and machine learning.Quickly dive into Spark capabilities such as distributed datasets, in-memory caching, and the interactive shellLeverage Spark's powerful built-in libraries, including Spark SQL, Spark Streaming, and MLlibUse one programming paradigm instead of mixing and matching tools like Hive, Hadoop, Mahout, and StormLearn how to deploy interactive, batch, and streaming applicationsConnect to data sources including HDFS, Hive, JSON, and S3Master advanced topics like data partitioning and shared variables

The Web is getting faster, and the data it delivers is getting bigger. How can you handle everything efficiently? This book introduces Spark, an open source cluster computing system that makes data analytics fast to run and fast to write. You'll learn how to run programs faster, using primitives for in-memory cluster computing. With Spark, your job can load data into memory and query it repeatedly much quicker than with disk-based systems like Hadoop MapReduce. Written by the developers of Spark, this book will have you up and running in no time. You'll learn how to express MapReduce jobs with just a few simple lines of Spark code, instead of spending extra time and effort working with Hadoop's raw Java API. Quickly dive into Spark capabilities such as collect, count, reduce, and save Use one programming paradigm instead of mixing and matching tools such as Hive, Hadoop, Mahout, and S4/Storm Learn how to run interactive, iterative, and incremental analyses Integrate with Scala to manipulate distributed datasets like local collections Tackle partitioning issues, data locality, default hash partitioning, user-defined partitioners, and custom serialization Use other languages by means of pipe() to achieve the equivalent of Hadoop streaming.

Product Identifiers
ISBN-10 1449358624
ISBN-13 9781449358624

Key Details
Author Andy Konwinski, Holden Karau, Mark Hamstra, Matei Zaharia, Patrick Wendell
Number Of Pages 274 pages
Format Ebook
Publication Date 2015-02-27
Language English
Publisher O'Reilly Media, Incorporated

Additional Details
Copyright Date 2013

Target Audience
Group Scholarly & Professional

Sharing Widget

Download torrent

7.9 MB

seeders:1

leechers:0

Learning Spark Lightning-Fast Big Data Analysis