Www.tensupport.com



MapR:Developing Apache Spark ApplicationsOverviewThis course enables developers to get started developing Big Data applications with ApacheSpark. In the first part of the course, you will use Spark’s interactive shell to load and inspectdata. The course then describes the various modes for launching a Spark application. Youwill then go on to build and launch a standalone Spark application. The concepts are taughtusing scenarios that also form the basis of hands-on labs.Duration3 daysWho is the course forDevelopers interested in designing and developing Spark applications.PrerequisitesAttendees must have Java programming experience to do the exercises.+ Basic to intermediate Linux knowledge, including the ability to use a text editor suchas vi, and familiarity with basic command-lineoptions such as mv, cp, ssh, grep, cd,+ Knowledge of application development principles+ A Linux, Windows or MacOS computer with the MapR Sandbox installed (for theondemandcourse)+ Connection to a Hadoop cluster via SSH and web browser (for the ILT and VILTcourse)What you will learnIncluded in this 3-day course are:+ Access to a multinodeAmazon Web Services (AWS) cluster+ Lab Code+ Slide Guide pdf+ Lab Guide pdfCourse OutlineDay 1:Lesson 1 – Introduction to Apache SparkDescribe the features of Apache Spark+ A dvantages of Spark+ How Spark fits in with Hadoop+ How Spark fits in with the Big Data application stack+ Define Apache Spark componentsLesson 2 – Load and Inspect Data in Spark+ Describe different ways of getting data into Spark+ Create and use Resilient Distributed Datasets (RDDs)+ Apply transformation to RDDs+ Use actions on RDDs: Lab: Load and inspect data in RDD+ Cache intermediate RDDs+ Use Spark DataFrames for simple queries: Lab: Load and inspect data in DataFramesLesson 3 – Build a Simple Spark Application+ Define the lifecycle of a Spark programme+ Define the function of SparkContext: Lab: create the application+ Define different ways to run a Spark application+ Run your Spark application: Lab: launch the applicationDay 2Lesson 4 – Work with Pair RDD+ Describe pair RDD+ Why use pair RDD+ Create pair RDD+ Apply transformations and actions to pair RDD+ Control partitioning across nodes+ Changing partitions+ Determine the partitionerLesson 5 – Work with Spark DataFrames+ Create Apache Spark DataFrames+ Work with data in DataFrames+ Create userdefinedfunctions+ Repartition DataFrameLesson 6 – Monitor a Spark Application+ Describe the components of the Spark execution model+ Use the SparkUI to monitor a Spark application+ Debug and tune to Spark applicationsDay 3Lesson 7 – Introduction to Apache Spark Data Pipelines+ Identify components of Apache Spark Unified Stack+ Benefits of the Apache Spark Unified Stack over Hadoop ecosystem+ Describe data pipeline use casesLesson 8 – Create an Apache Spark Streaming Application+ Spark streaming architecture+ Create DStreams+ Create a simple Spark Streaming application: Lab: Create a Spark Streamingapplication+ DStream operations: Lab: Apply an operations on DStreams+ Apply DStream operations+ Use Spark SQL to query DStreams+ Define window operations: Lab: Add windowing operations+ Use Spark SQL to query DStreams+ Define window operations: Lab: Add windowing operations+ Describe how DStreams are faulttolerantLesson 9 – Use Apache Spark GraphX to Analyse Flight Data+ Describe GraphX+ Define a property graph: Lab: create a property graph+ Perform operations on graphs: Lab: Apply graph operationsLesson 10 – Use Apache Spark MLIib to Predict Flight Delays+ Describe Spark MLIib+ Describe a generic classification workflow+ Describe common terms for supervised learning+ Use a decision tree for classification and regression+ Lab: Create a DecisionTree model to predict flight delays on streaming data.Need more information:Phone us: +44 (0)20 7205 2550Email us: training@Visit us: training ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download