Hands-On Introduction to Apache Spark

Track: General Track

Session Number: 5005
Date: Tue, May 2nd, 2017
Time: 2:10 PM - 3:10 PM
Room: Room B

Description:

This hands-on session will provide participants with a basic knowledge of Apache Spark. The session has two hands-on labs to teach participants how to start coding using Apache Spark in Juypter Notebook on the IBM Data Science Experience/IBM Bluemix cloud platform. Lab One, “Hello Spark,” shows you the basics to create RDDs, pull in data files, and run map, filter and other basic transformation command. Lab Two, “Spark SQL,” shows you the basics to create DataFrames, run Sspark SQL, create tables, join tables/DataFrames, and visualize Spark SQL results with matplotlib.
What will attendees learn: Learn to code with Apache Spark using Jupyter.
Session Type: Podium Presentation

Session Code: HOL
Audience experience level: Beginner, Intermediate, Advanced
Presentation Category: Emerging Technology
Presentation Platform: Cross Platform
Audiences this presentation will apply to: Data Architects, Database Administrators
Technical areas this presentation will apply to: User Experiences
Session Type: Podium Presentation

Session Code: HOL
Audience experience level: Beginner, Intermediate, Advanced
Presentation Category: Emerging Technology
Presentation Platform: Cross Platform
Audiences this presentation will apply to: Data Architects, Database Administrators
Technical areas this presentation will apply to: User Experiences