View Printable Version

Hands-On Introduction to Apache Spark

Session Number: 4373
Track: Hands On Labs
Session Type: Podium Presentation
Primary Presenter: carlo appugliese [IBM]
Time: May 02, 2017 (02:10 PM - 04:45 PM)
Room: Castle A&B - Hands on Labs

Audience experience level: Beginner, Intermediate, Advanced
Presentation Category: Select a Value
Presentation Platform: Select a Value
Audiences this presentation will apply to: Select a Value
Technical areas this presentation will apply to: Select a Value

Abstract:  This hands-on session will provide participants with a basic knowledge of Apache Spark. The session has two hands-on labs to teach participants how to start coding using Apache Spark in Juypter Notebook on the IBM Data Science Experience/IBM Bluemix cloud platform. Lab One, “Hello Spark,” shows you the basics to create RDDs, pull in data files, and run map, filter and other basic transformation command. Lab Two, “Spark SQL,” shows you the basics to create DataFrames, run Sspark SQL, create tables, join tables/DataFrames, and visualize Spark SQL results with matplotlib.
What will attendees learn: Learn to code with Apache Spark using Jupyter.

For questions or concerns about your event registration, please contact