The Time Is Now! Join the Data Tech Movement
Data scientists and data architects: Don't miss this opportunity to meet and learn from leaders in the industry at the Data Tech Summit. Deep technical content + valuable networking opportunities = success.
Register Today
Looking for broader education for DB2 users? Learn more about the IDUG Tech Conference!

Day 1

Forrester’s View on Big Data, Analytics & Open Source

Mike Gualtieri.png

Mike Gualtieri, Vice President, Forrester

Monday,  May 1st  

Mike's research focuses on software technology, platforms, and practices that enable technology professionals to deliver prescient digital experiences and breakthrough operational efficiency. His key technology and platform coverage areas are big data and IoT strategy, Hadoop/Spark, predictive analytics, streaming analytics, and prescriptive analytics, machine learning, data science, AI, and emerging technologies that make software faster and smarter.

Biography:

Mike has more than 25 years' experience in the industry helping firms design and develop mission-critical applications in eCommerce, insurance, banking, travel/hospitality, manufacturing, healthcare, and scientific research for organizations including NASA, eBay, Bank of America, Liberty Mutual, Nielsen, EMC, and others. He has written thousands of lines of code, managed development teams, and consulted with dozens of technology firms on product, marketing, and R&D strategy. He is a frequent and sought-after speaker at industry, corporate, educational, and technology events for his audience-designed, insightful, and energetic speeches.

Education:

Mike earned a B.S. in computer science and management from Worcester Polytechnic Institute. While a student, Mike was awarded three US patents for inventing an expert system used to train air traffic controllers around the world.

Winning with Machine Learning. Monetize the data behind your firewall

DanielHernandez.jpg

Daniel Hernandez, VP IBM Analytics, IBM

Monday,  May 1st  

Competitive pressures and the need to innovate are everywhere. Data is the new currency and exploiting your data to gain insights becomes core to every business. In this presentation, you will learn about IBM's point of view, how we help you progress your journey towards cognitive self-service analytics with Machine Learning, and generate value today and tomorrow, as your organization embraces hybrid cloud solutions.

Biography:

Daniel is currently the offering management leader for the IBM Analytics division and part of a talented team that reinvents the Analytics by focusing on business, building, operating, and enhancing integrated analytic solutions. Daniel always looks beyond the envelope, loves challenges and grows remarkable teams that put the client first.

Big Data & Analytics in a Cognitive Era

Dr. Christine Ouyang
Distinguished Engineer; Strategic IBM Analytics CTO Office; Master Inventor; Member of Academy of Technology, IBM
  Bio

Dr. Christine Ouyang

Dr. Christine Ouyang is Distinguished Engineer, CTO Office, IBM Analytics Group. She is an IBM Master Inventor and member of Academy of Technology. She leads a global team to build long-term strategic partnership with clients in the areas such as Big Data, Predictive Analytics, Cognitive, Open Source and Cloud. She is an innovative thought leader in Strategic Partnership and Ecosystem Development and a T-shaped technical leader. Prior to her current role, she was at IBM Corporate Strategy where she established strategic Collaborative Innovation Centers. She holds 70 US patents and 90+ scientific publications. She has a Ph.D in Electrical Engineering from University of Texas, Austin.

Building a cognitive enterprise is a journey. This talk will first explain how analytics and cognitive are related to and different from each other. Next, it will discuss the three steps to achieve the ultimate Cognitive Enterprise: 1) Expand the aperture of data and access to data. Acquire all data (enterprise, third-party and public open data). Build data lake(s). Break data silos. Provide management and governance and hence access to data for all users; 2) Build a strong analytics and cognitive foundation. IBM has full capabilities to support our clients; and 3) Empower all personas with analytics and cognitive capabilities. Enable self-service and collaboration. Embed cognitive in the lifecycle of analytics.

Machine Learning With Spark

Emil Kotrc
Principal Software Engineer, CA Technologies
  Bio

Emil Kotrc

Emil Kotrc is a principal software engineer at CA Technologies working in the Prague Technology Center since 2005. He started on Resource management products for Mainframe and later moved into Database management for DB2 for z/OS development. Before joining CA Technologies, Emil worked in an academic environment. Emil is an IBM Champion for 2015 and 2016.

Apache Spark defines itself as a fast and general engine for large-scale data processing and it is becoming very popular since its inception. One of the key features of Spark is that it comes with a set of built-in libraries. One of these is MLlib, which is the Spark's Machine Learning library. Machine Learning is a well established field of the data science that studies the algorithms for learning and predicting on data. MLlib brings these algorithms to Spark. In this presentation we will do a short overview of Spark and Machine Learning, and will show how you can benefit from these two worlds.

Data Without Governance Is a Liability: Data Lake Best Practices

David Stevens
IBM
  Bio

Emil Kotrc

David Stevens has a solid history of proven success with industry leaders including IBM, Princeton Softech, Oracle Corporation. My roles have included VP of Digital Asset Management, VP of Strategic Services, Global Oracle Program Director, and owner/founder of a technology services company. As part of IBM’s Information Management WW COE I was the Delivery Excellence Manager. My current role has me on the IBM Analytics Platform Competitive and Product Strategy team supporting Data Lake, Information Integration & Governance and Hosted Cloud Services.

Data enablement is a crucial capability for the successfully turning data into actionable information; i.e., creating business value from data. The lack of this capability is a significant stressor to data management and analytics delivery across industries. Adding worry to challenge, data professionals have to face this fact daily: data without governance is a material liability. This talk outlines some common assumptions and blind spots related to data lake implementation strategies which, once understood, can dramatically increase data enablement across an organization. Focus areas include components of the data lake, the value of governed data and best practices for architecture, technology and deployment.

Feeding Data Hungry Machine Learning

Tim Willging
Rocket Software
  Bio

Tim Willging

Database Architect and Strategist Tim Willging joined the Rocket Software team in 2005 and was named a distinguished engineer in 2015. With over 25 years developing enterprise software tools, Tim has architected and authored several database products focusing on backup and recovery, cloning, administration, monitoring and change management. He has recently been focusing on solutions for advanced analytics on z/OS. He is currently based in Chicago, and is an alumnus of Northern Illinois University with an emphasis in theoretical computing.

The mainframe is now a viable platform for machine learning and many companies are looking to embed in transaction analytics. A myriad of new technologies have come available in the last few years that can help companies reliably and cheaply prepare and analyze data on z/OS. This presentation will cover how to leverage your modern mainframe for machine learning by providing it with easy access to data.

Performance Enterprise Architectures for Analytic Design Patterns

Dave Beulke
Dave Beulke & Associates
  Bio

Dave Beulke

Dave Beulke is a system strategist, application architect, and performance expert specializing in Big Data, data warehouses, and high performance internet business solutions. He is an IBM Gold Consultant, Information Champion, President of DAMANCR, former President of International DB2 User Group, and frequent speaker at national and international conferences. His architectures, designs, and performance tuning techniques help organization better leverage their information assets, saving millions in processing costs. Follow him on Twitter or connect through LinkedIn.

With today’s cloud, application appliance and columnar database alternatives, implementing another analytic data silo is quickly and easily done. Architecting enterprise analytical systems requires comprehensive data management and usage analysis to build the proper performance infrastructure design pattern. Having designed, deployed and implemented successful 5 billion, 8 billion, and 22 billion row analytical database systems, this presentation will discuss the design patterns that provide sustainable performance for fast analytic answers. Through this presentation, you will learn the enterprise analytics architectures, and the special design patterns and techniques to maximize performance for fast analytic answers.


Day 2

R as a Weapon of Choice for Data Science

Emil Kotrc
Principal Software Engineer, CA Technologies
  Bio

Emil Kotrc

Emil Kotrc is a principal software engineer at CA Technologies working in the Prague Technology Center since 2005. He started on Resource management products for Mainframe and later moved into Database management for DB2 for z/OS development. Before joining CA Technologies, Emil worked in an academic environment. Emil is an IBM Champion for 2015 and 2016.

R is a free software project for statistical computing and graphics, which is already well established in the data science world and it is used by many researchers and statisticians around the world. R is also gaining popularity outside the academic area and is being integrated with many tools and platforms. You can use R to analyze your DB2 data, your data in Big Data platforms, or even your data in cloud. This presentation will introduce the R project, will show its main benefits, and will guide you how you can use it to process your data across the platforms and data sources. We will also discuss the integration of R with DB2 as a data source and with Spark as a processing engine.

Taking a Leap Forward; High-speed Analytics Running on Your IoT Devices Without Moving the Data!

Robert Neugebauer
IBM  Bio

Robert Neugebauer

Bob Neugebauer is a senior developer at IBM. He has deep experience in SQL language and tuning with over 15 years experience in DB2 Query Optimization development. Beyond development, Bob is often called upon to assist with DB2 Proof of Concept and enablement efforts. He is the author of several patents related to database technology and has become a regular presenter at IDUG on topics such as query optimization and advanced DB2 analytics. Currently he is researching and developing new technology to bring advanced analytics to highly distributed datasets.

Imagine the potential if it were possible to run advanced analytics from Spark, R Studio, Python and SQL directly on thousands of light-weight IoT data sources. Imagine this was as easy to configure as downloading an app on your phone, and could magically process data hundreds of times faster than current technologies for "analytics at the edge". IBM® Queryplex can scale to millions of connected nodes creating self-organizing organically inspired constellations and use that structure to achieve dramatic processing acceleration. Plus there are no more stale copies of your data created by replicating your data into a central warehouse or big data cluster; the original data is accessed at its source.

Every now and then an idea comes along that changes what we thought we knew technology could achieve. Too good to be true? Come and learn about the coming of IBM® Queryplex.

Make Data, Not War: Reconciling the NoSQL vs. SQL Debate

Chris Bienko
IBM   Bio

Emil Kotrc

Chris Bienko is the co-author of several publications on modern database and analytics technologies, including Big Data Beyond the Hype and multiple IBM RedBooks. As part of IBM’s World-Wide Technical Sales organization, he supports enterprises in their adoption of the IBM Cloud Data services portfolio. Over the better part of three years, he has navigated the cloud computing domain, enabling IBM customers and sellers to stay abreast of these rapidly evolving technologies.

The cloud marketplace— as well as the developers building applications & services atop it —require flexible options on deployments and platform. They need the ability to provision databases across multiple vendors and platforms. These are the requirements of today's mobile application ecosystem.

This presentation will cover these fundamental challenges to modern application design and illustrate how specifically NoSQL databases like Cloudant mitigates the risks. In particular, we'll cover why NoSQL databases excel at offline application data access, cross data-center and device replication, and how these techniques nicely complement traditional RDBMS use cases.


Special Interest Group (SIG) for Data Architecture and Data Science

The SIG will cover a number of new technologies that are in the Data Architecture and Data Science world, including mchine learning, Spark, Analytics, Internet of Things (IoT), Hadoop, NoSQL and R.

Hands-on Lab – Apache Spark (Exclusively for Data Tech Summit Attendees)

This hands-on session will provide participants with a basic knowledge of Apache Spark. The session has two hands-on labs to teach participants how to start coding using Apache Spark in Juypter Notebook on the IBM Data Science Experience/IBM Bluemix cloud platform. Lab One, “Hello Spark,” shows you the basics to create RDDs, pull in data files, and run map, filter and other basic transformation command. Lab Two, “Spark SQL,” shows you the basics to create DataFrames, run Spark SQL, create tables, join tables/DataFrames, and visualize Spark SQL results with matplotlib.