Check-in for workshops - please note that each workshop costs $150 and requires separate registration
In this tutorial we'll explore the concepts and motivations behind the continuous application, how Structured Streaming Python APIs in Apache Spark™ enable writing continuous applications, examine the programming model behind Structured Streaming, and look at the APIs that support them.
Data Science and Machine learning have been synonymous with languages like Python. Libraries like Numpy and Pandas have become the de facto standard when working with data. The DataFrame object provided by Pandas gives us the ability to work with heterogeneous unstructured data that is commonly used in "real world" data.
New learners are often drawn to Python and Pandas because of all the different and exciting types of models and insights the language can do and provide, but are awestruck when faced with the initial learning curve.
This is a tutorial for beginners on using the Pandas library in Python for data manipulation. We will go from the basics of how to load and look at a dataset in pandas (python) for the first time, and begin the progess of preparing data for analysis.
This workshop will have four or so Lego Mindstorms that we'll use to learn how to program a Mindstorm to use its camera to follow a racetrack made out of tape.
Check-in for workshops - please note that each workshop costs $150 and requires separate registration
Build-up your Python skills by working on a short, focused project.
This fun workshop is a hands-on class teaching many core Python skills.
We practice test driven development to create and apply a unittest module.
Skills covered:
Logging module
Named tuples
Classes and special methods
Writing and using context managers
Exception handling
Conditional expressions
main sections
Assertions
getattr() and hasattr()
F-strings
Test driven development
The doctest, unittest and py.test modules
Code organization and documentation
This workshop will help you level-up your general Python skills from a data science perspective. It is designed for people who are already familiar with the basics of Python and interested in their application to data processing.
Beef-up your Python skills by working on a small, focused project.
This fun workshop is a hands-on class teaching foundational Python skills.
We'll create a working web framework from scratch.
Skills covered:
HTTP headers and status codes
The dataclasses module and type declarations
Joining lists of strings
JSON encoding and decoding
Query string parsing
Dictionary comprehensions
How WSGI works (web standards gateway interface)
String encoding and decoding
Making apps for the wsgiref module
REST APIs
Modules: logging, wsgiref, http, urllib.parse, wsgiref
Briefly show Flask and Bottle
In this session, we will cover how to create Deep Neural Networks using the PyTorch framework on a variety of examples. The material will range from beginner - understanding what is going on ""under the hood"", coding the layers of our networks, and implementing backpropagation - to more advanced material on RNNs,CNNs, LSTMs, & GANs.
Check-in | Breakfast at Fisher Atrium
Opening and Champion Sponsor's Remarks at Robertson Room
Innovations in machine learning are changing our perception of what is possible to do with a computer. But how will machine learning change the way we program, the tools we use, and the mix of tasks done by expert programmers, novice programmers, and non-programmers? This talk examines some possible futures.
In 2017, CPython codebase was moved to GitHub from Mercurial, an effort that took more than three years of planning and lots of volunteer coordination. The move proved to be successful and well-appreciated. New contributors face less barriers when contributing to Python. Core developers are benefiting from personal assistants in the form of GitHub bots and automations. Can the workflow be even better? In this talk, we'll look into other problems in CPython's workflow: the issue tracker itself.
The acceptance of PEP 581, by Python steering council means that another big workflow change is impending. Let's hear about some of the proposed plans on improving CPython's workflow, and learn how you can help and take part in this process.
Break
Curious what's coming next for Python? Well, Python 3.8 is being released just a few short months. It already includes a number of new features such as assignment expressions, improved debugging, more string formatting features, and many smaller changes. Come learn about the big ones and some of the more interesting small ones too!
Bridgette Powell
(DevOps, Testing, & Automation,
Web, IoT, & Hardware)
Robertson 2
30 mins
Without a unified analytics solution, it has been difficult to understand tool efficiency within our pipeline. Developers and artists need a way to proactively identify areas for improvement in their pipeline tools and production rigs. Visualizing statistical data allows users to quickly identify, correlate, and fix issues. It provides insight into complex systems which are otherwise black boxes.
In this talk, we present Koalas, a new open source project that was announced at the Spark + AI Summit in April. Koalas is a Python package that implements the pandas API on top of Apache Spark, to make the pandas API scalable to big data. Using Koalas, data scientists can make the transition from a single machine to a distributed environment without needing to learn a new framework.
Manish Sinha
(Python & Libraries,
Web, IoT, & Hardware)
Boardroom
30 mins
GraphQL has become the de facto successor to REST. How does this impact Django developers? In this talk, we will explore the state of Django web API libraries focusing in on the Django REST Framework and Graphene.
Paul Everitt
Łukasz Langa
Barry Warsaw
Benjamin Peterson
Emily Morehouse-Valcarcel
(Python & Libraries)
Mya
45 mins
Elected as prescribed in PEP 8016, the Python Steering Council is a 5-person committee that assumes a mandate to maintain the quality and stability of the Python language and CPython interpreter, improve the contributor experience, formalize and maintain a relationship between the Python core team and the PSF, establish decision making processes for Python Enhancement Proposals, seek consensus among contributors and the Python core team, and resolve decisions and disputes in decision making among the language.
This session will be moderated by , Paul Everitt (Python Software Foundation), to introduce a discussion with members of the community. Barry Warsaw (Steering Council), Łukasz Langa (3.8 Release Manager), Emily Morehouse (Core Dev), Benjamin Peterson (2.7 Release Manager)
Meredydd Luff
(Python & Libraries,
Web, IoT, & Hardware)
Robertson 2
45 mins
Programming for the Web requires 5 languages and 5+ frameworks. Wouldn’t it be easier if we could do it all in Python?
Meredydd will discuss how he built Anvil, a tool for building full-stack web apps with nothing but Python. Topics include compiling Python to JS, how autocompletion works, capability-based security, and why true accessibility means more than “usable by beginners”.
Ville Tuulos
Ravi Kiran Chirravuri
Savin Goyal
(ML, AI, & Data,
Python & Libraries,
Scale & Performance,
DevOps, Testing, & Automation)
Fisher West
45 mins
We will share our experiences on building Metaflow, a Python library that is used at Netflix to build and operate hundreds of machine learning applications. This talk is for you if you want to learn how to develop systems for big data and ML in Python.
Lunch
The talk provides an overview of the ad tech ecosystem, in particular how ad slots are dynamically populated by way of user behavioral targeting each time a webpage loads. It should be interesting for developers as well as data scientists who are looking to learn about the applications of machine learning in computational advertising.
Louise Grandjonc
(Python & Libraries,
Scale & Performance)
Robertson 2
30 mins
SQL can seem like an obscure but somehow useful language. In this talk we will look into things that SQL can do, sometimes more easily than using python, and how to get it in your ORM, running in your application. During this talk we will use an application analysing the lyrics of my favorite teenage band and show fun examples of these SQL statements, and how to integrate them in your code
Wyatt Peterson
(DevOps, Testing, & Automation,
Python & Libraries)
Fisher West
30 mins
At NerdWallet we are heavy Terraform users, but we don't use HCL (the native Terraform language) to build our resource definitions -- we use Python! This talk will cover the motivation for using Python and details of the implementation of how Python is converted into Terraform definitions.
Gabriel Boorse
(DevOps, Testing, & Automation,
Scale & Performance)
Boardroom
30 mins
Puzzling over performance problems in production? Baffling backend bugs bending your brain? Unleash a plague of Locusts on your web app to devour performance problems, permanently! In this talk, you will learn how to leverage Locust for load testing RESTful services and more.
The strengths and weaknesses of Python lend themselves to a different style of object oriented programming. By accepting several constraints on how we design and implement classes, we make our code more robust, more testable, and easier to adapt to changing circumstances.
Edwin Jung
(DevOps, Testing, & Automation,
Python & Libraries)
Robertson 2
30 mins
Dependency Injection is a basic technique in other languages or frameworks, but less commonly seen in Python. For many developers, and especially those who have come to rely on patching, it is also un-intuitive. Superficially, it may be considered as simply "passing things in" via parameters, but this understanding is mistaken, and often falls apart when applied to realistic examples.
Understanding DI as a technique is a gateway to improving both your software design and testing, as well as clean architecture principles. This session will introduce fundamental DI concepts with basic examples, clear up some common misunderstandings, and draw a connection to clean design and testing. The content is aimed at a beginning to intermediate level.
Jigyasa Grover
(ML, AI, & Data,
Python & Libraries)
Fisher West
30 mins
Last couple of years have witnessed an immense growth of Python in multifarious domains especially AI/ML, web, etc., each one necessitating a different programming paradigm varying from object oriented, functional, procedural to imperative. This talk reviews them all and helps you choose one for an efficient design solution !
Lusen Mendel
(People & Project Management)
Boardroom
30 mins
Everyone deserves to make the most of their career opportunities, but it can be difficult to ask for a raise or negotiate an offer. This talk will not only inspire you to stop holding back from advocating for yourself, but also give you a concrete mental model and specific statements you can use to build healthy employment relationships with recruiters and managers.
GDB is powerful, and can be extended with Python to do more than just one-off debugging. This talk will describe using Python with GDB to with GDB to write tools that interact with running processes, highlighting GDB’s ability to call C functions and how this can be coupled with Python’s C-API to inject code without needing to stop the process.
Dustin Ingram
(DevOps, Testing, & Automation)
Robertson 2
30 mins
Most of us have installed a Python package, but do we know what it takes to make that work in a consistent, reliable way?
As the complexity and volume of data grows, data teams are optimizing their analytics workflows to support more complex logic, advanced transformations and customized visualizations that will be crucial in supporting machine learning and AI. Britton Stamper of Periscope Data by Sisense will share the impact data analysts are seeing from leveraging the strengths of SQL, Python and R together into their workflows.
Snack
Too often, the people giving talks have already arrived at big fancy titles: Directors of Engineering, Chief Scientists, and Founders. But at some point, all of us will interview for that dream job, and most of us will suffer a coding interview or two or twenty. Fortunately, Python is an ideal language for many coding interviews. Come discover tools and tips for technical interviews with Python.
William Horton
(ML, AI, & Data,
Scale & Performance)
Robertson 2
45 mins
It’s 2019, and Moore’s Law is dead. CPU performance is plateauing, but GPUs provide a chance for continued hardware performance gains, if you can structure your programs to make good use of them. In this talk you will learn how to speed up your Python programs using Nvidia’s CUDA platform.
Liran Haimovitch
(Python & Libraries,
DevOps, Testing, & Automation)
Fisher West
45 mins
Have you ever wondered how Python debugging looks on the inside? During this session, we’ll share how debugging actually works in Python. We’ll discuss the differences between CPython and PyPy interpreters, explain the underlying debugging mechanism and show you how to utilize this knowledge at work and up your watercooler talk game.
Jyotika Singh
(ML, AI, & Data,
Python & Libraries)
Boardroom
25 mins
Challenges with Audio Classification problems focussing on cleaning and building features from audio which can then be used to build a classification model, features that work well with audio and speech data and the open source libraries useful for the same.
Sean Farley
(Python & Libraries,
Scale & Performance)
Mya
30 mins
Learn how the open source Python module CuPy works as a drop-in replacement for NumPy to enable calculation using GPUs. We'll explain how engineers can speed up calculations by making their own kernels on the GPU. We'll also cover other projects incorporating CuPy to increase calculation speed.
Eva Sasson
(ML, AI, & Data)
Robertson 2
30 mins
Learn how to build a network web in Python to reflect conversations between people based on Slack conversations. Then, build a natural language processing model to evaluate what all those people are talking about, and which conversations determine who in the network carries “technical knowledge”.
Daniel Wallace
(Python & Libraries,
Web, IoT, & Hardware)
Fisher West
30 mins
Api's all have a defined structure, they sometimes almost look like a file tree of actions. Use that similarity + a plugin system to model and organize your api.
Pytest is a widely-used, full-featured Python testing tool that helps you write better programs. Did you know that you can easily enhance and customize Pytest through the use of plugins? In this talk, you will learn all about some of the useful Pytest plugins that are available, and learn how to create your own plugins.
Lightning talks are 5-minute talk slots with or without slides on something you’re passionate about (no hiring or company pitches, please). Humor preferred, goat photos optional, learning value expected.
Please sign up to deliver a lightning talk now!
Bryan Siepert
(Web, IoT, & Hardware)
Robertson 2
45 mins
An introduction to Python on hardware from the perspective of writing device drivers for embedded systems.
Stu Stewart
(ML, AI, & Data,
Python & Libraries)
Fisher West
45 mins
Together we live code (in a RISE slideshow) a fully-connected neural net from scratch via numpy, initially training it using a for-loop to demonstrate core concepts, and finally codifying it as a Scikit-learn classifier with which one can fit & predict on one’s own data. To close, I walk through a toy example which logistic regression can’t properly classify, but which our NN can.
Check-in | Breakfast at Fisher Atrium
Opening Remarks
or "Strategies learned from coaching, teaching, and StackOverflow"
If you work with thousands of developers, ranging from the experienced to the aspirational, you can see what patterns of thought seem to confer success. Raymond shares what he’s seen that works best for developing problem solving skills, learning how to learn, how to get unstuck, and reliable strategies for managing complexity.
The talk includes live coding examples to make these ideas concrete.
Break
Sarah Schattschneider
(ML, AI, & Data)
Mya
30 mins
Heard of Apache Airflow? Do you work with Airflow or want to work with Airflow? Ever wonder how to better test Airflow? Have you considered all data workflow use cases for Airflow? Come be reminded of key concepts and then we will dive into Airflow’s value add, common use cases, and best practices. Some use cases: Extract Transform Load (ETL) jobs, snapshot databases, and ML feature extraction.
The talk will introduce people to EdgeDB and how it can can be used in Python. EdgeDB is an excellent replacement for an ORM, it can drastically simplify code and improve performance. EdgeDB can act as a modern data layer, unlocking GraphQL and providing an asynchronous communication channel to your database.
I will present strategies to create effective data representations using a variety of python libraries. This talk will contain effective data visualization principles contextualized through Python examples. We will focus on commonly used techniques such as bar charts, pie charts, and will also discuss new strategies such as small multiples & use of animation and interaction for data exploration.
Make Sphinx your own, in 30 hilarious minutes. You'll laugh, you'll cry, and so will I.
Do you use Sphinx? Probably. Do you run a Sphinx? Possibly. Do you customize or extend your Sphinx? Sources say “no”. In this talk, we tap into the little-known power of Sphinx-as-a-Platform.
Michael Sullivan
(Scale & Performance,
DevOps, Testing, & Automation)
Mya
45 mins
Dropbox is a heavy user of the mypy type checker, recently passing four million lines of type-annotated Python code, with over half of that added in the last eighteen months. Type checking is helping find bugs, making code easier to under stand, enabling refactors, and is an important aid to our ongoing Python 3 migration.
In this talk, we discuss how we got there. We’ll talk about what we tried in order to get our engineers to type annotate their code—what worked, what didn’t, and what our engineers had to say about it.
Additionally, we’ll discuss the performance problems we faced as the size of our checked codebase grew, and the techniques we employed to allow mypy—which is implemented in Python—to efficiently check (faster than a second, for most incremental checks) millions of lines of code, which culminated in mypyc, a new ahead-of-time compiler for type-annotated Python!
Emily Morehouse-Valcarcel
(People & Project Management)
Robertson 2
45 mins
Python Core Developers help shape and build the language we love. But what do core developers actually do, and how do you become one? Hear the story of how I went from an aspiring Core Developer to implementing one of the most controversial changes to the language.
Saishruthi Swaminathan
Gabriela de Queiroz
Karthik Muthuraman
(DevOps, Testing, & Automation,
ML, AI, & Data)
Fisher West
45 mins
Seamlessly serve state-of-the-art deep learning models as web microservices in minutes and create an application around it without having prior deep learning experience.
Adam Breindel
(ML, AI, & Data,
People & Project Management)
Boardroom
45 mins
This talk is less about technical success and failures; it instead focuses on where the tech meets squishy humans: API design, communication, empowering (or not) end users, sometimes even protecting open source from ourselves.
Specifically, we'll look at some well-oiled parts and some rusty frictiony bits that come between PySpark (as a computing ecosystem) and its main user community.
Lunch
This talk will arm you with some tools to design a library that "just works", but also has obvious escape hatches to handle corner cases. It covers several patterns for cleanly organizing related and overlapping functionality in a way that satisfies both humans and static analysis tools.
Vanessa Barreiros
(Python & Libraries,
Web, IoT, & Hardware)
Robertson 2
30 mins
In Django, we have a powerful tool called ORM to manipulate databases easily. For small queries, it can be quite simple, but what happens when you need to do tricks like nested queries or computed values? One of the answers is query expressions. In this talk, we'll learn how to power-up queries with them by walking through comparisons and examples with a dataset.
Vikram Bhat
(DevOps, Testing, & Automation)
Fisher West
30 mins
Unit testing is loved by so many that it is already a very popular concept. But I will try to make unit testing even simpler with pytest using monkey patching. Come to learn about unit testing with pytest using monkey patching.
Through a series of case studies, I will illustrate different types of algorithmic bias, debunk common misconceptions, and share steps towards addressing the problem.
Annie Cook
(Scale & Performance,
Python & Libraries)
Robertson 2
30 mins
Asyncio is the latest and greatest way to write concurrent code, but are the principles behind it really novel? At its core, asyncio just an implementation of single-threaded, cooperative concurrency. Thats a lot of buzzwords! Let's zoom in on single-threaded, cooperative concurrency more broadly, explore what that means, how it works, and why it's valuable.
R.Gabriel Esteves
(ML, AI, & Data,
Scale & Performance)
Fisher West
30 mins
The Python Machine Learning stack is notable for its depth, breadth and flexibility, often with components of equivalent functionality with different performance characteristics. This talk presents tools and techniques that can help informing the decisions of which components to use so as to create deployments that meet the performance requirements of an application.
Julia Ma
(Python & Libraries)
Boardroom
30 mins
3D printers are becoming more and more accessible, but to create your own objects, you need to know how to generate 3D models. Get started by using OpenPyScad, a Python library that generates 3D-modeling source code.
This talk will go over the challenges I encountered while learning 3D modeling, including deciding on which tools to use and how to think about 3D objects in code.
Łukasz Langa
(Python & Libraries)
Mya
30 mins
What good is a code style if it's not internally consistent? What good is a linter when it slows you down? What if you could out-source your worries about code formatting, adopt a consistent style, and make your team faster all at the same time?
Come hear about Black: a new code style and a tool that allows you to format your Python code automatically. In the talk you'll learn not only how the style looks like but why it is the way it is. I will do my best to convince you not only that it's good but that it's good enough. You'll see how you can integrate it with your current workflow and how it speeds up your life while making your code prettier on average.
Lose your attachments, delegate the boring job of moving tokens around to satisfy the linter, and save time for more important matters. Guaranteed to increase the life expectancy of space bars and Enter keys on your new MacBook's keyboard.
David Lord
(Python & Libraries,
Web, IoT, & Hardware)
Robertson 2
30 mins
Browsers provide many ways to help keep your users and their data secure. In this talk, learn about what security features are available and how to enable them in Flask, Django, or other web applications. This talk is targeted at intermediate web developers, but should be useful for beginners as well.
Keeley Takimoto
(ML, AI, & Data)
Fisher West
30 mins
As the demand grows for data science in industry and research, so does the need for tools to conduct, document, and share analysis. Project Jupyter has stepped up to meet this need with "open-source software, open-standards, and services for interactive computing across dozens of programming languages." Jupyter is now used by everyone from individuals doing personal projects on GitHub to major universities and corporations, including Microsoft and Google.
This talk will give an introduction to two key products of Project Jupyter: the Jupyter Notebook document and the multi-user JupyterHub. We will then share examples of how Jupyter is being leveraged to do data science accessibly, collaboratively, and at scale at UC Berkeley.
APIs are important. It's how your packages relate to the rest of a program and their users and they are a fundamental defense against complexity. Discussions on how to write good APIs are long on opinion and short on concrete and unambiguous steps a developer can take. At the opposite end, I decided to focus on the very specific subject of API consistency and to provide a package that supports the writing of consistent APIs by factoring out their syntactic and semantic aspects. From argument naming and ordering, to initialization and checking, to documentation, autosig provides a way to concisely describe an API, minimizing repetition and the possibility of error, or just messiness. In this talk, we will introduce the package by way of two examples of increasing complexity, then discuss with the audience directions in which the package should go.
Snack
In this talk I will present a project (coined py2store) whose goal is to develop tools that allow developers to interact with a variety of data sources and sinks, local or remote, through a consistent and simple interface.
How simple? The simplest. Objects that feel like built-ins (dicts, lists, sets).
Stefan Krawczyk
(DevOps, Testing, & Automation,
Python & Libraries,
Scale & Performance)
Robertson 2
30 mins
At Stitch Fix most application logs are output in a structured JSON format for simpler debugging and downstream consumption.
In this talk we’ll cover in more detail why structured logs are useful and provide leverage, caveats to using them, and how simple it is to get one going with Python.
Kiva has provided an enormous amount of data transparency for over a decade. The data has powered economics studies and machine learning research. Attendees will be introduced to the one of the largest publicly available data sets for micro finance.
Justina Petraityte
(ML, AI, & Data,
Python & Libraries)
Mya
45 mins
AI assistants are getting a great deal of attention from the industry as well as the research. However, the majority of assistants built to this day are still developed using a state machine and a set of rules. That doesn’t scale in production. In this talk, you will learn how to build AI assistants that go beyond FAQ interactions using machine learning and open source tools.
Mahmoud Hashemi
(DevOps, Testing, & Automation,
Python & Libraries,
Scale & Performance)
Robertson 2
45 mins
If you had to build a software application right now, how would you do it? First step, Python. But then what?
This talk looks at over 200 of the most-successful open-source Python applications to provide advice on building effective software to reach the masses. Architecture, testing, licensing, packaging and distribution, these projects hold lifetimes of work and wisdom, waiting to be learned!
Java is object oriented and Haskell is functional. How about Python? Is it really OO with free-standing functions and porous encapsulation? Python has lambdas and closures, but is it functional? Are these useful questions?
A better approach to learning of programming languages is to focus on features, not paradigms. This delivers practical advice for choosing patterns and understanding idioms.
Python development has had a great emergence in development of statistical packages, algorithms, and implementations. However, with the development and ease of practicing statistics & algorithms, there are still some rules and constraints one must follow to obtain quality solutions. And that is especially true with AB Testing, a statistical procedure to provide data-driven insights in uncertainty.
Lightning Talks - Feel free to submit 5 minutes lightning talk proposal by August 6.
Raffle, final remarks and conference close