I am a Software Engineer with Intel, working on improving the visibility of features on Intel hardware through tools and runtimes, mostly those offered by the Microsoft stack. I graduated from the University of Waterloo with a PhD, where my research centered around the development of adaptive systems using Machine Learning, metaprogramming and GPGPU.
The Python Machine Learning stack is notable for its depth, breadth and flexibility, often with components of equivalent functionality with different performance characteristics. This talk presents tools and techniques that can help informing the decisions of which components to use so as to create deployments that meet the performance requirements of an application.
Python is notorious for its popularity in Machine Learning (ML) development. It is also notorious for not being the fastest runtime around. This is resolved by taking advantage of the many high-quality native libraries available for a great variety of ML tasks. Yet, improving the performance of a Python/C++ workload is a sadly under-documented activity, and one that, typically, often has to be undertaken in anger, with limited time to learn specialized tools and techniques. Our team has tried to remedy this situation by connecting the features and functionalities available in Intel hardware and software tools with the very popular Visual Studio Code tooling from Microsoft. We believe that this combination empowers Python ML application and library developers to create efficient workloads for a variety of Cloud and Edge deployments.