Bridgette Powell is a software engineer at Industrial Light & Magic, currently developing pipeline data tools and workflows for artists. Prior to working at ILM, Bridgette began her career in R&D for the feature animation industry. The creative collaboration between artists and engineers is what inspires her to work in film. Bridgette’s curiosity motivates her to reach out to production departments, learn their workflows, and develop tools to improve artist efficiency. Bridgette holds a B.S. in Computer Science from Georgia Tech and an M.S. in Computer Science from U.C. San Diego.
Without a unified analytics solution, it has been difficult to understand tool efficiency within our pipeline. Developers and artists need a way to proactively identify areas for improvement in their pipeline tools and production rigs. Visualizing statistical data allows users to quickly identify, correlate, and fix issues. It provides insight into complex systems which are otherwise black boxes.
Analytics is a framework developed at Industrial Light & Magic that offers real-time reporting of the efficiency of pipeline tools, scripts, applications, and production data. The system allows developers to instrument their pipeline tools and applications with tracking code, in order to better understand metrics of interest such as: application and function activity, execution time, memory usage, failures, etc. We show how highly customized Grafana components can provide a more targeted and intuitive dashboard for visual analysis. Thus, we can reveal tool usage patterns and identify poorly-performing and underutilized applications and functions. We also use Analytics to measure production efficiency. For example, artists can follow the performance speed of CG creature animation rigs over time and easily correlate a slowdown to a particular check-in, to begin the investigation into why the slowdown occurred.
The goal of the Analytics project is to provide a system to Engineers, CG Supervisors, Pipeline TDs, and Artists that supports:
1. Identifying and alerting of inefficiencies in our pipeline
2. Gathering usage metrics to focus development on high-use tools, and identify seldom-used tools that can be retired or need additional training/ publicizing
3. Monitoring statistics on real-time production data
We designed a scalable backend to collect Analytics. We built a python API that is fast, lightweight and asynchronous. Any tool can use the Analytics API to send data to be tracked over time for analysis. The API generates JSON data that is sent to Elasticsearch and ultimately can be visualized on custom Grafana dashboards (our frontend). Grafana is an open source software tool for visualizing and monitoring time series data. On the front-end side, the Grafana dashboards present the Analytics data in a way that it can be interpreted per use case. We have our own custom branch of Grafana where we have iterated on the code to build custom plugins of our own. For example, we developed a custom plugin to visualize “sessions,” which are processes grouped within a single block of execution. This plugin uses a Gantt chart to show when each process starts and finishes on the same timeline, allowing users to visualize the relationships between the processes and see how they’re scheduled. We build Grafana from source into a Docker image and monitor the build using Jenkins. We deploy the service using an Ansible playbook, which starts a Docker container with our custom Grafana image. The workflows to create and modify dashboards are also supported by the deployment system via integration with our version management system.