Srivatsan Sridharan is an Engineering Manager at Yelp, leading efforts around Distributed Systems and Streaming Infrastructure. He cares about building teams rooted in trust and empathy, winning through collaboration, and following pythonic principles in his everyday life (EAFP += 1). In his free time, he writes fiction and performs improv comedy.
Making decisions around data infrastructure investments are never easy. We faced similar challenges at Yelp when we decided to replace our batch ETL system with our streaming Data Pipeline. In this session, we’ll discuss our decision-making process and share lessons learned along the way so that you can apply them when confronted with similar challenges.
In 2011, we wrote an extract-transform-load system (ETL) to move data to our then newly created Data Warehouse. The system worked very well until about 2015 when it failed to handle our rapidly increasing datasets and our growing business needs. We realized that our data infrastructure needed to change -- and fast. This talk will walk you through the challenges we faced and provide a retrospective take on the lessons we learned along the way, covering questions like:
At what point should you seriously start thinking about the architecture of your data infrastructure?
How do you decide which technology investments are the right investments for you?
How do you get organizational buy-in on your new approach?
To open source or not to open source?
How do you balance rolling out the new system, while continuing to manage the old system?