Peer Reviews for Data Science

Endigest AI Core Summary

This post describes the peer review process for data science work developed at Square, drawing from both software code reviews and academic peer review traditions.

•Producers are responsible for identifying a reviewer at project start, sharing code and documents in a commentable format, and providing test cases for manual verification
•Reviewers must directly inspect and execute code, spot-check representative examples, and validate results against alternative data sources or methods
•Tight reviewer pairs are discouraged to prevent silos; the process is designed to distribute knowledge and context across the team
•Sharing unreviewed work with stakeholders early is allowed, but viewers must be warned results may change
•Most peer reviews should take about an hour, focusing on critical areas most likely to contain errors rather than reviewing everything exhaustively