From Python3.8 to Python3.10: Our Journey Through a Memory Leak

2025-12-15

9 min read

by Jay Patel

Tags:

memory-leak

python

Receive daily AI-curated summaries of engineering articles from top tech companies worldwide.

This post details Lyft's investigation into a memory leak and latency regression discovered while upgrading a Python service from 3.8 to 3.10.

•After upgrading, gevent thread joins that normally took milliseconds began timing out at 30 seconds, causing downstream 5xx errors
•Memory usage in all pods steadily increased, suspected to be linked to the gevent/greenlet behavior under Python 3.10
•Lyft used an internal tracemalloc-based memory profiler triggered via USR2 signals sent to gunicorn worker processes
•Initial profiling attempts failed silently because gunicorn's preload=True (copy-on-write) option prevented worker processes from registering the USR2 signal handler, causing kill -USR2 to terminate the worker instead
•Disabling preload resolved the signal registration issue and allowed successful memory trace capture

Related Articles