
Wonder how Uber scales responsible AI at a massive scale? Dive into Uber’s centralized Model Catalog, automated explainability, and governance built directly into the engineering workflow.

Explore how we optimized Petastorm to cut deep learning training time by 6× and eliminated hidden randomness at Uber.

Uber

Large-scale test migrations aren't impossible—they're automatable. Discover how Uber pulled off one of the largest JUnit migrations ever, transforming over 1.25 million lines of test code in just a few months.

Uber is setting a new standard for design systems by using the Figma Console MCP to shatter the manual documentation bottleneck. By letting AI agents pull directly from design data, weeks of spec writing turns into minutes of automated precision.

Learn how we rebuilt our Ads personalization model with Transformer-based sequential encoders and HeteroMoE to better capture user intent, boost targeting accuracy, and scale intelligent ads delivery across the platform.

Uber’s Payment Account Batch Processing system handles over 30 financial update operations per second for hot accounts with sub-second batching and strict consistency. Learn how we built it without using special hardware or software.


Uber’s 10PB, 16K-dataset Hive monolith for the Delivery business had huge limitations. See how we transformed it into a secure, scalable, decentralized platform with zero downtime and saved more than 1PB along the way. #BigData #DataSecurity

Discover how Uber built and automated a global rate-limiting system that protects millions of RPCs per second, improving reliability, reducing latency, and simplifying operations across our service mesh.

Uber processes trillions of Kafka messages per day on a push-based consumer proxy in real time. Read this blog to learn about the thinking behind open source uForwarder before applying it to your use cases.

Uber prioritizes a reliable data lake, which is distributed across on-premise and cloud environments. This multi-region setup presents challenges for ensuring reliable and timely data access due to limited network bandwidth and the need for seamless data availability, particularly for disaster recovery. Uber uses the Hive Sync service, which uses Apache HadoopⓇ Ditscp (Distributed Copy) for data replication. However, with Uber’s Data Lake exceeding 350 PB, Distcp’s limitations became apparent. This blog explores the optimizations made to Distcp to enhance its performance and meet Uber’s growing data replication and disaster recovery needs across its distributed infrastructure.