Project

Building an Algorithms Tribe from Zero

Scaling decision logic to handle 500k daily events.

Context

Getir was scaling from a regional delivery startup to a global operation across nine countries, processing over 500,000 orders per day. The company's core routing, ETA, and pricing systems were algorithm-driven, but there was no dedicated team owning them. The algorithms existed. Nobody could measure whether they worked.

I joined when the function didn't exist. The algorithms were in production, but there was no team or process to measure whether they were working. Over three years (2021–2024), I built and led what became a 9-squad tribe of 35+ engineers and analysts responsible for every algorithm that touched order acceptance, courier assignment, delivery timing, and pricing.

The Problem

The algorithm team couldn't prove its own impact. Changes shipped, but there was no reliable way to isolate whether improvements came from the algorithm or from a dozen other variables: seasonal demand shifts, new store openings, courier headcount changes, weather. Without credible measurement, the team couldn't justify investment, couldn't prioritize effectively, and couldn't build confidence among the operations teams who lived with the consequences of every algorithm decision.

On the operations side, confidence in the algorithms was low. City leads routinely requested manual overrides for courier assignments. There was no shared language for discussing tradeoffs, no structured forums for raising concerns, and no documentation that non-technical stakeholders could reference to understand what the algorithms were optimizing for.

Approach

Measurement before features. The first hire wasn't an engineer. It was an analyst. Before building anything new, I needed to understand what existed and whether it worked. I formed a data team to isolate algorithm effects from everything else using simulations, quasi-experimental frameworks, and on-site observation.

This was harder than it sounds. You can't A/B test routing at city scale, and you can't hold demand constant. We built lightweight simulation environments to compare solver configurations under controlled conditions, then cross-referenced with production patterns. The evidence wasn't airtight, but it was credible enough to drive prioritization decisions.

Hiring for simplification. Grew the team deliberately: 5 analysts first, then co-hired 7 product managers with engineering leadership. Looked for people who could simplify complex problems rather than add layers of abstraction. In three years, the tribe reached 30–35 people across 9 squads covering routing, ETA, pricing, maps, workforce planning, and demand forecasting.

Tradeoff forums instead of escalations. Algorithm decisions aren't purely technical. Tighter ETAs increase conversion but stress operations. Cost optimization might hurt courier earnings. I ran tradeoff sessions with whoever was relevant: city ops leads, PMs, finance, couriers. The format was explicit: here's the tradeoff, here are the options, who bears the cost of each, what do we optimize for?

Example: car-couriers are more expensive per delivery. Pure cost optimization means they sit idle during slow periods. But idle couriers earn less and become dissatisfied. Is cost the main objective? Or do we utilize them despite higher unit cost to protect workforce stability? That's a business decision, not an algorithm decision. Making that distinction visible changed how the organization related to the algorithms team.

Cross-country rollout with local adaptation. Nine countries, each with different constraints: vehicle regulations, labor laws, demand patterns, map quality. Led knowledge-sharing sessions and built documentation that raised algorithm literacy across markets. Each country needed the same core logic with locally adapted constraint configurations, not a bespoke system per market.

No overrides. We never allowed operations teams to manually override algorithm decisions. If they saw a problem, the requirement was to prove it with data: here's the pattern, here's what the algorithm isn't accounting for. Then we fixed the root cause. Over time, this approach built durable trust between the algorithms team and operations.

Results

The tribe delivered measurable impact across every core metric:

+33 percentage points on-time delivery (45% to 78%) and +40 NPS points through ETA model overhaul and operational alignment
~30% reduction in last-mile costs (~$15M monthly impact) by re-architecting order acceptance and assignment with a heuristic solver
~80% reduction in solver latency after migrating vehicle routing to Google OR-Tools at 500K+ daily orders
2% monthly GMV lift (~$600K/month on a $30M base) from dynamic delivery fee and threshold pricing
20K+ incremental daily orders within six months of launching scheduled delivery across three business domains
100% customer experience SLA maintained through post-acquisition integration of Gorillas into core routing
Three third-party channel integrations (Just Eat, Uber Eats, Grubhub) expanding order volume and market presence
82% average OKR attainment across 9 squads aligned to board-level targets

What Transferred

Measurement capability comes before feature capability. If you can't prove impact, you can't justify investment, and you can't earn trust from the people who live with your system's decisions every day. The second thing: making tradeoffs explicit and collaborative is harder than optimizing in isolation, but it produces decisions that actually stick. These patterns apply to any team building data-driven systems in high-frequency, high-stakes operations.