## Cracking the Code: How Uber Masters ETA Calculation on a Massive Scale – Medium

Predicting ETAs

Ubers main goal in predicting ETA was to be reliable. This means that the estimated time of arrival should be very close to the actual time, and this accuracy should be consistent across different places and times.

The simplest approach that comes to mind to find the predicted ETA is to use map data, such as the haversine distance (shortest distance between two points), and add a scaler for speed. However, this method is not sufficient, as there can be a significant gap between the predicted and the actual ETA calculation since people dont travel in a straight line between two points.

To address this issue, Uber has incorporated additional layers such as routing, traffic information, map matching, and machine learning algorithms to enhance the reliability of the predicted ETA.

Lets deep dive into the additional layers.

Problem statement : Build a large-scale system that computes the route from origin to destination with the least cost and low latency.

To achieve this they represent the physical map as a graph

Every road intersection represents a node, and each road segment is represented as a directed edge.

To determine the ETA, they need to find the shortest path in this directed weighted graph. Dijkstras algorithm is commonly used for this purpose, but its time complexity is O(n log n), where n is the number of road intersections or nodes in the graph.

Considering the vast scale of Ubers operations, such as the half a million road intersections in the San Francisco Bay Area alone, Dijkstras algorithm becomes impractical.

To address this issue, Uber partitions the graph and precomputes the best path within each partition.

Interacting with the boundaries of graph partitions alone is sufficient to discover the optimal path.

Picture a dense graph represented on a circular map.

To find the best path between two points in a circle, traditionally, every single node in the circle needs to be traversed, resulting in a time complexity proportional to the area of the circle ( * r).

However, by partitioning and precomputing, efficiency is improved. It becomes possible to find the best path by interacting only with the nodes on the circles boundary, reducing the time complexity to the perimeter of the circle (2 * * r).

In simpler terms, this means that the time complexity for finding the best path in the San Francisco Bay Area has been reduced from 500,000 to 700.

Once we have the route, we need to determine the travel time. To do that, we require traffic information.

Consider traffic conditions when determining the fastest route between two points.

Traffic depends on factors like time of day, weather, and the number of vehicles on the road.

They used traffic information to determine the edge weights of the graph, resulting in a more accurate ETA.

They integrated historical speed data with real-time speed information to enhance the accuracy of traffic updates, as the inclusion of additional traversal data contributes to more precise traffic information.

Before moving forward, there were two critical questions that needed addressing:

1. Validity of Real-time Speed: Too short a duration might imply a lack of understanding of the current road conditions. Conversely, if its too long, the data becomes outdated.

2. Integrating Historical and Real-time Speeds: Striking a balance here involves a tradeoff between bias and variance. Prioritizing real-time data yields less bias but more variance. Emphasizing historical data introduces more bias but reduces variance. The challenge lies in finding the optimal balance between the two

GPS signals can be less reliable and less frequent, especially when a vehicle enters a tunnel or an area with many tall buildings that can reflect the GPS signals.

Also, mobile GPS signals are usually close to the street segments but not perfectly on it, which makes it difficult to get the exact street coordinates.

Map matching is like connecting the dots! Imagine you have red dots representing raw GPS signals.

Now, the goal is to figure out which road segments these dots belong to. Thats where map matching comes in it links those red dots to specific road segments.

The resulting blue dots show exactly where those GPS signals align with the road segments. Its like fitting the puzzle pieces together to see the actual path on the map.

They use the Kalman filter for map matching. It takes GPS signals and matches them to road segments.

Besides they use the Viterbi algorithm to find the most probable road segments. Its a dynamic programming approach.

Ubers initial aim was to provide reliable ETA information universally. Reliability has been discussed above; now, lets shift the focus to how Uber ensures availability everywhere.

Uber has observed that ETA predictions in India are less accurate compared to North America due to systematic biases or inefficiencies. This is where Machine Learning (ML) can play a crucial role by capturing variations in: 1. Regions 2. Time 3. Trip types 4. Driver behavior etc

By leveraging ML, Uber aims to narrow the gap between predicted ETAs and actual arrival times, thereby enhancing the overall reliability and user experience.

Lets define a few terms, and then we will better understand their decisions.

1. Linear Model: Definition: A linear model assumes a linear relationship between the input variables (features) and the output variable. It follows the equation (y = mx + b), where (y) is the output, (x) is the input, (m) is the slope, and (b) is the intercept. Example: Linear regression is a common linear model used for predicting a continuous outcome.

2. Non-linear Model: Definition: A non-linear model does not assume a linear relationship between the input and output variables. It may involve higher-order terms or complex mathematical functions to capture the patterns in the data. Example: Decision trees, neural networks, and support vector machines with non-linear kernels are examples of non-linear models.

3. Parametric Model: Definition: A parametric model makes assumptions about the functional form of the relationship between variables and has a fixed number of parameters. Once the model is trained, these parameters are fixed. Example: Linear regression is parametric since it assumes a linear relationship with fixed coefficients.

4. Non-parametric Model: Definition: A non-parametric model makes fewer assumptions about the functional form and the number of parameters in the model. It can adapt to the complexity of the data during training. Example: k-Nearest Neighbors (KNN) is a non-parametric algorithm, as it doesnt assume a specific functional form and adapts to the data during prediction based on the local neighbourhood of points.

Since ETA is influenced by factors like location and time of day, and there is no predefined relationship between variables, they opted for non-linear and non-parametric machine learning models like

In their terms, With great (modelling) power comes great (reliability) responsibility! So, they have fallback ETAs to avoid system downtime situations.

They also monitor ETA to prevent issues for both internal and external consumers.

Related Posts