Location-private traffic flow prediction with lossless utility: An encrypted geometric recruitment framework
Abstract
Accurate traffic flow prediction at specific road segments is essential for optimizing signal control, mitigating congestion, and improving the efficiency of urban transportation systems. Mobile Crowdsensing (MCS) enables large-scale monitoring by collecting geotagged data from participating vehicles and aggregating them at a centralized server. However, most existing solutions for privacy preservation introduce noise or coarse spatial aggregation of location data, which distorts spatiotemporal patterns and degrades the utility of prediction models, while users remain vulnerable to deanonymization and trajectory re-identification attacks. Approaches based on Differential Privacy (DP) offer formal guarantees by injecting calibrated noise into trajectories or model updates, but this perturbation is particularly harmful for short horizon traffic flow prediction, where detailed local patterns are crucial. We instead shift from perturbation to encryption based computation and aim to preserve the utility of the prediction model while still enforcing strong location privacy. We propose a location privacy preserving traffic flow prediction framework that moves all location sensitive operations into an encrypted recruitment protocol. A Paillier additively homomorphic encryption scheme supports geometric range queries, in particular point in rectangle tests for task regions, directly over encrypted coordinates. Service requesters encode task areas as encrypted rectangles, vehicles encrypt their current positions, and edge nodes assist in homomorphic operations. The crowdsensing server can decide whether a vehicle lies inside a task area or satisfies distance constraints without observing raw locations and without modifying the traffic measurements used for learning. On top of this privacy preserving data acquisition pipeline, we build a Gated Recurrent Unit (GRU) based traffic flow prediction model and evaluate it on real world California Performance Measurement System (PeMS) highway data. Because the privacy layer leaves traffic flow values intact, the GRU operates on high fidelity time series under strict location privacy constraints. Experiments show that the GRU based predictor achieves lower Mean Absolute Error (MAE), Mean Squared Error (MSE), and Root Mean Squared Error (RMSE) than Long Short-Term Memory (LSTM) and Sparse Autoencoder (SAE) baselines, indicating that strong location privacy can be enforced without sacrificing the fidelity of short horizon traffic forecasts.