The dqr paper is published in the Journal of Business & Economic Statistics

Title: A Note on Distributed Quantile Regression by Pilot Sampling and One-Step Updating

Authors:  Rui Pan, Tunan Ren, Baishan Guo, Feng Li, Guodong Li, and Hansheng Wang

Abstract: Quantile regression is a method of fundamental importance. How to efficiently conduct quantile regression for a large dataset on a distributed system is of great importance. We show that the popularly used one-shot estimation is statistically inefficient if data are not randomly distributed across different workers. To fix the problem, a novel one-step estimation method is developed with the following nice properties. First, the algorithm is communication efficient. That is the communication cost demanded is practically acceptable. Second, the resulting estimator is statistically efficient. That is its asymptotic covariance is the same as that of the global estimator. Third, the estimator is robust against data distribution. That is its consistency is guaranteed even if data are not randomly distributed across different workers. Numerical experiments are provided to corroborate our findings. A real example is also presented for illustration.

Links: Journal version | Spark Implementation

