Information Directed Sampling for Sparse Linear Bandits


Stochastic sparse linear bandits offer a practical model for high-dimensional online decision problem and have rich information/regret structures. In this work, we explore the use of information-directed sampling (IDS) for this model who can naturally balance the information and regret trade-off. We develop a class of information-theoretical Bayesian regret bounds under different problem instances that nearly match the lower bounds. This demonstrates the adaptivity of IDS for different information/regret structures. Moreover, we propose an efficient implementation of IDS based on an empirical Bayesian approach for sparse posterior sampling. Numerical results demonstrate significant regret reductions by our method relative to several baselines.