2024 PKU Workshop on Modern Bayesian Computation

Venue

December 09, 2024, Monday 8:00-12:30

Room 216, Guanghua Building 2

Guanghua School of Management, Peking University, Beijing China

Speakers

Minh-Ngoc Tran, Associate Professor at the Business Analytics discipline, University of Sydney Business School

Shijia Wang, Assistant Professor at Institute of Mathematical Sciences, ShanghaiTech University

Shufei Ge, Assistant Professor at Institute of Mathematical Sciences, ShanghaiTech University

Cheng Zhang, Assistant Professor, School of Mathematical Sciences, Peking University

Feng Li, Associate Professor at Guanghua School of Management, Peking University

Organizers

Feng Li, Guanghua School of Management, Peking University

Cheng Zhang, School of Mathematical Sciences, Peking University

Program

8:00-9:45 Keynote lecture: Modern Bayesian Computation as Optimisation on Euclidean and Wasserstein spaces.

Speaker: Minh-Ngoc Tran, University of Sydney Business School

Abstract: Sampling from the posterior distribution and sequential prediction are fundamental challenges in Bayesian computation. This talk explores the interplay between Bayesian computation and optimisation. The first part reviews the standard Variational Bayesian Inference method which can be viewed as an optimisation problem on Euclidean space. The second part delves into the use of Optimal Transport (OT) to design geometry-assisted and optimisation-guided Bayesian computation techniques. OT is a powerful mathematical theory that sits at the interface of several fundamental theories, including probability and optimisation, and provides a mathematically elegant tool for solving optimisation problems on the space of probability measures. By equipping this space with the Wasserstein distance, it acquires the structure of a Riemannian manifold, enabling rich geometric insights and practical tools for statistical applications in optimisation and sampling. The talk will also cover several particle-based Variational Inference frameworks that traverse a set of particles to approximate the target distribution by iteratively solving optimisation problems on Wasserstein space.

Bio: Minh-Ngoc Tran is an Associate Professor at the Business Analytics discipline, the University of Sydney Business School. He is a Chief Investigator in the ARC Centre for Data Analytics for Resources and Environments (DARE). His research interests include developing efficient estimation methods (MCMC, SMC and Variational Bayes) for structured and/or high-dimensional data.

9:45-10:00 Coffee Break

10:00-10:30 Improving approximate Bayesian computation via machine learning methods.

Speaker: Shijia Wang, ShanghaiTech University

Abstract: Approximate Bayesian computation (ABC) is a class of Bayesian inference algorithms that targets problems with intractable or unavailable likelihood functions. It uses synthetic data drawn from the simulation model to approximate the posterior distribution. Firstly, we propose an early rejection Markov chain Monte Carlo (ejMCMC) sampler based on Gaussian processes to accelerate inference speed. Secondly, we propose a novel Global-Local ABC-MCMC algorithm that combines the “exploration” capabilities of global proposals with the “exploitation” finesse of local proposals.

Bio: Shijia Wang is an assistant professor in the Institute of Mathematical Sciences, ShanghaiTech University. He received his PhD degree from Simon Fraser University, Canada. His main research interests include Bayesian Statistics, machine learning and genetics.

10:30-11:00 Semi-Implicit Variational Inference with Score Matching

Speaker: Cheng Zhang, School of Mathematical Sciences, Peking University

Abstract: Semi-implicit variational inference (SIVI) greatly enriches the expressiveness of variational families by considering implicit variational distributions defined in a hierarchical manner. However, due to the intractable densities of semi-implicit distributions, typical SIVI approaches often use surrogate evidence lower bounds (ELBOs) or employ expensive inner-loop MCMC runs for unbiased ELBOs for training. In this talk, we will introduce new SIVI methods based on several alternative training objectives via score matching, which allows us to leverage the hierarchical structure of semi-implicit distributions to bypass the intractability of their densities. We will start with the basic score matching framework for SIVI, called SIVI-SM, via a minimax formulation. We will then talk about how to further enhance the flexibilty of semi-implicit distribution by allowing multiple hierarchical layers, which can also be used to accelerate diffusion model given the learned score networks. Lastly, we will introduce KSIVI, a variant of SIVI-SM that eliminates the need for lower level optimization through kernel tricks. An upper bound for the variance of the Monte Carlo gradient estimators of the KSD objective is derived, which allows us to establish novel convergence guarantees of KSIVI.

Bio: Dr Cheng Zhang is Assistant Professor in the School of Mathematical Sciences at Peking University. His research area is Bayesian statistics and machine learning, with interests including graphical models, efficient Markov chain Monte Carlo methods and variational inference methods for Bayesian models, deep Bayesian learning/Bayesian deep learning, and various applications using probabilistic modeling.

11:00-11:15 Coffee Break

11:15-11:45 Poisson Hyperplane Processes with Rectified Linear Units

Speaker: Shufei Ge, ShanghaiTech University

Abstract: Neural networks have shown state-of-the-art performances in various classification and regression tasks. Rectified linear units (ReLU) are often used as activation functions for the hidden layers in a neural network model. In this work, we establish the connection between the Poisson hyperplane processes (PHP) and two-layer ReLU neural networks. We show that the PHP with a Gaussian prior is an alternative probabilistic representation to a two-layer ReLU neural network. In addition, we show that a two-layer neural network constructed by PHP is scalable to large-scale problems via the decomposition propositions. Finally, we propose an annealed sequential Monte Carlo algorithm for Bayesian inference. Our numerical experiments demonstrate that our proposed method outperforms the classic two-layer ReLU neural network.

Bio: Shufei Ge is an assistant professor at the Institute of Mathematical Sciences, ShanghaiTech University. She received her PhD from Simon Fraser University, Canada, in August 2020, and since then, she joined ShanghaiTech University. Her research interests include Bayesian non-parametrics, statistical machine learning, and Monte Carlo methods. Particularly, she is interested in developing scalable methods to address important tasks in computational biology, such as genetic similarity analysis, neuroimaging genetics, and disease prediction.

11:45-12:15 Distributed Computing on Dependent Data

Speaker: Feng Li, Guanghua School of Management, Peking University

Abstract: Providing forecasts for ultra-long time series plays a vital role in various activities, such as investment decisions, industrial production arrangements, and farm management. Parametrically, we develop a novel distributed forecasting framework to tackle the challenges of forecasting ultra-long time series using the industry-standard MapReduce framework. The proposed model combination approach retains the local time dependency. Nonparametrically, we propose a novel grid point approximation (GPA) method, which is as statistically efficient as the global estimator under mild conditions. Second, it requires no communication and is extremely efficient in terms of computation for prediction. Third, it is applicable to the case where the data are not randomly distributed across different machines.

Bio: Feng Li joined the Guanghua School of Management at Peking University as an Associate Professor in 2024. Dr. Feng Li earned his Ph.D. in Statistics from Stockholm University, Sweden in 2013 (awarded Cramér Prize for the best Ph.D. thesis in Statistics). His research interests include Bayesian Statistics, Econometrics and Forecasting, and Distributed Learning. He develops highly scalable algorithms and software for solving real business problems.