Research
Research Interests
- Generative models
- Tree-based models and methods
- Multi-scale methods
- Nonparametric inference
- Bayesian modeling and computation
- Biomedical applications
A current focus of my work is the statistical assessment of generative models, particularly methods that accommodate a range of study and sampling designs. I am also interested in models and algorithms for nonparametric inference, as well as methods for analyzing complex biomedical data, especially microbiome sequencing and flow cytometry data.
Support
My research group is currently supported by both the NSF (DMS-2152999) and the NIH (NIGMS grant R01-GM135440). Prior support: NSF (DMS-1309057, DMS-1612889, DMS-1749789, DMS-2013930, and EEC-2133504) and a Google Faculty Research Award.
Publications
Use the switcher to browse the same publication list chronologically or by research area.
- 2026 Horiguchi A, Ma L, and Szabo B. Sampling depth trade-off in function estimation under a two-level design. Electronic Journal of Statistics. In press. [arxiv]
- 2026 Wang Z, Mao J, and Ma L. A tree-based model for addressing sparsity and taxa covariance in microbiome compositional count data. Statistics in Medicine. 45, no. 10-12: e70584. [talk@IMSI] [online] [arxiv]
- 2026 Kim H, Siddiqui N, Karstens L, and Ma L. A negative binomial latent factor model for paired microbiome sequencing data. BMC Bioinformatics. 27:45. [online]
- 2026 Kang G, Mao J, and Ma L. Multiscale Cochran-Mantel-Haenszel Scanning for Conditional Dependency. [arxiv]
- 2026 Awaya N, Xu Y, and Ma L. Two-sample comparison through additive tree models for density ratios. [plenary talk@bnp14] [arxiv]
- 2026 Luo H, Horiguchi A, and Ma L. Wavelet tree ensembles for triangulable manifolds. [arxiv]
- 2026 Dai Q, Fodor AA, Wei G, Ma L, Gunsch C, and Granek JA. From external-input sensitivity to resident persistence: community assembly in a sink p-trap model. [bioRxiv]
- 2026 Kim H and Ma L. An exponential scale mixture model for metatranscriptomics data with application to inflammatory bowel disease. [bioRxiv]
- 2025 Ignatiadis N and Ma L. Partially Bayes p-values for large-scale inference. [arxiv]
- 2025 Xu Y, Wei Y, and Ma L. Distributional evaluation of generative models via relative density ratio. [arxiv]
- 2025 Xu Y, Luo K, and Ma L. A tree-based kernel for densities and its applications in clustering DNase-seq profiles. [arxiv]
- 2025 Liu L and Ma L. On forest-type tree ensemble approaches to density learning under a generalized Bayesian framework.
- 2025 Ma L and Bruni B. A partial likelihood approach to tree-based density modeling and its application in Bayesian inference. [talk@ISSI] [arxiv]
- 2025 Luo H, Horiguchi A, and Ma L. Efficient decision trees for tensor regressions. Journal of Computational and Graphical Statistics. Published online. [online] [arxiv] [code]
- 2025 Horiguchi A, Chan C, and Ma L. A tree perspective on stick-breaking models in covariate-dependent mixtures (with discussion). Bayesian Analysis. Vol. 20, No. 3, 1139–1230. [BA discussion webinar] [online] [arxiv]
- 2025 Wei G and Ma L. Stream-level flow matching with Gaussian processes. Proceedings of the 42nd International Conference on Machine Learning, PMLR. 267:66137-66154. [online]
- 2024 Wang Z, Awaya N, and Ma L. Generative modeling of conditional densities through additive tree flows. [arxiv]
- 2024 Awaya N and Ma L. Unsupervised tree boosting for learning probability distributions. Journal of Machine Learning Research. 25(198):1−52. [talk@bnp-webinar] [online] [R package]
- 2024 Liu L and Ma L. Spatial adaptation by Bayesian unsupervised trees. Proceedings of Thirty Seventh Conference on Learning Theory, PMLR. 247:3556-3581. [online]
- 2024 Liu R, Li M, and Ma L. Efficient in-situ image and video compression through probabilistic image representation. Signal Processing. Vol. 215, 109268. [online] [arxiv] [Matlab code]
- 2024 Gorsky S, Chan C, and Ma L. Coarsened mixtures of hierarchical skew normal kernels for flow cytometry analyses. Bayesian Analysis. Vol.19, No.2, 439–463. [15-min video] [online] [R package]
- 2024 Awaya N and Ma L. Hidden Markov Pólya trees for high-dimensional distributions. Journal of the American Statistical Association, Theory and Methods. Vol.119, No.545, 189–201. [online] [arxiv] [R package] [slides]
- 2023 Ji Z and Ma L. Controlling taxa abundance improves metatranscriptomics differential analysis. BMC Microbiology. 24:60. [online]
- 2023 LeBlanc P and Ma L. Microbiome subcommunity learning with logistic-tree normal latent Dirichlet allocation. Biometrics. Vol. 79, Iss. 3, 2321–2332. [online] [arxiv] [R package]
- 2022 Gorsky S and Ma L. Multiscale Fisher’s independence test for multivariate dependence (with discussion). Biometrika. Vol. 109, No. 3, 569–587. [online] [arxiv] [R package]
- 2022 Gorsky S and Ma L. Rejoinder: “Multiscale Fisher’s independence test for multivariate dependence.” Biometrika. Vol. 109, No. 3, 605–609. [online]
- 2022 Mao J and Ma L. Dirichlet-tree multinomial mixtures for clustering microbiome compositions. Annals of Applied Statistics. Vol. 16, No. 3, 1476–1499. [talk] [15-min video] [online] [arxiv] [R package] [numerical examples]
- 2022 Siddiqui N, Ma L, Brubaker L, Mao J, Hoffman C, Wang Z, Karstens L. Updating urinary microbiome analyses to enhance biologic interpretation. Frontiers in Cellular and Infection Microbiology. 12:789439. [online]
- 2022 Vaughan M, Zemtsov G.E., Dahl E.M., Karstens L, Ma L, Siddiqui N. Concordance of urinary microbiota detected by 16S rRNA amplicon sequencing versus expanded quantitative urine culture. American Journal of Obstetrics & Gynecology. [online]
- 2022 Luo K, Zhong J, Safi A, Hong L.K., Tewari A.K., Song L, Reddy T.E., Ma L, Crawford G.E., and Hartemink A.J. Quantitative occupancy of myriad transcription factors from one DNase experiment enables efficient comparisons across conditions. Genome Research. 32: 1183–1198. [online] [bioRxiv]
- 2022 Li M and Ma L. Learning asymmetric and local features in multi-dimensional data through wavelets with recursive partitioning. IEEE Transactions on Pattern Analysis and Machine Intelligence. Vol. 44, No. 11, 7674–7687. [online] [arxiv] [Matlab toolbox & R package]
- 2021 Vaughan M, Mao J, Karstens L, Ma L, Amundsen C, Schmader K, Siddiqui N. The urinary microbiome in postmenopausal women with recurrent urinary tract infections. Journal of Urology. Vol. 206, No. 5, 1222–1231. [online] [bioRxiv]
- 2021 Ramalingam S, Siamakpour-Reihani S, Bohannan L, Ren Y, Sibley, Sheng J, Ma L, Nixon AB, Lyu J, Parker DC, Bain B, Muehlbauer M, Ilkayeva O, Kraus VB, Huebner J, Spitzer T, Brown J, Peled J, van den Brink M, Gomes A, Choi T, Gasparetto C, Horwitz M, Long G, Lopez R, Rizzieri D, Sarantopoulos S, Chao N, and Sung AD. A phase 2 trial of the somatostatin analog pasireotide to prevent GI toxicity and acute GVHD in allogeneic hematopoietic stem cell transplant. PLOS ONE. Vol. 16, No. 6. [online]
- 2021 Giri V, Kegerreis K, Ren Y, Bohannon L, Lobaugh-Jin E, Messina J, Matthews A, Mowery Y, Sito E, Lassiter M, Saullo J, Jung S, Ma L, Greenberg M, Andermann T, van den Brink M, Peled J, Gomes A, Choi T, Gasparetto C, Horwitz M, Long G, Lopez R, Rizzieri D, Sarantopoulos S, Chao N, Allen D, and Sung A. Chlorhexidine gluconate bathing reduces the incidence of bloodstream infections in adults undergoing inpatient hematopoietic cell transplantation. Transplantation and Cellular Therapy. Vol. 27, No. 1, 262e1–e11. [online]
- 2020 Liu R, Li M, and Ma L. CARP: Compression through adaptive recursive partitioning for multi-dimensional images. CVPR 2020: IEEE/CVF Conference on Computer Vision and Pattern Recognition. [online] [Matlab code]
- 2020 Christensen J and Ma L. A Bayesian hierarchical model for related densities using Pólya trees. Journal of the Royal Statistical Society. Series B. Vol. 82, 127–153. [online] [preprint] [R package]
- 2020 Mao J, Chen Y, and Ma L. Bayesian graphical compositional regression for microbiome data. Journal of the American Statistical Association, Applications and Case Studies. Vol. 115, No. 530, 610–624. [talk] [online] [preprint] [R package] [Source code for examples]
- 2019 Ma L. Discussion on “Latent Nested Nonparametric Priors” by Camerlanghi et al. Bayesian Analysis. Vol. 14, No. 4, 1303–1356. [online] [preprint]
- 2019 Ma L and Mao J. Fisher exact scanning for dependency. Journal of the American Statistical Association, Theory and Methods. Vol. 114, No. 525, 245–258. [online] [preprint] [R code]
- 2019 Soriano J and Ma L. Mixture modeling on related samples through psi-stick breaking and kernel perturbation. Bayesian Analysis. Vol. 14, No.1, 161–180. [online] [R package] [examples]
- 2018 Ma L and Soriano J. Analysis of distributional variation through multi-scale Beta-Binomial modeling. Journal of Computational and Graphical Statistics. Vol. 27, No. 3, 529–541. [online] [preprint] [R package]
- 2018 Tang Y, Ma L, and Nicolae DL. A phylogenetic scan test on Dirichlet-tree multinomial model for microbiome data. Annals of Applied Statistics. Vol. 12, No. 1, 1–26. [online] [preprint] [R code]
- 2018 Ma L and Soriano J. Efficient functional ANOVA through wavelet-domain Markov groves. Journal of the American Statistical Association, Theory and Methods. Vol. 113, No. 3, 802–818. [online] [R package]
- 2017 Soriano J and Ma L. Probabilistic multi-resolution scanning for two-sample differences. Journal of the Royal Statistical Society. Series B. Vol. 79, No. 2, 547–572. [online] [R package]
- 2017 Ma L. Recursive partitioning and multi-scale modeling on conditional densities. Electronic Journal of Statistics. Vol. 11, No. 1, 1297–1325. [online] [R package]
- 2017 Ma L. Adaptive shrinkage in Pólya tree type models. Bayesian Analysis. Vol. 12, No. 3, 779–805. (Featured in the editor’s invited session “Highlights from Bayesian Analysis” at JSM 2017.) [online] [supplement] [R package]
- 2015 Ma L. Scalable Bayesian model averaging through local information propagation. Journal of the American Statistical Association, Theory and Methods. Vol. 110, No. 510, 795–809. [online] [preprint] [R package]
- 2013 Ma L. Adaptive testing of conditional association through recursive mixture modeling. Journal of the American Statistical Association, Theory and Methods. Vol. 108, No. 504, 1493–1505. [online] [R package]
- 2012 Ma L, Wong WH, and Owen AB. A sparse transmission disequilibrium test for haplotypes based on Bradley-Terry graphs. Human Heredity. Vol. 73, No. 1, 52–61. [online] [preprint]
- 2011 Ma L and Wong WH. Coupling optional Pólya trees and the two sample problem. Journal of the American Statistical Association, Theory and Methods. Vol. 106, No. 496, 1553–1565. [online] [arxiv]
- 2011 Ma L, Stein ML, Wang M, Shelton AO, Pfister CA, and Wilder KJ. A method for unbiased estimation of population abundance along curvy margins. Environmetrics. Vol. 22, No. 3, 330–339. [online]
- 2010 Wong WH and Ma L. Optional Pólya tree and Bayesian inference. Annals of Statistics. Vol. 38, No. 3, 1433–1459. [online] [pdf] [R package]
- 2010 Ma L, Assimes T, Asadi NB, Iribarren C, Quertermous T, and Wong WH. An “almost exhaustive” search based sequential permutation method for detecting epistasis in disease association studies. Genetic Epidemiology. Vol. 34, No. 5, 434–443. [online] [software]
- 2010 Ma L, Mease D, and Russell D. A four group cross-over design for measuring irreversible treatments on web search tasks. Proceedings of HICSS-44. [online]