Non-sequential Pipelines and Tuning

Research output: Chapter in Book/Report/Conference proceeding › Book chapter › Research › peer-review

Martin Binder
Florian Pfisterer
Marc Becker
Wright, Marvin Nils Ole

Real-world applications often require complicated pipeline that do not progress sequentially. For example, many experiments have demonstrated that bagging is a powerful method to improve model performance. Bagging can be thought of as a non-sequential pipeline where a learner is replicated, each separate learner is trained and makes predictions, and their results are combined. This is non-sequential as data is not flowing sequentially through the pipeline but is instead passed to all learners (who may then subsample the data) and then recombined, thus creating a pipeline where operations have multiple inputs and outputs. Pipeline operations also have hyperparameters that can be set and tuned to improve model performance. Moreover the choice of operations to include in a pipeline can also be tuned, known as combined algorithm selection and hyperparameter optimization (CASH). This chapter looks at more advanced uses of mlr3pipelines. This is put into practice by demonstrating how to build a bagging and stacking pipeline from scratch, as well as how to access common pipelines that are readily available in mlr3pipelines. The chapter then looks at tuning pipelines and CASH.

Original language	English
Title of host publication	Applied Machine Learning Using mlr3 in R
Editors	Bernd Bischl, Raphael Sonabend, Lars Kotthoff, Michel Lang
Number of pages	22
Publisher	CRC Press
Publication date	2024
Pages	174-195
Chapter	8
ISBN (Print)	978-1-032-51567-0, 978-1-032-50754-5
ISBN (Electronic)	978-1-003-40284-8
DOIs	https://doi.org/10.1201/9781003402848-8
Publication status	Published - 2024

Bibliographical note

ID: 390194958

Faculty of Law

Non-sequential Pipelines and Tuning

Bibliographical note