Binary outcome (“yes” or “no”)
100,000 rows, 18 columns
Mix of numeric and categorical predictors
How long does it take to tune a boosted tree model on my laptop?
| Approach | Area under ROC | Elapsed time |
|---|---|---|
| Default engine + grid search | 0.8957 | 3.68h |
| Optimized engine + search strategy |
| Approach | Area under ROC | Elapsed time |
|---|---|---|
| Default engine + grid search | 0.8957 | 3.68h |
| Optimized engine + search strategy | 0.8954 | 1.52m |
Virtually indistinguishable performance in 0.7% of the time.
Quickly, some background:
Here’s our tuning process visualized similarly:
Sequentially:

In parallel:

In tidymodels, this is one added line of code:
Before:

With a carefully chosen modeling engine:

In tidymodels, this is one changed line of code. From:
To:
Before:

Fitting a third as many models:

In tidymodels, this is a few added lines of code:
In some cases, this “just works” with no changes.
Before:

Giving up on poorly performing models early:

In tidymodels, this is one changed line of code. From:
To:
We went from:

To:


github.com/simonpcouch/rpharma-24