Fully-Dynamic Approximate Decision Trees With Worst-Case Update Time Guarantees

Marco Bressan, Mauro Sozio

Research output: Contribution to journalConference articlepeer-review

Abstract

We study the problem of maintaining a decision tree in the fully-dynamic setting, where the dataset is updated by an adversarial sequence of insertions and deletions. We present the first algorithm with strong guarantees on both the quality of the tree and the worst-case update time (the maximum number of operations the algorithm performs between two dataset updates). For instance, we can maintain a tree where each node has Gini gain within β of the optimum, while guaranteeing an update time O(dβ-3 log4 n), where d is the number of features and n the maximum size of the dataset. This is optimal up to polylogarithmic factors, as any dynamic algorithm must have update time in Ω(d). Similar guarantees hold for the variance and information gain, for classification and regression, as well as for boosted trees. This shows that many popular decision trees such as ID3 or C4.5 can be efficiently made dynamic, answering an open question of Bressan, Damay, and Sozio (2023).

Original languageEnglish
Pages (from-to)4517-4541
Number of pages25
JournalProceedings of Machine Learning Research
Volume235
Publication statusPublished - 1 Jan 2024
Externally publishedYes
Event41st International Conference on Machine Learning, ICML 2024 - Vienna, Austria
Duration: 21 Jul 202427 Jul 2024

Fingerprint

Dive into the research topics of 'Fully-Dynamic Approximate Decision Trees With Worst-Case Update Time Guarantees'. Together they form a unique fingerprint.

Cite this