Online checkpointing with improved worst-case guarantees

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In the online checkpointing problem, the task is to continuously maintain a set of k checkpoints that allow to rewind an ongoing computation faster than by a full restart. The only operation allowed is to replace an old checkpoint by the current state. Our aim are checkpoint placement strategies that minimize rewinding cost, i.e., such that at all times T when requested to rewind to some time t ≤ T the number of computation steps that need to be redone to get to t from a checkpoint before t is as small as possible. In particular, we want that the closest checkpoint earlier than t is not further away from t than q k times the ideal distance T/(k + 1), where qk is a small constant. Improving over earlier work showing 1 + 1/k ≤ qk ≤ 2, we show that qk can be chosen asymptotically less than 2. We present algorithms with asymptotic discrepancy qk ≤ 1.59 + o(1) valid for all k and qk ≤ ln (4) + o(1) ≤ 1.39 + o(1) valid for k being a power of two. Experiments indicate the uniform bound pk ≤ 1.7 for all k. For small k, we show how to use a linear programming approach to compute good checkpointing algorithms. This gives discrepancies of less than 1.55 for all k < 60. We prove the first lower bound that is asymptotically more than one, namely qk ≥ 1.30 - o(1). We also show that optimal algorithms (yielding the infimum discrepancy) exist for all k.

Original languageEnglish
Title of host publicationAutomata, Languages, and Programming - 40th International Colloquium, ICALP 2013, Proceedings
Pages255-266
Number of pages12
EditionPART 1
DOIs
Publication statusPublished - 23 Jul 2013
Externally publishedYes
Event40th International Colloquium on Automata, Languages, and Programming, ICALP 2013 - Riga, Latvia
Duration: 8 Jul 201312 Jul 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume7965 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference40th International Colloquium on Automata, Languages, and Programming, ICALP 2013
Country/TerritoryLatvia
CityRiga
Period8/07/1312/07/13

Fingerprint

Dive into the research topics of 'Online checkpointing with improved worst-case guarantees'. Together they form a unique fingerprint.

Cite this