Skip to main navigation Skip to search Skip to main content

Unboxed Data Constructors: Or, How cpp Decides a Halting Problem

  • PSL research University & IPSL
  • Jane Street Europe Limited
  • University of Cambridge

Research output: Contribution to journalArticlepeer-review

Abstract

We propose a new language feature for ML-family languages, the ability to selectively unbox certain data constructors, so that their runtime representation gets compiled away to just the identity on their argument. Unboxing must be statically rejected when it could introduce confusion, that is, distinct values with the same representation. We discuss the use-case of big numbers, where unboxing allows to write code that is both efficient and safe, replacing either a safe but slow version or a fast but unsafe version. We explain the static analysis necessary to reject incorrect unboxing requests. We present our prototype implementation of this feature for the OCaml programming language, discuss several design choices and the interaction with advanced features such as Guarded Algebraic Datatypes. Our static analysis requires expanding type definitions in type expressions, which is not necessarily normalizing in presence of recursive type definitions. In other words, we must decide normalization of terms in the first-order λ-calculus with recursion. We provide an algorithm to detect non-termination on-the-fly during reduction, with proofs of correctness and completeness. Our algorithm turns out to be closely related to the normalization strategy for macro expansion in the cpp preprocessor.

Original languageEnglish
Pages (from-to)604-637
Number of pages34
JournalProceedings of the ACM on Programming Languages
Volume8
DOIs
Publication statusPublished - 5 Jan 2024

Keywords

  • boxing
  • data representation
  • recursive definitions
  • sum types
  • tagging
  • termination

Fingerprint

Dive into the research topics of 'Unboxed Data Constructors: Or, How cpp Decides a Halting Problem'. Together they form a unique fingerprint.

Cite this