TY - JOUR
T1 - Elimination of annotation dependencies in validation for Modern JSON Schema
AU - Attouche, Lyes
AU - Baazizi, Mohamed Amine
AU - Colazzo, Dario
AU - Ghelli, Giorgio
AU - Klessinger, Stefan
AU - Sartiani, Carlo
AU - Scherzinger, Stefanie
N1 - Publisher Copyright:
© 2025 The Author(s)
PY - 2026/2/13
Y1 - 2026/2/13
N2 - JSON Schema is a declarative language that allows one to specify the structure of JSON instances using hierarchical schema objects that combine logical and structural operators.2.2 Early versions of JSON Schema, known collectively as Classical JSON Schema, operated with a straightforward semantics where a schema's meaning was completely determined by which JSON values it could successfully validate. This simple foundation enabled researchers to develop robust theoretical frameworks and practical tools for instance validation and also to determine whether schemas are satisfiable or equivalent to one another. However, Classical JSON Schema had a significant weakness in its inability to effectively express certain kinds of extensions of object schemas. This limitation prompted a major overhaul in Draft 2019-09, introducing two new features that fundamentally alter how JSON Schema works. The first is annotation dependency, where validation now produces more than just a yes/no result. When a schema validates a JSON instance, it also generates an “annotation” that records which fields and items were “evaluated”. This annotation then influences the behavior of the new operators "unevaluatedProperties" and "unevaluatedItems", creating a dependency that did not exist before. The second feature is dynamic references, a separate mechanism that allows for the target of a reference operator to depend on the validation context. These changes were so substantial that all JSON Schema versions from Draft 2019-09 onward are called Modern JSON Schema. This semantic shift invalidated much of the existing theoretical work, and the algorithms that researchers had developed for Classical JSON Schema — particularly those for determining satisfiability and schema inclusion — do not easily adapt to Modern JSON Schema's new behavior. One approach to bridge this gap is “elimination” — converting Modern JSON Schema constructs back into equivalent Classical JSON Schema forms. Previous research successfully developed algorithms for eliminating dynamic references, but annotation dependency remained unsolved. In this paper we solve this problem, providing three contributions: an expressibility result, proving that eliminating annotation-dependent operators is possible; a succinctness result, proving that eliminating annotation-dependent operators can generally cause schemas to grow exponentially in size, and finally a practical algorithm to perform annotation elimination. Our “practical algorithm” not only matches the asymptotic lower-bound that is provided by the succinctness theorem, but it also presents some specific optimizations that we designed to exploit typical features or real-world schemas. A comprehensive experimental testing, executed on a representative set of 305 schemas retrieved from GitHub, shows that the practical algorithm runs on less than 10 ms on all of them, and in less than 1 ms in the 98 % of the cases, and that, in the 95 % of the cases, it produces schemas in Classical JSON Schema whose size is at most ten times bigger than the source schema written in Modern JSON Schema.
AB - JSON Schema is a declarative language that allows one to specify the structure of JSON instances using hierarchical schema objects that combine logical and structural operators.2.2 Early versions of JSON Schema, known collectively as Classical JSON Schema, operated with a straightforward semantics where a schema's meaning was completely determined by which JSON values it could successfully validate. This simple foundation enabled researchers to develop robust theoretical frameworks and practical tools for instance validation and also to determine whether schemas are satisfiable or equivalent to one another. However, Classical JSON Schema had a significant weakness in its inability to effectively express certain kinds of extensions of object schemas. This limitation prompted a major overhaul in Draft 2019-09, introducing two new features that fundamentally alter how JSON Schema works. The first is annotation dependency, where validation now produces more than just a yes/no result. When a schema validates a JSON instance, it also generates an “annotation” that records which fields and items were “evaluated”. This annotation then influences the behavior of the new operators "unevaluatedProperties" and "unevaluatedItems", creating a dependency that did not exist before. The second feature is dynamic references, a separate mechanism that allows for the target of a reference operator to depend on the validation context. These changes were so substantial that all JSON Schema versions from Draft 2019-09 onward are called Modern JSON Schema. This semantic shift invalidated much of the existing theoretical work, and the algorithms that researchers had developed for Classical JSON Schema — particularly those for determining satisfiability and schema inclusion — do not easily adapt to Modern JSON Schema's new behavior. One approach to bridge this gap is “elimination” — converting Modern JSON Schema constructs back into equivalent Classical JSON Schema forms. Previous research successfully developed algorithms for eliminating dynamic references, but annotation dependency remained unsolved. In this paper we solve this problem, providing three contributions: an expressibility result, proving that eliminating annotation-dependent operators is possible; a succinctness result, proving that eliminating annotation-dependent operators can generally cause schemas to grow exponentially in size, and finally a practical algorithm to perform annotation elimination. Our “practical algorithm” not only matches the asymptotic lower-bound that is provided by the succinctness theorem, but it also presents some specific optimizations that we designed to exploit typical features or real-world schemas. A comprehensive experimental testing, executed on a representative set of 305 schemas retrieved from GitHub, shows that the practical algorithm runs on less than 10 ms on all of them, and in less than 1 ms in the 98 % of the cases, and that, in the 95 % of the cases, it produces schemas in Classical JSON Schema whose size is at most ten times bigger than the source schema written in Modern JSON Schema.
KW - JSON schema
KW - Modern JSON Schema
KW - Schema languages
KW - Schema rewriting
KW - Type theory
KW - Validation
UR - https://www.scopus.com/pages/publications/105024665431
U2 - 10.1016/j.tcs.2025.115645
DO - 10.1016/j.tcs.2025.115645
M3 - Article
AN - SCOPUS:105024665431
SN - 0304-3975
VL - 1063
JO - Theoretical Computer Science
JF - Theoretical Computer Science
M1 - 115645
ER -