TY - JOUR
T1 - THE CATEGORICAL CONTOURS OF THE CHOMSKY-SCHÜTZENBERGER REPRESENTATION THEOREM
AU - Melliès, Paul André
AU - Zeilberger, Noam
N1 - Publisher Copyright:
© P.-A. Melliès and N. Zeilberger.
PY - 2025/1/1
Y1 - 2025/1/1
N2 - We develop fibrational perspectives on context-free grammars and on nondeterministic finite-state automata over categories and operads. A generalized CFG is a functor from a free colored operad (aka multicategory) generated by a pointed finite species into an arbitrary base operad: this encompasses classical CFGs by taking the base to be a certain operad constructed from a free monoid, as an instance of a more general construction of an operad of spliced arrows W C for any category C. A generalized NFA is a functor from an arbitrary bipointed category or pointed operad satisfying the unique lifting of factorizations and finite fiber properties: this encompasses classical word automata and tree automata without ϵ-transitions, but also automata over non-free categories and operads. We show that generalized context-free and regular languages satisfy suitable generalizations of many of the usual closure properties, and in particular we give a simple conceptual proof that context-free languages are closed under intersection with regular languages. Finally, we observe that the splicing functor W: Cat → Oper admits a left adjoint C: Oper → Cat, which we call the contour category construction since the arrows of C O have a geometric interpretation as oriented contours of operations of O. A direct consequence of the contour / splicing adjunction is that every pointed finite species induces a universal CFG generating a language of tree contour words. This leads us to a generalization of the Chomsky-Schützenberger Representation Theorem, establishing that a subset of a homset L ⊆ C(A, B) is a CFL of arrows if and only if it is a functorial image of the intersection of a C-chromatic tree contour language with a regular language.
AB - We develop fibrational perspectives on context-free grammars and on nondeterministic finite-state automata over categories and operads. A generalized CFG is a functor from a free colored operad (aka multicategory) generated by a pointed finite species into an arbitrary base operad: this encompasses classical CFGs by taking the base to be a certain operad constructed from a free monoid, as an instance of a more general construction of an operad of spliced arrows W C for any category C. A generalized NFA is a functor from an arbitrary bipointed category or pointed operad satisfying the unique lifting of factorizations and finite fiber properties: this encompasses classical word automata and tree automata without ϵ-transitions, but also automata over non-free categories and operads. We show that generalized context-free and regular languages satisfy suitable generalizations of many of the usual closure properties, and in particular we give a simple conceptual proof that context-free languages are closed under intersection with regular languages. Finally, we observe that the splicing functor W: Cat → Oper admits a left adjoint C: Oper → Cat, which we call the contour category construction since the arrows of C O have a geometric interpretation as oriented contours of operations of O. A direct consequence of the contour / splicing adjunction is that every pointed finite species induces a universal CFG generating a language of tree contour words. This leads us to a generalization of the Chomsky-Schützenberger Representation Theorem, establishing that a subset of a homset L ⊆ C(A, B) is a CFL of arrows if and only if it is a functorial image of the intersection of a C-chromatic tree contour language with a regular language.
KW - category theory
KW - context-free grammars and context-free languages
KW - finite-state automata and regular languages
KW - operads
KW - representation theorem
KW - tree contour words
UR - https://www.scopus.com/pages/publications/105006903551
U2 - 10.46298/lmcs-21(2:12)2025
DO - 10.46298/lmcs-21(2:12)2025
M3 - Article
AN - SCOPUS:105006903551
SN - 1860-5974
VL - 21
SP - 12:1-12:49
JO - Logical Methods in Computer Science
JF - Logical Methods in Computer Science
IS - 2
ER -