The study of register in computational language research has historically
been divided into register analysis, seeking to determine the registerial
character of a text or corpus, and register synthesis, seeking to generate a
text in a desired register. This article surveys the different approaches to
these disparate tasks. Register synthesis has tended to use more theoretically
articulated notions of register and genre than analysis work, which often seeks
to categorize on the basis of intuitive and somewhat incoherent notions of
prelabeled ‘text types’. I argue that an integration of computational register
analysis and synthesis will benefit register studies as a whole, by enabling a
new large-scale research program in register studies. It will enable
comprehensive global mapping of functional language varieties in multiple
languages, including the relationships between them. Furthermore, computational
methods together with high coverage systematically collected and analyzed data
will thus enable rigorous empirical validation and refinement of different
theories of register, which will have also implications for our understanding
of linguistic variation in general.

Source link