Alberto Campagnolo, Ligatus Research Centre, University of the Arts, London; Dot Porter, Doug Emery & Dennis Mullen, Schoenberg Institute for Manuscripts Studies, University of Pennsylvania

Virtually disbinding codices: The visualization of the construction of codex textblocks

Codices are primarily physical objects composed of leaves combined in quires, used as writing supports, and bound together. The basic physical unit of a codex is the quire (or gathering), the set of sheets folded together to form a group of leaves to be sewn together as part of a textblock. The way the elements of a codex are arranged tells its own story about the object.

Traditionally, information on the gathering structure of books is recorded in highly dense expressions, referred to as 'collation formulas'. These are capable of describing in detail the sequence of bifolia (and singletons) within the gatherings, but their decoding in relation to the physical appearance of the object that they are describing can prove challenging.

In addition to this, manuscript studies — unlike the case of printed books and their bibliographical description — lack a standard for drafting collation formulas that is approved and employed by all scholars.

However, for the most part, both bibliographical collation formulas and the various styles of those employed in manuscript studies share a set of information units that are necessary to describe the arrangement of the sheets within the textblock.

Digital facsimiles of manuscripts and books in general tend to show only single pages — or facing pages at most — failing to show any actual physical connection between the pages. In fact, only in the case of the central leaves of a gathering do these show together pages that are indeed physically connected at the fold. Collation formulas are sometimes provided along with the images, but the physical make-up of the object is nonetheless lost in the visual layout, and in the digital facsimile in general.

We have been working on a way to collect information on the physical make-up of the quire structure of books in codex format, and to present this visually to the end user. It is a novel way of presenting manuscripts on a computer, one that is substantially different from the customary page-turning view, and one that focuses on the physicality of the book as opposed to its state as a text-bearing object.

In the project’s pilot phase[1], the system took the necessary information from collation formulas to present the viewer with a representation of the physical relations between the pages. Due to the aforementioned lack of standards, we have moved away from the parsing of collation formulas as the system’s first input, and have instead decided to offer the users with an on-line form[2] on which to collect the basic information on the structure of the manuscript. From the information gathered through the webform, we can then generate an eXtensible Markup Language (XML)[3] model of the gathering structure, and then use that information for the visualizations.

The system generates three main outputs: (i) the XML description of textblock structures, (ii) its automated diagrammatic visualization, and (iii) a digital facsimile of the manuscript that is focussed on the bifolium view and not the customary facing-pages view — i.e. it shows together those pages that are physically conjoined in the manuscript, as opposed to those that are next to each other.

At the time of writing, the system is still in the development stage, but we foresee that it will be flexible and robust enough to be useful not only for scholars interested in the utilizing a digital facsimile of manuscript structures for their research, but also for other categories of end users, such as conservators and students.

The system, in fact, lends itself to a multiplicity of uses and offers a variety of corollary features not usually found in other manuscript description practises.

Firstly, the XML model behind the input webform allows describing in a structured way the construction of manuscripts, and the resulting XML file can be exported and subsequently utilized to generate collation formulas according to any desired format, through eXtensible Stylesheet Language Transformations (XSLT)[4] or other programming language.

Secondly, we want to offer a diagram view of each quire whilst it is being described; these automated diagrams would allow for immediate data validation and increase data accuracy of the recorded information.[5]

Thirdly, when dealing with material objects and their visualization, one has to accept a degree of uncertainty, but this is not generally allowed in traditional collation formulas; we aim at allowing the indication of uncertainty both for the gathering structure descriptions and their visualization.

We offer the collation tool as a free on-line system for people to use as it better fits their need, be this only a temporary description tool to generate and export the structure of manuscripts in XML or its automated diagrammatic visualization — which we think could be of particular interest to conservators — or the fully-fledged digital edition visualization.

In our paper, we will present the system, its uses, and how we hope that it will lead to new research questions, and new scholarship.

[1] See http://dorpdev.library.upenn.edu/collation/ (accessed June 2015).

[2] See https://protected-island-3361.herokuapp.com/manuscripts/new (accessed June 2015) and https://github.com/leoba/VisColl (accessed June 2015).

[3] http://www.w3.org/XML/ (accessed June 2015).

[4] http://www.w3.org/TR/xslt (accessed June 2015).

[5] As shown in Campagnolo, Alberto (2016), “Errata (per Oculos) Corrige. Visual Identification of Meaningless Data in Database Records of Bookbinding Structures”, in Care and Conservation of Manuscripts 15: Proceedings of the Fifteenth International Seminar Held at the University of Copenhagen, 2nd-4th April 2014, in press, Copenhagen: Tusculanum Press.