Paginated outputs remain important to scholarly communications, and are still critical for books like monographs. Even in today’s increasingly digital discovery landscape, many readers of long-form content continue to prefer print, and the ability to cite page numbers continues to be critical to creating good old-fashioned tools like a book index. But producing paginated books from HTML source files that could also be used for generating other types of digital files has always been a challenge, as Nellie McKesson notes in her recent blog post on Hederis. So, a couple of years ago, the University of California Press and the California Digital Library partnered with Coko to begin an ambitious project to develop a workflow application that would allow books to be built in a browser using entirely open source technologies. Editoria is not the first open source, browser-based book production system that has ever been attempted, but it’s (at least to our knowledge) the first that has attempted to replicate the rigorous production editing process and workflow, which includes styling, copyediting, author review, and proofreading, in a browser-based application.
We borrowed the idea of single-source publishing using HTML source from predecessor applications like Adam Hyde’s Booktype, O’Reilly’s Atlas, and Hugh McGuire’s Pressbooks, all of which use some form of PDF rendering engine (often proprietary) to output beautiful, paginated books in addition to EPUBs and other HTML or XML-based files. Then, we’ve tried to stand on the shoulders of those applications by building in a greater degree of workflow support. It’s an ambitious project, and supporting paginated outputs from a single HTML-based source file, has been a non-trivial aspect of the system’s development.
Editoria starts with a book dashboard where all active titles that a user is working on can be seen and accessed.
Clicking the “edit” link next to any of the books in the library brings you to the so-called “Book Builder” interface, which represents the narrative structure of the book:
We’ve introduced the ability to upload Microsoft Word documents and to order the book into the commonly understood sections of frontmatter, body, and backmatter that are outlined in the Chicago Manual of Style.
Clicking “edit” next to any section of the book drops you into a web-based word processing environment, Wax, where production editors, copyeditors, and authors can all interact around the text.
This is where manuscript styling and copyediting take place. The interface introduces the ability to edit notes as well as text. Access to the text can be managed using a team manager that allows for fairly granular role-based permissions. These allow administrators to control who can do what to the text at what point.
Once the the text has reached a point at which everyone feels that it’s ready to be published, the book can be exported in either PDF or EPUB format. At the moment, these are the two standards-based formats that the system supports. We are currently using the Vivliostyle CSS-based typsetting engine. We initially chose Vivliostyle because not only was it open source, but it also tried, insofar as possible, to work with existing web standards for browser-based pagination.
This is a fairly simple, text-only example, but there are a number of complex elements that all have to be handled perfectly in order to render a high-quality PDF from HTML source. These two pages alone contain or require running heads, subheads, footnotes, diacritics, and hyphenation control, all of which are critical to not just rendering a serviceable PDF, but a PDF that is of high enough quality for a publisher or author.