Typesetting a novel for Print with HTML & CSS Paged Media

I’ve been fol­lo­wing new deve­lop­ments around CSS Paged Media and PrintCSS for a while now. I never had the chan­ce to test them out in a real word pro­ject. Until recent­ly. The publi­shing hou­se I work for wants to pro­du­ce per­so­na­li­zed novels. Custo­mers can custo­mi­ze (*badum *tsch) their novel in a web fron­ted writ­ten in Python. The out­put is plain HTML5.

To avo­id over­head I set out to typeset the dif­fe­rent novels with web tech­no­lo­gies. The­se are my fin­dings:

tl;dr: It pret­ty much went smooth­ly. CSS Paged Media is even more con­ve­ni­ent than InDe­sign in some aspects.
Docrap​tor​.com (with Prince XML as the engi­ne) was my Con­ver­ter of choice.
I would only wish for one mis­sing fea­ture, other­wi­se typeset­ting was a bre­e­ze.

The Dream (and the reality)

  • Auto­ma­tic gene­ra­ti­on of print ready PDF from HTML & CSS (✔)
  • Easy set­up of pages and type-areas (✔)
  • Custom Fonts (✔)
  • Widows and orphans can be avo­ided (✔)
  • Jus­ti­fied text has no rivers (✔)
  • Hyphe­na­ti­on is hand­led pro­per­ly and auto­ma­ti­cal­ly (✔)
  • Fill the given type-area with text from left to right an top to bot­tom with lines of text (left to right: ✔| top to bot­tom: ⚠)
  • Hea­dings can start on a new right page (✔)
  • Pages can be empty (✔)
  • Run­ning hea­ders (✔)
  • Pagi­na­ti­on (✔)

So let’s see how far we can get.

Generate PDF from HTML & CSS

To out­put print ready PDFs you need a con­ver­ter engi­ne. After some rese­arch I came up with the fol­lo­wing pos­si­bi­li­ties (feel free to sug­gest alter­na­ti­ves)

Page Setup

Page Set­up is fair­ly easy with CSS Paged Media. You can choo­se  bet­ween stan­dard paper sizes like »Let­ter« or »A4« or use a com­ple­te­ly custom Page size and type area via the @page rule.

@page {
size: 125mm 187mm;
margin: 15mm 15mm 16mm 15mm;
}

Custom Fonts

easy:

@font-face {
font-family: Garamond;
font-weight: normal;
src: url(fonts/AGaramondPro-Regular.otf) format('opentype');
}

Embed­ding custom fonts works as it should.

Widows & Orphans

Avo­i­ding widows and orphans can be done with this code.

p.example {
widows:2;
orphans:2;
}

Just as a refres­her as CSS-Tricks phra­ses it:

widows = mini­mum num­ber of lines in a para­graph split on the new page.
orphans = mini­mum num­ber of lines in a para­graph split on the old page.

Justified Text

Jus­ti­fi­ca­ti­on can be easi­ly set in the CSS.

p.example {
text-align: justify;
}

This was easy. Are we done?  Not qui­te.

Nice loo­king jus­ti­fied text ist intrin­si­cal­ly lin­ked with Hyphe­na­ti­on. Without hyphens your text will have big gaps and sport the occa­sio­nal river, which is ver ugly and hard to read.

So on to …

Hyphenation

Hyphe­na­ti­on is hand­led by the ren­de­rer. You can switch it on and off via a CSS-rule.

p.example {
hyphens: auto;
}

Dif­fe­rent ren­de­rers use dif­fe­rent Algo­rithms and dic­tio­na­ries for hyphe­na­ti­on. So the out­put will vary:

  • Vivliostyle: No hyphe­na­ti­on at the moment (Uses web­kit as ren­de­ring engi­ne. As soon as web­kit does hyphe­na­ti­on they will, too)
  • Wea­sy­print: Expe­ri­men­tal hyphe­na­ti­on pos­si­ble via ven­dor spe­ci­fic pre­fix
    »-weasy-hyphens« (uses pyphen as hyphe­na­ti­on engi­ne)
  • PDFre­ac­tor: Hyphe­na­ti­on pos­si­ble (Dic­tiona­ry for the fol­lo­wing lan­guages: Bul­ga­ri­an, Cata­l­an, Danish, New Ger­man, Ger­man traditional,Greek, Modern, English (US), English (GB), Esto­ni­an, Gali­ci­an, Inter­lin­gua, Indo­ne­si­an (Baha­sa Indo­ne­sia), Ice­lan­dic, Ita­li­an, Kur­man­ji (Nort­hern Kur­dish), Latin, Dutch, Polish, Rus­si­an, Swe­dish)
  • PrinceXML: Hyphe­an­ti­on pos­si­ble (I think they use the TEX algo­rithm and Open­Of­fice dic­tio­na­ries)

It’s important that you set the »lang« attri­bu­te for your docu­ment. The attri­bu­te can chan­ge for cer­tain para­graphs.
Soft hyphens (­) can be used to influ­ence how wor­ds are hyphen­ated.

All of the con­ver­ters that sup­por­ted hyphe­na­ti­on pro­du­ced fair­ly good loo­king jus­ti­fied text.

Filling the page from top to bottom automatically

So this the sin­gle point, which did not work as I wis­hed. As I’m avo­i­ding widows and orphans not every page can be easi­ly fil­led with lines of text. If this would be an Inde­sign pro­ject I’d fidd­le with the ker­ning and the width of the let­ters to shor­ten or leng­t­hen cer­tain para­graphs and thus fill the page com­ple­te­ly. I could do this manu­al­ly for every book in this pro­ject, but this would defeat the pur­po­se of the who­le thing.

Ide­al­ly this would be done by the ren­de­rer:

  • Is the type-area fil­led with lines?
  • If not try to fill it by shor­ten­ing or leng­the­ning pre­ce­ding para­graphs using ker­ning and font width to fill it (wit­hin given para­me­ters of cour­se)

Headings can start on an new right page

easy:

h1 {
page-break-before: right;
}

This will start a new page star­ting with the h1. Wich brings us to …

Empty pages

CSS paged media has a selec­tor for empty pages: »:blank«.

Note: Some of the PDF Ren­de­rers do not yet (i.e. PDFrec­tor) sup­port this.

Pagination

Pagi­na­ti­on can be set up with CSS coun­ters, gene­ra­ted con­tent and page are­as. First we set up a coun­ter (I gave it the name »page-counter«). This coun­ter will count up one one every page.

@page {
  counter-increment: page-counter;
}

Now we need a place to dis­play this coun­ter. CSS Paged Media has 16 page are­as in the mar­gin and one for the main page its­elf:

top-left-corner top-left top-center top-right top-right-corner
left-top  main page area right-top
left-middle right-middle
left-bottom right-bottom
bottom-left-corner bottom-left bottom-center bottom-right bottom-right-corner

So let’s dis­play the page coun­ter on »bottom-right« or »bottom-left« depen­ding on the page:

@page :right {
    @bottom-right {
    content: counter(page-counter);
  }
}

@page :left {
    @bottom-left {
    content: counter(page-counter);
  }
}

Conclusion

It worked pret­ty well. I’d defi­ni­te­ly do ano­t­her pro­ject like this one. With a basic under­stan­ding of HTML & CSS and the docu­men­ta­ti­on all around the web this task was fun and easy to do. Gran­ted this was »only« a novel.

Would you rather have used Indesign?

Some things are so easi­ly set up with CSS, that I wish Inde­sign would adopt this kind of work­flow.
For examp­le: Our Book has Line indents on every para­graph, except the ones that fol­low a head­line or an orna­ment. This can be set up with one simp­le rule:

h1 + p.normal, p.ornament + p.normal {
  text-indent: 0;
}

Done. No search and replace or app­ly­ing custom styles to para­graphs. So Much easier.

Why I chose Docraptor for Production

  • Fle­xi­ble pri­cing model (mon­th­ly fee for a given num­ber of crea­ted docu­ments)
  • sup­port of @page:blank
  • Good Hyphe­na­ti­on

Further Reading:

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert.