Typesetting a novel for Print with HTML & CSS Paged Media

I’ve been fol­lo­wing new deve­lo­p­ments around CSS Paged Media and PrintCSS for a while now. I never had the chan­ce to test them out in a real word pro­ject. Until recent­ly. The publi­shing house I work for wants to pro­du­ce per­so­na­li­zed novels. Cus­to­mers can cus­to­mi­ze their novel in a web fron­ted writ­ten in Python. The out­put is plain HTML5.

To avo­id over­head I set out to type­set the dif­fe­rent novels with web tech­no­lo­gies. The­se are my findings:

tl;dr: It pret­ty much went smooth­ly. CSS Paged Media is even more con­ve­ni­ent than InDe­sign in some aspects.
Docrap​tor​.com (with Prin­ce XML as the engi­ne) was my Con­ver­ter of choice.
I would only wish for one miss­ing fea­ture, other­wi­se type­set­ting was a breeze.

The Dream (and the reality)

  • Auto­ma­tic gene­ra­ti­on of print rea­dy PDF from HTML & CSS (✔)
  • Easy set­up of pages and type-areas (✔)
  • Cus­tom Fonts (✔)
  • Widows and orphans can be avoided (✔)
  • Jus­ti­fied text has no rivers (✔)
  • Hyphena­ti­on is hand­led pro­per­ly and automatically (✔)
  • Fill the given type-​area with text from left to right and top to bot­tom with lines of text (left to right: ✔| top to bottom: ⚠)
  • Hea­dings can start on a new right page (✔)
  • Pages can be empty (✔)
  • Run­ning headers (✔)
  • Pagi­na­ti­on (✔)

So let’s see how far we can get.

Generate PDF from HTML & CSS

To out­put print rea­dy PDFs you need a con­ver­ter engi­ne. After some rese­arch I came up with the fol­lo­wing pos­si­bi­li­ties (feel free to sug­gest alternatives)

Page Setup

Page Set­up is fair­ly easy with CSS Paged Media. You can choo­se bet­ween stan­dard paper sizes like »Let­ter« or »A4« or use a com­ple­te­ly cus­tom Page size and type area via the @page rule.

@page {
size: 125mm 187mm;
margin: 15mm 15mm 16mm 15mm;
}

Custom Fonts

easy:

@font-face {
font-family: Garamond;
font-weight: normal;
src: url(fonts/AGaramondPro-Regular.otf) format('opentype');
}

Embed­ding cus­tom fonts works as it should.

Widows & Orphans

Avo­i­ding widows and orphans can be done with this code.

p.example {
widows:2;
orphans:2;
}

Just as a refres­her as CSS-​Tricks phra­ses it:

widows = mini­mum num­ber of lines in a para­graph split on the new page.
orphans = mini­mum num­ber of lines in a para­graph split on the old page.

Justified Text

Jus­ti­fi­ca­ti­on can be easi­ly set in the CSS.

p.example {
text-align: justify;
}

This was easy. Are we done? Not quite.

Nice loo­king jus­ti­fied text ist intrin­si­cal­ly lin­ked with Hyphena­ti­on. Wit­hout hyphens your text will have big gaps and sport the occa­sio­nal river, which is ver ugly and hard to read.

So on to …

Hyphenation

Hyphena­ti­on is hand­led by the ren­de­rer. You can switch it on and off via a CSS-rule.

p.example {
hyphens: auto;
}

Dif­fe­rent ren­de­rers use dif­fe­rent Algo­rith­ms and dic­tio­n­a­ries for hyphena­ti­on. So the out­put will vary:

  • Viv­lio­style: No hyphena­ti­on at the moment (Uses Web­Kit as ren­de­ring engi­ne. As soon as web­kit does hyphena­ti­on they will, too)
  • Wea­sy­print: Expe­ri­men­tal hyphena­ti­on pos­si­ble via ven­dor spe­ci­fic prefix
    »-weasy-​hyphens« (uses pyphen as hyphena­ti­on engine)
  • PDFre­ac­tor: Hyphena­ti­on pos­si­ble (Dic­tion­a­ry for the fol­lo­wing lan­guages: Bul­ga­ri­an, Cata­lan, Danish, New Ger­man, Ger­man tra­di­tio­nal, Greek, Modern, Eng­lish (US), Eng­lish (GB), Esto­ni­an, Gali­ci­an, Inter­lin­gua, Indo­ne­si­an (Baha­sa Indo­ne­sia), Ice­lan­dic, Ita­li­an, Kur­man­ji (Nor­t­hern Kur­dish), Latin, Dutch, Polish, Rus­si­an, Swe­dish)
  • Prin­ceXML: Hyphe­an­ti­on pos­si­ble (I think they use the TEX algo­rithm and Open­Of­fice dictionaries)

It’s important that you set the »lang« attri­bu­te for your docu­ment. The attri­bu­te can chan­ge for cer­tain paragraphs.
Soft hyphens (­) can be used to influence how words are hyphenated.

All the con­ver­ters that sup­port­ed hyphena­ti­on pro­du­ced fair­ly good loo­king jus­ti­fied text.

Filling the page from top to bottom automatically

So this was the sin­gle point, which did not work as I wis­hed. As I’m avo­i­ding widows and orphans not every page can be easi­ly fil­led with lines of text. If this was an InDe­sign pro­ject I’d fidd­le with the ker­ning and the width of the let­ters to shor­ten or leng­then cer­tain para­graphs and thus fill the page com­ple­te­ly. I could do this manu­al­ly for every book in this pro­ject, but this would defeat the pur­po­se of the who­le thing.

Ide­al­ly this would be done by the renderer:

  • Is the type-​area fil­led with lines?
  • If not try to fill it by shor­tening or leng­thening pre­ce­ding para­graphs using ker­ning and font width to fill it (within given para­me­ters of course)

Headings can start on an new right page

easy:

h1 {
page-break-before: right;
}

This will start a new page start­ing with the h1. Which brings us to …

Empty pages

CSS paged media has a sel­ec­tor for emp­ty pages: »:blank«.

Note: Some PDF Ren­de­rers do not yet (i.e. PDFre­ac­tor) sup­port this.

Pagination

Pagi­na­ti­on can be set up with CSS coun­ters, gene­ra­ted con­tent and page are­as. First we set up a coun­ter (I gave it the name »page-​counter«). This coun­ter will count up one one every page.

@page {
  counter-increment: page-counter;
}

Now we need a place to dis­play this coun­ter. CSS Paged Media has 16 page are­as in the mar­gin and one for the main page itself:

top-​left-​cornertop-​lefttop-​centertop-​righttop-​right-​corner
left-​topmain page arearight-​top
left-​middleright-​middle
left-​bottomright-​bottom
bottom-​left-​cornerbottom-​leftbottom-​centerbottom-​rightbottom-​right-​corner

So let’s dis­play the page coun­ter on »bottom-​right« or »bottom-​left« depen­ding on the page:

@page :right {
    @bottom-right {
    content: counter(page-counter);
  }
}

@page :left {
    @bottom-left {
    content: counter(page-counter);
  }
}

Conclusion

It work­ed pret­ty well. I’d defi­ni­te­ly do ano­ther pro­ject like this one. With a basic under­stan­ding of HTML & CSS and the docu­men­ta­ti­on all around the web this task was fun and easy to do. Gran­ted this was »only« a novel.

Would you rather have used InDesign?

Some things are so easi­ly set up with CSS, that I wish InDe­sign would adopt this kind of workflow.
For exam­p­le: Our Book has Line ind­ents on every para­graph, except the ones that fol­low a head­line or an orna­ment. This can be set up with one simp­le rule:

h1 + p.normal, p.ornament + p.normal {
  text-indent: 0;
}

Done. No search and replace or app­ly­ing cus­tom styl­es to para­graphs. So Much easier.

Why I chose Docraptor for Production

  • Fle­xi­ble pri­cing model (month­ly fee for a given num­ber of crea­ted documents)
  • sup­port of @page:blank
  • Good Hyphena­ti­on

Further Reading:

Schreibe einen Kommentar