I frequently use Calibre to convert academic papers from PDF to epub, so I can read them on my tablet. But the conversion often results in weird formatting artifacts.

I’m aware that in the Conversion tool, there’s a Search & Replace section, where I can list various expressions that will help me remove things like page numbers. This post has some helpful expressions for that. For example:

\d+<br>\n|\d+\w<br>\n|\w\d+<br>\n

Does anyone have a list of more expressions that they use when converting academic papers from PDF to epub? Especially any good expressions for handling footnotes?? I would love to get a bunch together, save them, and then simply load them every time I have to do this.

Thanks all!

  • nobloat
    link
    fedilink
    English
    arrow-up
    2
    ·
    11 months ago

    I don’t have any idea but I want to thank you for the link. This is not related to conveting to Epub but have you looked at K2pdfopt? It has options to optimize PDF for e-readers. It doesn’t work all the time but it’s worth checking out. Last I checked it has templates to deal with scientific papers.