pdf2djvu creates DjVu files from PDF files.
It's able to extract:
- graphics,
- text layer,
- hyperlinks,
- document outline (bookmarks),
- metadata (including XMP metadata).
pdf2djvu is developed primarily for Unix-like systems, but a Windows distribution is available too.
The following software is required to build pdf2djvu:
- C++11 compiler, recent Poppler might require more recent versions;
- GNU make;
- DjVuLibre (≥ 3.5.25), both the library and the command line tools;
- Poppler (≥ 0.62), including xpdf headers;
- pkg-config.
The following software is recommended:
- gettext for internationalization;
- Exiv2 (≥ 0.21) and libuuid (part of util-linux or e2fsprogs) for correctly dealing with XMP metadata;
- GraphicsMagick for the
--fg-colors=Noption.
For the -j/--jobs option, the compiler must support OpenMP.
To run the tests, the following software is needed:
- Python 3.
To correctly convert some PDF files (mostly in Chinese, Japanese or Korean), the poppler-data package must be installed.
This repository initially just ported the tests of the original repository to Python 3.
The process involved the 2to3 tool and manual fixes afterwards to get the existing tests to pass.
Due to the upstream repository having been archived (Issue #157), there will be no pull request. Additionally, this never was considered as desirable by upstream, where the corresponding issue has been marked as wontfix: Issue #141.
In the meantime, I attempt to incorporate necessary changes to stay compatible with the latest upstream Poppler
Please note that I do not have any direct plans to actually do further work on pdf2djvu itself at the moment, as I am not actually using it myself. If someone wants to step on maintaining the actual C++ code, feel free to open a corresponding issue to further discuss this topic.
- Add support for building against the latest upstream versions of Poppler.
- Tests require recent Python 3 versions.
- Migrate tests from nose to plain unittest stdlib module.
- Conform tests to PEP8 coding style.