Last week I got PDF file with jpeg images for pages. As images were quite large (2095x2995) my lBook V3 had some trouble flipping them (15 second a page).
After some considerations I decided to transform PDF to DJVU format.
What can be easier?
- Extract JPEG pages from PDF
- Transform JPEG to PBM
- Transform PBM to DJVU
- Compile individual DJVU files in one book
The JPEGs were extracted without problem.
pdfimages -j source.pdf output
which gives a set of output-nnn.jpg files in a current directory.
And here comes the trouble.
Each JPEG contained a scanned color (RGB) or grayscale bookpage and had resolution like 2100x3000. Each image contained not only the front of the page but the back too as sheets of paper were quite thin and transparent on a scanner. It looked like this:

Once I tried to scan a paper book to read it on my lBook reader. That time I just told XSane to scan in “black&white mode” so there will be only letters on the front of the page and clean white background without contents from the back of the page.
After several minutes of search I found a name of the function to reduce colors of image — Posterize.
I checked the function in the GIMP and was quite pleased with the result. As there were hundreds of images to convert I needed more automagic solution. “What can be easier?” I asked myself again and run man convert (convert is a part of the ImageMagick suite). Quick search of the manual revealed -posterize function and all JPEGs were quickly converted by a shell for loop.
Alas the result was quite different from the one produced by GIMP.
Image By GIMP:

Image By Convert:

convert utility reduced the number of colors but emulated underlying images with dithering (to add insult to injury cjb2 utility from djvulibre suite wasn’t able to tell the difference between letters and background — result was a mess).
I must admit that I can’t cook utilities from the ImageMagick suite properly.
So I tried to make GIMP convert images automagically. That cost me several hours of debugging and cursing.
Beware of the snakes there.
Note to the brave: don’t always trust the tutorials and reference. Functions that return TRUE or FALSE by the reference in fact return (1) or (0) — a list containing one element — both are considered TRUE. Hence the (if (= 1 (car (function ...))) syntax.
First run GIMP in batch mode:
$ gimp -b —
Paste the script there:
(define (batch-jpeg-pbm filemask)
(let* ((filelist (cadr (file-glob filemask 1))))
(while (not (null? filelist))
(let* ((filename (car filelist))
(image (car (gimp-file-load RUN-NONINTERACTIVE
filename filename)))
(drawable (car (gimp-image-get-active-layer image))))
(if (= 1 (car (gimp-drawable-is-rgb drawable)))
(gimp-image-convert-grayscale image))
(gimp-posterize drawable 2)
(plug-in-autocrop RUN-NONINTERACTIVE image drawable)
(let ((newfilename (string-append (substring filename 0
(- (string-length filename) 4)) ".pbm")))
(file-pgm-save RUN-NONINTERACTIVE image drawable
newfilename newfilename 1))
(gimp-image-delete image))
(set! filelist (cdr filelist)))))
And run it like this:
> (batch-jpeg-pbm "/path/to/your/images/*.jpg")
After several minutes for each JPEG there will be a .pbm bitonal file which can be converted to DJVU by cjb2 utility like this:
$ for i in *.pbm; do cjb2 $i $i.djvu; done
Next command will compile all DJVU files into one book:
$ djvm -c book.djvu *.djvu
Read more about Script-Fu