There are a lot of scanned books in the DjVu format, but many of them are not professionally scanned. One of the DjVu books I encounter has different page sizes (page size changes every few pages). There are nice pdf scaling programs, but no such program for DjVu exists.
So I made my own. The python 3 program searches for INFO blocks which specify sizes and DPIs of images and change DPIs to match a constant width. The program needs improvement though – it currently reads the entire file into memory for processing. However, this is not a problem for my files so I’m not going to change it.
# usage: python3 djvu.py [FILE] [PAGE_WIDTH(IN)]
import sys
import struct
def main():
args = sys.argv
if len(args) != 3:
print('filename? page_width?')
return
filename = args[1]
page_width = float(args[2])
file_bytearray = None
with open(filename, mode='rb') as f:
file_bytearray = bytearray(f.read())
for i in range(len(file_bytearray) - 8):
if file_bytearray[i:i+7] == b'INFO\\0\\0\\0':
info_start = i+8
(width, height) = struct.unpack('>HH', file_bytearray[info_start:info_start+4])
desired_dpi = int(width / page_width)
file_bytearray[info_start+6:info_start+8] = struct.pack('<H', desired_dpi)
with open(filename+'mod.djvu', mode='wb') as f:
f.write(file_bytearray)
if __name__ == '__main__':
main()
Before

After
