As it is not at all clear from the help-file how to make use of SkanKromSator I thought I ought to write down that way I’ve been using it, as it has seemed successful.
What you put into the program is a collection of the scans of a book’s pages. What you get out is a set of page-images that have been beautifully straightened and have also been cropped so as to be all the same size, with a neat margin round the text on all four sides. This is perfect for making a pdf of the book’s scans, to help you while editing your OCRed text.
Make sure the scans are clear of mess outside the text area. You can use IrfanView to clean any of this up. Just draw a box round the mess, and hit control-X. This will fill the box with white, if you have set white to be the control-X fill-colour.
SkanKromSator is available as a free download in a .RAR file, of 11.393 megabytes. This is just another form of compressed file, like ZIP, and it can be expanded by almost any recent unzipping utility, such as FreeZip or 7-Zip.
Using Windows Explorer find the file called sk.exe. Actually there are several, one of which is labelled “Latest Version” but it doesn’t seem to work. Find the one which is 4.735 megabytes, dated January 11 2008. Click on this one, and you will see the utility in all its glory.
Top left there is a pane, and above that File Edit etc. Click on File, navigate to where your input files are, and they will appear with check-boxes. These will be filled in automatically, as we shall see shortly.
Below the pane you can see Pages Book Files Options, etc
Pages: If you have single pages, check Deskew and Despeckle. That’s the default, anyway.
Files: identify the output directory (I think you need to have created this already, for example in Explorer). Then the names to be given to the output files. For example if you wanted the pages to be called book007.tif, book008.tif, book009.tif, etc, you would call the output prefix book, you’d start from 7 with step 1, and you’d need a name-length of 3. The default output format is TIFF G4FAX, which we would call “Group 4”, a highly compressed tiff format. For various reasons I select “TIFF RLE Compress”, which we would call “Tiff packed”. You remember that RLE means Run-Length-Encoded, just the same as “Packed”.
Now look to the top right of the screen, where you will see a row of icons. The fifth one is called Process, but don’t hit it yet. You need to set certain simple things with the icon just to the left of Process.
When you click on this you will see a box named “Draft Kromsate”, Just use “Options”, not “Preprocess” or “Advanced”. You just need to set “Cutting Lines”, by unchecking all five of the options provided. Then click on “OK”, and you will see those checkboxes next to the file names being filled in automatically. When that is done, click on “Process”, and you will see the program muttering away to itself. When it is through with that you’ll see that it reports the names of the five longest files and the five widest. You might need to know this if the straightened book is not quite right. When you click on this report you will be able to see the output pages. Hitting “C” moves you to the next page right, and hitting “X” moves you to the next page left. Right away you will know if the work has been done well, because all the pages will look beautiful. If the margin is too wide on one of the four sides it is because the book’s input pages had some scuff somewhere, and looking right through the book will let you find where it is.
If the output version of the book is perfect, as it probably will be, you can just carry on with your work, But if a little cleaning up on the input files is indicated, do that, and start completely afresh with the program again. I’ve done five books in the week since I knew about this program, and this has only happened to me once.
The actual processing will take typically five minutes, much the same as the time it takes to make a nice cup of coffee. With setting up and checking the output we might be talking of ten to fifteen minutes. Not bad.
I hope this helps. Best of luck with it. Nick Hodson