Documents here are kept in a minimal subset of PDF format, just using it as a
container for lossless Group 4 fax compression (ITU-T recommendation T.6) images.
Contributions are normally post-processed by tools to put them in exactly this format,
so that all of the documents here are the same and can be burst at some point in the
future when OCR technology is mature enough to do a good job of recognition.
Documents were scanned using a Ricoh IS520 30ppm duplex production scanner from the late 90's through 2007.
Conversion to higher performance Kodak DS 2500D scanning occured in July, 2007.
The 2500D is an OEM version of the Panasonic KV-S2055 scanner.
In 2008, the Kodak was replaced by a Panasonic KV-S3065W, which
is capable of color 600dpi scanning, and has the capability to scan
sheets several feet long.
Post-processing is done using Lemkesoft's Graphic Converter
TIFF to PDF conversion is done using Eric Smith's tumble
The preferred form for any contributed text scan is as a collection of lossless
Group 4 fax compression (ITU-T recommendation T.6) images saved as TIFF
files with a minium scan resolution of 400 dpi.
Lower scan resolutions produce noticable artifacts if a page needs to be
straightened in post-processing.
Lossy compression formats, such as JPEG, should NEVER be used to save pages
of text, since the compression format destroys edge resolution and contrast
would make it difficult to OCR in the future.
Document Scanning Station
Tape processing over the years
These were taken in rooms that no longer exist at CHM, ca. 2006. The rooms were demolished when the Revolution
exhibit was built and are roughly where the gift shop and orientation theatre are now. You can
see the four XServe RAIDs that are finally being retired in 2019.
Where does the source material come from?
Most of the documents are from my personal collection that I have either bought
or been given over the course of many decades in the computer industry, or have been
loaned to me for scanning.
I have a VERY
large backlog of material to scan and don't actively sollicit material to work on.
If I do decide to scan something from a donor
I will return it if requested.
Unless it is a very rare document I probably
won't accept something that requires manual scanning, since scanning time in my
day is limited.
I do not personally archive any paper that has been scanned.
The scanning process I use is destructive. Bindings are removed and paper is recycled.
Original documents that are still in good condition may be donated to the Computer History Museum for archiving, depending on
if they are within CHM's collecting scope.
The CHM running lot number for my donated documents is
This project was started to downsize my collection of paper in the early 90's and continues
to be its primary purpose.
and.. the site looks this way for a reason, to leave it static and easy to mirror, so don't remind me that it looks like it's from 1995