Word Locations in Laserfiche Snapshot.

November 2, 2005 | KB: 1011056
Snapshot 7

Summary

Laserfiche uses word location information to associate words in the text of a document with the location of the words on the document's image files. When an image is printed and sent to a Laserfiche repository using Snapshot, the text stream generated by the printer drivers is used to determine word locations.

More Information

When a document is printed and sent to Laserfiche using Snapshot, by default, text in the document will be printed by the printer drivers. The information generated by the printer drivers is then passed into Laserfiche and converted into Laserfiche word location information. The documents can therefore have associated text without OCRing the document in the Laserfiche repository.

Note: The printer drivers can only generate this text stream if there is text associated with the document. Some documents - such as image files or non-searchable PDFs - do not have a text stream; the printer drivers will therefore be unable to produce associated text in this fashion.

If you do not want text to be extracted in this fashion, you can configure Snapshot such that word location data is not generated from the printer driver's text stream, and instead elect to OCR the document in Laserfiche. In Snapshot 7.0 and 7.0.1, you will need to OCR the document manually. In Snapshot 7.0.2, the option to automate the process is available.

To configure word location generation in Snapshot 7.0 and 7.0.1

  1. Open the Printer Properties dialog box of the Laserfiche Snapshot 7 printer. The location for this dialog box varies. When printing from an application, it is generally available by clicking the Properties or Settings button in the print dialog. When printing from the Laserfiche client, it can be opened by clicking the Properties button.
  2. Select the File Formats tab.
  3. Perform one of the following:
    • Select the Generate Text and Generate Text Locations options to configure Snapshot to generate word locations from the printer driver's text stream.
    • Clear them if you do not want word locations to be generated in this fashion.
  4. Click OK.
  5. If you wish to OCR the printed TIFF images, select the document in the Laserfiche repository and select OCR/Extract Text/Index from the Action menu.

To configure word location generation in Snapshot 7.0.2

  1. Open the Laserfiche Snapshot Configuration utility.
  2. Select the Advanced tab.
  3. Perform one of the following:
    • In the Text Generation section, select Obtain text from the print job to use the text and word locations generated by the printer drivers.
    • Select Perform OCR on the images created for the print job to generate text and word locations by OCRing the images in the Laserfiche repository.
  4. Click OK.
  5. If you are generating text from the printer drivers, you will also need to configure the printer properties to generate text and text locations:
    1. Open the Printer Properties dialog of the Laserfiche Snapshot 7 printer. The location for this dialog varies. When printing from an application, it is generally available by clicking the Properties or Settings button in the print dialog. When printing from the Laserfiche client, it can be opened by clicking the Properties button.
    2. Select the File Formats tab.
    3. Select the Generate Text and Generate Text Locations options to configure Snapshot to generate word locations from the printer driver's text stream.
    4. Click OK.

Related Links

For information on word locations, please see the following Knowledge Base article:

1011057 Basic Information Regarding Word Location Information Generated Through the Laserfiche Client.

For an issue relating to word locations generated by Snapshot, please see the following Knowledge Base article:

1011012 Generating Text From PDFs May Produce Unintelligible Text.