Case Studies 
 Selected Client List 
 Testimonials 
 
 Acrobat Courses 
 ColdFusion Courses 
 Dreamweaver Courses 
 Flash Courses 
 Flex Courses 
 Google Courses 
 JackBe Presto Courses 
 JavaScript / DHTML / AJAX Courses 
 Adobe LiveCycle Courses 
 Adobe Photoshop Courses 
 Robohelp / Captivate Courses 
 Other Courses - Complete Listing 
 
 Adobe Connect 
 Adobe LiveCycle 
 Adobe Creative Suite 
 E-Commerce 
 Job Bank 
 
 User Groups 
 Publications 
 Presentations 
 Source Code 
 Discussion Lists 
 
 Return to Home Page 
 
 History 
 Expertise 
 Partners 
 Executive Team 
 Locations 
 Employment 
 
 Strategic Analysis 
 Project Management 
 Creative Services 
 Software Integration 
 Best Practices Consulting 
 Quality Assurance 
 Technical & Web Writing 
 Support 
 Hosting 
 
 Latest News 
 Events 
 Awards 
 Press 
 
Contact Site Map
 
Click here to search.
Products

Call Us: 877-FIG-LEAF
Request a Quote: google@figleaf.com

ABBYY Recognition Server for Document Recognition

Introducing a way to finally make your documents in image formats discoverable.  Fig Leaf Software, has partnered with ABBYY, a leading provider of document recognition, data capture and linguistic software to bring the ABBYY Recognition Server for the Google Search Appliance to our customers.

The ABBYY Recognition Server works as a background optical character recognition (OCR) service, enabling the GSA to index full text content from documents in image formats.

Organizations need a fast and reliable way to sort through the tremendous volumes of information they accumulate over time.  Implementing the ABBYY Recognition Server for the Google Search Appliance, enables you to sort through your organization's information faster and increase your efficiency, without spending a large amount of time on training.

How It Works

ABBYY Recognition Server - Architecture ABBYY Recognition Server consists of several components, which can be installed on the same or on different computers in LAN. The main components are:

  • Server Manager - the central service component, which controls the document processing queue and orchestrates the work of Processing Stations and Verification Stations.
  • Processing Station - a service that performs recognition and document conversion.
  • Verification Station - a client station which provides an interface for proofreading the recognition results.
  • Remote Administration Console - a client console used for configuring and monitoring Recognition Server.

The document conversion process in Recognition Server can be divided in four logical parts:

1. Uploading documents

The user (or a client software program) uploads the images to one of the following network resources:
  • network folder (which is convenient in case of centralized processing of many image files);
  • FTP folder (e.g. if images should be uploaded from remote locations);
  • email folder (e.g. if users send their images for conversion by e-mail).

The Server Manger component of Recognition Server imports the images from the Input source and arranges them in a queue for processing.

2. Processing

The processing of the images and PDF files is done on a Processing Station.

It is possible to connect several computers to the Server Manager as Processing Stations, and the Server Manager will balance the workload among these stations evenly. This will result in much faster processing of the documents.

There are a few essential steps in the document conversion process. Recognition Server does them all automatically without any user assistance.

First there goes an image pre-processing step, at which some preliminary actions are performed on each page:

  • skew correction;
  • automatic detection of page orientation;
  • splitting of facing pages in the case of book scans;
  • noise and garbage removal.

Next comes the recognition part of the process. The OCR and barcode recognition technologies implemented in Recognition Server deliver the unprecedented accuracy and support processing of various types of text and the most popular 1D and 2D barcodes. The OCR process is supported with extensive language databases which include:

  • 37 main languages with Latin and Cyrillic alphabets;
  • 133 additional languages with Latin, Cyrillic, Greek and other alphabets;
  • Old European languages;
  • Chinese, Japanese and Korean languages
  • Hebrew;
  • Thai;
  • Chemical formulas, artificial and programming languages.

For images scanned in a batch, Recognition Server offers several document separation options. For example, the batch can be split into individual documents using blank separator sheets, barcode sheets, or barcodes stuck or printed on the first page of each document. Recognition Server performs document separation based on the separation rule and the recognized data. Each document will then be exported to a separate output file.

3. Quality Control

Sometimes there is a need to process important documents which have to be recognized with exceptional accuracy. Meanwhile, the quality of scanned images may not be perfect, suffering from low resolution and unwanted noise. In this case it is very important to have a reliable quality check mechanism. Recognition Server provides options for both automatic quality control and a visual verification.
  • Automatic quality control allows the administrator to set a threshold for recognition accuracy. When this option is on, documents with poor-quality text will not be converted, but rather stored in a separate folder for special treatment;
  • If the Verification option is enabled, the pages will be routed to available Verification Stations. Verification Stations allow operators to check the accuracy of the layout and the recognized text, perform any necessary corrections and do the spell checking. Verification can be enabled either for all recognized pages or only for those pages which are recognized with an accuracy below the certain threshold.

4. Getting converted documents

Administration

The administration of Recognition Server is performed via a convenient administration interface based on the Microsoft Management Console. It allows the administrator to configure the system and monitor its activity: to set processing parameters, to manage licenses, stations, and user permissions, to manage the processing queue and to view the log files.

The priority management and advanced scheduling features allow the administrator to control the order in which the documents are processed and use the stations’ hardware resources efficiently by scheduling OCR for night hours or weekends.

Integration

ABBYY Recognition Server provides an application programming interface (API) for integration with other applications. The API can be used to pass image files and processing parameters to Recognition Server, get notifications about job completion and obtain converted files. See more information in the Development and integration section.

ABBYY Recognition Server 3.0

Recognition Server 3.0 is now available.  Key features include:

  • Scanning Station with TWAIN and ISIS support
  • Indexing Station with point-and-click indexing capabilities
  • Scripts for easy customization and integration
  • Modules for Enterprise Search System like Google Search Appliance and MS Search

What's New in Recognition Server 3.0

  • Scanning Station
  • Indexing Station
  • GSA and I filter connectors
  • Scripts:
      Document Separation Scripts
      Document Type Detection and Indexing Scripts
     
    Export Scripts for handling output documents and failed jobs
  • Improved CJK (Chinese/Japanese/Korean languages)
  • New Barcodes: Data Matrix, QR Code, Aztec
  • 11th technologies including ADRT and MRC
  • SharePoint Server connector

Usage scenarios:

  • Indexing and Archiving
  • PDF conversion, Export to Doc. Txt, image formats
  • Point-and-click indexing
  • Simple classification
  • Enterprise Search Systems
  • Unlocks image-based documents and feeds back with searchable text
  • GSA and iFilter modules
  • E-Discovery
  • Convert images and email attachments into searchable formats\
  • Highly scalable
  • Bates stamping
  • Everyday conversion
  • 24/7 Service
  • Easy install
  • No training necessary