Paperless Office Software Glossary of Terms

This Glossary is a resource of Document Management Software Industry Terms. FileHold tries to make document management software affordable and easy to use.  Learn more about document management features or to learn more about document management software, attend a free document management sofware webinar.

Document Management Software Glossary

Acrobat - Acrobat is a program from Adobe that lets you capture a document and then view it in its original format and appearance. Acrobat is ideal for making documents or brochures that were designed for the print medium viewable electronically and capable of being shared with others on the Internet. To view an Acrobat document, which is called a Portable Document Format (PDF) file, you need Acrobat Reader. The Reader is free and can be downloaded from Adobe. You can use it as a standalone reader or as a plug-in in a Web browser.

Active Directory - Active Directory is Microsoft's trademarked directory service, an integral part of the Windows architecture. Like other directory services, such as LDAP or NT Domain, Active Directory is a centralized and standardized system that automates network management of user data, security, and distributed resources, and enables inter operation with other directories. Active Directory is designed especially for distributed networking environments.

Compound document - Compound document features provide the ability to create document-to-document relationships that quickly organize documents into logical groups. This allows you to link related documents together that would not be stored in the same folder.

Document Management - Document management describes the systems and strategies in place for the management of electronic and paper-based documents. Document management resources strive to create systems that can handle paper and electronic documents together, using tools such as document imaging software and Optical Character Recognition (OCR) alongside categorization, indexing, full text search, records management and archival tools.

Document Profiling / Indexing - The process by which metadata is associated with a document.

Document Usage Logs - Document usage logging is provided by the system to log and track all access and usage of documents stored within a document management system. This allows administrators to track usage for audit purposes, and end users to use it for collaborative purposes.

Document Scanning/Imaging - Is the process by which print and film documents are fed into a scanner and converted into electronic documents. During the scanning process documents can be OCR'd and indexed to insure quick retrieval at a later date. In document imaging, the emphasis is on capturing, storing, and retrieving information from the images (which are often mainly images of text).

Electronic Document Management (EDM) - An EDM system allows an enterprise and its users to create a document or capture a hard copy in electronic form, store, edit, print, process, and otherwise manage documents in image, video, and audio, as well as in text form. An EDM system may include scanners for document capture, printers for creating hard copy, storage devices and computer server and server programs for managing the databases that contains the documents.

Extranets - An Intranet that is partially accessible to authorized outsiders. Where an Intranet resides behind a firewall and is accessible only to people who are members of the same company or organization, an extranet provides various levels of accessibility to others outside the company. Extranets are becoming a very popular means for businesses and their partners to exchange corporate information.

File Folder Hierarchy - The hierarchy is the file and folder tree structure which a company's documents reside in. Each node / branch (folder) in the tree can contain child objects (files / documents) or other nodes / branches (folders).

Full Text Search (FTS) / Retrieval - is a capability that enables you to search for documents stored in a database based on the text contained in the documents. It can be used in conjunction with Metadata-based searching which relies on a description of the document entered by a scan operator.

Intranets /Corporate Portals - A network based on TCP/IP protocols belonging to an organization accessible only by the organization's members, employees, or others with authorization. An intranet site often looks and acts just like an Internet site, but firewall's, virtual private networking access and other secure remote access tools safeguard it from unauthorized access. Intranet's that have developed into large-scale sites, sometimes including knowledge management and Customer Relationship Management tools, are often known as 'corporate portals'.

Knowledge Management - Knowledge management is the name of a concept in which an enterprise consciously and comprehensively gathers, organizes, shares, and analyses its internal knowledge in terms of resources, documents, and people skills.

Mass Document Import Utility - Mass Document Import Utilities are able to quickly and easily index (profile) and import legacy files and folders into FileHold document management software.

Metadata - Meta is a prefix that in most information technology usages means 'an underlying definition or description.' Thus, Document metadata - as it relates to document management - is a definition or description of the document it relates to. When using document management software this information is typically entered by a end user or a scanning operator.

The Metadata Information can include physical location information (e.g., where the document is stored) and document identification information (e.g., date archived, creator, and contents).

Named User Licensing - Named user licensing provides each individual user with a unique license to a document management system or other software application.

Optical Character Recognition (OCR) - OCR is the recognition of printed or written text characters by a computer. This involves photo scanning of the text character-by-character, analysis of the scanned-in image, and then translation of the character image into character codes, such as ASCII, commonly used in data processing. In OCR processing, the scanned-in image or bitmap is analyzed for light and dark areas in order to identify each alphabetic letter or numeric digit. When a character is recognized, it is converted into an ASCII code.

OCR is being used by libraries to digitize and preserve their holdings. OCR is also used to process checks and credit card slips and sort the mail.

Personalization - Content can be personalized by country, publication date, subject or even user. Personalization tools can use web registration information, email address of IP to make intelligent choices when serving content to a web page to make the browsing experience more valuable.

PDF (Portable Document Format) - PDF (Portable Document Format) is a file format that has captured all the elements of a printed document as an electronic image that you can view, navigate, print, or forward to someone else. PDF files are created using Adobe Acrobat, Acrobat Capture, or similar products. To view and use the files, you need the free Acrobat Reader, which you can easily download. Once you've downloaded the Reader, it will start automatically whenever you want to look at a PDF file.

Records Management - Purchase orders, timesheets and other accounting and HR information is often collected on paper-based media. Records management automates that process, sometimes by the conversion of paper records to digital, often by the use of web forms on intranets and extranets.

Repository/Database/Storage - Digital data that is being created, converted, syndicated or scanned needs a repository to reside in so it can be accessed by a content management system and served to a user. These database systems must be secure, reliable and expandable to create a stable environment for the storage of content. This can often only be guaranteed through the implementation of a dedicated storage environment.

Roles Based Security - Users are placed into groups that are then authorized to access to different modules in a document management application. The group provide rights from Read only to Delete privileges. Users can access documents and can do various tasks based on their group memberships and the subsequent authorization, and access control the group membership provides.

Scanning Integration - Scanning Integration allows for the seamless connection of scanning / imaging systems to a document management system. For more information on scanning or imaging see Document Scanning / Imaging above.

Subscriptions - Within FileHold document management software, you can subscribe to a document or folder to be notified by email when there is an updated document.

TIFF (Tagged Image File Format) - TIFF is a common format for exchanging raster graphics (bitmap) images between application programs, including those used for scanner images. A TIFF file can be identified as a file with a '.tiff' or '.tif' file name suffix. The TIFF format was developed in 1986 by an industry committee chaired by the Aldus Corporation (now part of Adobe Software). Microsoft and Hewlett-Packard were among the contributors to the format. One of the most common graphic image formats, TIFF files are commonly used in desk top publishing, faxing, 3-D applications, and medical imaging applications.

Version Control - Version control allows you to manage the lifecycle of a document from conception to final copy. FileHold includes the ability to roll back versions and track usage within all versions. Only one person at a time can check out a document or file from the library, although people can access the current version and get a copy of the latest version if required. However, once the person updates the document and checks it back in, a new version is then created. 

Document Scanning Software Jargon

The Document Management Software  supports over 400 document scanners "out of the box" from the industry leading vendors, as well many other vendors who benefit from ISIS driver support from EMC-Captiva Corporation. Learn more about document scanning.

This technical glossary of terms, is a resource for records management prospects who may want to add scanning to their document management software. This may be especially helpful for those considering the conversion of an existing archive of physical records into an electronic record repository. Learn more about how to purchase records management software.

Anti-aliasing - A process used to remove the stair stepping effect found in diagonal lines of an image. It involves inserting dots of an in-between tone along the edges.

Aspect Ratio - The relative proportion of the length and width of an image. For example, if you scan an original that measures 4 by 6 inches, it will have an aspect ratio of 4:6, or 2:3.

Attribute - Characteristics of a page or character, such as underlining, boldface, or font that can be captured by an optical character recognition (OCR) program.

Automatic Document Feeder (ADF) - A device attached to a scanner that automatically feeds in one page at a time, allowing the scanning of multiple pages.

Auto Trace - A feature found in many object-oriented image editing programs, such as Adobe Illustrator, that allows you to trace a scanned image and convert it to an outline or vector format.

Batch - Actions carried out consecutively on a set of files.

Binary - Base-two arithmetic, which uses only 1's and 0's to represent numbers. 0001 represents 1 decimal, 0010 represents 2 decimal and so forth. Binary numbers are used indirectly to refer to color depth, as in 24-bit or 8-bit color.

Bit - The abbreviation for binary digit, either 0 or a 1. Scanners typically use multiple bits to represent information about each pixel of an image.

Bit Depth - The number of bits used to represent colors or tones.

Bitmap - An image represented as pixels in a row and column format. (Note that Adobe refers to a bitmap as a two-color image.

Calibration - A way of correcting for the variation in output of a device such as a printer or monitor when compared to the original image data from the scanner.

Carriage - The scanner component that moves down a page to capture an image.

CMYK - The abbreviation for cyan, magenta, yellow, and black.

Compression - Squeezing a file (especially an image) into a more efficient form to reduce the amount of storage space required.

Contrast - The range between the lightest and darkest tones in an image. In a high-contrast image, the shades fall at the extremes of the range between white and black. In a low contrast image, the tones are closer together.

Data Compression - A method of reducing the size of files, such as image files, by representing the sets of binary numbers in the file with shorter string that conveys the same information. Many image editing programs offer some sort of image compression as an optical mode when saving a file to disk.

Digitize - To convert analog information, such as a continuous tone image, to a binary form that can be processed by a computer.

Dot - A unit used to represent the smallest element a printer can image, but sometimes used to represent the resolution of other devices, such as monitors or scanners.

Dots Per Inch (DPI) - The resolution of a printed page, expressed in the number of printer dots in an inch, abbreviated dpi. Scanner resolution is also expressed, somewhat in accurately in dpi.

Down sampling - To reduce the amount of information in an image, usually to make it smaller or to discard some colors when changing bit depth. Also used when reducing the number of pixels in an image.

Dynamic Range - The range of densities between the highlights and shadows of an image.

Export - To transfer an image to another format type.

Filter - An image transform tool used to process an image; for example, to sharpen, blur, or diffuse it. Often this is a plug-in in an image editor, but filters are also built into scanning software or hardware.

Gamma - A way of representing the contrast of an image, shown as the slope of a curve showing tones from white to black.

Gamma Correction or Gamma Compensation - The process of preconditioning or adjusting an image to correct for the gamma of the device used to reproduce the image, such as a printer or display screen. Without gamma compensation, the image will look too dark when printed or displayed.

Gang Scan - The process of scanning more than one picture at a time, used when images are of the same density and color balance range.

Graphics Interchange Format (GIF) - A compressed image format popular on the Web. GIF was the first commonly used image format, but was largely replaced by JPEG.

Grayscale - Gray values in an image.

Halftoning - A method of representing the gray tones of an image by varying the size of the dots used to show the image.

Interpolation - A method of changing the size, resolution, or colors in an image by calculating the pixels used to represent the new image from the old ones. It is also being used to increase bit-depth claims on scanners (as in "Enhanced Bit Depth" or "Enhanced Color").

Invert - To reverse an image's tones to its opposite value: to make a negative.

Joint Photographic Experts Group (JPEG) - The JPEG format offers a compression scheme that makes the image file smaller than files in other formats by discarding some of the image information.

Landscape - The orientation of a page in which the longest dimension is horizontal.

Legal size - Paper or other media that is 8 1/2 inches wide and 14 inches long.

Moire - In scanning, an objectionable pattern caused by interference of halftone screens, often produced when rescanning a halftone and the sampling frequency of the scanner (spi) interferes with the halftone or dither pattern of the original.

Monochrome - Having a single color. Typically refers to a black and white image, but could be any single color image.

Noise - Random information that distorts an image, especially the background distortion of an analog image before it is converted to digital format.

Optical Character Recognition (OCR) - The process of converting printed characters into the ASCII characters and other attributes of a bitmapped image of text.

Optical Resolution - The resolution of a scanner that is calculated by dividing the width of the scanned area by the number of pixels in the CCD. Optical resolution is also often called true resolution and does not include any interpolation to increase pixels.

Pixel - A picture element of an image that refers to a single dot with in a digital photograph. A photograph is made up of thousands of pixels.

Pixels Per Inch (ppi) - The number of pixels captured per inch by a scanner. This is a more accurate rate term than dpi (dots per inch) when applied to scanners because scanners capture pixels.

Portable Network Graphics (PNG) - A lossless file format created to overcome deficiencies of the Graphics Interchange Format (GIF), such as the limited number of colors.

Portrait - The orientation of a page in which the longest dimension is vertical.

Preview Scan - A preliminary scan that can be used to define the exact area for the final scan. A low- resolution image of the full page or scanning area as shown, and a frame of some type is used to specify the area to be included in the final scan.

Raster Image - An image defined by rows and columns of pixels. Scanners capture images as raster images, although some can convert them to vector images.

Raster to Vector Conversion - The process of examining a raster image for lines and strokes, and creating a new image that looks the same but is made up of lines rather than pixels. When a person draws, they are creating a vector image. Vector images can be enlarged much more accurately and often have a smaller file size.

Resolution - The number of pixels or dots per inch in an image. Also the capability of a scanner to resolve detail, which requires quality optics as well as high ppi or spi.

Sample Rate or Samples Per Inch - The number of pixels per inch captured by a scanner.

Scanner - A device that captures images or text and converts it to a bitmapped image.

Selection Area - The part of a HP Deskscan preview scan that you select to be saved to a file or sent directly to a printer.

Sharpening - Increasing the apparent sharpness of an image by increasing the contrast between the adjacent tones or colors.

Smoothing - To blur the boundaries between tones of an image, usually to reduce a rough or jagged appearance.

Threshold - A predefined level used by scanners to determine whether a pixel will be represented as black or white.

Thumbnail - A miniature copy of a page or image, which gives you an idea of what the original looks like without having to open the original file or view the full size image.

Tagged Image File Format (TIFF) - A graphic file format originally developed specifically for scanners. It can be used to store grayscale and color images and now is graphic standard image file format supported by most applications, printers, and scanners.

Transparency Adapter - An add-on device used with a scanner to scan slides and other see-through media.

TWAIN - A software driver interface between a scanner and other image capturing devices that lets you scan images from a scanning application directly into an application like Adobe Photoshop.

Vector Image - An image defined by the beginning and ending points of each line.

Zoom - To enlarge a portion of an image.