Extraction Rules

An Extraction Rules tool has been created in order to manage the extraction of metadata from Microsoft Office Outlook msg files, file properties of any file type, PDF forms, and Microsoft Office Word forms. This allows the information contained within the emails, file properties or forms to automatically populate the metadata fields within a document schema.

Extraction rules can be used in conjunction with the Import Jobs (Automatic Document Importation). The extraction rules are automatically applied when an import job is processing documents on the server. Any metadata values extracted take precedence over the metadata values defined in the import job.

Extraction rules will work only when adding documents through the FileHold Desktop Application (FDA) or through Automatic Document Importation.

Extraction rules are only accessible by Library Administrators or higher permissions.

To access the extraction rules

  1. Do one of the following:
  • In the FDA, log in as a library administration and go to Tools > Extraction Rules.
  • In the Web Client, go to Administration Panel > Library configuration > Extraction Rules.

There are three types of extraction rules that can be created:

  1. Email Headers - Values contained in the headers of Microsoft Outlook msg files.
  2. File Properties - File properties of any file type.
  3. XML Nodes - Values entered into a Microsoft Word content controls
  4. PDF Forms - Values entered in an Adobe PDF form.

When the extraction rules are properly configured, the values from emails, file properties or xml nodes can be automatically extracted into the metadata fields of a schema.

To see the extraction rules in action, watch our video tours.