Automatic extraction of metadata from Microsoft Outlook emails

Automating the capture of email metadata allows users to easily store, search, and archive important emails. The document management system can automatically capture metadata from emails that are added to FileHold from Microsoft Outlook. The fields captured are To, CC, Date sent, From, Subject, and a logical value indicating if there is any attachment.

In order to extract email metadata, you must enable this feature. You will also need to create an email extraction schema and the metadata fields to map the fields CC, Date, From, Subject, and To. For example, you can name the fields “Email CC”, “Email Date”, “Email From”, “Email Subject”, and “Email To”.

After the feature is enabled and the schema is created, the values for the schema and mapped fields are automatically populated as the emails are moved into the client. This feature simply extracts values for metadata. It has no effect on storing files, deleting messages at the source, etc. That configuration is handled in central client options or in the client preferences depending on how your administrator has configured the software.

When setting up file properties extraction rules, the UTC date or local file date can be used. If the file type uses UTC, then select the UTC check box in the configuration settings.

In addition to the FDA, extraction rules can be used in conjunction with Import Jobs (Automatic Document Importation). The extraction rules are automatically applied when an import job is processing documents on the server.

Only users with Library Administrator or higher permission can create extraction rules.

Watch the Email Extraction Rule training video.

If the FileHold toolbar does not appear in the Add-Ins ribbon in Microsoft Outlook, refer to the Office Troubleshooting section.

To create the email schema and metadata

  1. Create a new schema called Email or something similar. For more information on creating schemas, see Creating Document Schemas for more information.

  2. Create fields for each of email properties you intend to extract:

  • From

  • To

  • CC

  • Subject

  • Date sent

  • Attachments - This is a boolean value indicating if there are more than zero attachments in the message.

  1. Save the schema.

Email property name Destination field type(s)
From, To, CC

Text - up to 513 characters each

Subject

Text - up to 998 characters

Date sent

Date / time

Attachments
  • Text - 4 to 5 characters
  • Checkbox
  • Number - a single digit no decimal

Although you must set a default schema to use when the fields are mapped, you can change this value before adding the document. The destination schema must have fields that match the fields used in the extraction rule in order for those values to be transferred. For example, if the extraction rule is putting the email "from" address into a field called From, any schema you want to copy the "from" address to will also need the From field.

 To enable the extraction of metadata from email

  1. Do one of the following:
  • In the FDA, log in as a library administration and go to Tools > Extraction Rules.
  • In the Web Client, go to Administration Panel > Library configuration > Extraction Rules.
  1. In the List of Extraction Rules window, click Add Email Headers Rule.
  2. In the Email Headers Rule window, enter a name for the rule such as "Email extraction rule".

  3. Enter a description (optional).

  4. To enable the rule, ensure the Rule is Enabled check box is selected.

  5. Select the Assume UTC dates check box if the file that is being extracted is using UTC date and time format. If using local date and time, do not select the check box.

  6. In the Document Schema field, select the Email schema name from this list.

  7. Map the metadata fields for From, To, CC, Subject, Attachments, and Date Sent to the metadata fields you created in the previous section.

  8. Click OK.

  9. To test the email settings, launch Microsoft Outlook and login to the FDA. If the FDA is already logged in, refresh (File > Refresh). Open an email in Outlook so it is in full screen. From the Add-ins ribbon, click Add to FileHold. The metadata fields will be automatically populated based on the email content. You can also add an Outlook MSG file directly to an FDA folder.

Addresses will be formatted as 'name' ('email address') where 'name' will be replaced with the display name in the message and 'email address' will be replaced with the email address property of the message or the SMTP address property if the email address is empty. Where the address type is "EX", typically Microsoft Exchange addresses, the SMTP address property will be used unless it is empty in which case the Exchange address will be used.