Automatic document importation (ADI)

The Automatic Document Importation (ADI) mechanism allows importing a large number of documents into the document management system with minimal user intervention. It runs on the FileHold server to facilitate the mass migration of documents. ADI is similar to the Watched Folders functionality but can also be integrated with various custom migration tools using an API.

Configuring and using Automatic Document Importation typically requires skilled IT personnel for a self-hosted system. If you are using FileHold Cloud or if you require assistance configuring or using ADI, our professional services team is available to help.

Paragraph holder

Several ADI “jobs” can be created by a Library Administrator or higher role. Each ADI “job” stores the configuration and status of the job. An administrator can configure the source type (Watched Folder or API), a time restriction for the job to run, the user account that is adding the documents, the source folder, target location and so on.

Documents can be imported from three sources:

  • If a Watched Folder is being used for the job, files from a specified directory are added to a queue. Once processed, they are imported into the destination folder in the library using the specified schema and metadata field values (direct), or using indirect metadata. The files from the specified directory can be monitored and brought automatically into the system. The input files can also be deleted.
  • If a Watched FTP site is being used for the job, files from an FTP server can be downloaded and processed. This method is useful when for example a scanning company completes a batch of scans and wants to send them into their customer’s FileHold repository. The scans are zipped along with the metadata and stored on a FTP server. When the file is stored on the FTP server, the download is triggered from either the appearance of the file or a notification email is sent to a specific email inbox. Direct or indirect metadata methods can be used.
  • If the source is an API, documents along with their target location in the library and metadata values are added to the queue using API calls. See the Knowledge Base for more information on API.

Once an ADI job is configured, the user specified in the job is the owner of the documents once the files are processed. This user must have a Document Publisher role or higher and must have access to the schema and destination folder.

For each job, the status which includes the number of processed documents, pending documents, and errors are shown. Within each job, the detailed list of documents, status (pending, completed, error), the date they were added to the queue, date they were processed, the source path and target folder are shown. These import details can be exported into a CSV file. Once a document has been successfully imported, the summary information and the document with associated metadata can be viewed. Summary information can be viewed for any pending documents or documents with errors.

The time at which documents are processed can be set on the job and for a scheduled task. In the job, you can specify when the specified directory is scanned for documents and puts them into the queue. However, when the documents are processed and imported into library is controlled by a scheduled task “FH process ADI job”. For example, you can be adding documents to the queue all day (no time restriction in the job settings) but the actual process of importing the documents occurs only at night (via the scheduled task settings) so the FileHold server is not additionally burdened during the day. The default setting for the scheduled task “FH process ADI job” is to run every 10 minutes indefinitely.

Extraction rules can be applied to documents that are imported when indirect metadata is not used. The extraction rule can either be used when the import job is set to use the same schema as the rule or it can override the schema according to the extraction rule. The metadata values that are extracted through the extraction rule take precedence over the metadata values set in the import job. If there is no value mapped in the rule, then the value set in the job is used. Note that if the metadata field is a drop down list, ensure that the value being extracted from the document exists in the list. If the value does not exist then the value set in the job is used.

Metadata field values can be extracted from a delimited file instead of using the static values when using a watched folder or watched ftp site type import job. This is called "indirect metadata". A text delimited file, such as a CSV, that contains the schema, full path and document name, and metadata fields and values, is used to define the values that populate the metadata fields. Document versions can be imported and auto-filing templates or scripts can also be used if using indirect metadata. See Using Indirect Metadata in an Import Job for more information.

Automatic Document Importation (ADI) is an optional feature that is controlled in the FileHold license. To purchase this feature, contact [email protected]. The FileHold professional services team has used ADI to help customers migrate documents from their legacy systems including Windows shared folders, Sharepoint, Image Now, ImageWare, FileNet, Mango Apps, ImageSilo, Computer Filing Cabinet, ApplicationXtender, Worldox, M-Files, Laserfiche, boxes of paper and more.

Automatic document importation job for a watched folder source

To create an ADI job for a watched folder source

  1. In the Web Client, go to Administration Panel > System management > Import Jobs.
  2. In the List of Import Jobs, click Add Job.
  3. Enter the Name of the job.
  4. Enter a Description for the job.
  5. Select a Source Type Watched Folder. Documents are imported from a specified folder path. This folder can be on the server or in a network location; however, the folder must have the designated FileHold service account as a member and have full control permissions. Select this option if you are using direct or indirect metadata.
  6. In the Job Settings area, select the Job is enabled check box to enable the job.
  7. The Restrict operation time fields determine when the documents will be brought into the queue from the Watched Folder. Select the Restrict operation for check box and enter the start and end time that the job will run. If no time is entered, the job runs as a continuous process and documents are added to the queue as soon as they are added to the source (Watched Folder).
  8. In the Max Documents Per Trigger field, enter the maximum number of documents that will be processed per import instance. For example, there can be 100 documents in the source folder but the maximum documents per trigger setting is set to 50 so only 50 documents will be processed when the scheduled task runs. The next 50 documents will be processed when the scheduled task runs again.

There are two limits to the number of documents that will be processed. In addition to the maximum number of documents there is a timer. The processing will stop when the maximum number of documents is reached or the timer expires. When the timer expires, the document that is currently being processed will be completed. The duration of the timer is set in the library manager web config file and should not normally be changed. It is the "ImportationJobTimeoutSec" key in the appSettings section.

  1. In the User Context field, select the user name from the list that will own the imported documents. This must be a user with a role of Document Publisher or higher.
  2. In the Post Import Actions field, select an option from the list:
  • None — No changes are made to the document
  • Force document format to electronic record — The document format is converted into an electronic record.
  1. If a Watched Folder source was selected, enter the Source Folder Path. This is the folder that is being “watched” for new documents and are brought into the queue.
  • You must use a UNC path for remote folder share locations, making sure that the designated FileHold service account has full control of this remote folder, and that the remote folder is properly shared as well.
  • If using indirect metadata, ensure that the indirect file and documents being imported are in the same directory.
  1. Select the Delete Input Files check box to delete the files from the source folder once they are imported into the library.
  2. Select the Automatically add new files to the queue check box to run this job without user intervention; documents are automatically added to the queue when the scheduled task is executed. If this check box is not enabled, then the job is run manually by pressing Watch Now. Manual operation can be helpful when you are first creating a job as you will be able to see the documents that are pending before they are added to the library. You may decide to reset the job if you notice an issue and would like to start the import again.
  3. Select the Use indirect metadata check box if you are using an indirect file that contains the metadata field values for the documents. See Indirect Metadata for more information. Fill out the following information:
  • File extension - Enter csv, tab, txt, etc.
  • Field delimiter - Enter the field separator. This delimiter is used to separate indirect file fields in the heading and value rows.
  • Value delimiter - Enter the value separator. This field is required even if you do not plan to use fields with multiple values.

Most fields only have a single value, but some, such as dropdown menus that allow multiple selections, can have more than one. Note that the field delimiter and the value delimiter cannot be the same.

  1. Select the Let extraction rules select the schema check box to allow extraction rules set the document schema.
  2. Click Select to set the Destination Folder from the library tree.
  3. Select the Document Schema from the list.

If you choose a document schema here any values for schema and metadata fields in the indirect file will be ignored.

  1. Enter the values in the metadata fields. All fields marked with an asterisk (*) are required.
  2. Click OK to save the job. The job is added to the List of Import Jobs.

When first setup an ADI job there may be missing or incorrect configuration on your server or in an indirect file. These errors are reported in the Windows event log for FileHold. If documents are not being added to the job queue, you will likely find an error in the event log.

Automatic document importation job for a watched FTP site source

When using a Watched FTP site as the source, documents and /or metadata are downloaded and imported from an FTP server. Downloads are triggered by the presence of a file or via an email.

To create an ADI job for a watched FTP site source

  1. Complete steps 1-10 as above except select Watched FTP site as the Source Type.
  2. In the FTP Site Settings area, in the Host field, enter the machine name or server IP address of the Source folder. Click Test Connection to verify the Host is accessible.
  3. Enter the Port number. Uses standard port 21 by default.
  4. Select the Encrypted Connection check box if encryption is used in the FTP connection.
  5. In the Authentication area, select Anonymous if the logon type is anonymous. Leave unchecked if using a normal connection type.
  6. If not using an anonymous connection type, enter a User name and Password for the FTP account.
  7. In the FTP Folder Settings area, enter the FTP source folder path. Provide the full path to the Source folder in this field (for example: /FileHold/Data/Source). Make sure the path begins at the base directory to which the FTP server allows connection. The path must start with a forward slash ( / ).
  8. In the Source Filter field, enter the acceptable file types to be transferred. This will filter out any files that do not match the specified source. To accept all file types, enter *.*. This field is unavailable if the option “Get filenames from the email body using a regular expression to search for filename details and form a complete filename with replace” is enabled.
  9. In the Local Destination Folder Path field, specify the folder location when the files will be downloaded to on the local computer.
  10. In the Post Download Operation area, select any of the following options:
  • Extract archived files — Extracts the downloaded files after they are downloaded. Enter the list of valid archive file extensions in the field.

The files are extracted to a randomly named folder in the job folder. Do not include a path for the file in the delimited file. The delimited file and files to import should all be in the root of the archive.

  • Delete archive files after contents are extracted — Select the check box to delete the zipped files after the contents have been extracted.
  • Rename source files — Renames the source files on the FTP site with a new extension. Enter the new file extension in the New File Extension field. Cannot be used with the Delete source files option.
  • Delete source files — Deletes the source files from the FTP source folder. Cannot be used with the Rename source files option.
  1. In the Watched Folder Trigger, select one of the following options:
  • File appears — Once a file appears in the FTP source folder path, the source files are downloaded to the local destination folder.
  • Email message received — Source files are downloaded when a notification email is received in a configurable email box. Use the following table to fill out the information:
FieldDescription
POP3 ServerEnter the address for the POP3 server and click Test Connection to verify.
PortEnter the port number. Uses standard port 110 by default.
Encrypted connectionSelect the check box if the connection is encrypted.
AuthenticationSelect Anonymous or enter a User name and Password.
Get filenames for the email body using a regular expression to search for filename details and form a complete filename with replaceSelect the option to use a regular expression in the Search and Replace options below.
SearchProvide a regular expression that finds each filename in the body of the email.
ReplaceInclude a regular expression to form a filename using characters found in the search above.

 

  1. Continue to fill out the Local File Processing Settings from step 12 to 18 above.
  2. Click OK to save the job.

Controlling ADI jobs

To enable or disable a job

  1. In the Web Client, go to Administration Panel > System management > Import Jobs.
  2. In the List of Import Jobs, click Enable or Disable next to the job name.

To view job details

  1. In the Web Client, go to Administration Panel > System management > Import Jobs.
  2. In the List of Import Jobs, click the name of the job to edit.
  3. In the Summary of job page, click View Details. In the Details of Job page, a list of the files that were processed are shown:
  • The document name, schema type, source location, destination folder, date the file was added to the queue, and the date the import was completed is displayed for each document.
  • The status of pending, completed, or error is displayed. In the case of an error, this indicated the import failed for that document and will need to be re-added to the queue.
  • Click Download as CSV to download the job details as a CSV file.
  • To view the details of a specific document, click the document name. In the Details of <file name> Document screen, the metadata fields and summary for the document are shown. In the case of an error, the Error Log message is displayed. Where the status of a document is “completed”, click Go to Document to view the document in the library.
  • To reload the details for the document from the import folder, click Re-process Document. Ensure that the issue that caused the error has been corrected prior to attempting to reprocess the document. Click Previous or Next to move to the previous or next document in the details list. Click Return to Job Details to return to the previous screen.
  • To clear the details of the successfully completed documents, click Clear Completed.
  • To clear the details of unsuccessfully imported documents, click Clear Errors.
  • To reprocess all the documents that generated errors, click Re-process Errors. In the Reprocess Errors of <job name>, select the documents to be reprocessed. Ensure that the issue that caused the error has been corrected prior to attempting to reprocess the document. The documents details are updated to the queue from the import folder and reprocessed. If the documents were able to be processed, they will have a status of “completed” in the job details. If the documents were not able to be processed, they will have a status of “error” in the job details.
  1. In the Details of Job page, click Return to Summary to return to the Job Summary page.
  2. In the Job Summary page, click Return to List to return to the List of Import Jobs.

Resetting a job marks all pending documents and documents with errors as deleted in the queue. The local import folder will be rescanned requeuing the pending and error documents and any new documents in the folder. FTP folders will not be scanned. This rescan is independent of the "Automatically add new files to the queue" setting.

To reset a job

  1. In the Web Client, go to Administration Panel > System management > Import Jobs.
  2. Select the job from the list.
  3. In the Summary of the job page, review if there are any errors. If present, click Reset Job.
  4. The message “Are you sure you want to reset this import job? All pending and failed documents will be removed and the import folder will be rescanned.” is displayed. Click OK to reset the job.

To delete a job

  1. In the Web Client, go to Administration Panel > System management > Import Jobs.
  2. In the List of Import Jobs, click the name of the job to delete.
  3. In the Summary of job page, click Delete Job.
  4. At the message prompt, click OK.

To edit a job

  1. In the Web Client, go to Administration Panel > System management > Import Jobs.
  2. In the List of Import Jobs, click the name of the job to edit.
  3. In the Summary of job page, click Edit Job.
  4. Make the job changes and click OK.

If the “Automatically add new files to the queue” option is not enabled for the job, the job must be run manually for a watched folder.

To manually run a job on a watched folder

  1. In the Web Client, go to Administration Panel > System management > Import Jobs.
  2. In the List of Import Jobs, click the name of the job to run.
  3. In the Summary of job page, click Watch Now. Any files in the source folder are added to the queue for processing.

If you manually watch a folder the documents in that folder will be immediately added to the job queue, but they will only be processed when the "FH process ADI job" scheduled task runs. By default this is every 10 minutes. If you are testing a job you can run the Windows task manually to speed up the testing process.

 

Using indirect metadata in an import job

For a Watched Folder type import job type, a text delimited file, such as a CSV, that contains the schema, full path and document name, and metadata fields and values, can be used to define the values that populate the metadata fields. This allows you to import documents that have metadata values that vary from document to document. Without indirect metadata, the values in the import job are static or extraction rules can be used.

The option "Use Indirect Metadata" is available in the import job. When selected, the file extension (typically csv), field delimiter (typically a comma or semicolon) and the value delimiter (typically a comma or semicolon) which is used for multiple selection type metadata fields. Note that the field delimiter and the value delimiter cannot be the same. Field and value delimiters can be any Unicode character

Offline documents can be added with ADI using an API-based import or the indirect metadata method. For the indirect metadata method, the schema listed in the text delimited file must be an offline document schema.

Document versions can be imported via ADI. The versions of the document that need to be associated to each other must be defined in the delimited file and with the document in order for the system to connect the versions together. The association can be from a metadata ID, document version ID, document ID, external ID using a quick search, and internal ID. It is not possible to change the document schema or metadata fields when importing a version record.

An auto-filing template or script can also be used when the indirect metadata option is enabled. The auto-filing template or script can vary automatically according to the document schema or it can be set the same for all documents imported in the job.

The following table describes the columns used for indirect metadata import. When creating the delimited file, the order of the columns does not matter.

Column headingRow Description

adiImportType

(ImportType must be used in versions less than 16)

*required field

Use one of the following:

  • Document – Used for importing documents the first or only version of a document. When additional versions will be included in the same indirect file, set the adiVersionKeyType to InternalId and use the same InternalId GUID for this and every version of the same document.
  • Version (version 16 and higher) – Used for importing one or more additional versions of an existing document. The adiVersionKeyType and adiVersionSequence columns must be configured with this import type. It is possible to import all versions of a document in a single indirect file when the Document and Version import types are used together.

adiImportFilename

(ImportFilename must be used in versions less than 16)

*required field

The full path and name of the document to import.

In the case of offline documents, use the document name only (not the path).

The FH_Service account must have full control to this directory. This field is mandatory.

adiDocumentSchema

(DocumentSchema must be used in versions less than 16)

*required field

The name of the document schema that should be assigned. This field is mandatory when the adiImportType is set to Document and must be blank when the import type is set to Version.
adiDocumentNameThe document name if different than the import filename. In the case of an offline file, the import filename will continue to be used for the original filename and document name if the new field is not provided. If this new field is provided, the import filename will only be used as the original filename. This is an optional field.
adiOwnerOverrides the user set as document owner in the job for each document version. It is supplied as a user GUID. This is an optional field.

adiVersionKeyType

adiVersionKey

adiQuickSearchName

The version key is used to determine how a version record will be associated as a new version of a document. The adiVersionKeyType and adiVersionKey columns are mandatory for version records and mandatory for a document record that will have subsequent versions in the same delimited file when using InternalId. The adiQuickSearchName is mandatory when the version key type is ExternalId.

In order to import a file as a version of a document, there needs to be some way to associate the version record with the document that it will become a version of. There are generally two cases: the document and versions records are all contained in the same delimited file or the document is already existing in FileHold and only the version records appear in the delimited file. The first case can use an identifier that is unique among all versions of the document and repeat this value in the delimited file for each record. For the second case, the version records in the delimited file will need an reference to the document in FileHold.

There are five possible values for the version key type. Only the first case is used when all document versions are in the indirect file. The last four are assume at least one version of the document exists in FileHold. The primary difference between them is that the ExternalId is an universal approach that does not require any information internal to FileHold. The other three all require information taken from the FileHold API or database or otherwise extracted from the system. They all share a performance advantage over ExternalId.

  • InternalId – The adiVersionKey field value will be interpreted as an id unique in the indirect file. This id must have previously been defined with a Document import type record in the same indirect file. The unique id must be in the format of a GUID. The scope of this value is limited to the delimited file.
  • ExternalId –  The adiVersionKey field value will be interpreted as a parameter to a quick search. When this version key type is used, a public quick search must also be specified in the adiQuickSearchName column. The quick search will be executed using the external id as a parameter in order to find the latest version of the document already in FileHold.
  • MetadataId – The adiVersionKey field value will be interpreted as the metadata id of an existing document in FileHold.
  • DocumentVersionId – The adiVersionKey field value will be interpreted as the document version id of an existing document in FileHold.
  • DocumentId – The adiVersionKey field value will be interpreted as the document id of an existing document in FileHold.

Metadata field values are ignored when importing a version. The values for the associated document are used for the version.

adiVersionSequence

This is used to help ensure that versions are added in a specific order. The value starts at 1 and increases by 1 for each version. This sequence number is not related to the FileHold version.

For example, if there are three version records for the same document in a delimited file, sequence numbers 1, 2, and 3 are expected in that order. If there are three version records for the same document in three delimited files, the value 1 is expected for each. The versions are added in the order the delimited files are added to the queue. The version sequence is not required with a document record as this is always the first version of a document. Any related version records in the same file must start with sequence number 1. This is a mandatory column for a version record.

adiCreatedDate

Provides a way to override the created date/time for the document version. This does not affect the last modified date. So, if a document is added on Jun 1 and this field value is provides as Feb 15, the created date will be Feb 15, but he last modified date will be Jun 1. This also does not affect the action date for the Add Document action which will be the actual date. The usage log will include a note/column that indicates the actual create date was overridden.

The ability to use this field depends on a system administration setting “Allow document version create date to be overridden.” By default, this is disabled. Changing this value will create an entry in the system audit log and cause a confirmation prompt: “Compliance or regulatory requirements may require this option to remain disabled. Confirm that you would like to enable setting arbitrary create dates on document versions.”

This is an optional column. When this column is not present, the create date will be the date the document is added with ADI.

adiApprovalStatusSets the approval status for the document version using the normal enumeration values. The only valid values are Not submitted for approval, Approved, and Not approved. This is an optional column. When this column is not present, the approval status will be not submitted for approval.
adiVCNThe version control number for the document version. This is an optional column.
adiDCNThe document control number for the document version. This is an optional column.
metadata field

The name of a metadata field when the import type is set to Document. There must be one for every required field in the document schema. Optional fields can be added as needed.

Metadata field names and drop down list values must exactly match the configuration in FileHold including case. If you notice a blank field after you import documents a misspelled field name in the indirect file may be the cause.

The indirect file and files for import do not need to be in the same directory location when being imported, but the designated FileHold service account must have full control.

A sample indirect file is shown below displayed in Microsoft Excel®. There is one document and one version being imported.

  • adiImportType - Set to Document or Version
  • adiImportFilename - The full path and name of the document to import.
  • adiDocumentName - Document name used to rename the original filename.
  • adiOwner - The GUID of the owner.
  • adiVersionKeyType - The method used to determine how the document is associated with the version.
  • adiVersionKey - The unique ID that links the document with the version. The number must be the same for both document and version.
  • adiVersionSequence - The order in which the versions are added. Must start at 1.
  • adiDocumentSchema - The name of the document schema being used.
  • Invoice number - The metadata field name.
  • Total - The metadata field name.
  • Invoice Date - The metadata field name.
  • Vendor - The metadata field name.
Image
ADI version key in csv file

Arbitrary Unicode characters can be used as delimiters by prefixing a decimal Unicode value with a backslash. The most common delimiter characters will come from the ASCII Punctuation symbols. A tab character can be expressed as \9, vertical tab is \11, a space as \32, and the \ (backslash) character as \92. A complete list of values is available Unicode Consortium website.

A second sample indirect file is below as plain text. There is one document being imported including a dropdown field with multiple values.

  • adiDocumentName - Left blank, so the base filename will be used as the document name.
  • Property - Text metadata field.
  • Applicable codes - Dropdown menu field with allow multiple selections enabled. Three selections have been included in the value using the default value delimiter; a semi-colon.
adiImportType,adiImportFilename,adiDocumentName,adiDocumentSchema,Property,Applicable codes
Document,\\localhost\C$\importfiles\form-property-411-20201003.pdf,,Property,PT-550411,ABN;MON;QUE

Special field type conditions with indirect metadata

Date fields

The following short date formats are supported in the delimited file with ADI:

  • mm-dd-yyyy
  • mm/dd/yyyy
  • yyyy-mm-dd
  • yyyy/mm/dd

Once the date metadata value is imported into the library, the date format will match the format set in the date metadata field properties in FileHold. For example, if the date format in the delimited file is mm-dd-yyyy and the date format in the Date metadata field is ddmmyyyy, the format for the date field in the metadata pane is ddmmyyyy. The most universal approach for dates is year, month, day.

Dropdown field

It is possible to define dropdown menu metadata fields to allow duplicate values. The behavior of indirect metadata assignment is undefined when there are more than one dropdown menu value matching the value in the indirect metadata. As a general rule, creating duplicate values in a dropdown menu is bad practice. 

Drilldown field

The value for only a single node can be specified in the indirect metadata. This value will be compared against every node value in the tree to find a match. If the leaf node restriction is defined for the field, the comparison will only be performed on the leaf node. The same problem with duplicate values as for dropdown menus exists for drilldown menus.

Understanding ADI processing

Like many autonomous tasks in FileHold, ADI normally operates as a Windows scheduled task. By default it is set to run every 10 minutes and will internally turn itself off after the maximum number of documents is processed or it reaches its timeout value, whichever comes first. It should never be allowed to run more than once.

The maximum number of documents for each job is set to unlimited by default and the timer is set to 570 seconds. As the task should never be allowed to run more than once, the Windows scheduled task definition is set to prevent this. If the documents are small and can be added quickly, approximately 30 seconds of processing time will be unused at the end of the 10 minute task interval. If the last document processed before the end of the timer is large and requires more than 30 seconds to process, the task will end after it next scheduled time and processing will be idle until the following scheduled time.

FileHold does not guarantee the order or priority for processing jobs. It is possible that one job may have a large number of documents and effectively consume all task time and starve other jobs from loading documents. In this case, the maximum number of documents can be set to throttle one or more jobs and allow other jobs to process. Since the task is running every 10 minutes, any changes to the job definition will take effect at the next 10 minute interval. This allows you to modify the maximum number of documents or disable a job to accommodate changes in volume, for example.

If you are processing a very large number of documents, such as a migration from a legacy system, you may want to make some additional system adjustments to end the import as quickly as possible. These adjustments must be made with the help of a Windows administrator. The first adjustment is simple, disable all jobs except the migration job and set the maximum number of documents to unlimited. Adjust the maximum duration of the import task to a value appropriate to the volume of documents. The value is in seconds, but it can be set to run for multiple days as needed. Disable the import task in the task scheduler and run it manually from the command line.

fileholdadm /lmprocessimportationjobs

Depending on the expected duration, you may wish to temporarily disable IIS recycling as this can interrupt the task. Recycling is an optional maintenance feature in IIS, but it is typically set to run every day. To improve performance you may also wish to disable the full text search indexing task. This will mean new documents will not be added to the full text index, but they will be queued to be added later. If server side OCR is enabled, disable that task also.

If you are performing a mass document migration from a legacy system, you are likely using the indirect import option to capture metadata. Before running a job with all your documents, it is advisable to test your import data to make sure it matches the configuration in FileHold. Common problems include missing or misspelled dropdown fields or missing data in required fields. The easiest way to test is on your test FileHold server. Prepare your production configuration and make sure it is loaded on the test server. Prepare ADI as above and run the task manually from the command line. Look to the Windows FileHold event log for warning messages relating to the format of the delimited file. Look to the job details for errors related to mismatched data.

ADI tracks the name of imported files and it will not import the same files twice. It compares the full path of the file with the full path of previously queued files to determine if they are the same. If you attempt to import a file that was previously queued for import it will be silently ignored. 

If large number of documents are being imported using ADI (such as document migrations), there are some settings and a scheduled task that assists in the "clean up" of the SQL table that tracks the ADI processing. This table needs cleaning periodically to ensure that ADI runs at optimum speeds. In the web config file in C:\Program Files\FileHold Systems\Application Server\LibraryManager, the following settings can be modified for the duration of the migration and returned to default values for normal day to day use:

  • ImportationJobDeleteProcessedEntriesAfterPeriodEnabled – Removes entries in the SQL table after a specified period of time (ImportationJobDeleteProcessedEntriesAfterPeriod). The default value is "true".
  • ImportationJobDeleteProcessedEntriesAfterPeriod – The amount of time before the record in table is cleaned up and can be specified in days; hours; minutes; seconds. For example, "1;12;0;0" would be a clean up every day and a half. So, if the record was added at 11 am yesterday, it is deleted the first time the scheduled task runs after 11 pm today. The default value is 20 days.
  • ImportationJobDeleteProcessedEntriesNumberOfEntries – The number of entries in the SQL table that are eligible for removal via the scheduled task. The default value is 10000. If there is no entry in the web config file for this setting, the default is 100.

The scheduled task "FH cleanup ADI job" runs once per day (default is 1 am) that cleans up the SQL table according to the configured settings above.