Automatically Extracting Metadata from the Document Properties of any Document Type
Automatically Extracting Metadata from the Document Properties of any Document Type
The document properties of a document can be automatically extracted into metadata fields for a defined schema when an extraction rule for that file type is configured. Since all document types have properties, you can extract metadata from any type of document. This is useful for file types such as images where you can extract information such as the size of the picture, the camera type, exposure time, resolution, and so on directly from the file.
The document properties are taken from the Details tab of the document properties which can be viewed from Microsoft Windows Explorer. These properties may vary for each document type and in operating systems such as Windows XP or Windows 7. The example below shows some of the document properties of an image in Windows Explorer in Windows 7.
When creating extraction rules for documents, you can create an extraction rule for each type of document that you want to extract values from. For example, you can set a separate rule for a docx, xlsx, pdf, jpg, tiff, and so on. You can create several extraction rules per file extension; however, only one extraction rule per document extension can be enabled at a time. The example below shows a rule created for photographs with the jpg file type.
When creating a document properties rule you will need to select a document "template". A document template is simply any document with the file type that you want to extract metadata from. The document template used will determine the type of document property extraction rule created; it is dependent on the document type such as a docx, xlsx, pdf, jpg and so forth. For example, to create a jpg file extraction rule, select a jpg file as the template.
A document schema is also assigned to the rule and the metadata fields are mapped to the file properties. In the example below, an extraction rule was created for a image file (jpg) file type using the Photographs schema.
When a document of that schema type is added to the document management system, then the file properties will be automatically extracted. In the example below, a jpg document was added to the system using the Photographs schema and the mapped metadata was extracted automatically.
To find out more information on how to purchase FileHold 12, please contact [email protected] or fill out the information request form.
P.S. You may have noticed from the screen captures that there are some other new extraction rules. We will be reviewing those too so check back soon!