Improving search results for casual users
FileHold provides extensive mechanisms for searching documents using document contents, system and custom defined fields, and automatically overlaying document permissions, library types, and history.
Making the best use of all these search facilities requires some knowledge and training which may not be practical or possible for cases like corporate or public portals. Since the beginning, FileHold has provided a “simple search” mechanism which accepts search criteria like an internet search engine.
For example, typing the words apple and pie into the simple search bar will return all documents where apple and pie appear anywhere in the document contents or metadata. Enclosing them in quotes like “apple pie” will only find the documents where the two words are next to each other. Like a Google search, there are also little tricks like putting a minus (-) character in front of pie to include documents with apple but not with pie.
While the simple search does make it easy to form a working query, it may produce unexpected results for any given repository; particularly if there are a large number of documents. An important aspect of searching within full text is that it is a two-phase process. In step one, the query is passed to the full text engine. It searches the entire index which includes old versions of documents, documents in the archive, and all documents without regard to permission. Step two applies criteria that are stored in the database such as permission, location, etc.
The first problem might be the volume of results and their respective relevance. If there are a large number of documents that the casual user has access to, they might find themselves paging through lists of documents when their query is complete. The second problem is just about the volume and its impact on performance. Even if the user has permission to very few documents, the full text results may produce a large list that then needs to be processed by the database engine. This can unnecessarily impact performance for all users while the extra documents are weeded out of the results.
The good news is that you as a FileHold administrator can create a bespoke search experience for users, and they do not even need to know you did it. The first step is to disable ad hoc searching. This is done by FileHold group, so it can be fine tuned to the users that you do not want to have arbitrary access. When ad hoc searching is turned off the simple search box only works with quick searches and all types of saved searches are the only options for the users.
The nice thing about a quick search is that you can provide the full text searching capability but add restrictions that the user does not know about. These restrictions can limit the documents that will be considered in the search and add additional criteria like a custom metadata field value.
The following examples show features that require at least version 17.0 of FileHold.
Prefilter search set based on folder or document schema
By default, when a folder or document schema is included in search criteria, a list of all documents contained in the folder or document schema will be sent to the full text search engine along with the other criteria. This list will effectively reduce the number of documents that get searched. If you have ten million documents in your system, but only twenty thousand for a particular document schema, only those twenty thousand documents will get searched. This can dramatically reduce the search time especially if common words or wildcards are used in the search criteria.
In this example we will add the document schema named Policy.
Save this as a saved quick search and assign permissions to the target user group. They will see this in their saved searches, and it will be available from the simple search bar.
By default, the maximum number of prefiltered documents is one hundred thousand which should work well in most circumstances.
Using metadata to limit full text results
The prefilter approach to getting the most relevant results might fit some circumstances, but not all. Perhaps the search is intended to provide a focus on departmental policy documents. While users may have permission to all organizational policies, they would typically like to search for those policies related to their department.
In this case we can update the past policy search and create one for each department. If we assume each department has its own group, we can easily add the correct departmental group as a member.
You can extend the concept to include multiple quick searches that are easily selectable from the library panel and the simple search bar. You can even mix and match full text with all other search types.
You can learn more about searching in our knowledgebase. Start from Searching for documents.
Russ Beinder is the Chief Technology Officer at FileHold. He is an entrepreneur, a seasoned business analyst, computer technologist and a certified Project Management Professional (PMP). For over 35 years he has used computer technology to help organizations solve business problems.