Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Full-text searching with IFilter's

0.00/5 (No votes)
30 Jul 2005 1  
Indexing Server, SQL Server, Windows SharePoint Services, SharePoint Portal Server, Exchange Server and Windows Desktop Search provide full-text search capabilities. Each utilizes the so called IFilter components to index the content and then allows clients to search the index.

Introduction

The nineties were all about information creation and sharing. Today's challenge is about finding the information you need when you need it. We all feel the ongoing pain that we never can find that piece of information which helps us to do the tasks at hand. The result being that we either spend a lot of time searching for information or if we can't find it we spend a lot of time achieving the task at hand with trial and error till we figured out how to do it. Microsoft products like Indexing Server, Exchange Server, SharePoint Server, SQL Server and Windows Desktop Search provide powerful full text search capabilities. All of these products share a common building block for their full text searching - IFilter's.

All Microsoft full text search engines have in common that they index the actual content and then allow to perform searches against these indexes. The indexing process finds the file type associated with the content and then invokes the associated IFilter. The COM object which implements the IFilter encapsulates the understanding about the content structure and performs the actual indexing of the content. If a third party ISV has some proprietary content which should be searchable by these Microsoft products then the ISV needs to create an appropriate IFilter COM object. As soon as this IFilter gets registered it can be utilized by all Microsoft full text search engines. This simplifies tremendously the process for ISV's to make their content searchable with all different Microsoft products.

How do IFilter's get associated with the different file/content types?

Any content searched has a "file extension" associated. Indexing Server, SharePoint and Windows Desktop Search are used to index and search files on the file system. Exchange Server, SharePoint and SQL Server can have files embedded which again have a file extension. All other fields in SQL Server are naturally assumed to be in text format and therefore assume the ".txt" extension. Messages in Exchange Server also assume the ".txt" extension. The registry therefore is the natural place to associate IFilter's with each file extension. The indexing process first determines the file extension of the content. Then it performs the following steps:

  • Step 1: Determine if there is a PersistentHandler associated with the file extension. This can be found in the registry under HKEY_LOCAL_MACHINE\Software\Classes\FileExtension, e.g. HKLM\Software\Classes\.htm. The default value of the sub key called PersistentHandler gives you the GUID of the PersistentHandler. If present skip to step four otherwise continue with step two.
  • Step 2: Determine the CLSID associated with the file extension. Take the default value which is associated with the extension, for example "htmlfile" for the key HKLM\Software\Classes\.htm. Next search for that entry, e.g. "hmtlfile", under HKLM\ Software\Classes. The default value of the sub key CLSID contains the CLSID associated with that file extension.
  • Step 3: Next search for that CLSID under HKLM\Software\Classes\CLSID. The default value of the sub key called PersistentHandler gives you the GUID of the PersistentHandler.
  • Step 4: Search for that GUID under HKLM\Software\Classes\CLSID. Under it you will find a sub key PersistentAddinsRegistered which always has a sub key {89BCB740-6119-101A-BCB7-00DD010655AF} (this is the GUID of the IFilter interface). The default value of this key has the IFilter PersistentHandler GUID.
  • Step 5: Search for this GUID once more under HKLM\Software\Classes\CLSID. Under its key you will find the InProcServer32 sub key and its default value contains the name of the DLL which provides the IFilter interface to use for this extension. For example for the .htm and .html extension this is the DLL nlhtml.dll.

The following article provides a more detailed description with examples of how the IFilter DLL is found. For more information about the PersistentHandler refer to this article.

How to create your own IFilter component?

ISV's which register their own file extensions with proprietary content structures need to provide their own IFilter components so that these file types can be searched by Microsoft products. The Platform SDK describes in detail the IFilter interface. The Platform SDK also contains three sample IFilter implementations.

The three IFilter components used in this article

The " Channel9 Wiki" lists the IFilter components which are present out of the box. Please note that a number of software packages install additional IFilter components. It also provides links to a number of additional IFilter components available. The Windows Desktop Search has its own site for additional IFilter components available. The rest of this article will explain how the full text search in Indexing Server, SQL Server, Windows Desktop Search and SharePoint works. It will also document any additional settings you need to make in order for new IFilter components to work. The three IFilter components used are:

  • CHM file extension - The CHM extension is used by compiled Windows help files. Out of the box CHM files have no PersistentHandler associated so they are not searchable. The installer places and registers one DLL - CHMIFilter.dll.
  • ZIP file extension - ZIP files are not searchable out of the box because it has no PersistentHandler associated. This installer places and registers one DLL - ztvArchFil.dll. The ZIP IFilter made available by Citeknet worked fine with Indexing Server and Windows Desktop Search but I could not get it to work with SQL Server. It itself places and registers one DLL - ZIPIFilter.dll.
  • XML file extension - The XML file extension has by default a PersistentHandler associated which works fine with Indexing Server and Windows Desktop Search. But the default IFilter did not work with SQL Server. This XML IFilter component works with all three. First you need to extract the file, then copy the XMLFilter.dll to windows\system32 folder and then register it.

You can also download a Filter Explorer from Citeknet. This explorer walks the registry and will list all the IFilter components available. It can also show all the file extensions which have no IFilter associated, meaning they are not searchable. This can be very useful in understanding what content is searchable or what is not. It also simulates the slightly different behavior of the different Microsoft products as to how each will read the registry entries to find the available IFilter components.

Full text search with Indexing Server

The following article describes how you can use Indexing Server to index and search files on the file system. Indexing Server can perform an auto registration of filters if they are added to the DLLsToRegister registry value (under HKLM\System\CurrentControlSet\Control\ContentIndex). When the Indexing service starts up it calls DllRegisterServer for each DLL listed. In Windows 2003 and XP this is a multi string value so you can edit it through the windows registry editor. In Windows 2000 this is a binary value. Some filters add themselves to this registry value during registration, e.g. the ZIP IFilter.

A newly registered IFilter takes effect only after the Indexing service has been restarted or an individual Indexing catalog itself has been restarted (then it only takes effect for that Indexing catalog). Unregistering an IFilter takes effect only till the Indexing service or an individual Indexing catalog gets restarted. To remove already indexed content you need to start a full rescan.

Full text search with SQL Server

The SQL Server full text search capability is only available when the "Full-Text Search" component (under Server Components) has been installed. The component is included if you choose a typical or complete installation. Otherwise you need to launch setup again and add the missing component. This adds the "Microsoft Search" service which performs the actual indexing and full text searching. This service needs to be running in order for the full text search to work. Anytime SQL Server encounters the CONTAINS or FREETEXT SQL statement or the CONTAINSTABLE or FREETEXTTABLE function it calls out to the "MS Search" service to perform the actual full text search against the Indexing catalog. The following article provides an overview of the Microsoft Search service.

How do I create a full text search catalog for my SQL Server database?

Open the Enterprise Manager for SQL Server and navigate to your SQL Server database (in the left side navigation pane select a SQL Server Group, next select a registered SQL Server and finally select under the "Databases" entry your SQL Server database). You see a number of entries under your database, one of them is called "Full Text Catalogs". This shows you any full text search catalog defined for your database. Right click on the right side pane or the item "Full Text Catalogs" and select "New Full-Text Catalog" from the popup menu. Give the full text search catalog a name and select where the catalog files get placed which is by default "c:\Program Files\Microsoft SQL Server\MSSQL\FTData". The "Schedules" tab allows you to set one or more schedules when the full text search catalog gets updated, for example perform a full population every Sunday and an incremental population every 10 minutes.

Click on the button "New Catalog Schedule" to define a new schedule. Give the schedule a name and select whether it is enabled or disabled through the "Enabled" check box. Under the job type select whether this schedule performs a full population or incremental population. A full population rebuilds the search catalog and an incremental population indexes changes since the last population. Finally you select the schedule, which can be either at startup time of the SQL Server Agent, a specific date and time or a recurring schedule. If you select "Recurring" click on the "Change button to define the recurrence schedule, for example every 10 minutes or every day at 1:00 AM from July 1st to July 31st.

How do I enable full text searching on one or more fields of a table?

In order to be able to perform a full text search on a table you need to create a full text index on the table. You can only define one full text index per table. Select the "Table" entry in the left side navigation pane to get a list of all tables defined in the database. Right click on a table and select "Full-Text Index Table | Define Full-Text Indexing on a Table" from the popup menu. This will bring up a wizard which allows you to select which fields should be indexed. Tables which are indexed need to have a "unique single column index" which does not allow Nulls. For example a primary key on an ID field which does not allow Nulls or an index on an ID field which has a unique constraint and does not allow Nulls. As the index gets built each index entry points back to the table rows it applies to (through the unique identifier). SQL Server delegates any free text search as part of a query to the Microsoft Search service. The Search service performs the actual free text search and then returns the list of rows to include in the result-set. This is done through the unique identifier associated with each index entry.

The "Full-Text Indexing" wizard guides you through the process of creating a full text index. First you select the "unique index" to use which allows you to select from a list of all unique indexes present on the table. Next you select the fields to index by checking the checkbox in front of each field to index. The list shows only fields which can be indexed which are fields of the following data types: char, nchar, varchar, nvarchar, text, ntext and image. All data types except the image data type are treated as text fields therefore the TXT IFilter will be used during the indexing process. Fields of the data type image contain file images. When you select a field of the data type image, you select under "document type column" which field will contain the data type of the file stored in the image field. The Search service looks during the indexing process at this field to understand the file type stored and which IFilter to apply. For example you may have a field called FileImage of the data type image and a field called FileType of the data type nvarchar. While creating records in the table you would store the file in the field FileImage and the file type in the field FileType, e.g. "zip".

After selecting all the fields to index you select to which full text catalog this index belongs to. You can select from the list of existing catalogs or create a new one. Next you can add a new catalog schedule or table index schedule. Any catalog schedule you add will apply to all table indexes in the full text catalog. A table index schedule you add will only apply for the table index which allows you to create different schedules for different tables. Finishing the wizard will apply all the changes, meaning it will add the catalog schedules, create the table index and create the table index schedules.

How do I manage existing full-text catalogs?

SQL Server provides a number of options to manage your full-text catalogs. The "Full-Text Catalogs" entry will list all defined catalogs. In the right side pane right click on the catalog to manage. The popup menu will show you a number of options:

  • Rebuild Catalog - rebuilds the catalog which generates a new empty catalog.
  • Start Full Population - starts a full population, which effectively rebuilds the catalog.
  • Start Incremental Population - indexes all changes since last population.
  • Stop Population - stops a running population.
  • Schedules - brings up the list of defined schedules and allows you to change the existing schedules or create new ones.
  • Delete - allows to delete the catalog with all its table indexes.
  • Properties - shows the properties of the catalog; the "Tables" tab shows all the tables which have an index and are part of this catalog; the "Schedules" tab" shows all the schedules defined.

How do I manage table indexes?

You can find out through the full-text catalogs which tables have a table index defined. You then find the appropriate table and right click on it. From the "Full-Text Index Table" popup menu you can select from a number of options:

  • Edit Full-Text Indexing - brings up the "Full-Text Indexing" wizard which allows to edit the full text index.
  • Remove Full-Text Indexing from a table - allows to remove a table index.
  • Start Full Population - starts a full population, which effectively rebuilds the table index.
  • Start Incremental Population - indexes all changes since last population.
  • Stop Population - stops a running population.
  • Schedules - brings up the list of defined schedules and allows you to change the existing schedules or create new ones.

The attached sample database

The attached SQL Server database named "FullTextSearchSample" illustrates how you can store files in a database and search the file contents through the full-text search engine from SQL Server. It contains a table called DocumentLibrary which has a field DocumentImage of type image and DocumentType of type nvarchar. The attached sample application "Insert Files into Database" provides a Windows Forms application to insert files into a table. You enter the name of the database server and the user credentials to use. Next you enter the name of the table where to insert the file into, the name of the field where to insert the file contents and the name of the field where to insert the file extension. Finally you select the file to insert and click "Insert" which will create a new record in the table and insert the file and file type. You can also achieve this by using the TextCopy utility provided by SQL Server. It is located at "c:\Program Files\Microsoft SQL Server\MSSQL\Binn". Here is an example:

TextCopy /S servername /U username /P password 
/D FullTextSearchSample /T DocumentLibrary
/C DocumentImage /I /F filename /W "where ID = 8" /z

For a complete description of all command line arguments click here. This utility can only update an existing record and the field needs to be not NULL otherwise you will get the following error "Text or image pointer and timestamp retrieval failed". After you have added some files into the sample database make sure that the index gets updated. You can start it manually by right clicking on the "DocumentLibrary" full-text catalog and selecting "Start Full Population" from the popup menu. Next open the "SQL Query Analyzer", log on, select the "FullTextSearchSample" database and run the following query:

SELECT * FROM DocumentLibrary 
   WHERE CONTAINS( DocumentImage, 'Enterprise-Minds')

This will query for all records with files in the DocumentImage field which contain the text �Enterprise-Minds'. The sample database comes already pre-populated with some files therefore it will return two records.

A newly registered IFilter takes effect without the need to restart services. Unregistering an IFilter also takes effect without restarting any services. To remove already indexed content you need to start a full population.

Full text search with Windows Desktop Search

Windows Desktop Search is Microsoft's approach of enabling individual users to index and search their personal content. You can download the MSN Search toolbar with Windows Desktop Search from here. Similar products are Google Desktop and Yahoo! Desktop Search. This article covers Windows Desktop Search as it also utilizes IFilter components. This article does not rate or compare any of the desktop search tools mentioned above. The installation of the MSN Search toolbar with Windows Desktop Search is very straightforward. It can also search your personal emails which requires Outlook 2000 or higher. The installation will ask you if you want to proceed if Outlook 2000 or higher is not present. This still allows you to search the content of files just not for emails. Confirm the message with OK to proceed (if the message appears). This starts the installation and registration of all the files followed by the "MSN Search Toolbar Customization Wizard" which allows you to configure the MSN Search toolbar with Windows Desktop Search.

You can check the option "Use the default settings and close this wizard" if you want to use all default options. Otherwise proceed with the next button. There are three MSN Search toolbars that you can install. The MSN Search toolbar displayed in Internet Explorer and Windows Explorer, the MSN Search toolbar displayed in Outlook (grayed out if Outlook 2000 or higher is missing) and the MSN Search Deskbar, which is shown in the Windows taskbar. Select which search toolbars you want to enable. Next you can choose if you want to participate in the "Customer Experience Improvement Program" and whether you want to make msn.com your default internet search. You also have control over which content gets indexed and the indexing process in general. Next you are presented with the following options:

  • Automatically start Windows Desktop Search - This is recommended as it will automatically start Windows Desktop Search when you log in (the install adds Windows Desktop Search to the Startup group). Otherwise you need to start it manually through the Windows start menu. Under "All Programs" you will find a Windows Desktop Search icon. As soon as Windows Desktop Search is started you will see an icon in the icon tray of the Windows taskbar.
  • What to index - Here you specify what drives, folders and email folders to index. You have the option to index all emails and hard-disks (this indexes only the files and folders you have access to), all emails and files stored under your "My Documents" (which is all your personal content and is the default option) or you can specify which folders and emails to index. If you choose the last option then click the Browse button to see a list of all hard drives, folders and email folders. Select the ones you want to index (this shows only the drives and folders you have access to).
  • Index email attachments - Check this checkbox if you want to index also file attachments (which is recommended and the default).
  • Index new items while on battery power - This is useful to preserve power while on battery and is again the recommended and default selection.

This finishes the configuration wizard and also launches (if selected) this site in your browser. The site shows Windows Desktop Search addins available, for example additional IFilter components. The installation automatically starts the Windows Desktop Search which shows an icon in the icon tray of the Windows taskbar. This starts two processes WindowsSearch.exe and WindowsSearchIndexer.exe. A third process WindowsSearchFilter.exe gets started when an indexing is in progress. You can right click on the icon in the icon tray of the Windows taskbar to see a list of available options:

  • Snooze Indexing 1 Hour - Allows you to pause the indexing process for an hour.
  • Index Now - Windows Desktop Search only indexes when the user is inactive. This allows optimal responsiveness of Windows while the user is active and indexes content while the user is inactive. Through this option you can force an indexing to happen now.
  • Indexing Status - Shows a Windows Form in the lower right corner with the current indexing status. You can snooze the indexing process for up to a day or force an indexing now.
  • Desktop Search options - Brings up the "MSN Search Toolbar Options" dialog which provides a number of settings:
    • General - Here you can set which country/region will be searched if you invoke a web search from the Windows Desktop Search (which by default uses msn.com). You can also specify any other search engine to use, for example http://www.google.com/search?hl=en&q=$w to use Google.
    • Deskbar - Allows to set a number of options for the Windows Deskbar. You can show or hide the Deskbar from here, set the key combination how to activate it (same as clicking into the search box on the Deskbar), whether to search and display results while you type, show the go button, etc.
    • Desktop Search - Provides a number of options what to index and how to run the indexing. These options are the same you were presented with during the installation. Under the sub-item Advanced you can specify which file extensions are handled as text files (for example files with the extension .c or .cs) and which file extensions should be ignored (for example files with the extension .386 or .bak). It also allows to change the location where the indexing files are placed, which is by default under "c:\Documents and Settings\Username\Local Settings\Application Data\ Microsoft". Under this you will find a folder named "Desktop Search" which contains configuration information, temporary folders, log files, etc. By placing it under a users profile it assures that each user only stores/sees their own information. When checking the option "Prioritize Indexing" you can tell Windows Desktop Search to index while the user is active.
    • Toolbar - Contains a number of options like which icons to show in the toolbars, to enable tabbed browsing, to enable the popup blocker, turn on search result highlighting (which highlights the search term in the search result), etc.
  • Search Now - Brings up the "Windows Desktop Search Result" window which allows users to perform searches for local or web content. Can also be opened through the "Window Desktop Search" icon under All Programs of the Windows start menu.
  • Exit - Closes Windows Desktop Search which then stops all indexing.

How do I use the Windows Deskbar to search local content?

You can show or hide the Windows Deskbar by right clicking on the Windows taskbar and selecting "Toolbars | MSN Search Deskbar" from the popup menu. This shows a toolbar with a search box and Go button. Activate the search box by clicking on it or by pressing Ctrl + Alt + M (default key combination). This brings up the "Windows Desktop Results" popup window which will show matching results as you type your search phrase. Clicking the Go button beside the search box or the "Search Desktop" button at the bottom of the "Windows Desktop Results" popup window will bring up the "Windows Desktop Search Result" window. The search phrase will get passed along and the window will show the result for the local search. Clicking on the Web button at the bottom of the "Windows Desktop Search" popup window will invoke a web search and launch a browser showing the web search result. This uses by default the msn.com search or whatever search engine has been configured.

Searching from within the "Windows Desktop Search Result" window

As explained above the "Windows Desktop Search Result" window can be launched through the Windows start menu or the Windows Deskbar (which already passes along a search term and shows the result). At the top of the window you will see the search bar where you can enter a search phrase and perform a local search by clicking on the "Search Desktop" button or perform a web search by clicking on the "Web" button. The web search will again use the configured search engine. Local searches will show the result in the lower left pane. You can filter the result set by clicking on the filter buttons (on the filter bar which is right underneath the search), for example Everything, Documents, Music, IM Chats, etc. This allows you to filter by content type. The "Other" button shows you a list of all available filter options to choose from. In the lower right pane you see a preview of the selected file in the search result. The second last icon from the right in the filter bar allows to switch the result set from large icons to small icons and to select where to place the preview pane (right, bottom or off).

A newly registered IFilter takes effect without the need to restart services. Unregistering an IFilter also takes effect without restarting any services. To remove already indexed content you need to rebuild your index. Open the "Desktop Search Options", select the "Desktop Search" item on the left side and then select the "Rebuild Index' button. This will close the Windows Desktop Search, rebuild the catalog, start the Windows Desktop Search again and then start the indexing process. This may take a few minutes.

Full text search with Windows SharePoint Services

Both Windows SharePoint Services as well as SharePoint Portal Server utilize IFilter components for searching. The following article explains how to install Windows SharePoint Services so that searching is available. The settings around searching in Windows SharePoint Services are limited. Open the SharePoint Central Administration (through the Administrative Tools Windows start menu) and select under the section "Component Configuration" the item "Configure full-text search". The only option is to enable or disable the full-text search capability which applies to all SharePoint portals. When full-text searching is enabled, you can search the contents of any document uploaded into any "Document Library". Windows SharePoint Services automatically refreshes the index when new documents get uploaded. Normally it takes a few minutes till the index has been updated and the new document is included in the search result.

A newly registered IFilter takes effect only after you disable and then re-enable the full-text search option which effectively rebuilds the catalog. It may then take a few minutes till the index has been rebuilt. Unregistering an IFilter also takes effect only after disabling and then re-enabling the full-text search option.

Full text search with SharePoint Portal Server

SharePoint Portal Server provides much greater control over the indexing process. SharePoint Portal Server registers its separate "Microsoft SharePointPS Search" Windows service for full-text searching. Similar to the "Microsoft Search" for SQL Server, it is responsible for the actual indexing and search process. To configure the search in SharePoint Portal Server open your portal and then click on "Site Settings". This brings up the "Site Settings" page which has a section called "Search Settings and Indexed Content". This section provides a number of settings about how portal content and non portal content gets indexed. Non portal content can be a web page or complete web site, a public folder from Exchange Server or a file share. So SharePoint Portal Server allows you to index and search a much wider array of sources.

Select the item "Configure search and indexing" under the section "Search Settings and Indexed Content". The "Configure Search and Indexing" page has three sections. The "General Content Settings and Indexing Status" section shows you an overall overview of the index. You can see how many portal or non portal documents have been indexed, when the last time the portal or non portal index was updated as well as any errors or warning from the last index update. Click on the number of errors or warnings shown which brings up the "Gatherer log details" page with the detailed list of the occurred errors or warnings. SharePoint has four different modes of index updates:

  • Full - this mode indexes all content and performs a full update of the index.
  • Incremental - this mode indexes changes since the last update.
  • Incremental (inclusive) - this mode indexes changes including Web Part pages and pages since the last update.
  • Adaptive - this mode indexes content that is likely to have changed based on the site history.

You can find a detailed description of each index update mode in the following article. Under the overall index overview you will find a list of links as follows:

  • Refresh - Refreshes this page with the latest information.
  • Start portal content update - Shows the four modes of index updates available for portal content. Select one to perform that index update, for example click on Full to perform a full index update. If an index update is in progress then you see only Stop which allows you to stop the index update.
  • Start non portal content update - Provides the same index update modes for non portal content.
  • Manage Search schedules - Brings up the "Manage Search Schedules" page which allows to manage the index update schedule. Out of the box there are three schedules defined. An incremental update of non portal content once a day, an inclusive incremental update of portal content once a day and an incremental update of portal content every ten minutes. You can add new schedules and remove or update existing schedules.
  • View errors and warnings on - Allows to bring up the "Gatherer log details" page to view all errors and warnings for either portal content or non portal content.
  • Manage Search scopes - Brings up the "Manage Search Scopes" page which is used to manage search scopes. Out of the box there is only one scope with the name "All Sources" defined. You can create new search scopes and remove or edit existing search scopes. Search scopes allow you to limit the search to only one or more areas. When users perform a search they can select from a list of available search scopes.
  • Include file types - Brings up the "Specify File Types to Include" page. This allows to specify file types which will be included in searches. In order for new IFilter components to take effect you need to register the component and then add the file type to this list, for example register the "ztvArchFil.dll" component and then add the "zip" extension to the include file types list. If you unregister IFilter components then you should also remove the file type from this list. You also have the option to keep the IFilter registered but only remove the file type from this list. All changes take effect when a full index update is performed. New file types will have no image associated out of the box. Copy the image file to associate with the new file type to the folder "c:\Program Files\Common Files\Microsoft Shared\web server extensions\60\template\images". Then edit the file "c:\Program Files\Common Files\Microsoft Shared\web server extensions\60\template\xml\docicon.xml". Under the node "ByExtension" add a new mapping between the file type and the image to use for that file type, for example "<Mapping Key="zip" Value="zip.gif"/>". You need to restart IIS for the image to take effect. Note that SharePoint allows you to block file types. If you want to add new file types which are blocked then you need unblock it. Open the "SharePoint Central Administration" (through the Administrative Tools of the Windows start menu) and select under the section "Security Configuration" the item "Manage blocked file types". Remove the file type you want to unblock or add the file type you want to block. Changes take effect immediately.
  • Add content Source - This shows the "Add Content Source" wizard which allows you to add new non portal data sources to search. These can be public folders on Exchange Server, file shares or web pages or entire web sites. For example to add a new file share select the option "File Share" and click next. Then enter in the field Address the file share (e.g. \\Enterprise-Minds\MyShare), a description and select whether to include sub folders or not. Finally click on the finish button which shows you a summary of the new data source. From here you can create a new update schedule or start a full index update.
  • Enable advanced search administration mode - The advanced mode allows you to manage multiple search indexes. Switching to this mode shows a new section called "Content Indexes" on the "Configure Search and Indexing" page. By default you will see a Portal_Content and Non_Portal_Content index. Through the "Manage content indexes" link you bring up the "Manage Content Indexes" page. You can create new indexes and edit or remove existing indexes. This mode removes the "Start portal content update", "Start non portal content update" and "View errors and warnings on" links from the "Configure Search and Indexing" page. You perform these actions now through the "Manage Content Indexes" page. Click on one of the indexes which brings up the popup menu and allows you to select these options from there.

The section "Other Content Sources" on the "Configure Search and Indexing" page allows you to add new content sources. By default you will see a "This Portal" and "People" data source. The "Manage content sources" link brings up the "Manage Content Sources" page which allows you to add new data sources (same as the "Add content source" wizard described above) and edit or remove existing ones. Through the section "Site Directory" on the "Configure Search and Indexing" page you can manage how SharePoint crawls web sites. Please refer to the SharePoint documentation as this topic is beyond the scope of this article.

Can I use Window Desktop Search as front-end for other search applications?

The Windows Deskbar allows you to create shortcuts which can point to URL's, files or applications. The search text box of the Windows Deskbar can be used to execute, create and update these shortcuts. To create or update a shortcut start with the @ character followed by the name of the shortcut, a comma and then the URL or file name to associate with the shortcut. If the shortcut points to an application then type after the comma an equal sign followed by the application name. Here is an example:

@ixs,=iexplore.exe file:///c:/windows/help/ciquery.htm#machine=MachineName, 
    catalog=CatalogName

This creates a shortcut with the name ixs which launches Internet Explorer and passes along a local HTML page to load including some parameters. The page ciquery.htm is provided by the Indexing Server and provides a front-end to search an indexing catalog. The machine argument passes along the Indexing Server to query and the catalog argument the indexing catalog to query. This shows the "Indexing Query Form" which allows a user to enter a search criteria, execute the query and then view the matching results. To launch the "Indexing Query Form" you have to type in the shortcut name in the search box of the Windows Deskbar and press the enter key. This launches the browser and loads the form. On Windows XP and Windows 2003 the browser will show you a warning about active content. Click on the information bar on top and select from the popup menu "Allow Blocked Content". This allows the "Indexing Query Form" to properly load. The form only works with Internet Explorer.

You can also create a shortcut to search SharePoint Portal Server from the Windows Deskbar. Create a shortcut pointing to the search.aspx page of your SharePoint Portal Server. Please note that this does not work with Windows SharePoint Services. Here is an example:

@sp,http://<SharePoint Server>/search.aspx?k=$w&s=All%20sources

This example creates a shortcut called "sp" which calls the search.aspx page and passes along two arguments. The "k" argument is the search term which is set to $w which means that the Windows Deskbar passes along the value the user passes along to the shortcut and the "s" argument which is the search scope. The search scope in this example is always set to "All sources". The user can then enter "sp <search term>" into the Windows Deskbar which brings up a browser with the search result of the search performed by SharePoint Portal Server, for example "sp Vancouver". This also works with multiple search terms separated by spaces, for example "sp City of Vancouver".

Summary

Microsoft has released with Indexing Server its first full-text search engine. Since then it has built on that very same concept and provided a common way to extend the full-text search engines of Indexing Server, SQL Server, SharePoint, Exchange Server and Windows Desktop Search. This makes it very easy for ISV�s to add full-text search capabilities of their proprietary content to most Microsoft products. IFilter's are widely known in the community and you can find IFilter components for most file types. The Platform SDK has detailed examples and also provides a number of tools to test IFilter components. If you have comments on this article or this topic, please contact me @ klaus_salchner@hotmail.com. I want to hear if you have learned something new. Contact me if you have questions about this topic or article.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here