Since my last post I had been working on a real annoying problem with a SQL Full-Text Search implementation I manage. As anyone who has worked with MSSFTE knows the query results returned do not offer a mechanism to highlight what search tokens were hit within the search results. I totally understand why this isn’t a feature in SQL from Micorosft’s standpoint but it’s still annoying nonetheless.
To address this issue I had conducted a fair amount of research and enlisted some insight from Jonathan Kehayias in this MSDN thread. He and I both thought refactoring the http://www.codeproject.com/KB/aspnet/DotLuceneSearch.aspx project might be a avenue worth investigating. Needless to say, this did not turn out to be very practical so I ended up building a new highlighter from scratch.
Utilizing this highlighter is very straight forward as demonstrated below. You simply need to provide it a optional list of stop words, your query and the content returnedby SQL Server and it will do the rest. The remaining properties should be self explanatory.
Search Result Highlighter for SQL Full-Text Search Results.zip (85.92 kb)
Dim summaryGenerator As New QuerySummary()
summaryGenerator.StopWords =New String() {"the", "a","and"}
summaryGenerator.OpenHighlightMark= "<B>"
summaryGenerator.CloseHighlightMark= "</B>"
summaryGenerator.SummaryLength= 350
summaryGenerator.RemoveHTMLBeforeProcessing= False
Console.WriteLine(summaryGenerator.GenerateSummary(searchText, _
My.Resources.TestInputData.ResourceManager.GetString("TestInputData")))
This is the first version so there might be some terms returned that don’t look too pretty. I’ll keep updating my blog with new versions as I refine the process.
As with all the code I provide you will need to add your own exception handling routines. Also, this version wasn’t specifically designed to work with HTML content returned by SQL Server. You will need to update the HTML removal method with a DOM implementation if you wish to highlight HTML content.