Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

C# and VB.NET Code Searcher - Using Roslyn

0.00/5 (No votes)
7 Mar 2013 4  
A fast C# and VB.NET code searcher using Roslyn.

Table of Contents

Introduction

This article is about a tool using Roslyn that can search through a large codebase in 4 ways:

  1. Search text in methods
  2. Search calls to certain methods  
  3. Search for methods with certain names
  4. Search for properties with certain names
  5. Search for classes with certain names

Screenshot 

Here's a screenshot of C# and VB.NET Code Searcher in action.

 

The problem  

Recently I had an assignment that required a lot of searching through the source code of a large legacy codebase (61 solutions, C# code). A field had to be moved from one table to another table. It was a change that would impact some parts of the codebase. To find out I had to find methods in the data layer where Stored procedures were called. Then I had to go bottom-up through the codebase to see where these methods were called, and what the impact was on the code.

At first I used the freeware tool "TextCrawler 2" for that (http://www.digitalvolcano.co.uk/content/index.php). This is quite a fast text search utility. But the problem is, it doesn't "know" anything about the C# language. For example, if you search for method calls to a certain method, TextCrawler will happily find files for you that have the method calls commented out. Another problem was, it wasn't fast enough (searching through 61 solutions can take some time..). I also used the Microsoft Desktop Search tool, this was fast but also not "intelligent" with the source code.

Since I read about Roslyn I thought of ways I could make it useful for this purpose.

Background about Roslyn

Roslyn is Microsoft’s project to open up the VB and C# compilers through APIs, and provide easy access to the information it gathers during the different stages of the compilation process. To get started on what Roslyn is about, you can read about it here:

Or if you'd prefer to take a deeper dive into Roslyn, here's a whitepaper from Microsoft:  

This article is not meant to give you an introduction to Roslyn, there are a couple of good CodeProject articles that do that:

I also found out that after installing the Microsoft Roslyn CTP - June 2012 there were lots of sample projects installed in my Documents folder. 

The solution

So I thought I'd give Roslyn a try to see if I could create a tool that could search through code faster. I think I succeeded in this. I use it all the time now! I created a Windows Forms application that has 5 ways of searching through C# and VB.NET code:  

  1. Search text in methods 
  2. Search calls to certain methods
  3. Search for methods with certain names
  4. Search for properties with certain names
  5. Search for classes with certain names 

I decided to share this with the world so everyone can enjoy it. By posting this article, I hope that:

  • People will find this useful too.
  • I get valuable feedback so the tool can be improved.
  • People will extend / adapt the tool or parts of it in ways I haven't thought of yet.

Why would you use it? 

For example: "Go To Definition" for other solutions

Let's say you're in a debugging session. You're debugging in solution X which calls a service that's in another solution Y. Now you see a method being called on a class in solution Y. In Visual Studio you can go to the definition of a method with right mouse click - "Go To Definition" or F12. But not when the method is in the other solution! So if you want to look up the definition of the method, the only ways to do that are:

  1. Step inside the method during the debug session
  2. Open solution Y and find the method you want to see. 

With RoslynCodeSearcher, it's very easy to look up a method that's in another solution, just type its name in the search field, select "Search methods" and click [Search].

As a help during refactoring

Sometimes you want to know "What will happen if I remove this method, where is it called in the jungle of solutions?". You can do a text search or you can start a compile build to see where it breaks, but for some projects that have lots of solutions a full compile build just takes long. With RoslynCodeSearcher, you type the name of the method in the search field, select "Search calls" and click [Search]. Wait a second, et voila!

Why don't you use reflection for this?

The reason I don't use reflection for this is, I want to have access to the actual sourcecode of the solutions I search. I want to return the method body for example. Reflection can't do that, it can only work on metadata (Type of Classes, Name and Signature of Methods, etc.). Also, when I want to do a text search on pieces of text in a method, using Roslyn is faster than text search for a larger number of solutions. This is because the solutions are compiled in memory.

How fast is it  

The first time you use it the tool will be slower, because it has to compile the solutions in memory (604 MB of memory in my situation). These compiled solutions will be available in IWorkspace objects. This happens at the startup of the tool every time. A progress indicator will indicate the progress of the compilation. With a couple of solutions this compilation will be finished in a second or so. With a whole lot of solutions it will take longer. To give you an indication: On my computer it took about half a minute compiling 61 solutions in memory the first time. After the initial compilation the search will be very fast: a second to a few seconds for searching through 61 solutions, depending on how much will be found. This is because it already has the list of IWorkspace objects in memory. After I started using the .NET 4 Parallel.ForEach keyword the performance has increased significantly (with a factor depending on the number of cores in the processor of your computer, Dual Core, Quad Core, etc.).  

How to use it   

Prerequisites

Make sure you have the following software installed in this order, otherwise the solution will not build

This article was written for the Roslyn June 2012 CTP version, that was compatible with Visual Studio 2010 SP1. However, the new version of Roslyn, the September 2012 CTP version, is only compatible with Visual Studio 2012. I have added a download link to the sources of Code searcher that work together with Visual Studio 2012. The rest of this article still needs to be updated to reflect this fact (or I will create a new article specially for the Visual Studio 2012 version, I still have to decide).

If you have Visual Studio 2012, the software needs to be installed in this order:

Next: solutions.txt file 

You have to provide the tool with a list of solutions to search through.

There are two ways you can do this:

  1. With a text file "solutions.txt" placed in the directory of the executable (or \bin\debug after you build the solution). The tool will read this on startup if it exists. This text file should contain full paths to the solutions. Each on it's own line.
  2. If the solutions.txt file doesn't exist yet, click on [Browse ...] and in the File dialog select a directory. Next click on [Update solution List]. The tool will then walk recursively down the directory structure, starting at the selected directory, looking for solution (.sln) files.

The result will be stored in the "solutions.txt" file in the directory of the executable. The existing "solutions.txt" file will be overwritten. 

Next: search 

  • Type the text you want to search for in the textbox.
  • Select one of the ways to search with by clicking one of the radio buttons.
  • Click [Search].

The solutions from solutions.txt, all underlying projects, and all underlying source files will be searched through.

The result of the search consists of:

  1. The path to the source files containing the found methods. 
  2. The body of the methods. 

Including / decluding files

You can also specify words in the textboxes on the right that say:

"Do not include files containing words in filename. Separate by comma."

or

"Only include files containing word in filename. Separate by comma."

  • "Do not include" means, the tool will not search in code files that have any of the words in the path. 
  • "Only include" means, the tool will only search in code files that have any of the words in the path.  

These text boxes are mutually exclusive, they can not be used at the same time, " Do not include" takes precedence over "include".

Searching part of text

It is possible to type only part of the text you want to search. For example, if you want to search for all methods that contain the word "Save", like "SaveCustomer", "SaveOrder", then check the option checkbox "Search part of text". If you select the search option "Search text in method" the option will be set by default. 

Syntax Highlighting with Fast Colored TextBox

To present the results of the code search I needed a text editor that could do Syntax Highlighting. I researched a couple of those, and decided to use the great "Fast Colored TextBox" from Pavel Torgashov in my project (also on CodeProject): Fast Colored TextBox for Syntax Highlighting. Which is fast indeed! It also supports searching in the textbox with Ctrl-F.  

Multiple Searches using KRBTabControl 

To be able to start multiple searches using a tabbed interface, I gladly used the excellent "KRBTabControl" from Burak299 in my project (also on CodeProject): KRBTabControl. This gave me the possibility to provide tabs that can be closed just like browser tabs. When there are too many tabs too display you will see two tiny arrows on the right so you can switch between tabs with the mouse.   

The implementation   

The code below is not entirely the same as the source code itself, but this is meant to show you the basics of how the tool works.

When the Search button is clicked, a search is started using the selected Search method (the radio buttons).

public enum SearchType
{
    SearchTextInMethod,
    SearchCallers,
    SearchMethods,
    SearchProperties,
    SearchClasses
} 
private SearchType _searchType = new SearchType();
 
/// <summary>
/// - Do some checks to see if the input is correct and the solutions.txt file exists
/// - Update the text of the tab to the text that is being searched
/// - Show a hourglass icon on the tab during the search
/// - Start a new worker that will do the search
/// </summary>
/// <param name="sender"></param>
/// <param name="e"></param>
private void btnSearch_Click(object sender, EventArgs e)
{
    string searchText = txtTextToSearch.Text;
    //Remove leading and trailing spaces
    searchText = searchText.Trim();

    if (searchText.Contains("(") || searchText.Contains(")"))
    {
        MessageBox.Show("Please specify searchtext without parentheses or parameters.");
        return;
    }

    if (!File.Exists(Constants.BaseDirectorySolutionsTxtPath))
    {
        MessageBox.Show("There is no solutions.txt file in the directory where the .exe resides." + 
          " Please click the [Browse] button to select a starting direcctory. Then click [Update solution List]");
    }
    else
    {
        SearchType searchType = new SearchType();

        if (!String.IsNullOrEmpty(searchText))
        {
            TabController.UpdateSearchTextOnTab(searchText);
            TabController.ShowHourGlass();

            if (rbSearchTextInMethod.Checked)
            {
                searchType = SearchType.SearchTextInMethod;
            }
            else if (rbSearchCallers.Checked)
            {
                searchType = SearchType.SearchCallers;
            }
            else if (rbSearchMethods.Checked)
            {
                searchType = SearchType.SearchMethods;
            }
            else if (rbSearchProperties.Checked)
            {
                searchType = SearchType.SearchProperties;
            }
            else if (rbSearchClasses.Checked)
            {
                searchType = SearchType.SearchClasses;
            }

            //Create and start a new worker that will do the searching for us.
            WorkerFactory.Start(searchType, searchText, txtExclude.Text, 
              txtInclude.Text, TabController.SelectedTab.Guid);
        }
        else
        {
            MessageBox.Show("Please enter text to search");
        }
    }
} 

The WorkerFactory.Start method creates a new Worker object every time you do a search.

public static class WorkerFactory
{
    private static List<Worker> _workerList = new List<Worker>();

    public static void Start(SearchType searchType, string searchText, 
           string filter, string include, Guid guid)
    {
        Worker worker;

        worker = new Worker(searchType, searchText, filter, include, guid);
        _workerList.Add(worker);

        worker.Start();
    }

    /// <summary>
    /// Select a worker from the workerlist with a certain Guid.
    /// </summary>
    /// <param name="guid"></param>
    /// <returns></returns>
    private static Worker SelectWorker(Guid guid)
    {
        var selectWorker = from worker in _workerList
                           where worker.Guid == guid
                           select worker;

        if (selectWorker != null && selectWorker.Count() == 1)
        {
            return (Worker)selectWorker.First();
        }

        return null;
    }

    /// <summary>
    /// If a tab is deleted the accompanying worker must be cancelled.
    /// It won't be killed, but the results will not be written to a tab anymore.
    /// If it's not needed anymore, doesn't matter because they will be cleaned up once the program quits.
    /// </summary>
    /// <param name="guid">The unique identifier of the worker</param>
    public static void Delete(Guid guid)
    {
        Worker selectWorker = SelectWorker(guid);

        //Does the worker exist in the workerlist?
        //Because, if a tab is deleted, but a worker was not started for that tab,
        //there is no worker to delete.
        if (selectWorker != null)
        {
            selectWorker.Cancel();
            _workerList.Remove(selectWorker);
        }
    }
} 

This Worker uses a BackgroundWorker to start a thread that starts a codesearch using Roslyn.

public class Worker
{
    private CodeSearcher _searcher;
    BackgroundWorker _worker;
    private string _result;
    private Guid _guid;
    private bool _cancel;
 
    public Worker(SearchType searchType, string searchText, string filter, string include, Guid guid)
    {
        _guid = guid;
        _searcher = new CodeSearcher(searchType, searchText, filter, include);

        _worker = new BackgroundWorker();
        _worker.DoWork += new DoWorkEventHandler(worker_DoWork);
        _worker.RunWorkerCompleted += new RunWorkerCompletedEventHandler(worker_RunWorkerCompleted);
    }

    public Guid Guid
    {
        get { return _guid; }
        set { _guid = value; }
    }

    public void Start()
    {
        _worker.RunWorkerAsync();
    }

    /// <summary>
    /// Cancel means the backgroundworker will finish it's job,
    /// but won't write the results to the tabcontroller anymore.
    /// </summary>
    public void Cancel()
    {
        _cancel = true;
    }

    private void worker_RunWorkerCompleted(object sender, RunWorkerCompletedEventArgs e)
    {
        if (!_cancel)
        {
            TabController.WriteResults(_guid, _result);
        }
    }

    private void worker_DoWork(object sender, DoWorkEventArgs e)
    {
       _result = _searcher.Search();
    }
}

If the worker is started with the Start method, it calls the worker_DoWork asynchronously, which calls the CodeSearcher.Search method that searches using 1 of 5 methods, depending on the selected SearchType

public class CodeSearcher
{
    /// <summary>
    /// Search for the provided searchtext in the sourcecode files of the solutions.
    /// Use the provided SearchType (method, callers, text in method).
    /// Return the result in a string.
    /// </summary>
    /// <returns></returns>
    public string Search()
    {
      string result = "";

      List<string> excludes = CodeSearcher.GetFilters(_exclude);
      List<string> includes = CodeSearcher.GetFilters(_include);

      if (CodeRepository.Workspaces.Count() == 0)
      {
        //Get the solutions from the solutions.txt file and load them into Workspaces
        //If it doesn't exist, this will be checked at the moment user presses the [Search] button.
        CodeRepository.Solutions = CodeRepository.GetSolutions(Constants.BaseDirectorySolutionsTxtPath);

        CodeRepository.Workspaces = CodeRepository.GetWorkspaces(CodeRepository.Solutions);
      }

      if (_searchType == SearchType.SearchTextInMethod)
      {
        result = SearchMethodsForTextParallel(CodeRepository.Workspaces, _searchText, excludes, includes);
      }
      else if (_searchType == SearchType.SearchCallers)
      {
        result = SearchCallersParallel(CodeRepository.Workspaces, _searchText, excludes, includes);
      }
      else if (_searchType == SearchType.SearchMethods)
      {
        result = SearchMethodsParallel(CodeRepository.Workspaces, _searchText, excludes, includes);
      }
      else if (_searchType == SearchType.SearchProperties)
      {
        result = SearchPropertiesParallel(CodeRepository.Workspaces, _searchText, excludes, includes);
      }
      else if (_searchType == SearchType.SearchClasses)
      {
        result = SearchClassesParallel(CodeRepository.Workspaces, _searchText, excludes, includes);
      }

      return result;
    } 

If a solutions.txt file exists in the directory where the RoslynCodeSearcher.exe resides, the paths to the solutions will be put in a List and the Workspaces with the solutions will be loaded. A workspace is an active representation of your solution as a collection of projects, each with a collection of documents. The workspace provides access to the current model of the solution. You can read more about it here.  

In the CodeSearcher class I have five search methods. This is where the searching happens. The searching makes use of the .NET 4 keyword Parallel.ForEach to speed things up depending on the number of cores in the processor of your computer. I will show one of the search methods here, the other 4 you can see in the source code. 

 /// <summary>
/// Search through the code for methods that contain the text textToSearch.
/// Return the resulting method bodies as a string.
/// excludes are used to exclude files that have paths that contain certain words.
/// includes are used to include files that have paths that contain certain words.
/// </summary>
/// <param name="workspaces"></param>
/// <param name="textToSearch"></param>
/// <param name="excludes">Projects / documents to exclude by name</param>
/// <param name="includes">Projects / documents to include by name</param>
/// <returns></returns>
public string SearchMethodsForTextParallel(List<IWorkspace> workspaces, 
  string textToSearch, List<string> excludes, List<string> includes)
{
  StringBuilder result = new StringBuilder();
  string language = "";

  foreach (IWorkspace w in workspaces)
  {
    ISolution solution = w.CurrentSolution;

    foreach (IProject project in solution.Projects)
    {
      language = project.LanguageServices.Language;

      Parallel.ForEach(project.Documents, document =>
      {
        //Filter and include document names containing certain words
        if (!excludes.Any(s => document.FilePath.ToUpper().Contains(s)) &&
            (
              includes.Count() == 0 || includes.Any(s => document.FilePath.ToUpper().Contains(s)))
            )
        {
          if (language == LANG_CS)
          {
            result.Append(SearchMethodsForTextCSharp(document, textToSearch));
          }
        }
      });
    }
  }

  return result.ToString();
}

private string SearchMethodsForTextCSharp(IDocument document, string textToSearch)
{
  StringBuilder result = new StringBuilder();

  CommonSyntaxTree syntax = document.GetSyntaxTree();
  var root = (Roslyn.Compilers.CSharp.CompilationUnitSyntax)syntax.GetRoot();

  var syntaxNodes = from methodDeclaration in root.DescendantNodes()
                   .Where(x => x is MethodDeclarationSyntax || x is PropertyDeclarationSyntax)
                    select methodDeclaration;

  if (syntaxNodes != null && syntaxNodes.Count() > 0)
  {
    foreach (MemberDeclarationSyntax method in syntaxNodes)
    {
      if (method != null)
      {
        string methodText = method.GetFullText();
        if (methodText.ToUpper().Contains(textToSearch.ToUpper()))
        {
          result.Append(GetMethodOrPropertyTextCSharp(method, document));
        }
      }
    }
  }

  return result.ToString();
}

When the text or call or method or property is found, the method GetMethodOrPropertyText is called to get the body of the method / property in which the searched item is found. The full text of the method /property will be returned, including the path to the .cs file.

/// <summary>
/// Get the full text of the method or property body.
/// </summary>
/// <param name="node"></param>
/// <param name="document"></param>
/// <returns></returns>
private string GetMethodOrPropertyTextCSharp(Roslyn.Compilers.CSharp.SyntaxNode node, IDocument document)
{
  StringBuilder resultStringBuilder = new StringBuilder();

  string methodText = node.GetFullText();
  bool isMethod = node is Roslyn.Compilers.CSharp.MethodDeclarationSyntax;
  string methodOrPropertyDefinition = isMethod ? "Method: " : "Property: ";

  object methodName = isMethod ? ((Roslyn.Compilers.CSharp.MethodDeclarationSyntax)node).Identifier.Value : 
    ((Roslyn.Compilers.CSharp.PropertyDeclarationSyntax)node).Identifier.Value;
  resultStringBuilder.AppendLine("//=====================================================================================");
  resultStringBuilder.AppendLine(document.FilePath);
  resultStringBuilder.AppendLine(methodOrPropertyDefinition + (string)methodName);
  resultStringBuilder.AppendLine(methodText);

  return resultStringBuilder.ToString();
} 

Jumping back to the Worker object above, when the worker is finished and has the results, the TabController.WriteResults method will be called to update the FastColoredTextBox with the results. 

public static class TabController
{
  private static List<FastColoredTextBoxNS.FastColoredTextBox>
     _fastColoredTextBoxes = new List<FastColoredTextBoxNS.FastColoredTextBox>();

  /// <summary>
  /// The results of the search will be written to the tab specified with the guid
  /// </summary>
  /// <param name="guid"></param>
  /// <param name="text"></param>
  public static void WriteResults(Guid guid, string text)
  {
    //If another thread comes here, block it temporarily until this thread is finished.
    lock (_lockobj)
    {
      var selectFastColoredTextBox = from fctb in _fastColoredTextBoxes
                       where fctb.Guid == guid
                       select fctb;

      if (selectFastColoredTextBox != null && selectFastColoredTextBox.Count()==1)
      {
        FastColoredTextBox currentTextBox = (FastColoredTextBox)selectFastColoredTextBox.First();

        currentTextBox.Text = text;

        if (text == "") currentTextBox.Text = "Nothing found.";

        //move caret to start text
        currentTextBox.Selection.Start = Place.Empty;
        currentTextBox.DoCaretVisible();
      }
    }
  }
} 

As you will see in the source code, there is much more to it then I have shown in this article. For example, it is possible to start multiple searches independently at the same time from different tabs. This uses some threading and proper handling / locking. Also, the tool itself can search in both C# and VB.NET source code. 

About the Source Code

The projects attached will open up and build in Visual Studio 2010 SP1 (or Visual Studio 2012 if you download the 2012 version). In paragraph "How to use it" I explain the prerequisites that are necessary to use the tool.  

Future of this project

Some thoughts about the direction this project might go in the future:

Regular Expression Support  

I want to be able to search using regular expressions, for example: Give me all the methods that are named "SaveCustomer" or "InsertCustomer". You would have to type a regular expression like this.
(Save|Insert)Customer 

Visual Studio Extension  

This could be reworked as a Visual Studio extension. That way it could make use of the C# code editor and other parts of Visual Studio. That could make it even more powerful and accessible to more people. 

Advanced stuff 

To make refactoring source code through multiple solutions friendlier, it would be nice if you could do some type of "queries" on your source code, just like LINQ. Something like  http://www.ndepend.com/Doc_CQLinq_Syntax.aspx or  http://www.codeproject.com/Articles/408663/Using-NRefactory-for-analyzing-Csharp-code. To make these queries strongly typed and not dynamic, that would need IntelliSense in a kind of interactive window. Maybe the new Roslyn "C# Interactive window" could be of use for this. But probably this would be easier to realize as a Visual Studio Extension. 

Output  

Let the user define what the output should contain, for example:

  • The whole code file
  • A graphical view of connections between methods / classes / solutions etc.

History    

2012-12-05

  • Created a version for Visual Studio 2012 and Roslyn September 2012 CTP 
  • Fixed the breaking changes in the version that works with Roslyn September 2012 CTP
  • Fixed the unit tests because of the breaking changes of Roslyn September 2012 CTP 

2012-08-21

  • Fixed a bug in searching callers; some callers were not found.

2012-08-05

  • Added "precompile" option to compile the solutions in memory at program startup to speed things up 

2012-08-04

  • Added class name search ability and "part of text" search  

2012-08-01

  • Used Parallel.ForEach for searching. +/- 2x as fast with 2 cores, 4 cores not able to test, but probably 4x as fast.  

2012-07-28

  • More unit tests (TabController)  
  • Use .Any() instead of Count() > 0 
  • Unit test to test performance of  @"A".ToUpper().Contains(@"B".ToUpper()) versus @"A".IndexOf(@"B", StringComparison.OrdinalIgnoreCase)  

2012-07-24

  • Added unit tests   
  • Able to search in VB.NET code also  

2012-07-18

  • Added property search ability
  • Input check on search textbox
  • Remove leading / trailing spaces on text from search textbox when click [Search]
  • Show "Method:" or "Property:" depending on which search type is selected

2012-07-16

  • Fixed ability to Copy (Ctrl-C) from the FastColoredTextBox 
  • Show hourglass icon on tabs when threads are running 
  • If you click button [New tab] the program automatically jumps to the next tab
  • Separator lines between tabs 
  • Changed text of include / exclude text fields to better describe what they mean 

2012-07-15

  • Fixed issue causing error with parentheses in search text
  • Added extra comments to source
  • Tested if different types of method definitions can be searched
  • Some refactoring: regions etc. 
  • Added MessageBox for button "Update solution List".   

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here