Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / database

SQL Object Dependencies with Simple RegEx and Some Algo Magic

5.00/5 (2 votes)
10 Jan 2018CPOL4 min read 11.6K  
A simple yet powerful tool to analyze SQL object dependencies in quite a fast and reliable manner.

Introduction

Databases came to the world in 70. Since that, they became bigger in size, with more complicated logic and still keep growing. So, there are many tools, SQL parsers, which tokenize by-object script.

But common parsers (ANTLR, e.g.) may face with script parsing errors, each SQL dialect has its own specials, and time issues on huge number of DB objects.

So, I want to show much simpler, a RegEx+some trick way.

Main Concepts

  1. Get one-file SQL script of whole db objects.
  2. Split SQL script onto text blocks by using regex that match DDL statements
  3. Search some text string among these text blocks

Implementation

1. Take All SQL Objects Script

Each RDBMS has an option to generate SQL drop/create scripts. So, first we get single SQL script of all database objects. Or we can take northwind.sql as an example.

2. Get all DDL Statements with Regular Expression

I use this expression:

C#
\b(create|alter)\s+(unique)*\s*(nonclustered)*\s*
(procedure|proc|table|trigger|view|function|index)\b\s([^=<>\s\']+).([^=<>\s\'\(]+)[*\s\b\t]*

FastColoredTextBox gives the great tools to work with regular expressions, so we use:

C#
var result = range.GetRangesByLines(regexStr, RegexOptions.IgnoreCase);

First of all, I built some useful regex and collect them in one static class:

C#
public static class RegexValues
{
    public static string SqlCmdObjects = @"\[\$\(([^=<>\[\]\s\']+)\)\].\[[^=<>\s\']+\]";
    public static string SqlCmdObjectsShort = @"\$\(([^=<>\[\]\s\']+)\)";

    public static string DdlObjects = @"\b(create)\s+(procedure|proc|table|trigger|
           view|function)\b\s\[\$\(([^=<>\[\]\s\']+)\)\].\[[^=<>\s\']+\]";
    public static string DdlObjectsPrepared = @"\b(create)\s+(procedure|proc|table|
           trigger|view|function)\b\s\[([^=<>\[\]\s\']+)\].(\[[^=<>\s\']+\])";

    public static string DdlObjects_ = @"\b(create)\s+(procedure|proc|table|
                  trigger|view|function)\b\s\$\(([^=<>\[\]\s\']+)\).[^=<>\s\']+";

    public static string DdlObjectsPrepared_ = @"\b(create)\s+(procedure|proc|table|
                  trigger|view|function)\b\s([^=<>\s\']+).([^=<>\s\'\(]+)[*\s\b\t]*";

    public static string DdlObjectsPreparedProcViewTrigger = @"\b(create)\s+
    (procedure|proc|trigger|view|function)\b\s([^=<>\s\']+).([^=<>\s\'\(]+)[*\s\b\t]*";

    public static string DdlObjectsPreparedWithIndex = @"\b(create)\s+(unique)*\s*
    (nonclustered)*\s*(procedure|proc|table|trigger|view|function|index)\b\s
    ([^=<>\s\']+).([^=<>\s\'\(]+)[*\s\b\t]*";

    public static string DdlIndexAll = @"\b(create|alter)\s+(procedure|proc|table|
                  trigger|view|function)\b\s\[([^=<>\[\]\s\']+)\].\[[^=<>\s\']+\]";

    public static string Variables = @"\@([^=<>\s\'\)\(\,]+)";
    public static string SqlCmdVariables = @"\:SETVAR\s+([a-zA-Z_]+)\s+([a-zA-Z_]+)";
}
  • SqlCmdObjects - matches for [$(some_var)].[obj_name] or [$(some_var)].[obj_schema].[obj_name]
  • SqlCmdObjectsShort - matches for $(some_var)
  • DdlObjects - same as SqlCmdObjects plus create PROC/TABLE/VIEW/FUNC/TRIGGER statements
  • DdlObjectsPrepared - same as DdlObjects, but removed $ (sqlcmd variable) restriction
  • DdlObjects_ - same as DdlObjects, but removed enclosing [] restriction
  • DdlObjectsPrepared_ - same as DdlObjectsPrepared, but removed enclosing [] restriction
  • DdlObjectsPreparedProcViewTrigger - DDL statements for only proc, views, triggers and functions, i.e., only code objects
  • DdlObjectsPreparedWithIndex - same as DdlObjectsPrepared_, but extended with index statements
  • DdlIndexAll - same as DdlObjectsPrepared_, but extended with alter statements
  • Variables - regex that finds all @variables in script
  • SqlCmdVariables - finds SQL CMD variables (like :SETVAR var1 val1)

The heart of this software is the following lines of code:

C#
var range = Syncronized(() => fastColoredTextBox1.Selection.Clone());
range.Normalize();

range.Start = new Place(0, 0);
range.End = Syncronized(() => new Place(tb.GetLineLength(tb.LinesCount - 1),
                                        tb.LinesCount - 1));

So we just load SQL script into FastColoredTextBox (thanks alot to its authors!!!) and apply some regex to all its contents. As the output, we got RESULT variable, which contains a list of found ranges.

Range is the powerful class (thanks again to FastColoredTextBox author). Range contains line number and column number of found fragment in SQL script. We just store list of ranges in virtual listbox (common technique) and on SelectedIndexChanged do the following:

C#
fastColoredTextBox1.Selection = range;
fastColoredTextBox1.DoSelectionVisible();

by these two lines, we instantly navigate to found piece of code (i.e., DDL statement).

Then we build another list of ranges, but instead of regex string, we put the name of some object to be found.

The last step is to find occurrences of second list of ranges in the first list of ranges. This is done in method:

C#
private void FindUsage(string regex)

of Form1.cs.

That's all, the rest of the work is just common .NET coding to build up all together.

As a brief to the above - the logic is as follows:

  1. We find all create/alter statements by using regex. And remember those line numbers.
  2. We find all occurrences of text (object, e.g., table name) by using regex. And also remember line numbers.
  3. Now join these two datasets, finding, where the text line is in between create/alter statements lines

As a result, we got a simple GUI designer, where you can:

  • open a SQL script or whole folder (scripts will be merged into temp file and opened as single script)
  • Search all DDL statements (CREATE/ALTER) - list will be build on the left pane, with full navigation support.
  • select some piece of text by mouse (e.g., some SQL object name)
  • right click and select "Find Usages" - list will be build on the right pane - these are SQL objects, containing selected text

Another good thing is that you can search not only SQL objects occurrence in SQL objects, but almost any piece of code, text, comment, etc.

Important Notice

As this way of analyzing SQL dependencies is based not on metadata info, you should always keep in mind that you're joining two datasets of rows/ So there are some limitations, or, better say, "features" :)

Let's say we have a stored procedure:

SQL
create proc test

as

declare @somevar int

create table #tmp(

colum_we_search nvarchar(255),
somevar int)

If we apply regex matching "CREATE TABLE" statement, we match temp table along with CREATE PROC.

Then, if we try to search "COLUMN_WE_SEARCH" - it will be found inside that temp table #tmp, not in procedure test.

This can be worked around with more precise initial regex. For such cases, I wrote DdlObjectsPreparedProcViewTrigger regex statement (see the above).

Points of Interest

I want to add MSAGL https://github.com/Microsoft/automatic-graph-layout support to visualize dependencies.

FastColoredTextBox - is the high-end, the total breakthrough! I didn't even expect that it would have so many cool features!

History

  • 10th January, 2018: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)