Introduction
Databases came to the world in 70. Since that, they became bigger in size, with more complicated logic and still keep growing. So, there are many tools, SQL parsers, which tokenize by-object script.
But common parsers (ANTLR, e.g.) may face with script parsing errors, each SQL dialect has its own specials, and time issues on huge number of DB objects.
So, I want to show much simpler, a RegEx+some trick way.
Main Concepts
- Get one-file SQL script of whole db objects.
- Split SQL script onto text blocks by using regex that match DDL statements
- Search some text string among these text blocks
Implementation
1. Take All SQL Objects Script
Each RDBMS has an option to generate SQL drop/create scripts. So, first we get single SQL script of all database objects. Or we can take northwind.sql as an example.
2. Get all DDL Statements with Regular Expression
I use this expression:
\b(create|alter)\s+(unique)*\s*(nonclustered)*\s*
(procedure|proc|table|trigger|view|function|index)\b\s([^=<>\s\']+).([^=<>\s\'\(]+)[*\s\b\t]*
FastColoredTextBox
gives the great tools to work with regular expressions, so we use:
var result = range.GetRangesByLines(regexStr, RegexOptions.IgnoreCase);
First of all, I built some useful regex
and collect them in one static
class:
public static class RegexValues
{
public static string SqlCmdObjects = @"\[\$\(([^=<>\[\]\s\']+)\)\].\[[^=<>\s\']+\]";
public static string SqlCmdObjectsShort = @"\$\(([^=<>\[\]\s\']+)\)";
public static string DdlObjects = @"\b(create)\s+(procedure|proc|table|trigger|
view|function)\b\s\[\$\(([^=<>\[\]\s\']+)\)\].\[[^=<>\s\']+\]";
public static string DdlObjectsPrepared = @"\b(create)\s+(procedure|proc|table|
trigger|view|function)\b\s\[([^=<>\[\]\s\']+)\].(\[[^=<>\s\']+\])";
public static string DdlObjects_ = @"\b(create)\s+(procedure|proc|table|
trigger|view|function)\b\s\$\(([^=<>\[\]\s\']+)\).[^=<>\s\']+";
public static string DdlObjectsPrepared_ = @"\b(create)\s+(procedure|proc|table|
trigger|view|function)\b\s([^=<>\s\']+).([^=<>\s\'\(]+)[*\s\b\t]*";
public static string DdlObjectsPreparedProcViewTrigger = @"\b(create)\s+
(procedure|proc|trigger|view|function)\b\s([^=<>\s\']+).([^=<>\s\'\(]+)[*\s\b\t]*";
public static string DdlObjectsPreparedWithIndex = @"\b(create)\s+(unique)*\s*
(nonclustered)*\s*(procedure|proc|table|trigger|view|function|index)\b\s
([^=<>\s\']+).([^=<>\s\'\(]+)[*\s\b\t]*";
public static string DdlIndexAll = @"\b(create|alter)\s+(procedure|proc|table|
trigger|view|function)\b\s\[([^=<>\[\]\s\']+)\].\[[^=<>\s\']+\]";
public static string Variables = @"\@([^=<>\s\'\)\(\,]+)";
public static string SqlCmdVariables = @"\:SETVAR\s+([a-zA-Z_]+)\s+([a-zA-Z_]+)";
}
SqlCmdObjects
- matches for [$(some_var)].[obj_name]
or [$(some_var)].[obj_schema].[obj_name]
SqlCmdObjectsShort
- matches for $(some_var)
DdlObjects
- same as SqlCmdObjects
plus create PROC
/TABLE
/VIEW
/FUNC
/TRIGGER
statements DdlObjectsPrepared
- same as DdlObjects
, but removed $
(sqlcmd
variable) restriction DdlObjects_
- same as DdlObjects
, but removed enclosing []
restriction DdlObjectsPrepared_
- same as DdlObjectsPrepared
, but removed enclosing []
restriction DdlObjectsPreparedProcViewTrigger
- DDL statements for only proc, views, triggers and functions, i.e., only code objects DdlObjectsPreparedWithIndex
- same as DdlObjectsPrepared_
, but extended with index
statements DdlIndexAll
- same as DdlObjectsPrepared_
, but extended with alter
statements Variables
- regex that finds all @variables
in script SqlCmdVariables
- finds SQL CMD variables (like :SETVAR var1 val1
)
The heart of this software is the following lines of code:
var range = Syncronized(() => fastColoredTextBox1.Selection.Clone());
range.Normalize();
range.Start = new Place(0, 0);
range.End = Syncronized(() => new Place(tb.GetLineLength(tb.LinesCount - 1),
tb.LinesCount - 1));
So we just load SQL script into FastColoredTextBox (thanks alot to its authors!!!) and apply some regex to all its contents. As the output, we got RESULT
variable, which contains a list of found ranges.
Range
is the powerful class (thanks again to FastColoredTextBox
author). Range
contains line number and column number of found fragment in SQL script. We just store list of ranges in virtual listbox
(common technique) and on SelectedIndexChanged
do the following:
fastColoredTextBox1.Selection = range;
fastColoredTextBox1.DoSelectionVisible();
by these two lines, we instantly navigate to found piece of code (i.e., DDL statement).
Then we build another list of ranges, but instead of regex string, we put the name of some object to be found.
The last step is to find occurrences of second list of ranges in the first list of ranges. This is done in method:
private void FindUsage(string regex)
of Form1.cs.
That's all, the rest of the work is just common .NET coding to build up all together.
As a brief to the above - the logic is as follows:
- We find all
create
/alter
statements by using regex. And remember those line numbers. - We find all occurrences of text (object, e.g., table name) by using regex. And also remember line numbers.
- Now join these two datasets, finding, where the text line is in between
create
/alter
statements lines
As a result, we got a simple GUI designer, where you can:
- open a SQL script or whole folder (scripts will be merged into temp file and opened as single script)
- Search all DDL statements (
CREATE
/ALTER
) - list will be build on the left pane, with full navigation support. - select some piece of text by mouse (e.g., some SQL object name)
- right click and select "Find Usages" - list will be build on the right pane - these are SQL objects, containing selected text
Another good thing is that you can search not only SQL objects occurrence in SQL objects, but almost any piece of code, text, comment, etc.
Important Notice
As this way of analyzing SQL dependencies is based not on metadata info, you should always keep in mind that you're joining two datasets of rows/ So there are some limitations, or, better say, "features" :)
Let's say we have a stored procedure:
create proc test
as
declare @somevar int
create table #tmp(
colum_we_search nvarchar(255),
somevar int)
If we apply regex matching "CREATE TABLE
" statement, we match temp
table along with CREATE PROC
.
Then, if we try to search "COLUMN_WE_SEARCH
" - it will be found inside that temp
table #tmp
, not in procedure test.
This can be worked around with more precise initial regex. For such cases, I wrote DdlObjectsPreparedProcViewTrigger regex
statement (see the above).
Points of Interest
I want to add MSAGL https://github.com/Microsoft/automatic-graph-layout support to visualize dependencies.
FastColoredTextBox
- is the high-end, the total breakthrough! I didn't even expect that it would have so many cool features!
History
- 10th January, 2018: Initial version