The Semantic Database In Action

Marc Clifton

4.97/5 (13 votes)

4 Nov 2014CPOL16 min read

27K

A Feed Reader Use Case, demonstrating in part how the Semantic Database works, as well as the Higher Order Programming Environment IDE

Watch the Video!

The video is a great way to actually see process described in this article. It's pretty raw, so at some point I might redo it to make it a bit more professional. The audio is rather low for some reason, so you'll have to turn your volume settings up.

Grab the Code!

git clone https://github.com/cliftonm/HOPE.git
git checkout -b feed-reader-semantic-database

Introduction

This is Part II of the article series on semantic database (Part I is here.) In this article, we will again visit the idea of a feed reader (which I earlier wrote about with regards to Natural Language Processing) but this time, we'll focus on the persistence and querying of semantic information using the Semantic Database that I wrote about in Part I.

The format for this article will be basically be a text-based tutorial of the video that I haven't made yet, but with some additional "behind the scenes" code examples. This pattern is repeated throughout the article to illustrate what is going on in the receptors for each step.

It is assumed that you are familiar with the HOPE concept. If not, visit the HOPE website for tutorials, links to articles on Code Project, white papers, etc.

Creating A Feed Reader Applet

For each of the steps in this tutorial, there is a corresponding applet file, so you can load the applet and play with just the behavior in that step.

Step 1: The Basics First

Creating a basic feed reader applet requires only two receptors:

Feed Reader
Carrier List Viewer

In the HOPE IDE, select the Carrier List Viewer and the Feed Reader receptors and click the Add button:

This creates two receptors on the HOPE surface, as well as an un-initialized viewer:

Double-clicking on the Carrier List Viewer brings up its configuration UI:

Enter "Code Project" for the window name and select the semantic type RSSFeedItem, as we want to be displaying feed items emitted by the feed reader. When we save the configuration, notice that the list viewer's columns are now configured based on the selected semantic type:

Also notice that the surface now shows the two receptors to be interacting with each other. From the View menu, we'll select "Show Semantics" to see what the protocols are, thus revealing:

We want to disable the RSSFeedUrl signal from being emitted by the carrier list viewer. Why? Because later, when we interact with the List Viewer by double-clicking on a feed item to display it in a browser page, we don't want the feed URL to be sent to the Feed Reader, which would re-trigger a read operation (the Feed Reader can receive this semantic structure, aka protocol, as a way to programmatically trigger the Feed Reader.)

To disable this protocol, right-click on the Carrier List Viewer and uncheck the protocl RSSFeedUrl:

We now see that this protocol is no longer being received by the Feed Reader:

Now let's configure the Feed Reader receptor. Double-click on it and enter the URL for the Code Project Article feed (http://www.codeproject.com/WebServices/ArticleRSS.aspx):

Click on Save, and notice that, after a few seconds, the list view populates with the items in the feed. You will see the items sent to the list viewer is as little red triangles:

And observe the items in the List Viewer:

This applet is saved as "Step 1 - The Basics" and, whenever you load it into the HOPE IDE, it will re-acquire the feed and display the articles.

When you work with semantic structures, you gain a lot of flexibility in the ways data can be presented. One of the simplest options is to inspect a particular sub-type of a semantic structure. For example, we can configure the Carrier List Viewer to respond to any other protocol in the RSSFeedItem semantic structure. If we want to view only RSS Titles, which is one of the semantic structure of which RSSFeedItem is composed, simply change the semantic type to "RSSFeedTitle":

Now, when we receive feeds, we only see their titles:

We will take advantage of this capability later on when we create custom ontology queries (a fancier word than "relationships" or "table joins", and in my opinion, more accurate when dealing with a semantic database, and besides, I enjoy creating buzzword bingo terms as much as the next guy.)

Behind the Scenes

The Feed Reader

The feed reader reads the feed asynchronously, returning a SyndicationFeed object:

protected async Task<SyndicationFeed> GetFeedAsync(string feedUrl)
{
  CreateCarrier("LoggerMessage", signal => 
  {
    signal.TextMessage.Text.Value = "Acquiring feed " + feedUrl + ".";
    signal.MessageTime = DateTime.Now;
  });

  SyndicationFeed feed = await Task.Run(() =>
  {
    // To handle this error:
    // For security reasons DTD is prohibited in this XML document. To enable DTD processing set the DtdProcessing property on XmlReaderSettings to Parse and pass the settings into XmlReader.Create method.
    XmlReaderSettings settings = new XmlReaderSettings();
    settings.XmlResolver = null;
    settings.DtdProcessing = DtdProcessing.Ignore;

    XmlReader xr = XmlReader.Create(feedUrl);
    SyndicationFeed sfeed = SyndicationFeed.Load(xr);
    xr.Close();

    return sfeed;
  });

  CreateCarrier("LoggerMessage", signal =>
  {
    signal.TextMessage.Text.Value = "Feed " + feedUrl + " has " + feed.Items.Count().ToString() + " items.";
    signal.MessageTime = DateTime.Now;
  });

  return feed;
}

You'll note some logging messages as well. Once the reader acquires the items, it creates carriers with the RSSFeedItem protocol and populates the signal with the syndication feed items:

protected void EmitFeedItems(SyndicationFeed feed, int maxItems = Int32.MaxValue)
{
  // Allow -1 to also represent max items.
  int max = (maxItems == -1 ? feed.Items.Count() : maxItems);
  max = Math.Min(max, feed.Items.Count()); // Which ever is less.

  feed.Items.ForEachWithIndexOrUntil((item, idx) =>
  {
    CreateCarrier("RSSFeedItem", signal =>
    {
      signal.RSSFeedName.Name.Text.Value = FeedName;
      signal.RSSFeedTitle.Title.Text.Value = item.Title.Text;
      signal.RSSFeedUrl.Url.Value = item.Links[0].Uri.ToString();
      signal.RSSFeedDescription.Description.Text.Value = item.Summary.Text;
      signal.RSSFeedAuthors.Text.Value = String.Join(", ", item.Authors.Select(a => a.Name));
      signal.RSSFeedCategories.Text.Value = String.Join(", ", item.Categories.Select(c => c.Name));
      signal.RSSFeedPubDate.Value = item.PublishDate.LocalDateTime;
      });
  }, ((item, idx) => idx >= max));
}

The Carrier List Viewer

On the receiving side, the Carrier List Viewer is paying attention to the protocol that we configured earlier:

protected void ConfigureBasedOnSelectedProtocol()
{
  ProtocolName = cbProtocols.SelectedValue.ToString();
  CreateViewerTable();
  ListenForProtocol();
  UpdateCaption();
}

The workhorse function here is CreateViewerTable, which queries the semantic structure (drilling down into sub-types) and displaying, based on the ordinality property value set in the schema, the columns. Column names use aliases, again as determined by the Alias property in the schema.

protected virtual void CreateViewerTable()
{
  if (!String.IsNullOrEmpty(ProtocolName))
  {
    DataTable dt = new DataTable();
    List<IFullyQualifiedNativeType> columns = rsys.SemanticTypeSystem.GetFullyQualifiedNativeTypes(ProtocolName).OrderBy(fqn=>fqn.Ordinality).ToList();
uniqueKey.Clear();

    columns.ForEach(col =>
    {
      try
      {
        DataColumn dc = new DataColumn(col.FullyQualifiedName, col.NativeType.GetImplementingType(rsys.SemanticTypeSystem));

        // If no alias, then use the FQN, skipping the root protocol name.
        String.IsNullOrEmpty(col.Alias).Then(() => dc.Caption = col.FullyQualifiedName.RightOf('.')).Else(() => dc.Caption = col.Alias);
        dt.Columns.Add(dc);
        col.UniqueField.Then(() => uniqueKey.Add(col));
      }
      catch
      {
        // If the implementing type is not known by the native type system (for example, List<dynamic> used in the WeatherInfo protocol, we ignore it.
        // TODO: We need a way to support implementing lists and displaying them in the viewer as a sub-collection.
        // WeatherInfo protocol is a good example.
      }
    });

    dvSignals = new DataView(dt);
    dgvSignals.DataSource = dvSignals;

    foreach(DataColumn dc in dt.Columns)
    {
      dgvSignals.Columns[dc.ColumnName].HeaderText = dc.Caption;
    }
  }
}

When the Carrier List Viewer receives a signal of the specified semantic structure (aka protocol), it populates a row of the grid, as long as that row is not a duplicate. Duplicates are determined by whether the semantic structure has flagged the structure, element, and/or native types as unique (refer to the article on Semantic Databases for a discussion of this.)

protected void ShowSignal(dynamic signal)
{
  form.IfNull(() => ReinitializeUI());
  List<IFullyQualifiedNativeType> colValues = rsys.SemanticTypeSystem.GetFullyQualifiedNativeTypeValues(signal, ProtocolName);

  if (!RowExists(colValues))
  {
    try
    {
      DataTable dt = dvSignals.Table;
      DataRow row = dt.NewRow();
      colValues.ForEach(cv =>
      {
        try
        {
          row[cv.FullyQualifiedName] = cv.Value;
        }
        catch
        {
          // Ignore columns we can't handle.
          // TODO: Fix this at some point. WeatherInfo protocol is a good example.
        }
      });
    dt.Rows.Add(row);
    }
    catch (Exception ex)
    {
      EmitException(ex);
    }
  }
}

Step 2: Viewing the Feed Item in a Browser

Obviously we will want to view the feed items that interest us. By default, when we double click on the Carrier List Viewer, it will emit the protocol and sub-protocols to any receptor listening to those protocols. In our case, let's use a web page viewer, which is a simple viewer based on .NET's WebBrowser control. First, we add the Web Page Viewer receptor to the surface by selecting it and clicking Add:

Notice that the sub-protocol "Url" is being received by our new receptor from both the Carrier List Viewer and the Feed Reader:

We don't the Web Page Viewer displaying every page that the Feed Reader emits as part of the RSSFeedItem protocol, so we'll right-click on the Feed Reader and disable the Url semantic type:

We now see the desired configuration:

Lastly, when we double-click on an item in the list, we get a browser window with the item content:

One of the key aspects of the HOPE system is that receptors are lightweight components. For example, instead of using the Web Page Viewer receptor, we could instead have used the Web Page Launcher receptor, which would launch the page as a tab in your browser. It is this component-oriented (another buzzword!) development approach, coupled with the full semantics of the data, that makes it so easy to build applications.

Behind the Scenes

The Carrier List Viewer Receptor

What happens when we double-click on an item in the viewer? Essentially, the opposite occurs, where the signal is reconstructed from the selected row:

protected virtual void OnCellContentDoubleClick(object sender, DataGridViewCellEventArgs e)
{
  ISemanticTypeStruct st = rsys.SemanticTypeSystem.GetSemanticTypeStruct(ProtocolName);
  dynamic outsignal = rsys.SemanticTypeSystem.Create(ProtocolName);
  List<IFullyQualifiedNativeType> ntList = rsys.SemanticTypeSystem.GetFullyQualifiedNativeTypes(ProtocolName);

  ntList.ForEach(nt =>
  {
    // Store the value into the signal using the FQN.
    string colName = nt.FullyQualifiedName;

    // Columns that can't be mapped to native types directly (like lists) are not part of the data table.
    if (dgvSignals.Columns.Contains(colName))
    {
      rsys.SemanticTypeSystem.SetFullyQualifiedNativeTypeValue(outsignal, nt.FullyQualifiedNameSansRoot, dvSignals[e.RowIndex][colName]);
    }
  });

  // Send the record on its way.
  rsys.CreateCarrier(this, st, outsignal);
}

The Web Pager Viewer Receptor

The implementation here is trivial, so I'll show you the whole class:

public class WebBrowserReceptor : WindowedBaseReceptor
{
  protected WebBrowser browser;

  public override string Name { get { return "Web Page Viewer"; } }
  public override bool IsEdgeReceptor { get { return true; } }

  public WebBrowserReceptor(IReceptorSystem rsys)
    : base("webPageViewer.xml", false, rsys)
  {
    AddReceiveProtocol("Url", (Action<dynamic>)(signal => ShowPage(signal.Value)));
  }

  protected void ShowPage(string url)
  {
    form.IfNull(() => ReinitializeUI());
    browser.Navigate(new Uri(url));
  }

  protected override void InitializeUI()
  {
    base.InitializeUI();
    browser = (WebBrowser)mycroParser.ObjectCollection["browser"];
  }
}

Notice that it listens to "Url" protocols and, on receipt, navigates the browser to that page, creating the form if it doesn't exist. The only "complexity" is the backing XML code for the UI:

<?xml version="1.0" encoding="utf-8" ?>
<MycroXaml Name="Form"
  xmlns:wf="System.Windows.Forms, System.Windows.Forms, Version=1.0.5000.0, Culture=neutral, PublicKeyToken=b77a5c561934e089"
  xmlns:def="def"
  xmlns:ref="ref">
  <wf:Form Text="Web Page Viewer" Size="400, 300" StartPosition="CenterScreen" ShowInTaskbar="false">
    <wf:Controls>
      <wf:WebBrowser def:Name="browser" Dock="Fill"/>
    </wf:Controls>
  </wf:Form>
</MycroXaml>

Step 3: More Feed Readers

We'll add a few more feed reader:

Gigaom: https://gigaom.com/feed/

Wired: http://feeds.wired.com/wired/index

InfoWorld: http://www.infoworld.com/index.rss

Ars Technica: http://feeds.arstechnica.com/arstechnica/technology-lab?format=xml

In each new feed reader, we disable the URL and RSSFeedUrl protocols from being emitted, otherwise these are received by the Web Page Viewer and the Feed Reader receptors respectively (try saying "receptors respectively five times real fast.) You can arrange the receptors as you like:

And enjoy a list of all recent items:

Step 4: Refreshing Feeds

It would be nice The Feed Reader receptors re-acquired the feed at some regular interval. Fortunately, the Feed Reader receptor will respond to an RSSFeedRefresh protocol. We can emit this protocol at a timed interval by adding an Interval Timer receptor:

and configuring to emit RSSFeedRefresh every 30 minutes:

Once the protocol is specified, we observe how the Interval Timer receptor emits this protocol to the Feed Reader receptors. The UI also displays the countdown:

Behind the Scenes

The Feed Reader Receptor

The RSSFeedRefresh protocol is a semantic structure with no structure! It is simply a protocol -- there are no sub-types.

In the Feed Reader receptor, receiving this protocol:

AddReceiveProtocol("RSSFeedRefresh", (Action<dynamic>)(s => AcquireFeed(lastFeedUrl)));

Causes the feed to be re-acquired.

The reason we implement don't implement automatic refresh in the Feed Reader receptor is that we want to maintain a separation of concerns. Why should the Feed Reader receptor know anything about re-acquiring the feed at some designated interval? This paints the receptor into a corner, as there may be a variety of different ways we want to trigger re-acquiring the feed, having nothing to do with timers!

The Interval Timer Receptor

This receptor is very simple as well -- it configures a Timer to fire every second (so that the UI can be updated) and, when the elapsed time occurs, it fires the carrier with the specified protocol. Note that the signal has no values.

protected void FireEvent(object sender, EventArgs e)
{
  if (triggerTime <= DateTime.Now)
  {
    triggerTime = DateTime.Now + new TimeSpan(days, hours, minutes, seconds);

    if (Enabled)
    {
      CreateCarrierIfReceiver(ProtocolName, signal => { });
    }
  }

  TimeSpan ts = triggerTime - DateTime.Now;

  Subname = String.Format("{0}:{1}:{2:D2}:{3:D2}", ts.Days, ts.Hours, ts.Minutes, ts.Seconds);
}

Step 5 - Adding Persistence

This has all been fine, but we need to get to the meat of the matter now, which is adding persistence. So, instead of displaying the feed items with a Carrier List Viewer, we'll replace this receptor with the Semantic Database receptor, and remove the Web Page Viewer. Drag the Carrier List Viewer and Web Page Viewer receptors off the surface, and they will be removed.

Next, add the Semantic Database receptor:

Once the receptor is on the surface, we can configure it for which protocols (semantic structures) we want to persist. The only semantic structure we're interested in is "RSSFeedItem", so we configure it to persist that semantic type:

Once this is done, we note that the Feed Reader receptors are emitting RSSFeedItem to the Semantic Database receptor:

Behind the Scenes

There isn't any "Behind the Scenes" here. We haven't written any code in the Feed Reader receptors to communicate to a database, the signals are now persisted by the fact that we have a receptor (the Semantic Database in this case) listening to the protocol that the Feed Reader emits. In other words, we can add persistence anywhere in the application by listening to the protocols that the receptors emit.

Furthermore, the Semantic Database receptor automatically checks to see if the supporting tables exist for the protocols that it listens to, and if they don't, it creates the necessary tables.

Of course, saying there's nothing going on behind the scenes is a bit of a lie -- there's a lot going on in the Semantic Database receptor, which I've described in the previous article.

Step 6: Add some Logging

Let's add some logging capability. The Semantic Database emits the SQL statements that it generates, and we want to look at those, so we'll add a Carrier List Viewer receptor and a Text Display receptor:

The result (after configuring the Carrier List Viewer to receive LoggerMessage protocols) is something of a mess because just about every receptor has some logging output:

Instead of turning off selective protocols in all the Feed Readers, let's put the Carrier List Viewer and Semantic Database into its own membrane:

Now, we just turn off the Text protocol in the Semantic Database and, by double clicking anywhere in the membrane, we enable RSSFeedItem as a protocol that can permeate into the membrane:

Giving us the desired configuration:

Now we just wait 5 more minutes for the signals to refresh and see what the log file shows us the database is doing. In the meantime, let's look at the log to see what the Semantic Database did with a SQLite database (it's nice to work with SQLite because you can simply delete the entire file and start from fresh.) We note first that it created the various tables necessary for the RSSFeedItem structure:

and indeed we see these tables using the SQLite Database Browser:

Next, once the feeds re-acquire (the signals by the Semantic Database are its checking the database schema against the semantic schema):

We see (just once, when the database is created, because most of this will be duplicate data on subsequent re-acquisitions) a lot of checks for uniqueness and a lot of insert statements:

You can see the semantic structure for the RSSFeedItem being built up. It's a lot of transactions because the RSSFeedItem structure is rather deep and we have to test every record for uniqueness since the we know the Feed Reader will be re-acquiring the same feed items much of the time.

Happily, we didn't have to code these statements, we didn't have to create a complicated ORM and populate objects, we didn't have to figure out the data model before hand -- the Semantic Database takes care of automating all of these processes.

Of course we all know that there are cases when automation doesn't get you to that last 1%, but fortunately we don't encounter any of those issues in this tutorial.

One last point - we can double-click on a line item in the logger to more easily review the SQL statements:

Behind the Scenes

We've already seen how selecting an item in a Carrier List Viewer emits a protocol. As the LoggerMessage semantic structure contains a Text semantic structure, the Text Viewer receptor, which listens for this protocol, will respond. It's implementation of its core behavior is quite simple:

public ReceptorDefinition(IReceptorSystem rsys) : base(rsys)
{
  AddReceiveProtocol("Text", (Action<dynamic>)(signal =>
  {
    form.IfNull(() =>
    {
      InitializeViewer();
      UpdateFormLocationAndSize();
    });

    string text = signal.Value;

    if (!String.IsNullOrEmpty(text))
    {
      tb.AppendText(text.StripHtml());
      tb.AppendText("\r\n");
    }
  }));
}

Step 7: Querying the Database and Using a Real Feed Reader Control

Let's wrap what we've done in an outer membrane to create a "computational island", as we don't want any interactions with the next step. To do this, we left-click and drag to surround all the receptors and the membrane, resulting in:

We can put this "computational island" aside now and work on displaying feed items with the Feed List Viewer receptor, which is a derivation of the Carrier List Viewer. We'll start by adding several receptors:

Feed List Viewer
Semantic Database
Signal Creator
Interval Timer
Web Page Launcher

Resulting in:

Note the new dialog and the unconfigured receptors. But also note that a few things are already happening, once we separate out the receptors:

We see that the Feed List receptor is issuing a Query to the Semantic Database. It does this immediately, and since it is querying for bookmark categories, a response signal is being returned. We also note that the protocol "Url" is being received by the Web Page Launcher receptor (this is a sub-type of the RSSFeedBookmark protocol, and is therefore being "seen" by the Web Page Launcher receptor. To stop this (otherwise it'll actually launch web pages as a result of returning data from queries containing this sub-type), we right-click on the Semantic Database receptor and disable the "Url" emitted protocol (similar to how we disabled protocols earlier, so I won't show this yet again.)

Also, we "know" that we want to configure this instance of the semantic database to persist the following semantic structures:

UrlVisited
RSSFeedItemDisplayed
RssFeedBookmark

as the Feed List Viewer will emit these protocols as we interact with the UI. We double-click on the Semantic Database receptor and add these protocols:

We now see these protocols described in the interaction between the Feed Item List receptor and the Semantic Database Receptor:

Next, let's add an initial query to populate the display. We double-click on the Signal Creator and add the following:

Note the query and how we are "joining" several semantic structures.

The Semantic Database infers the joins from the structures themselves, creating this lovely query:

Once we've done this, the Feed Item List receptor is populated with data that was acquired while we were fussing with the feed reader persistence:

and we also note that the Query protocol is being sent from the Signal Creator receptor to the Semantic Database receptor:

The last step is to wire up the Interval Timer receptor. The Signal Creator listens to a "Resend" protocol (again with no actual value types). Here we'd like to re-query the database, say, every 10 minutes, so any new items that are acquired every 30 minutes are displayed within 10 minutes more. We double-click on the Interval Timer receptor, and as before, configure it:

Now we have a new computational island (which I've wrapped into a membrane) that updates our feed display.

Our complete Feed Reader Applet now has two "computational islands", one for displaying feeds and the other for reading and persisting new feed items:

Behaviorally, the UI will show:

New feeds with a white background
Old feeds with a blue background
Visited feeds with a green background

(Incidentally, the first two items, while they look identical, are not -- the description has changed, but the URL has not. Because the URL is the same, the viewer shows that both items are visited even though I only visited one of them.)

You can bookmark feeds and provide a category and add a note (notes are displayed in the note field when viewing bookmarked items):

And you can query specific bookmarked items by selecting the category and clicking on the Show button:

Behind the Scenes

One of the interesting things to look at is what happens when you select a bookmark category and click on Show. Internally, the Feed Item List receptor dispatches a Query protocol:

protected void ShowItemInCategory(object sender, EventArgs args)
{
  ClearGrid();
  string categoryName = ((ComboBox)form.Controls.Find("cbCategories", false)[0]).SelectedItem.ToString();
  CreateCarrierIfReceiver("Query", signal =>
  {
    signal.QueryText = "RSSFeedBookmark, RSSFeedItem, UrlVisited, RSSFeedItemDisplayed where [BookmarkCategory] = @0 order by RSSFeedPubDate desc, RSSFeedName";
    signal.Param0 = categoryName;
  });
}

With regards to the Semantic Database, the first semantic structure is considered the "root" structure -- everything else is left-joined to this or other structures. Therefore, to acquire only the RSSFeedItem records that are bookmarked, we start with the RSSFeedBookmark structure as the first semantic type in the query.

This is the resulting query:

In the previous article on the Semantic Database, I described how a multi-structure query creates a new semantic type at runtime. The reason the above query is so complicated is because it is returning the entire semantic graph for the structures, which looks like this:

The items in green are the semantic structures that are specifically being queried.

Because the semantic graph above includes RSSFeedItem, this sub-element is emitted due to the Feed Item List receptor listening to this protocol. But this receptor is also listening to the BookmarkCategory protocol, so when this protocol is received, we can do something interesting to associate a bookmark note with a URL:

protected void AssociateBookmarkNote(ICarrier carrier)
{
  string url = carrier.ParentCarrier.Signal.RSSFeedUrl.Url.Value;
  string note = carrier.ParentCarrier.Signal.BookmarkNote.Note.Text.Value;
  urlNote[url] = note ?? "";
}

Note (haha) how we're navigating up the semantic graph to the RSSFeedBookmark semantic structure and then acquiring both the note and the feed URL. This is particularly interesting because it illustrates how the Semantic Database returns a signal in exactly the same semantic structure as is defined by the semantic schema. We have therefore completely eliminated the "impedence mismatch" between the database model and the code representation, since both are mirroring exactly the same semantic structure.

Conclusion

This is definitely not your mainstream development path. None the less, I hope (pun intended) that it spurs the imagination! Thanks for reading or watching the video, or both. HOPE will be back!

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)