Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C#

TryGetLineAndOffsetOf

4.38/5 (4 votes)
7 Nov 2016CPOL2 min read 9.5K  
A method to match a string within a string, but with added features

Background

Last week, the following message was posted in The Lounge:
I despair for our profession.[^]
I considered a gauntlet to have been thrown down, so I picked it up. That's just what I do with a gauntlet.

Introduction

This method meets the specified requirements:

  1. Search a string for some text.
  2. Allow line breaks (LINEFEED characters and optional CARRIAGE-RETURN characters) within the matched value.
  3. Return the (zero-based) line the text was found on.
  4. Return the (zero-based) offset of the text within the line.

Plus, I added some support for HYPHENs. And I chose to make it more flexible by not limiting it to only searching strings. The search is case-sensitive, but the calling code could provide values that have been made all upper or all lower case.

Using the Code

I decided to follow the pattern set forth by the various TryParse methods, so the method signature is as follows:

C#
public static bool
TryGetLineAndOffsetOf
(
  System.Collections.Generic.IEnumerable<char> ToSearch
,
  string                                       SearchFor
,
  out int                                      Line
,
  out int                                      Offset
)

So calling the method may be done something like this:

C#
int  line   ;
int  offset ;
bool found  = TryGetLineAndOffsetOf ( stringtosearch , stringtoseek , out line , out offset ) ;

The first part of the method sets defaults for the values to return, tests to be sure that the provided values are reasonable, retrieves the enumerator for the characters to search (using the enumerator directly, rather than using foreach, yields tighter code), and instantiates a LimitedQueue to hold the characters as they get matched.
LimitedQueue, you ask? That's over here: LimitedQueue[^]

C#
{
  bool result = false ;

  Line   = 0 ;
  Offset = 0 ;

  if
  (
    ( ToSearch != null )
  &&
    ( SearchFor != null )
  &&
    ( SearchFor.Length > 0 )
  )
  {
    System.Collections.Generic.IEnumerator<char> tosearch =
      ToSearch.GetEnumerator() ;

    PIEBALD.Type.LimitedQueue<CharPos> charpos =
      new PIEBALD.Type.LimitedQueue<CharPos> ( SearchFor.Length ) ;

A while loop is used to enumerate the provided characters and stop when either the desired string is found or the end of the characters to search is exhausted.

C#
while
(
  !( result = ( charpos.Count == SearchFor.Length ) )
&&
  ( tosearch.MoveNext() != false )
)

Inside the while loop is a switch to determine what to do with the current character.

  1. If it's a LINEFEED -- Then increment the count of lines and reset the Offset.
  2. If it's a CARRIAGE-RETURN -- Just increment the Offset.
  3. If it's a HYPHEN -- If it matches the sought text, then treat it as a regular character, otherwise just increment the Offset.
  4. All other characters get counted and enqueued.
    Then the magic happens. Once a character is added to the queue, the queue is validated to ensure that any characters it contains match the sought text. Characters will be dequeued until the queue is empty or contains a partial match.
C#
{
  switch ( tosearch.Current )
  {
    case '\n' :
    {
      Line++ ;

      Offset = 0 ;

      break ;
    }

    case '\r' :
    {
      Offset++ ;

      break ;
    }

    case '-' :
    {
      if ( SearchFor [ charpos.Count ] == '-' )
      {
        goto default ;
      }

      Offset++ ;

      break ;
    }

    default :
    {
      int j = charpos.Count ;

      charpos.Enqueue ( new CharPos
      (
        Line
      ,
        Offset++
      ,
        tosearch.Current
      ) ) ;

      while ( j < charpos.Count )
      {
        if ( charpos [ j ].Char.CompareTo ( SearchFor [ j ] ) == 0 )
        {
          j++ ;
        }
        else
        {
          charpos.Dequeue() ;

          j = 0 ;
        }
      }

      break ;
    }
  }
}

(The use of CompareTo is a hold-over from an earlier experiment with trying case-insensitivity.)

Once the while loop exits, if the sought text was found, the Line and Offset can be copied from the head item in the queue.

C#
    if ( result )
    {
      Line   = charpos.Peek().Line   ;
      Offset = charpos.Peek().Offset ;
    }
  }

  return ( result ) ;
}

CharPos

The CharPos class is minimal.

C#
private sealed class CharPos
{
  public int  Line   { get ; private set ; }
  public int  Offset { get ; private set ; }
  public char Char   { get ; private set ; }

  public CharPos
  (
    int  Line
  ,
    int  Offset
  ,
    char Char
  )
  {
    this.Line   = Line   ;
    this.Offset = Offset ;
    this.Char   = Char   ;

    return ;
  }
}

History

  • 2016-11-07: First submitted

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)