Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / desktop / Win32

Longest Common Subsequence

4.20/5 (4 votes)
25 Mar 2010CPOL2 min read 1   408  
Generic algorithm of the longest common subsequence

Introduction

The classic longest common subsequence algorithm needs the length of two sequences are known. It is usually used to calculate the common sub-sequence of strings.

Sample

C++
left sequence :"1233434printSection(SectionCommon"
right sequence:" using printSection(SScpeardectionCommon::"
output:"printSection(SectionCommon"
LCS size=27(include '\0')

Background

My algorithm is extremely adaptable. It can be used to calculate common subsequence of strings, or file diff calculation. Theoretically it can be used for any two sequences with similar characteristics of longest common subsequence.

Using the Code

C++
template<typename L_Iterator,typename R_Iterator,typename Container> 
LCS_Size FEtool::LCS_Calculate(L_Iterator lbeg,L_Iterator lend,
    R_Iterator rbeg,R_Iterator rend, Container &out); 
template<typename L_Iterator,typename R_Container,typename Container> 
inline LCS_Size FEtool::LCS_Calculate(L_Iterator lbeg,L_Iterator lend,
    R_Container const&rcontainer, Container &out); 
template<typename L_Container,typename R_Container,typename Container> 
inline LCS_Size FEtool::LCS_Calculate(L_Container const& lcontainer,
    R_Container const&rcontainer, Container &out);

L_Iterator accepts input iterator. R_Iterator accepts random access iterators. L_Container and R_Container by calling begin () and end () to the first LCS_Calculate .

L_ * Heading refer to the left sequence , R_ * Heading refer to the right sequence .
The last parameter Container & out is used to output sequences of the same section. Typical parameters std:: deque <FEtool::SectionCommon> can also FEtool:: EmptyContainer. If FEtool:: EmptyContainer parameters be chosen a specialization of template code Guarantee does not do back calculation.

LCS_Calculate optimization algorithms work according to the incoming parameter. It requires the right sequence is always the random access iterators, the left of the sequence can be the input iterator. Within the array used in the calculation using the scroll array implementation(LCSArray), space size is the right sequence length * 2.

If the last parameter of LCS_Calculate is not EmptyContainer, it will calculate common section of the sequences. If L_Iterator is not, random access iterators will be used within a dynamic growth array to do back calculation. If L_Iterator is a random access iterator Direct Request space (left sequence size * the right sequence size) is used to support the back calculation.

Points of Interest

struct FEtool::LCS_Compare_Trait defines the comparison algorithm by operator ==. You can customize it.

struct FEtool:: SectionCommon defines the same part of the two sequences. For example, SectionCommon:: L_begin is 0, SectionCommon:: R_begin is 10, SectionCommon:: count is 5. This means that the left sequence of 0 start, and the right sequence from 10 start, 5 data are identical.

Related Resources

History

  • 25th March, 2010: Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)