Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Tricky Binary Collision

0.00/5 (No votes)
4 Dec 2020 1  
An overview of a tricky binary collision example when there are multiple definitions of symbols in linked binaries
When I analyzed production code(!), I found a very interesting problem. The code consists of a static library and executable; both contain definition of a class with the same name (methods definition differs though). The library contains a pointer to the object of that class, and it points actually to the object of another definition of this class. So when library calls methods of this object, there is no problem when signatures are the same in library's class and executable's class - methods of executable are linked and they are actually called. Problems begin when someone decides to change the signature of the method - so method of one class will be called with this pointing to another class.

Introduction

Let's define two binaries: static library "examlplelib.a" and program "example".

examlplelib.a:

  • test_class_lib.h
  • test_class_lib.cpp
  • test_class_user.h
  • test_class_user.cpp

Example:

  • test_class_prog.h
  • test_class_prog.cpp
  • test_class_main.cpp

So, in the library, we'll have class TestClass defined as below:

test_class_lib.h

#pragma once

class TestClass {
    public:
        int MethodA();
        virtual int MethodV();
        virtual ~TestClass();
    private:
        const int value = 22;
};

test_class_lib.cpp

#include <test_class.h>
#include <iostream>

int TestClass::MethodA()
{
    std::cout << "lib::TestClass::MethodA, value=" << value << std::endl;
    return 5;
}

int TestClass::MethodV()
{
    std::cout << "lib::TestClass::MethodV" << std::endl;
    return 7;
}

TestClass::~TestClass()
{
}

In the same library "examplelib.a", we have another class which uses TestClass above:

test_class_user.h

#pragma once

class TestClass;

class TestClassUser {
    public:
        TestClass * pObj = nullptr;
        void CallMethodA();
        void CallMethodV();
};

test_class_user.cpp

#include <test_class_user.h>
#include <test_class_lib.h>

void TestClassUser::CallMethodA()
{
    if (pObj) {
        pObj->MethodA();
    }
}

void TestClassUser::CallMethodV()
{
    if (pObj) {
        pObj->MethodV();
    }
}

Then, we have binary program "example", where another class (with the same name and methods signatures) defined:

test_class_prog.h

#pragma once

class TestClass {
    public:
        int MethodA();
        virtual int MethodV();
        virtual ~TestClass();
    private:
        const int value = 42;
};

test_class_prog.cpp

#include <test_class_prog.h>
#include <iostream>

int TestClass::MethodA()
{
    std::cout << "prog::TestClass::MethodA" << std::endl;
    return 15;
}

int TestClass::MethodV()
{
    std::cout << "prog::TestClass::MethodV" << std::endl;
    return 17;
}

TestClass::~TestClass()
{
}

For now, signatures of both classes are the same. You can see that methods implementations are slightly different - they output different strings and return different values.

There is also main function like the following:

test_class_main.cpp

#include <test_class_prog.h>
#include <test_class_user.h>
#include <memory>

int main()
{
    TestClassUser user;
    auto obj = std::make_unique<TestClass>();
    user.pObj = obj.get();
    user.CallMethodA();
    user.CallMethodV();
    return 0;
}

Let's build this program. To successfully link this, we need to pass "--allow-multiple-definition" option to linker.

If we look at the symbols in "example", we can see both methods there (the methods from "example" should take precedence during linking since they are "local"):

000000000041a1c0 T TestClass::MethodV()
000000000041a230 T TestClass::MethodA()

Let's run program to ensure that the methods from program (not library) are called:

$ ./example

prog::TestClass::MethodA

prog::TestClass::MethodV

Good!

Let's make some tricks.

In the "example"'s class, let's add some argument to MethodA()

test_class_prog.h

#pragma once

class TestClass {
    public:
        int MethodA(int arg);
        virtual int MethodV();
        virtual ~TestClass();
    private:
        const int value = 42;
};

After building program, we'll see that there are both MethodA methods - the one from library (without arguments), and the one from program - with int argument:

000000000041a1c0 T TestClass::MethodV()
000000000041a230 T TestClass::MethodA(int)
000000000041ac54 T TestClass::MethodA()

And the method from library is called:

$ ./example

common::TestClass::MethodA, value=42

prog::TestClass::MethodV

You can notice that TestClass::value data member is taken from "example"'s class definition, it simply means that this points to object instantiated in "example"'s main.

So the behavior of calling method MethodA() in this case could be unpredictable, since the data members are used from totally different class

Since there are totally different classes, what if there is no member TestClass::value at all in it?

Let's remove it and see:

test_class_prog.h

#pragma once

class TestClass {
    public:
        int MethodA(int arg);
        virtual int MethodV();
        virtual ~TestClass();
    private:
        // const int value = 42;
};

After compiling and executing, we'll see something like this:

$ ./example

common::TestClass::MethodA, value=27958528

prog::TestClass::MethodV

Tada! This is just some value in memory with offset corresponding to TestClass::value in object with definition of other TestClass.

Virtual members will work in such case similarly to the data members, since they are just pointers to functions in virtual methods table of an object.

It depends not on the method name (as it is the case for non-virtual methods), but rather on the placement in the class definition.

Example, if we rename virtual method "example":

test_class_prog.h

#pragma once

class TestClass {
    public:
        int MethodA(int arg);
        virtual int MethodOtherV();
        virtual ~TestClass();
    private:
        // const int value = 42;
};

By calling MethodV() from library, we'll actually call MethodOtherV() since it is the first method in VMT in both classes:

$ ./example

common::TestClass::MethodA, value=27958528

prog::TestClass::MethodOtherV

But if we add new virtual method above MethodV(), this new method will be called:

test_class_prog.h

#pragma once

class TestClass {
    public:
        int MethodA(int arg);
        virtual int NewOtherMethodV();
        virtual int MethodV();
        virtual ~TestClass();
    private:
        // const int value = 42;
};
$ ./example

common::TestClass::MethodA, value=27958528

prog::TestClass::NewOtherMethodV

Conclusion

Such an interesting problem came from analyzing the production code. It appeared when someone decided to copy-paste class definition to alter its behavior, because doing this inside library appeared "dangerous" for him (there are another binaries which use this library).

Before doing copy-pasting of source code, try thinking about how many man-days of supporting such code you are going to add.

History

  • 4th December, 2020: Initial version

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here