Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C#

When What You Set is Not What You Get: SetEnvironmentVariable and getenv

4.33/5 (2 votes)
13 Oct 2009CPOL4 min read 27.6K  
Mixing SetEnvironmentVariable and getenv is asking for trouble

Mixing SetEnvironmentVariable and getenv is asking for trouble, as we recently found to our dismay (and exasperation!). It took some serious debugging to figure out the actual problem – let’s see if you can figure it out.

Here’s some C++/CLI code that uses getenv to read the value of an environment variable and print it to the console.

MC++
public ref class EnvironmentVariablePrinter
{
public:
	void PrintEnv(String ^key)
	{
		 IntPtr ptr = Marshal::StringToHGlobalAnsi(key);
		 const char *ckey = (const char *)ptr.ToPointer();
		 const char *cval = getenv(ckey);
		 string val = cval == NULL ? "" : string(cval);
		cout << "Key is " << ckey << " and value is " << val;

		 Marshal::FreeHGlobal(ptr);	
	}
};

PrintEnv merely converts the .NET string into an unmanaged one and calls getenv, printing the returned value to the console.

And here’s some C# code that tests the class.

C#
class Test
{
    const string key = "MyKey";
    public static void Main()
    {
        Environment.SetEnvironmentVariable(key, "MyValue");
        PrintValue();
    }
 
    private static void PrintValue()
    {
        EnvironmentVariablePrinter er = new EnvironmentVariablePrinter();
        er.PrintEnv(key);
    }
}

The code uses System.Environment.SetEnvironmentVariable to set the value of the variable and then calls the C++/CLI code to verify that it prints the correct value. And of course, being written in different languages, the two pieces of code must reside in different projects, say CPlusPlusLib.dll and ConsoleApplication.exe, with the latter referencing the former.

No surprises here - this works as expected and prints “Key is MyKey and value is MyValue”.

However, a seemingly harmless change breaks the code big time.

C#
class Test
{
    const string key = "MyKey";
    public static void Main()
    {
        Environment.SetEnvironmentVariable(key, "MyValue");
        EnvironmentVariablePrinter er = new EnvironmentVariablePrinter();
        er.PrintEnv(key);
    }
}

All I've done is inlining of PrintValue, yet running this code prints “Key is MyKey and value is” – getenv is now returning NULL instead of “MyValue”.

It gets even more interesting.

C#
class Test
{
    const string key = "MyKey";
    public static void Main()
    {
        Environment.SetEnvironmentVariable(key, "MyValue");
        EnvironmentVariablePrinter er = null;
        PrintValue(); 
    }
 
    private static void PrintValue()
    {
        EnvironmentVariablePrinter er = new EnvironmentVariablePrinter();
        er.PrintEnv(key);
    }
}

This doesn't work either – the mere declaration of EnvironmentVariablePrinter inside Main makes getenv return NULL for “MyKey”. This can't be good, can it?

Actually yes, because that is a valuable clue – it means the change in behavior has something to do with JITting. As long as all code that references EnvironmentVariablePrinter is in a separate method, everything works fine (in debug mode, atleast). Things start going south when any such code is in Main itself.

What would the JITter do differently in the two scenarios? Load the assembly containing EnvironmentVariablePrinter (CPlusPlusLib.dll) at different times, of course. When all code referencing EnvironmmentVariablePrinter is inside PrintValue, it will have to load the DLL only when JITting PrintValue, whereas in the other case, it will have to load it when JITting Main. JITting Main obviously occurs much before JITting PrintValue, so DLL load time (relative to other code) is one big difference between the two scenarios that occurs because of JITting.

Why would loading CPlusPlusLib.dll a little early make getenv return NULL?

To understand that, you'll first have to know how the getenv function works. Windows has APIs to set and get environment variables (SetEnvironmentVariable/GetEnvironmentVariable), and the .NET method P/Invokes into the Windows API to set and get values. getenv, on the other hand, is a CRT function, and does not delegate to the Windows API. Instead, the CRT gets all environment variables and their values when it is starting up (using the GetEnvironmentStrings Windows API), and copies them into its own data structures (MSVCR80!environ). getenv then works on the copied data from then on.

Now do you see the problem? When CPlusPlusLib.dll is loaded early (when JITting Main), the CRT also gets loaded as one of its dependencies, and the startup code that copies environment variables runs right away. At that point, Main hasn't even been JITted yet, so there’s no way our call to System.Environment.SetEnvironmentVariable could have run by that time. And when it actually runs, it’s too late – the CRT environ block would have been updated much earlier, and calling the Windows APIs SetEnvironmentVariable wouldn't have any effect on the cached values. When getenv runs, it looks in the cached values and returns NULL.

It’s easy to see why it works in the first case now – CRT loading occurs when JITting PrintValue, and that occurs after our call to SetEnvironmentVariable has executed. Which means that when it calls GetEnvironmentStrings as part of startup, it gets the variable (and its value) that we just set.

Nasty, ain’t it? The actual scenario was a lot more messy – things suddenly stopped working when we linked to a DLL ported to VS2008. We actually figured the problem backwards – we first saw that the CRT load time was different, theorized how getenv works, verified the theory by stepping through the assembly code and looking at the environ block, and once we realized the problem, figured out what was causing early loading of the CRT. Windbg was awesome for debugging this - things would have been very difficult if not for sxe ld:MSVCR90 and x MSVCR80!environ.

The fix was rather simple - in our case, we merely had to move code that set environment variable before CRT load. There’s another twist though; mscorwks.dll, which is the heart of the CLR, loads MSVCR80 when it loads, and you can't set your environment variables before that, not from managed code anyway. Fortunately, in our case, the getenv call is from a library that links to MSVCR90, so as long as we set the environment variable before that version of the CRT loads, we're good to go. Until the CLR gets linked to MSVCR90, anyway :).

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)