Table of Contents
Introduction
Many a time production servers need to migrate from one server to another. There are many reasons to do that. Some reasons are:
- Changing Data Center
- Upgrading Operating System, for example Windows Server 2003 to Windows Server 2008/2012
- Upgrading computer hardware, for example 32 bit processor to 64 bit processor with multi-core environment
- Changing companies geographical location
- and many more.
When any technical person or team start migrating production servers, they may face many technical challenges. Challenges might be related to hardware or software. I will explain software related challenges they may face and how to resolve all those.
Background
A few months ago we have done production servers migration from one data center to another. At that time, we faced so many sofware related challenges. Before explaining that let me describe our production servers environment for both old and new.
| Old Server | New Server |
---|
Operating System (OS) | Windows Server 2003 | Windows Server 2008 |
Processor | 32 bit | 64 bit |
Database Server | SQL Server 2005 | Sql Server 2008 |
Reporting Server | SSRS 2005 | SSRS 2008 |
Internet Information Service (IIS) | Version 6 | Version 7.5 (Integrated Mode) |
Before start migration, we thought it will be very simple, easy and straight forward task for us. But in the execution time, we realize that it is not as easy and simple as we thought before. We faced so many challenges and passed really very hard time during that period. So I decided to share it so that in future any it will help them who will work similar job.
Migration Challenging Items
Now I will explain one by one:
Showing Maintenance Alert:
we need plan and preparation before start. We need to take backup old servers databases, copy server deployed code files (binary and resource), configuration files and others and zip that and shift these to new server. More or less 5/6 hours needed to do that. During that time, we do not want client access our sites/application and change anything. Actually we will shut down production sites. But we must show proper message to the client so that they understand server maintenance are going on and they need to wait until finish. We created a static html file suppose its name is MaintenanceAlert.htm.
Now issue is when and how to can use this file? The answer is when user request our site or any bookmarked url, it will redirect to that static page.
How to redirect that? One solution may be we redirect it from Master page code behind file.
Response.Redirect("MaintenanceAlert.html", false);
But problem is, we have more then one master page and we need to change all master pages and deploy it to the production servers and after completion we mush rollback old master pages. The execution process will be
Change > Deployed > Rollback > Deployed
it will not be a perfact solution due to code change is required. We were searching solution without code changes.
Can we do anything from our web.config file? If so then it will be very easy for us. After googling we found a useful Http Redirection link.
<httpRedirect enabled="true" destination="Maintenance.htm"
exactDestination="true" httpResponseStatus="Permanent" />
httpRedirect web.config entry will redirect all request to Maintenance.htm page.
Still our problem will not completely resolved. We have a Hello.htm page in our site root directory. This page is used by a load balancer component. if any request comes to this page then it should not be redirected. If redirection happend the load balancer will not work.
How will we achieve that? Another web config entry will solve this problem.
<location path="Hello.htm">
<system.webServer>
<httpRedirect enabled="false />
</system.webServer>
</location>
This web config entry will override httpRedirect default redirection behaviour and it will ignore redirection if request comes for Hello.htm page.
Configuration Files:
We all know that every web site/application has a web.config file. This configuration file contains various configuration data like:
- database connection string
- mail settings
- http handlers/modules registration
- AppSettings key-values
- 3rd party component registration etc.
In real life, this web configuration file become very large. It's size is increased day by day. To keep web.config file small and simple, we separate connection string and appsettings values with two different files. The config settings looks:
<connectionString configSource="connection.config"/>
<appSettings configSource="AppSettings.config"/>
In runtime, Framework will search connection.config and AppSettings.config file in the same directory location where Web.config file is stored and merge it.
Still problem is not solved. Connection.config file contains database path, Usally per site contain single database path. So connection.config file is always smaller in size. AppSettings.config file will be lerger in size. Because day by day developers will add application keys to that file. After migration AppSettings is changed because some old value need to replace with new one. During test run, we found that some functionalities are not performed as expected. After investigating we found that some values are found multiple times with different different keys. For example in one place we found
<add key="smtpHost" value="192.168.0.1" />
found anther place:
<add key="mailServer" value="192.168.0.1" />
When we replaced with new mail server ip, we only change smtpHost key. mailServer key was remain unchanged. The components which were using mailServer key cannot sent email. So we have to find out all duplicate values and replace that with new values. AppSettings file is too large, so it was very difficult. We did not find proper comment for that keys there.
Another scenario is, our system has 10-12 .NET executable files which are run by windows task scheduler. Each file has its own configuration. Suppose 10 executable files has 10 config files and all config files contain same database connection strings. So we need to replace new value to all that files. In testing period, our QA engineers report that some schedulers are not working properly due to configuration error.
How to solve this? Well, after investigating we know that we missed to update some config files properly. We correct those files and in future we do not want to see same problem again. So we create a ConnectionString.config file and place it to root of the directory as a base config file. We refer that ConnectionString.config file from all other connection string files so that other config files will reuse that base config file.
<connectionString configSource="ConnectionString.config" />
File System Security:
During test run we found our sites log data is not written to the log file. We did not identify the problem quickly. After investigating, we realized that we need to grant appropriate permission to the log directory.
Which user needs that permission?
There are two solutions for that.
- Grant permission to the user account by which Application Pool is running (in our case we run Network service).
- Identity impernation from web.config file or writing code for impersonation.
<identity impersonate="true"
userName="domain\username"
password="password"/>
In our schedulers we create a different user account with appropiate permission for running scheduler executable files. In development time, I see many developers when face permission related problem, he grants Everyone to full control permission to his pc and resolve that. But actually it bypass the security problem in his pc. But If we do same thing in produciton server then actually we open security whole for system and it might create dangeour security relateed problem. I found a useful link for security best practices
http://technet.microsoft.com/en-us/library/cc779601(v=ws.10).aspx
Http Handler Delete Verb Missing:
We use generic http handler (SimpleHandlerFactory-Integrated) for handling ajax request for deleting resource from server. This handler will not contain "delete" http verb by default. So we can not delete any resource/file using this handler.
But if we want to delete then what we need to do? We just need to allow "delete" http verb for this handler. Previously we did it in our old production server. But a long time gap we forgot the matter. When QA engineer reports an issue "he can not delete any resource from server". After investigating the issue we found that we missed to add that http verb in that handler. After adding that verb we solve the problem.
Unified Login:
Authenticated from a single site and based on that authentication access multiple site is called Unified Login. We can also called it Single Signon. To send authentication ticket by cookie it is implemented.
We developed a seperate site for handling user authentication. Few of our sites depends on that site for user authentication. If user authenticated, it returns authenticated cookie for the requested sites. Multiple sites share same cookie and allow users to access sites based on that authenticated cookie. In a short, we can say that once user can login in a single site and access multiple sites without authentication.
During test run, we can not login any of the sites. Only one custom error message came "user is not valid. Please try again". Initially we guess database connection string is wrong. After checking we found configuration is fine. We did not address the problem. I created unified login environment in my local development pc and found it is working as expected. I reviewed the code but did not find any problem. We know for the authentication, login site use a wcf security service. We run independently from production site with set error mode "off" so that we can see the actual error. After analyzing trace data, We know site is throwing DiretoryNotExists Exception. But in our current server we did not find any such directory. I took svn repository help. I review previous version and found in one place it read an xml file from a specific directory and add that to the File Cache Dependency object. We search that folder in our old production server and found that directory is exists there. We created that directory to the new server and fixed it.
Problem identification needs more time because we override exception message "Directory Not Exist" to "User not Valid" message. It will confuse us and take more time to address actual problem. We change exception handling code and send actual exception message to the client.
Point to be noted that it is not recommended to send actual exception message to the client in all time. It might create security whole. If that the case then you may write actual exception message to the log file and send general message to the client similar like "Error raise. Please contact administrator". But should not send wrong message. It will misguide user as well as technical person.
32 Bit VS 64 Bit:
I said earlier that our old server was 32 bit windows 2003 operating system and new server is 64 bit windows 2008 server. We just copied our code base from old server to new server. Site is working as expected. We have 10-12 executable files run with windows task scheduler. All are working fine. But QA engineer reports that some schedulers which we can run on demand from site's aspx page with ajax request are not working. During investigation, we found when run these by double click, it run but from the site page ajax request it is not running and even it is not throwing any exception. We review the code and see that exe file is running from page web method is
Process.Start("Publish.exe");
We search event log, error log, site log, internet to find any clue why that is happening. But did not find any useful resource for that. After a long time spend, we understand that our web site's application pool is running under 64 bit. But our staging server was 32 bit. We use cruse control to build our executable files from SVN code repository. It creates 32 bit assembly. We have 2 solutions for that.
- Change our application pool 64 bit to 32 bit.
- Rebuild exe files from its source code from 64 bit machine.
We rebuild our scheduler files from our 64 bit development pc and re deploy it to new production server.
AjaxPro Registration:
Ajax pro is a .NET based open source component for ajax communication. If you want to know detail about that component you can visit the link AjaxPro .NET
We use AjaxPro and JQuery library simultaneously for ajax communication. In our previous server we found both are working fine. But in our new production server we found that Ajaxpro does not working but jquery ajax is working fine. I found 2 beautiful links to solve the problem
- Link1
- Link2
Actually previous iis 6.0 site we register that component to ii6 section
<httpHandlers>
<add verb="POST,GET" path="ajaxpro/*.ashx"
type="AjaxPro.AjaxHandlerFactory, AjaxPro.2"/>
</httpHandlers>
But in iis7 this config entry will not work. It will replace by new entry
<system.webServer>
<handlers>
<add name="AjaxPro" verb="*" path="ajaxpro/*.ashx"
type = "AjaxPro.AjaxhandlerFactory, AjaxPro.2 resourceType="unspecified" />
</handlers>
</system.webServer>
TransactionScope Not Working:
We use System.Transactions.TransactionScope object to execute and manage our transactions. The following is the code sample how to use transactionscope object.
But during testing phase we found an error message is showing and also site is crashing
We start investigating this issue. We found that transactionscope object use MSDTC service which has security configuration and that are not configured properly. If you want know detail of TransactionScope you can visit All About TransactionScope.
Our web server and database server are two separate physical server.
We need to enable transaction scope for both of the server. After enabling we found that it is working fine.
HTTP VS HTTPS:
In our development and staging servers we use http protocol. But in production server we use https protocol for secured communication. During test run, we found a very strange error. We see the error in the browsers. In crome it looks:
In Internet Explorer the error looks:
We start investigating the error and found 2 very useful links which help us to solve the problem.
http://ajax.googleapis.com/ajax/libs/jquery/1.6.1/jquery.min.js We understand what we have done something incorrectly and as a consequence that alert is showing. We were using jquey http cdn path inside our https site. So browsers throw unsecured content alert.
We just replaced CDN path http to https it looks https://ajax.googleapis.com/ajax/libs/jquery/1.6.1/jquery.min.js
localhost VS 127.0.01:
After migration and configuration completed in our new mail server we found system does not send any email. In configuration file, in smtphost key we enter localhost as value. During investigating time, we found two very useful links:
<p<img width="350" height="200" src="/KB/server-management/608531/HttpViolationCromeError.png">
- localhost or 127.0.01
- StackOverflow thread
We understand that some scenarios localhost and 127.0.0.1 are not same. Specially when network is designed and used both Internet Protocol Version-4 and Internet Protocol Version-6. So we replaced configuration value localhost with 127.0.01 and problem is resolved and emails are delivering properly.
SSRS Report Files Not Uploading:
Downloading all rdl file from old server, when uploaded to new server, i found that 2 file not uploaded. The files throw exception. Exception message is "Private Assembly custom.dll not found to ssrs report private location". After investigating we found that 2 rdl file use private assembly custom.dll. SSRS report search that dll to its private assembly folder location. The location is:
%ProgramFiles%\Microsoft Visual Studio 9.0\Common7\IDE\PrivateAssemblies
After deploying that dll to that specific location, rdl files are uploaded properly and solve our problem.
CLR Functions Not Working:
CLR some times called SQL CLR is a technology for hosting .NET code inside SQL Server. Many scenarios we can use these technology and get benefited. Actually inside database set based data access/manipulation going on. In most of the scenario it is nice but some scenarios it will be costly. For example if you have large data and you need to do string manipulation, data conversion etc type then CLR function/procedure will get performance benefit. There are so many articles found on line about that matter. If you interested then you can study on that.
We use Clr function from sql server stored procedure. We also use some 3rd party sql clr functions. When new server is testing, One exception raised "SQL CLR not enabled to your database". During investigating the issue i found a beautifule link CLR Enabled Process
After enabled sql server clr option to production database then the problem is resolved.
SSRS Report Folder Permission:
After deploying all reports to our new production server, still reports are not showing. During investigation of that issue we found a useful link SSRS Report Folder Permission
We did following steps:
- Created a new windows user account
- Go to Folder Settings
- Assign new role to the report spefic folder
SSRS Report Icons Not Showing:
Every SSRS report has default navigation icons. Icons are:
- First > Navigate to first page
- Previous > Navigate to Previous Page from Current Page
- Next > Navigate to Next age from current page
- Last > Navigate to last page.
<p<img width="350" height="200" src="/KB/server-management/608531/HttpViolationCromeError.png">
During test run we found that that icons are not showing. We were confused that time. Because that images are SSRS report's default icons and loading that icons automatically. We did not understand what to do for fix this. During that time we found 2 very useful links.
- Link1
- Link2
We understand that it is IIS related Issue. IIS-6 and IIS-7 http handler registration section is different in web.config. In IIS-6 it is <httphandlers> section in IIS-7 or next version it is <handlers> section. So we register our handlers to new <handlers> section.
<handlers>
<add name ="ReportViewerWebControlHandler" preCondition="integratedMode"
verb="*" path="Reserved.ReportViewerControl.axd"
type="Microsoft.Reporting.WebForms.HttpHandler,
Microsoft.ReportViewer.WebForms, Version=8.0.0.0,
Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a" />
</handlers>
After registering that handler to IIS-7 section we found that default icons are showning and solve our problem.
DNS Flushing:
When we start testing to our newly migrated sites, during that time our old server is also live and both server's dns name is same. Then how to test new sites independently? One solution is:
But only host entry will not solve the problem. We need to follow some steps:
- Close all browsers
- Execute Dns Flush command from command prompt
- Open a Browser
- Request sites with proper url
- Open http debug tool (firebug etc.)
- See and Confirm Remote IP is showing
Conclusion
It is very very difficult to cover all scenarios faced during production servers migration time. I tried my level best to cover maximum important scenarios which we faced. I hope near future if anyone do similar kind of job then this article will be helpful for them and save their valuable time.