Contents
Abstract
The S.O.L.I.D programming principles are not new, unfortunately I wasn’t aware to these principles until I read few articles by Mr. Shivprasad Koirala. In this article, we will understand these great programming principles.
Myths Related to S.O.L.I.D
I found there are few myths related to SOLID which mess all the things in aspects of Architecture, Design Pattern, etc. There are many kinds of myths, some related to structuring of program and others are related to Design Patterns. Let's discuss few of these.
I Know OOP Then Why Should I Learn S.O.L.I.D
This is the biggest mistake when someone correlates S.O.L.I.D to Object Oriented Programming. Earlier, I was one of these people who believed in this myth. OOP is a programming paradigm based on concepts of Object while S.O.L.I.D. are the programing principles which tell us how to write good program.
I Know Design Patterns Then Why Should I Learn S.O.L.I.D
Again a mistake, design patterns are telling how to design our programs/software but on the other hand S.O.L.I.D are only principles to make our program clean.
SOLID Principles are Only Applicable to .NET/C# and Not for Java
One greater misunderstanding about these principles a myth, which says these principles are not applicable for Java. Ah! what a myth :). These principles are not related to any programming language or in other words, these principles are not built for any specific programming languages. These principles are just kind of guidelines to make our code/program robust and it does not matter in which language the program has been written.
As an Architect, Do I Really Need to Care About S.O.L.I.D?
It is purely un-realistic if someone says that these principles are not for Architects. As an architect - you should think about a robust design, scalability, and components distribution. While designing any software/application, one should keep in mind all these principles. I agree there would be chances for overlapping but to make application design robust, one should follow these. These principles are applicable for all those who are involved in the technical part of the Software.
- A developer - needs to write a code in way so, he/she must obey these principles
- A reviewer - should review code/area in a way so, he/she must obey these principles
- A Tech Lead - should guide the team in a way so, team must obey these principles
There may be a long list depending upon the nature and size of a software/application.
I Am Working in Maintenance Project, Why Do I Care About S.O.L.I.D?
Interesting one, it varies from project to project how could one implement these principles. But, we can't say that in a maintenance type of project, we neglect these principles. I would take the opportunity to elaborate this with a real scenario example: a long time back, I worked on a maintenance project and the interesting thing is that when I get a part of that team, the project was almost done. I got some assignment and I noticed a much repeated code through-out the application. I have to complete that assignment in 16 hours (2 days) including QA's efforts.
I approached my Team Manager, unfortunately my manager wasn't convinced by me, his words were "we are almost done with this project, I do not care about the SOLID principles, also, we have received a Green flag from our users/clients. If you can't complete this task within the time limit, then I have to assign this task to someone else, the choice is yours :(" . It wasn't my day that day. Can you imagine, what I did that time? Many of us will definitely complete the task within a stipulated time period. We are developers and we don't want to leave tasks. Of course, I did not leave that task, I completed the same before the time period but in my style. I wrapped up a new class and added new functionality related to my task, wrote it in a way of S.O.L.I.D. I also attached similar things with this class so other areas of code could feel S.O.L.I.D. When I sent my changes for review, I sent with these notes:
"I noticed lot of code is repeating itself and some places are yak, you need to clean up the things. I implemented my changes by obeying S.O.L.I.D. Principles".
Can you imagine what would be the reply of code review? Ah! I got a reply from a reviewer, who was much senior to me quoting notes "Gaurav, I am reviewing this project from last 17-months and did not find any discrepancies. However, I noticed few things in your code as mentioned below, request you to make appropriate changes."
I made all the requested changes, which are just cosmetic things not related to S.O.L.I.D or my way of coding and my changes get approved after QA and deployed with release. Now, think about the story regarding this incident, we got a call for code-review by the client. We have a meeting, I wasn't part of that meeting, only few senior guys of our team attended. Suddenly, I got a phone call from my manager with the words "Come to the conference hall, the client is calling". It was a video conference call, and suddenly my manager introduced me to the client. I was shocked when the client asked about my work experience and the approach why I wrote the code in this way. Crap man! I was not in my zone that time. Suddenly, the client clapped and said 'great, you've done it in a good manner'. ... My motto to give the above scenario is just, what if I felt that I am a developer and working in a maintenance project also, my project is having final release, why should I bother about all these principles. As I said, it depends/varies from project-to-project how to implement these principles, but it is recommended never to forget these principles.
I am a Quality Assurance Person and Working with Selenium, Why Do I Care About S.O.L.I.D?
Recently, I attended a seminar of an Engineering College (for their Science Fest), there I got asked this shocked question from one of the QA developers. In my view, the QA developer should care about SOLID. As a specific scenario here for Selinium, so, here QA developer has to write all code to automate his/her QA process. In this scenario, we should care about SOLID. I will discuss in detail later on with more write-up. I have a few examples which prove the above.
Note: There might be more myths, if there are any in your mind, please add them here and we will definitely discuss those.
Introduction
Object Oriented Programming (OOP) provides us the way to write, polish our program in a better shape and way. These are the basic points which guide us how to write a good program. For great Object Oriented Designs (OOD), we need to think beyond this. S.O.L.I.D principles provide us with a way to write great design, yes there are certain design patterns, which guide us the same but SOLID is much before these patterns. Let's discuss and understand all using C# code snippets as code examples.
Defining S.O.L.I.D
This is an acronym introduced by Michael Feathers as:
- S for SRP: Single responsibility principle
- O for OCP: Open/closed principle
- L for LSP: Liskov substitution principle
- I for ISP: Interface segregation principle
- D for DIP: Dependency inversion principle
Learning Single Responsibility Principle (SRP)
Name of this principle described itself, it should have single responsibility. Who should have single responsibility? Here, we are studying/learning principles to design best classes/programs/systems. In reference to this, I would say "A class should have single responsibility".
Let’s dive into ocean - can we read this like "a class should not design to do multiple activities", so, what kind of activities. Let’s think I have to design a system which provides me the employee details, so, it should include activities, general I have CRUD (Create Read Update Delete) operations.
Now, as per Single responsibility principle, I have to design a class, which should do any of these operations but not all of these. I am remembering my old days, when I was learning C++. Generally, I was writing 1000s lines of code in a one program contains many if..else
. At that time, I was happy to run these programs. Now, I do not like a method which contains more than 4-5 lines, how the world has changed?
When I was learning this principle, the question was in my mind, why class should not be responsible for multiple responsibilities? I found the answer as: More, responsibility tied classes towards more changes in future. Yes, this is true, if I designed a class (I will explain using my old days code in coming learning part of S.O.L.I.D), which is responsible to modify data, retrieve data and then save data. In future, if there is Such kind of business requirements, where our modification or data retrieval logic would be changed then we have to change our classes many time and at many places, this would encounter more bugs and more code changes.
Now, take a look in the following snippet:
public class DataMigrater
{
public IList<serverdata> GetData(DateTime initialDate, DateTime endDate)
{
return new List<serverdata>();
}
public IList<serverdata>ProcessData(IEnumerable<serverdata>rawData)
{
return rawData.ToList();
}
public void Migrate(IEnumerable<serverdata> rawData)
{
}
}
class Program
{
private static readonly DateTime StartDate = new DateTime(2014, 07, 01);
private static readonly DateTime EndDate = DateTime.Now;
static void Main(string[] args)
{
var dataMigrater = new DataMigrater();
var rawData = dataMigrater.GetData(StartDate, EndDate);
var processedData = dataMigrater.ProcessData(rawData);
dataMigrater.Migrate(processedData);
}
}
In our class, DataMigrater
has too many responsibilities. This class:
- fetches data
- processes data
- then, migrates data
So, as per SRP, our class is doing wrong. Here, we are not following S.O.L.I.D. What should our class do? Let’s start from the name of class, i.e. DataMigrater
, this looks me that the class should be responsible only to migrate data. So, class should not be concerned about what and how data is coming for migration. One more reason, class should be responsible for one thing - let’s think of a scenario where method:
public void Migrate(IEnumerable<serverdata> rawData){ }
throws an exception and our data does not get migrated. Now, we need to verify all three methods because we are not sure if data is well fetched or well processed. This laid us for many burdens. Do you want to bear this load? I am sure, no developer wants to bear this.
Now, a big question is 'how to do that?' - take a look into the following snippet:
public class DataMigrater
{
private readonly IList<serverdata> _data;
private readonly IServerDataRepository _repository;
private readonly ILog _logger = LogManager.GetLogger(typeof(DataMigrater));
public DataMigrater(IList<serverdata> data, IServerDataRepository repository)
{
_data = data;
_repository = repository;
}
public void Migrate()
{
try
{
foreach (var data in _data)
{
var stopWatch = Stopwatch.StartNew();
Migrate(data);
stopWatch.Stop();
_logger.InfoFormat("Total data {0} migrated in {1}", _data.Count, stopWatch.Elapsed);
}
}
catch (Exception ex)
{
_logger.Error(ex);
throw;
}
}
private void Migrate(ServerData data)
{
try
{
_logger.InfoFormat("Migrating data with Id:{0}", data.Id);
}
catch (Exception ex)
{
_logger.ErrorFormat("An exception occurred attempting to migrate data with Id:{0}", data.Id);
throw;
}
}
}
Now, our class has only one method Migrate()
. You noticed there are lot of changes that have been made in this class, we will discuss all one-by-one in the coming articles.
As of now, let’s concentrate on Single Responsibility Principle. In the above, our class is now only concerned about Migrate Server data. This class is not bothered about whether supplied data is processed or raw, there are other classes responsible for these things now. See below:
public class ServerProcessedOrRawDataQuery
{
internal IServerDataRepository Repository { get; set; }
public ServerProcessedOrRawDataQuery(IServerDataRepository repository)
{
Repository = repository;
}
public IQueryable<serverdata> Query(DateTime startDate, DateTime endDate)
{
return new ServerDataQuery(Repository).ProcessedData()
.Where(d => d.InitialDate <= endDate && d.EndDate >= startDate);
}
}
public class ServerDataQuery
{
internal IServerDataRepository Repository { get; set; }
public ServerDataQuery(IServerDataRepository repository)
{
Repository = repository;
}
public IQueryable<serverdata> Query()
{
return Repository.Get().AsQueryable<serverdata>();
}
public IQueryable<serverdata> ProcessedData()
{
return Query().Where(d => d.IsDirty == false);
}
}
Now, our program is changed as:
class Program
{
private static readonly DateTime StartDate = new DateTime(2014, 07, 01);
private static readonly DateTime EndDate = DateTime.Now;
static void Main(string[] args)
{
var repository = new ServerDataRepository();
var processedData = GetProcessedData(repository);
var dataMigrater = new DataMigrater(processedData,repository);
Console.WriteLine("Data migration started...");
dataMigrater.Migrate();
Console.WriteLine("Data migration completed...");
Console.ReadLine();
}
private static IList<serverdata> GetProcessedData(IServerDataRepository repository)
{
return new ServerProcessedOrRawDataQuery(repository).Query(StartDate, EndDate).ToList();
}
}
Finally, we can define Single responsibility principle as:
"Class should be designed for single responsibility and there should not more than one reason to make changes in this class. The responsibility of this class should be completely tide/encapsulated by the class."
Learning Open/Closed Principle (OCP)
When I read this principle, I thought it looks like my class should be open or closed, I thought either one. But when I read the following definition from wiki:
"Software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification."
I was shocked to think how it is possible to make my class open and closed and not open or closed, I thought in other words how can I allow things to modify without making actual modifications to my object, it was just confusing to me. I dove into OOP (Object Oriented Programming) for the answer to my question.
Let’s think about abstraction; we can create a base class and overridable functions with different behavior. Yes, we can allow changes by keeping objects unchanged. Let's take an example: We have to send an email for different operations body of emails depend upon certain business rules and can contain different or same messages.
Now, what we can do here? We can create a class like CreateEmail
or whatever you want to name it, with one method BuildMessage
. So, this class is only and only responsible to build email messages as per different logic. As this method is overridable, I can define it functionality per my choice. Dosen't it look very interesting and easy? In continuation with our above code of example, we need to update our database from one server to another server (suppose we need to refresh our Development database from production), but there are some rules like data-type should be the same, data value should be changed, etc.
public class ValidateData
{
public void SyncronizeData(ServerData data, SourceServerData sourceData)
{
if (IsValid(data, sourceData))
{
}
}
private bool IsValid(ServerData data, SourceServerData sourceData)
{
var result = false;
if (data.Type == sourceData.Type)
result = true;
if (data.IP != sourceData.IP)
result = true;
return result;
}
}
Can you think, what is wrong with the above code? Let’s discuss now, take a look into class ValidateData
. First of all, it is doing two activities. In other words, our class is responsible for two things:
- to validate incoming data (from source server)
- to save data
Now, think of a scenario, if someone wants to extend this so, it could use another external service. In this scenario, he/she would have no other choice and have to modify IsValid
method.
Also, if someone needs to make it as a component and provide to third parties for their use, then its users would have no way to add another service. This means this class is not open for extensions.
On the other hand, if someone needs to modify the behavior to persist data, she/he needs to change the actual class. In sum-up, this class is directly violating OCP as this is neither open for extensions not closed for modifications. So, what would be a better solution for this class so, it should follow, OCP. Remember abstraction, let’s try to do something by creating an interface
:
public interface IDataValidator
{
bool Validate(ServerData data, SourceServerData sourceData);
}
public class IPValidator : IDataValidator
{
public bool Validate(ServerData data, SourceServerData sourceData)
{
return data.IP != sourceData.IP;
}
}
public class TypeValidator : IDataValidator
{
public bool Validate(ServerData data, SourceServerData sourceData)
{
return data.Type == sourceData.Type;
}
}
The above code snippet is self-explanatory, where we defined IDataValidator
which is having a method Validate
. Method name describes itself, it's a part of DataValidator
so, it should validate data so, it named as Validate
.
Now, redesign our class ValidateData
:
public class ValidateData
{
public bool IsDataValidated(ServerData data, SourceServerData sourceData)
{
IList<idatavalidator> validators = new List<idatavalidator>();
validators.Add(new IPValidator());
validators.Add(new TypeValidator());
return IsDataValid(validators, data, sourceData);
}
private bool IsDataValid(IList<idatavalidator> validators, ServerData data, SourceServerData sourceData)
{
foreach (var validator in validators)
{
if (validator.Validate(data, sourceData))
return true;
}
return false;
}
}
Stay here to discuss the above snippet! Above, we have a ValidateData
class, which is only responsible to validate data by certain validations/rules. With above changes, our class is not more stable and robust, we can add many validators as we want. Also, you can use this validator to save your data.
Ah! I forget to mention, you can save the data just by calling this validator to another class, it could be a repository class or your custom class where you just persist your data. I am not going to write that part of save, you can easily implement this by yourself.
Learning Liskov Substitution Principle (LSP)
Here is definition from wiki:
"if S
is a subtype of T
, then objects of type T
may be replaced with objects of type S
(i.e., objects of type S
may be substituted for objects of type T
) without altering any of the desirable properties of that program (correctness, task performed, etc.)"
I understood the above definition like this: parent should easily replace the child object. To understand it bit more, let’s look into the example of EmailNotifications
(considered in OCP, above), now let’s say we need to also send this email for printing, what we can do? We can create a new class, let’s call it NotificationsForPrint
or whatever name you would like it inherits our class EmailNotifications
.
Now, our both classes base and child class have at least one similar methods. Can we use our child class to substitute our base class, no, in this situation never? So, we need to use inheritance, define two separate interfaces one is building message and another one is sending message and let’s decide in implementation where for what we need to build and send messages. Take a look into the following snippet:
public class DataBase
{
public virtual bool IsValid(ServerData data, SourceServerData sourceData)
{
return new Validator(new List<ivalidator>()).Validate(data, sourceData);
}
public virtual void Save()
{
}
}
public class ProdDB : DataBase
{
public override bool IsValid(ServerData data, SourceServerData sourceData)
{
return base.IsValid(data, sourceData);
}
public override void Save()
{
base.Save();
}
}
public class QADB : DataBase
{
public override bool IsValid(ServerData data, SourceServerData sourceData)
{
return base.IsValid(data, sourceData);
}
public override void Save()
{
base.Save();
}
}
public class LocalDB : DataBase
{
public override bool IsValid(ServerData data, SourceServerData sourceData)
{
return base.IsValid(data, sourceData);
}
public override void Save()
{
throw new Exception("Local Data should not be saved!");
}
}
Recall, inheritance and you can visualize that DataBase
is a parent class of ProdDB
, QADB
and LocalDB
. Let's think polymorphism for a while and we can write as:
DataBase pDataBase = new ProdDB();
DataBase qDataBase = new QADB();
DataBase lDataBase = new LocalDB();
var dataBases = new List<database> {new ProdDB(), new QADB(), new LocalDB()};
var dataBases = new List<database> {new ProdDB(), new QADB(), new LocalDB()};
foreach (var dataBase in dataBases)
{
if (dataBase.IsValid(data, sourceData))
dataBase.Save();
}</database></database>
Wait, wait...
What’s Wrong with the Above, NOTHING?
Yes, you are absolutely correct there is nothing wrong with the above code, the only thing is, its execution. When above code executes, it will also invoke save
method of LocalDB
object. In this case, we received an exception as our LocalDB
object is not supposed to save data.
A Big Question is "Why This happened?"
In simple words, LocalDB
is actually not an entity of DataBase
or we can say DataBase
is not an actual parent of LocalDB
. Another question in mind "Why is LocalDB
not an entity of DataBase
, when it inherits DataBase
". Hold on, go back to LocalDB
class and check this is not meant to implement Save()
method, here this makes LocalDB
as a separate entity. In simple words, LISCOV
says parent should easily replace its child.
How to Implement LISCOV Principle?
We know LocalDB
is not supposed to save data but others are. Let’s consider the following snippet:
public interface IRule
{
bool IsValid(ServerData data, SourceServerData sourceData);
}
public interface IRepository
{
void Save();
}
Now, we have two interfaces, with their own methods.
IRule
: to validate data and IRepository
: to save/persist data
Let’s make changes to our LocalDB
class, as:
public class LocalDB : IRule
{
public bool IsValid(ServerData data, SourceServerData sourceData)
{
return new Validator(new List<ivalidator>()).Validate(data, sourceData);
}
}
Why We Implement IRule?
For LocalDB
, we only want to check whether data is valid or not. We do not want to persist data in any scenario. Now, we can't write this:
DataBase lDataBase = new LocalDB();
Our DataBase
class should be like this:
public class DataBase : IRule, IRepository
{
public virtual bool IsValid(ServerData data, SourceServerData sourceData)
{
return new Validator(new List<ivalidator>()).Validate(data, sourceData);
}
public virtual void Save()
{
}
}
Other classes will remain unchanged. Our execution code goes as:
public void Execute(ServerData data, SourceServerData sourceData)
{
var dataBases = new List<irepository> { new ProdDB(), new QADB() };
foreach (var dataBase in dataBases.Where(dataBase => ((IRule)dataBase).IsValid(data, sourceData)))
{
dataBase.Save();
}
}
Now our code looks easier to handle.
Learning Interface Segregation Principle (ISP)
Here is a definition from wiki:
"No clients should be forced to implement methods which they does not use and the contract should be broken into small and more specific interfaces."
I took this correctly as:
"as a client, why should I implement 9-methods of interface when I need only 3-methods", doesn't it make developers' life easy.
This is also similar to High Cohesion Principle of GRASP. Think, can we consider our EmailNotification
example (for both scenarios print and send via SMTP server)?
Let’s explore this with an example: First of all, go back and take a look into the code example discussed in LSP
, there some of our databases are getting saved after validation. Now, think of a scenario where there are more databases and for these additional databases, we require a report. In other words, new databases needs to be read and saved.
In the very first instance, I can think of adding a new method to interface
IRepository
(which can read data or generate report).
public interface IRepository
{
void Save();
void Generate();
}
Do you think that the above approach is good? Think, think and again think.... :) By adding a new method to an existing interface, we are forcing to use the new method to all those classes, who are implementing this interface. But those classes are not supposed to use a newly added method. So, my ProdDB
class look like:
public class ProdDB : DataBase
{
public override bool IsValid(ServerData data, SourceServerData sourceData)
{
return base.IsValid(data, sourceData);
}
public override void Save()
{
base.Save();
}
public override void Generate()
{
}
}
But actually, ProdDB
class does not require to Generate report, but with the above implementation, this class will have to implement new Generate()
method. Here, we are forcing our class to implement that method, which this class does not want.
So, we are not following the Interface Segregation Principle in our above code [go top and read ISP definition :)].
What is the Solution for this Problem?
First, try to segregate our IRepository
interface.
public interface IReport:IRepository
{
void Generate();
}
To segregate, I created another IReport
interface with a new method Generate()
. Now, we have two separate interfaces IRepository
and IReport
. Let’s create a new class, which is meant for those clients, who want to generate report:
public class DataBaseReport : IReport
{
public void Save()
{
var dataBase = new DataBase();
dataBase.Save();
}
public void Generate()
{
}
}
At this point, we have two different classes DataBase
and DataBaseReport
, one is for those clients who don’t want to generate report and another is for those who want to generate report :).
So, our execute
method would look like:
public void Execute()
{
IRepository repository = new DataBase();
repository.Save();
IReport report = new DataBaseReport();
report.Generate();
}
You can see how we resolved the problem. This solution will be very useful when millions of clients need different things while our existing clients don't.
Learning Dependency Inversion Principle (DIP)
This principle reminds us of Decoupling:
"High level modules should not depend upon low-level modules both should be tide using abstractions" .
What does this mean? Let’s take a look into our EmailNotification
example once again and think why our code should decide where to send my email (SMTP server or printer) at the very beginning.
Why should it not automatically perform preferable action [we will take a look in details - in the coming parts]? Till then, take a look into this pattern on wiki.
Let's explore this with an example: Rewind the code example we discussed Single Responsible Principle, there is a property Type
, which tells data type, now take a look into the following snippet:
public class DataBase : IRule, IRepository
{
private ISession session;
public virtual bool IsValid(ServerData data, SourceServerData sourceData)
{
}
public virtual void Save(ServerData data)
{
if(data.Type == 1)
{
session = new ServerDataSession();
}
else
{
session = new SourceServerDataSession();
}
session.Save(data);
}
}
There is violation of SRP
in above, to solve this, let's create a separate interface which holds the Save
method:
interface ISession
{
void Save<t>(T data);
}
and let's create a different saver as per the below snippet:
class ServerDataSession:ISession
{
public void Save<t>(T data)
{
}
}
class SourceServerDataSession:ISession
{
public void Save<t>(T data)
{
}
}
Now, What?
We need to supply Session
as per Data type, here we are delegating responsibility to someone else.
In other words, for the above snippet, our DataBase
class is delegating its responsibility to others (ServerDataSession
, SourceServerDataSession
). We need to modify our DataBase
class as:
public class DataBase : IRule, IRepository
{
private ISession _session;
public DataBase(ISession session)
{
_session = session;
}
public virtual bool IsValid(ServerData data, SourceServerData sourceData)
{
}
public virtual void Save(ServerData data)
{
_session.Save<serverdata>(data);
}
}
At this point, our client is free to inject what he/she wants to consume:
IRepository dataBase = new Database(new ServerDataSession());
Conclusion
We discussed SOLID programming principles to make our applications/software design better. This is just a beginning, as soon as you go through the other Design Patterns, you will grab more power to make your software/program more beautiful. The complete source code is also available at Learning SOLID.