|
That sounds like a good reason.
|
|
|
|
|
Anytime I'm doing multiple queries I manually open the connection. From my understanding each query (without manually opening) is in essence doing this:
-> Open Connection
-> Query
-> Close connection
-> Open Connection
-> Query
-> Close connection
So by me calling: db.Database.Connection.Open() it should be doing this:
-> Open Connection
-> Query
-> Query
-> Close connection
|
|
|
|
|
Here is the remarks section for your "Connection" property:
Quote: You can use this property to interoperate with relational ADO.NET code.
The returned connection will be closed unless it has been explicitly opened by the user.
(Which appears to also reflect what is in the EF docs).
DataContext.Connection Property (System.Data.Linq)[^]
|
|
|
|
|
Ok as a test I took out all of the following:
db.Connection.Open();
I kept my Try/Catch/Finally which calls Dispose on the DataContext for testing. Since I am not manually opening connections it should open and close after each query/update but it should handle the connections by itself.
I'm going to keep an eye on memory today. If this doesn't work then I will try the USING statement. I still want to handle the connection because I don't want it opening and closing all the time since it causes a delay. Once I narrow down what it is I'll call Close() before disposing of the DataContext.
I will let you all know once I have an update
|
|
|
|
|
So I thought it was looking promising but the memory maxed out again. I could see the memory going up and down but over time it just steadily cliimbed.
I posted the code here:
KnowMoreIT/CloudPanel-Service · GitHub[^]
|
|
|
|
|
Hi again,
that's a lot of code, but I like your coding style (except for too many var s) so that kept me going; I have been browsing some of the code, and have a few questions already:
1) are you running this in 32-bit or in 64-bit mode? and how much RAM does your PC have?
2) what is the estimated count (order of magnitude) of yours lists (users, messages received, messages sent, ...) ?
3) what is the frequency of the Get_MessageTrackingLogsTask? (while hoping that is where the problem is).
In the end I will probably be curious to see all the intervals as collected by GetIntervalsFromConfig().
4) please give a quantitative description instead of "the memory going up and down but over time it just steadily cliimbed". e.g. is it like a sawtooth? how much does it go down when it does? how much does it go up from one tooth to the next? what is the periodicity of it going down? at what rate does it go up on average? what is the final value? and how long does it run fine?
5) when it fails, what is the exact message you are getting? is this repetitive? how many runs did you perform with the same code (and pretty much the same data)? do you know where in the code it comes to a sudden end?
6) do you know of anything in the application domain that (essentially without changing the code) dramatically changes the up-time? e.g. can you make it fail much more rapidly by altering some parameters?
7) suggestion: if acceptable, could you also provide a log of a failing run that fits the code currently on github?
modified 12-Jan-16 22:38pm.
|
|
|
|
|
Hey Luc!
Glad to hear you like my coding style... its been completely self taught (with help from random people on the web) so never really knew LOL.
1) I am running 32-bit process. Memory is a virtual machine running 2008 R2 with 16GB of memory (4GB is used normally)
2) Here is the breakdown:
Users: 4333 rows
Message Tracking Logs: 1432284 rows
Mailbox Sizes: 2209537 rows
Database Sizes: 10520
History Stats table: 69806 rows
3) Get_MessageTrackingLogsTask normally would run every 720 minutes. To speed up the testing phase I changed it to every 60 minutes (along with the rest of the tasks)
4) The service starts out at 5MB of memory. From 6pm to 9pm tonight it has climbed to 314MB. I usually end up stopping the service around 2.8GB used. I've let it just go once and it crashed when it hit the max memory for 32-bit process) Here is intervals:
GetDisabledUsers has ran 180 times (every 10 min) [Users table]
GetLockedUsers has ran 180 times (Every 10 min) [Users table]
GetMailboxSizes has ran 3 times (every 60 min) [Mailbox sizes table]
GetMailboXDatabaseSizes has ran 6 times (every 30 min) [Database sizes table]
GetMessageTrackingLogs has ran 3 times (every 60 min) [Message Tracking table]
FindMissingData has ran 9 times (every 20 min) [Users table]
UpdateDatabaseHistory has ran 3 times (every 60 min) [History Stats table]
5) I've only let it crash once so I don't know where it crashed. I don't see any errors and it continues to function fine till it runs out of memory. This does happen everytime I run it.. I've tried removing the db.Connection.Open() and currently (right now) i'm trying using the "using" statement.
6) I can make the memory climb by increasing the intervals. So one of my tasks is causing the issue but examining the code over and over I can't find where it possibly is doing it
7) It probably would take about 2 days for it to fully climb and crash. I would setup something to capture a memory dump when it crashes but it would take a few days.
One thing I can try is to increase the interval for all tasks to like 999999 minutes except for one (set at 10 min). Then I can slowly include each task until I see the memory problem.
But right now it has gone from 5MB to 314MB in 3 hours. It should just be querying from Exchange/AD and dumbing the data to SQL and disposing of all of it... so technically the memory should go back down right?
|
|
|
|
|
Hi,
thanks for all that info. Here are 3 remarks:
1) changing from explicit Dispose() calls to using constructs will not solve the problem by itself; it merely results in simpler code that is less likely to contain errors. So it should not be a priority.
2) I have an experiment I want to suggest to you with high priority: your code contains some 16 instances of
List<something> somename=new List<something>();
I want you to replace those by:
List<something> somename=new List<something>(someconstant);
i.e. specify an initial capacity for each and every list, where someconstant is a small factor larger than your current number of items; also preferably do not use more than 2 different values. So I'd suggest 0x00400000 for Message Tracking Logs and Mailbox Sizes, and 0x00040000 for everything else; yes those are hex numbers for a reason, and they exceed 4 million and quarter of a million respectively. The idea is to have the Lists of sufficient size right away (this will save a lot of data copying while populating an currently also growing the lists), and to improve the probability the same memory chunks will be used for newer generations of those lists.
This may seem mysterious, the rationale is: a List internally has an array holding the items; an array is an object; objects larger than around 80KB are kept in what is known as the "Large Object Heap". That heap used to have very poor memory management in the early days of .NET (say .NET 2.0); the issue was kept quiet by MS as long as possible, but it was (maybe or probably still is) a weakness of the .NET memory management. It has been said it got improved, however I don't know whether it has, how, and by how much. BTW: I expect behavior to be better on a 64-bit application.) If my suggested experiment behaves much better, it more or less proves the issue still exists and is relevant to your application.
Warning: the memory usage may initially be higher, however the "climbing steadily" should disappear completely.
3) I discovered one thing that is wasteful, and fixing that would then probably be the final solution I'd suggest: in your scenario you first build a list of all messages (sent, received), then repetitively use LINQ to reduce that list to a smaller list, and then basically are interested in the count of that smaller list. With a little effort, you could write a method that doesn't build the big list, it could create a list of lists (one for each user) hopefully resulting in many lists each of them much smaller than the current big list (a list with fewer than 10K items is bound to be managed well on the regular heap). If needed we can discuss this more later on, after we have the results from the experiment under (2).
That's it for now.
|
|
|
|
|
Luc,
Thank you for assisting me with this. So I'm going to update the code and push it out tomorrow morning with the pre-allocated list sizes. Also just to let you know the process is at 386MB of memory right now.
I know you said don't worry about the using statement but when I started this at 6pm I already put them in place. The code on github is still showing the try/catch/finally but I can update it with what I changed if you think it is necessary.
Again,
I really appreciate you helping!
|
|
|
|
|
OK, the change to using s is no problem, once it works it should be fine.
yes, if anything is still the matter, I will want access to the code of which you will tell us the results later on.
this will be my last message for today. CU tomorrow.
|
|
|
|
|
Luc,
Sorry for the delay. Yesterday I discovered another issue in the code. The Quartz library i'm using would let the job fire multiple times. So it didn't stop the timer while the job was executing and restart the timer after it was done.
I discovered after that the memory leak still exists and i'm trying your suggestions now. HOWEVER the memory leak is MUCH slower now because the jobs are not occurring so often.
From yesterday morning to now the memory has grown to 1.5GB
|
|
|
|
|
Ok code is updated and running now. I changes the triggers that were every 60 minutes to every 30 minutes to hopefully reproduce quicker.
Updated code is here:
KnowMoreIT/CloudPanel-Service · GitHub[^]
|
|
|
|
|
Ok it is running with that new code is currently at 677MB. Just a few minutes ago it went up to 1.5GB and back down... however it is slowly climbing
|
|
|
|
|
Hi,
1. once again I must ask you to be more specific; "slowly climbing" doesn't offer much info.
2. changing the code and changing run-time parameters too much makes it extremely hard if not impossible to understand what sporadic measurement results actually mean.
3. I won't be in tomorrow.
4. Today I have been, and still am, researching the behavior of the Large Object Heap, and probably will write a CP article about it; if so, that will take a few weeks though.
In summary, while there may be a way to make it perform as required (with a lot of conditions), I am still inclined to avoid large objects as much as possible. And hence:
5. I developed a little class that would keep your huge message lists in the regular heap, hence avoiding the LOH fragmentation you are currently bound to get. It does not offer an initial capacity, I don't think that would be very useful. It has been tested with foreach and with LINQ. Here it is:
using System;
using System.Collections;
using System.Collections.Generic;
using System.Text;
namespace LargeObjectsHeapTester {
public class ListOfLists<T> :IEnumerable, IEnumerable<T> {
public const int MAX_ITEMS_IN_LEVEL1=10000;
private List<List<T>> level0;
private List<T> currentLevel1;
public ListOfLists() {
Clear();
}
private void log(string msg) {
}
public void Clear() {
level0=new List<List<T>>();
currentLevel1=new List<T>();
level0.Add(currentLevel1);
}
public void Add(T item) {
if(currentLevel1.Count>=MAX_ITEMS_IN_LEVEL1) {
currentLevel1=new List<T>();
level0.Add(currentLevel1);
}
currentLevel1.Add(item);
}
public int Count {
get {
int totalCount=0;
foreach(List<T> level1 in level0) totalCount+=level1.Count;
return totalCount;
}
}
public void DumpCounts() {
int totalCount=0;
log("level0.Count="+level0.Count);
int i=0;
foreach(List<T> level1 in level0) {
int count1=level1.Count;
log("level0["+i+"].Count="+count1);
totalCount+=count1;
i++;
}
log("total count="+totalCount);
}
IEnumerator IEnumerable.GetEnumerator() {
return this.GetEnumerator();
}
public IEnumerator<T> GetEnumerator() {
foreach(List<T> level1 in level0) {
foreach(T item in level1) {
yield return item;
}
}
}
}
}
6. Your LINQ statements such as
var allMailboxes = db.Users.Where(x ... ... ... ).ToList();
clearly also create lists; I don't know how large they are, if more than 10000 elements, they too will contribute to the LOH fragmentation, and have escaped every remedy I have suggested so far.
Questions:
a) how many users satisfy x.MailboxPlan > 0
b) what would be a good upper limit for the Count of this allMailboxes?
c) how many users would there typically be in the below a.Users (sent or received)
List<MessageTrackingLog> totalSentLogs = sentLogs.Where(a => a.Users.Contains(x.EmailAddress, StringComparer.OrdinalIgnoreCase)).ToList();
That's it for now.
modified 15-Jan-16 8:18am.
|
|
|
|
|
Hi Luc!
I think after reading your questions/answers and researching that I have a much better understanding of what is going on. Basically when a List has more than 10,000 objects in it then it goes into LOH. Now from what I understand the garbage collector doesn't normally "compact" these as it expects to reuse this space?
Now I read in 4.5.1 they introduced this: whi[^] which gives you the ability to tell it TO compact once and then it resets to default which doesn't compact LOH.
Now what you posted creates a List of Lists (as named) and keeps each list under 10,000 so it never enters LOH.
I'm testing your class right now and recompiling..
Edit: (Sorry forgot to answer your questions):
Questions:
a) how many users satisfy x.MailboxPlan > 0
Right now it is under 5000 but technically it could be well over 10,000 one day
b) what would be a good upper limit for the Count of this allMailboxes?
Same thing.. right now under 5000 but could be over 10,000 one day
c) how many users would there typically be in the below a.Users (sent or received)
This is how many messages a single user has sent. Most of the time it will be under 1000 but if someone spams it could be larger. The list of ALL users for sent messages is well over 10,000. I can put in some logs to get the exact data
modified 17-Jan-16 22:19pm.
|
|
|
|
|
1.
Yes, your first paragraph is correct.
2.
I have almost zero experience with the improved LOH GC in .NET 4.5/4.5.1/4.5.2/4.6 (yes it came incrementally!), I've read it all and I'm working on some experiments, however I do not fully trust it for the potential side effects. I'd rather avoid the fragmentation if there happens to be a reasonable way to do so. Keep in mind an LOH compaction is a potentially huge operation that is rumored to maybe take ten seconds or so, in which time your app probably doesn't respond to anything.
3.
Once you got an OOMExc, you're stuck, unless you put try-catch AND retry logic everywhere! As an OOMExc could occur in many places, you would have to:
(a) either include a lot of GCSettings.LargeObjectHeapCompactionMode=...Once statements,
(b) or trust that by setting it once at the start of some/all intervals would suffice.
But that assumption might be hard to proof correct.
Anyway I'd be more interested in the newer forms of GC.Collect, see here[^] but that requires 4.6
You could also set LargeObjectHeapCompactionMode to once AND call GC.Collect(2) to force a LOH collect right away. That should work on 4.5.1
BTW: I've seen an article suggesting a timer that periodically causes an LOH compaction, but that sounds horrible, it would not synchronize to your app at all.
4.
I'm inclined to recommend you eventually switch to a 64-bit app if that is at all possible. Yes pointers become twice as large (so lists become LOH candidates sooner), but the usable virtual address space theoretically grows from 2 GB (maybe 3 or 4) up to 8 TB (other limits may apply, the Windows memory system is pretty complex!).
A 64-bit app needs a CPU that supports x64 and a Windows OS version that does the same (check under MyComputer/Properties, not sure how virtual machines handle it), and a .NET app that is built for "Any CPU", an option in Build/ConfigurationManager (no problem, unless you are referencing libraries/DLLs that are built explicitly for 32-bit).
5.
Warning: every little step you take to solve a problem like this may make it less probable, so it becomes harder to detect if anything more needs to be done. It is essential you find and use a way to "stress test" your code, maybe by shortening intervals, entering duplicates of actual list elements, etc.
6.
Finally, depending on how your program is going to be used, maybe an automatic periodic restart is quite acceptable. Application.Restart()[^] works really well!
|
|
|
|
|
I have no issues moving it to x64 except it seems like it would use way more memory than it should even if I switched it right?
I have shortened the intervals (just a xml file) to every 30 minutes... i'm querying all of this from Exchange so I don't want to run it too often. I may need to just create some fake data and seed it into a list and bypass Exchange for testing.
This is a Windows Service... The restart would technically help but I don't like that idea lol
I'm testing your ListOfLists class right now.. worse case I could combine my powershell command sql insert statements so i'm not passing around Lists. However the powershell command does return a ICollection of PSObjects so most likely I guess we would end up in the same situation?
|
|
|
|
|
1.
The C# code remains the same; after JIT compilation, your code is somewhat larger, this won't be relevant.
IIRC the smallest object grows from 32 to 48B when switching to x64.
And obviously every reference grows from 4 to 8B.
So memory usage is bound to be less than twice the original, and typically much less than that, as value types, texts, etc. don't grow at all.
2.
A piece of code that returns a collection and doesn't care about fragmentation issues probably is based on an array; so is the ToList() you're using in LINQ. An array (or an array-based class) is just the easiest to produce and consume.
There are alternatives, such as streaming (produce while consuming, never have it all in memory; that still is likely to contain an array internally, a smaller one). Or just asking for less data at once (you could keep start and end datetimes closer to each other, possibly in a loop, inside ExchActions.Get_TotalSentMessages() .
BTW: My ListOfLists class is an IEnumerable, i.e. to the consumer it only shows one element at a time. That is extreme streaming! Fortunately the compiler converts "yield return" statements in all the code required to keep track of where the consumer is currently fetching the data.
3.
What the powershell interface gives you is an ICollection, which is slightly more than an IEnumerable (e.g. it has a Count property, and an Add method). From this one can only gamble how it is implemented. Or look at it using Reflector or some other tool.
4.
I'm not sure you need ToList() in your LINQ statements. Dit you try without?
Select() returns an IEnumerable, and I expect that is all you are needing. I don't know how Select is implemented, I would hope it works in some kind of streaming mode (IEnumerable in, IEnumerable out, and produce while being consumed).
If this holds true, that is a number of potentially big objects you no longer need.
You would have to declare differently, and count yourself (while at it, I also summed the
bytes!):
IEnumerable<MessageTrackingLog> totalSentLogs=LINQ statement without .ToList();
int sentCount=0;
int sentBytes=0;
foreach(MessageTrackingLog item in totalSentLogs) {
sentCount++;
sentBytes+=item.TotalBytes;
}
which is much cheaper than having the CLR hand you a List first, and then use LINQ to process it.
|
|
|
|
|
Just wanted to update you that i've tried a couple different things... The first thing i've tried is completely disabling all Exchange actions (message tracking logs, mailbox sizes, etc).
Now it basically just processes the Active Directory options which has a total of 4533 that can be put in a list.
What I am finding is the memory usage is still up to 1GB now even with all the tasks disabled and growing.
I've had this service working without memory issues in the past. I completely rewrote it changing from Entity Framework to Linq to SQL because I didn't want to worry about the "context" being different. My goal was to make it where the scheduler version could last multiple version of the primary application. I'm really starting to wonder if its Linq to SQL because nothing should be over 5000 in a list now after disabling those other options.
I may try switching to using SqlConnection and SqlCommand for a test (BTW I updated my code if you want to check it out again at the current state)
|
|
|
|
|
Luc,
I've been running ANTS Memory Profiler 8.8 on the service... the Large Object Heap size is actually only about 40MB according to this profiler.
It shows the "Private Bytes" and "Working Set - Private" as the one that has all the memory.
When I took a snapshot this is what it is saying:
-> Generation 1 0 bytes
-> Generation 2 -> 1.105MB
-> Large Object Heap -> 2.645MB
-> Unused memory allocated to .NET -> 108.6MB
-> Unmanaged -> 618.6MB
It also shows this for class list (Live size (bytes)):
-> ConditionalWeakTable<tkey, tvalue="">+Entry<Object, PSMemberInfoInternalCollection<psmemberinfo>> (8,073,936 bytes)
-> Int32[] (2,726,312 bytes)
-> string (337,040)
-> AdsValueHelper (151,956)
and it just goes down from there. It does show "string" has 4,782 live instances and "AdsValueHelper" has 4,221 live instances
|
|
|
|
|
1.
I have no idea what all that means.
2.
I'm not a PowerShell user, nor will I become one any time soon.
I have been reading up on it a bit, and seem to have hit on two reasons for it to leak memory:
one of the first results googling "C# powershell memory leak"[^]
leaky PowerShell scripts[^]
3.
If I were to expect lots of output from something like PowerShell, and having seen the number of questions and complaints on it after a 1 minute Google, I would opt for a file interface: launch it with Process.Start() and have it create a file, hence avoiding most potential trouble.
4.
I recommend you reduce your program to a fraction of the intended functionality, make the memory consumption numbers very visible, and work on it till your "climbing slowly" is completely gone. Then iteratively add code and functionality, keeping a sharp eye on the memory situation at all times.
|
|
|
|
|
edit: Piebald and others raised concerns here about the possibility that the use of IEnumerable<T>Count() here would make the code break with Stack and Queue.
That is not the case; the code works:
private void TestChunking()
{
string testString = "aaabbbcccdddeeefffggghhh";
int[] intary = new int[36];
List<int> intlist = new List<int>(36);
Stack<int> stack = new Stack<int>();
Queue<int> queue = new Queue<int>();
for (int i = 0; i < 36; i++)
{
intary[i] = i;
intlist.Add(i);
queue.Enqueue(i);
stack.Push(i);
}
var result1 = intary.ToChunkedKvPList(9);
var result2 = intlist.ToChunkedKvPList(6);
var result3 = stack.Reverse().ToChunkedKvPList(9);
var result4 = queue.ToChunkedKvPList(4);
var result5 = testString.ToChunkedKvPList(3);
} The goal here (Extension method on IEnumerable) was to take an IEnumerable of any Type, and a chunk-size, and return a List of KeyValuePairs where each KeyValuePair had as its 'Key the first element in a chunk, and the KeyValuePair 'Value contained the all-but-the-first element in the chunk:
public static class IEnumerableExtensions
{
public static IEnumerable<KeyValuePair<T1, List<T1>>> ToChunkedKvPList<T1>(this IEnumerable<T1> source, int chunksz)
{
if(source.Count() % chunksz != 0) throw new ArgumentException("Source.Count must equal ChunkSize modulo 0");
int ndx = 0;
int listsz = chunksz - 1;
return source
.GroupBy(x => (ndx++/chunksz))
.Select(grp => grp.ToList())
.Select(lst => new KeyValuePair<T1, List<T1>>(lst[0], lst.GetRange(1, listsz)));
}
} Yeah, this works, but I remain convinced there is probably a much more elegant way of doing this using Linq; a way that would not require using an indexer external to the Linq operation. Perhaps a way to avoid two levels of 'Select ?
«Tell me and I forget. Teach me and I remember. Involve me and I learn.» Benjamin Franklin
modified 12-Jan-16 6:18am.
|
|
|
|
|
You can avoid the first Select like this:
return source
.GroupBy(x => (ndx++ / chunksz))
.Select(grp => new KeyValuePair<T1, List<T1>>(grp.First(), grp.Skip(1).ToList()));
This also makes the variable listsz obsolete.
I don't see a way to get rid of the external indexer without making it more convoluted.
If the brain were so simple we could understand it, we would be so simple we couldn't. — Lyall Watson
|
|
|
|
|
Thanks for this excellent response, Sascha.
It would be interesting to know if the use of 'First and 'Skip makes for any difference in computation-time and memory use compared to the code I showed. I doubt it.
cheers, Bill
«Tell me and I forget. Teach me and I remember. Involve me and I learn.» Benjamin Franklin
modified 11-Jan-16 14:40pm.
|
|
|
|
|
You're very welcome, Bill. Best of luck for your eyes surgery!
If the brain were so simple we could understand it, we would be so simple we couldn't. — Lyall Watson
|
|
|
|
|