|
I have no issues moving it to x64 except it seems like it would use way more memory than it should even if I switched it right?
I have shortened the intervals (just a xml file) to every 30 minutes... i'm querying all of this from Exchange so I don't want to run it too often. I may need to just create some fake data and seed it into a list and bypass Exchange for testing.
This is a Windows Service... The restart would technically help but I don't like that idea lol
I'm testing your ListOfLists class right now.. worse case I could combine my powershell command sql insert statements so i'm not passing around Lists. However the powershell command does return a ICollection of PSObjects so most likely I guess we would end up in the same situation?
|
|
|
|
|
1.
The C# code remains the same; after JIT compilation, your code is somewhat larger, this won't be relevant.
IIRC the smallest object grows from 32 to 48B when switching to x64.
And obviously every reference grows from 4 to 8B.
So memory usage is bound to be less than twice the original, and typically much less than that, as value types, texts, etc. don't grow at all.
2.
A piece of code that returns a collection and doesn't care about fragmentation issues probably is based on an array; so is the ToList() you're using in LINQ. An array (or an array-based class) is just the easiest to produce and consume.
There are alternatives, such as streaming (produce while consuming, never have it all in memory; that still is likely to contain an array internally, a smaller one). Or just asking for less data at once (you could keep start and end datetimes closer to each other, possibly in a loop, inside ExchActions.Get_TotalSentMessages() .
BTW: My ListOfLists class is an IEnumerable, i.e. to the consumer it only shows one element at a time. That is extreme streaming! Fortunately the compiler converts "yield return" statements in all the code required to keep track of where the consumer is currently fetching the data.
3.
What the powershell interface gives you is an ICollection, which is slightly more than an IEnumerable (e.g. it has a Count property, and an Add method). From this one can only gamble how it is implemented. Or look at it using Reflector or some other tool.
4.
I'm not sure you need ToList() in your LINQ statements. Dit you try without?
Select() returns an IEnumerable, and I expect that is all you are needing. I don't know how Select is implemented, I would hope it works in some kind of streaming mode (IEnumerable in, IEnumerable out, and produce while being consumed).
If this holds true, that is a number of potentially big objects you no longer need.
You would have to declare differently, and count yourself (while at it, I also summed the
bytes!):
IEnumerable<MessageTrackingLog> totalSentLogs=LINQ statement without .ToList();
int sentCount=0;
int sentBytes=0;
foreach(MessageTrackingLog item in totalSentLogs) {
sentCount++;
sentBytes+=item.TotalBytes;
}
which is much cheaper than having the CLR hand you a List first, and then use LINQ to process it.
|
|
|
|
|
Just wanted to update you that i've tried a couple different things... The first thing i've tried is completely disabling all Exchange actions (message tracking logs, mailbox sizes, etc).
Now it basically just processes the Active Directory options which has a total of 4533 that can be put in a list.
What I am finding is the memory usage is still up to 1GB now even with all the tasks disabled and growing.
I've had this service working without memory issues in the past. I completely rewrote it changing from Entity Framework to Linq to SQL because I didn't want to worry about the "context" being different. My goal was to make it where the scheduler version could last multiple version of the primary application. I'm really starting to wonder if its Linq to SQL because nothing should be over 5000 in a list now after disabling those other options.
I may try switching to using SqlConnection and SqlCommand for a test (BTW I updated my code if you want to check it out again at the current state)
|
|
|
|
|
Luc,
I've been running ANTS Memory Profiler 8.8 on the service... the Large Object Heap size is actually only about 40MB according to this profiler.
It shows the "Private Bytes" and "Working Set - Private" as the one that has all the memory.
When I took a snapshot this is what it is saying:
-> Generation 1 0 bytes
-> Generation 2 -> 1.105MB
-> Large Object Heap -> 2.645MB
-> Unused memory allocated to .NET -> 108.6MB
-> Unmanaged -> 618.6MB
It also shows this for class list (Live size (bytes)):
-> ConditionalWeakTable<tkey, tvalue="">+Entry<Object, PSMemberInfoInternalCollection<psmemberinfo>> (8,073,936 bytes)
-> Int32[] (2,726,312 bytes)
-> string (337,040)
-> AdsValueHelper (151,956)
and it just goes down from there. It does show "string" has 4,782 live instances and "AdsValueHelper" has 4,221 live instances
|
|
|
|
|
1.
I have no idea what all that means.
2.
I'm not a PowerShell user, nor will I become one any time soon.
I have been reading up on it a bit, and seem to have hit on two reasons for it to leak memory:
one of the first results googling "C# powershell memory leak"[^]
leaky PowerShell scripts[^]
3.
If I were to expect lots of output from something like PowerShell, and having seen the number of questions and complaints on it after a 1 minute Google, I would opt for a file interface: launch it with Process.Start() and have it create a file, hence avoiding most potential trouble.
4.
I recommend you reduce your program to a fraction of the intended functionality, make the memory consumption numbers very visible, and work on it till your "climbing slowly" is completely gone. Then iteratively add code and functionality, keeping a sharp eye on the memory situation at all times.
|
|
|
|
|
edit: Piebald and others raised concerns here about the possibility that the use of IEnumerable<T>Count() here would make the code break with Stack and Queue.
That is not the case; the code works:
private void TestChunking()
{
string testString = "aaabbbcccdddeeefffggghhh";
int[] intary = new int[36];
List<int> intlist = new List<int>(36);
Stack<int> stack = new Stack<int>();
Queue<int> queue = new Queue<int>();
for (int i = 0; i < 36; i++)
{
intary[i] = i;
intlist.Add(i);
queue.Enqueue(i);
stack.Push(i);
}
var result1 = intary.ToChunkedKvPList(9);
var result2 = intlist.ToChunkedKvPList(6);
var result3 = stack.Reverse().ToChunkedKvPList(9);
var result4 = queue.ToChunkedKvPList(4);
var result5 = testString.ToChunkedKvPList(3);
} The goal here (Extension method on IEnumerable) was to take an IEnumerable of any Type, and a chunk-size, and return a List of KeyValuePairs where each KeyValuePair had as its 'Key the first element in a chunk, and the KeyValuePair 'Value contained the all-but-the-first element in the chunk:
public static class IEnumerableExtensions
{
public static IEnumerable<KeyValuePair<T1, List<T1>>> ToChunkedKvPList<T1>(this IEnumerable<T1> source, int chunksz)
{
if(source.Count() % chunksz != 0) throw new ArgumentException("Source.Count must equal ChunkSize modulo 0");
int ndx = 0;
int listsz = chunksz - 1;
return source
.GroupBy(x => (ndx++/chunksz))
.Select(grp => grp.ToList())
.Select(lst => new KeyValuePair<T1, List<T1>>(lst[0], lst.GetRange(1, listsz)));
}
} Yeah, this works, but I remain convinced there is probably a much more elegant way of doing this using Linq; a way that would not require using an indexer external to the Linq operation. Perhaps a way to avoid two levels of 'Select ?
«Tell me and I forget. Teach me and I remember. Involve me and I learn.» Benjamin Franklin
modified 12-Jan-16 6:18am.
|
|
|
|
|
You can avoid the first Select like this:
return source
.GroupBy(x => (ndx++ / chunksz))
.Select(grp => new KeyValuePair<T1, List<T1>>(grp.First(), grp.Skip(1).ToList()));
This also makes the variable listsz obsolete.
I don't see a way to get rid of the external indexer without making it more convoluted.
If the brain were so simple we could understand it, we would be so simple we couldn't. — Lyall Watson
|
|
|
|
|
Thanks for this excellent response, Sascha.
It would be interesting to know if the use of 'First and 'Skip makes for any difference in computation-time and memory use compared to the code I showed. I doubt it.
cheers, Bill
«Tell me and I forget. Teach me and I remember. Involve me and I learn.» Benjamin Franklin
modified 11-Jan-16 14:40pm.
|
|
|
|
|
You're very welcome, Bill. Best of luck for your eyes surgery!
If the brain were so simple we could understand it, we would be so simple we couldn't. — Lyall Watson
|
|
|
|
|
You could also get rid of the second Select :
return source.GroupBy(
x => (ndx++ / chunksz),
(key, grp) => new KeyValuePair<T1, List<T1>>(grp.First(), grp.Skip(1).ToList()));
Enumerable.GroupBy(TSource, TKey, TResult) Method (IEnumerable(TSource), Func(TSource, TKey), Func(TKey, IEnumerable(TSource), TResult)) (System.Linq)[^]
Add in a KeyValuePair<TKey, TValue> factory method:
public static class KeyValuePair
{
public static KeyValuePair<TKey, TValue> Create<TKey, TValue>(TKey key, TValue value)
{
return new KeyValuePair<TKey, TValue>(key, value);
}
}
and the statement becomes almost readable:
return source.GroupBy(
x => (ndx++ / chunksz),
(key, grp) => KeyValuePair.Create(grp.First(), grp.Skip(1).ToList()));
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
I'd actually prefer the separate .Select over the .GroupBy with resultSelector, to me that's a split second faster to recognize. I like the idea with the factory method though
If the brain were so simple we could understand it, we would be so simple we couldn't. — Lyall Watson
|
|
|
|
|
thanks for this ! Bill
«Tell me and I forget. Teach me and I remember. Involve me and I learn.» Benjamin Franklin
|
|
|
|
|
I suspect that would be so much easier (and quicker) in straight procedural code.
And IEnumerable doesn't have a Count member; so you're doomed to failure from the first statement. Maybe you want to use IList instead? I think a better behaviour would be to not attempt to count the items, but to leave the final item short or to pad the final item with default(T1) s . Maybe even allow the caller to specify which behaviour to use (throw, pad, as-is). And, of course, document such behaviour.
|
|
|
|
|
PIEBALDconsult wrote: And IEnumerable doesn't have a Count member;
:cough: MSDN[^] :cough:
Bad command or file name. Bad, bad command! Sit! Stay! Staaaay...
|
|
|
|
|
But that's an extension method, and it would consume the IEnumerable while counting.
|
|
|
|
|
That's an interesting comment: the word "consume" usually means "use-up;" but, in this case, the code works, and works because a source IEnumerable can be "used" any number of times.
Of real interest is whether multiple evaluations of the IEnumerable source are very expensive ... in terms of memory, time.
Perhaps it is the case that transforming the IEnumerable to a List<T;> is a good thing to do, if it needs to be evaluated more than once.
thanks, Bill
«Tell me and I forget. Teach me and I remember. Involve me and I learn.» Benjamin Franklin
|
|
|
|
|
BillWoodruff wrote: because a source IEnumerable can be "used" any number of times.
Not all of them; and you can't tell. Queue and Stack implement IEnumerable, but they can be consumed only once (fortunately they have Count properties).
BillWoodruff wrote: whether multiple evaluations of the IEnumerable source are very expensive ... in terms of memory, time.
It may have to enumerate it fully; that takes time. Enumerating may also involve file or network access or similar (e.g. database access, reading from a socket) that uses time, IO, and memory.
And this particular result is not worth the effort in this case; so it's a waste.
Of course, it's possible that the Count method you use checks for certain types (e.g. Stack, Queue, Array, String) or interfaces (e.g. IList) and then uses the appropriate Count or Length methods rather than enumerating, but failing that, it must enumerate.
But the bottom line, in this case, is that there is no reason to check the Count anyway. And as Luc pointed out, if you're checking the Count you might as well check the chunksz before trying to divide by it. Which then leads to the question "what to do when the caller specifies a chunksz of zero?" -- and I suspect the "best" thing to do is to treat it as an "all the rest" value. But that's just my thought.
I suggest leaving the burden of checking such things to the caller. Document what the method does and let the buyer beware.
modified 9-Jan-16 23:34pm.
|
|
|
|
|
Thanks for the interesting response, the "quick sketch" I showed here was not meant to show all the programmer-is-an-idiot-proofing that might go in "production code."
I'll follow up on your comments by doing some testing with Queues and Stacks; never even thought of trying those.
cheers, Bill
«Tell me and I forget. Teach me and I remember. Involve me and I learn.» Benjamin Franklin
|
|
|
|
|
Be sure to hydrate.
|
|
|
|
|
PIEBALDconsult wrote: Queue and Stack implement IEnumerable, but they can be consumed only once
The enumerators for both Queue<T> and Stack<T> will not remove items from the collection, so you can iterate them as many times as you want.
BlockingCollection(T).GetConsumingEnumerable[^] is a better example.
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
What Piebald said.
I believe the best examples of IEnumerables that would be consumed are Iterator Methods (using yield return) and Enumerable.Range(,).
<Edit>removed some brain rot<Wrong is evil and must be defeated. - Jeff Ello
modified 11-Jan-16 12:32pm.
|
|
|
|
|
That looks horribly inefficient!
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
Worse than that, it doesn't work as intended.
Note to self, never post untested code you thought up before going to bed.
|
|
|
|
|
I realized from Piebalds response below that I owe you a proper answer.
If you take a look at the IEnumerable(T) Interface[^] it specifies only one Method, GetEnumerator.
All other methods are extensions that depend on that one single method.
So if we look at the IEnumerator(T) Interface[^] you'll notice that it has only three methods.
Dispose is of no interest here. MoveNext means that it's a forward only enumerator.
But note that Reset does not need to be implemented. which means you cannot count on getting restarted. So you can only count on enumerating once.
This is why I wrote that "it would consume the IEnumerable while counting".
So why did your code work?
I guess your Source parameter also implemented the ICollection(T) Interface[^] which specifies the Count property. (A List<T> for example)
And if you have a Class property/method with the same signature as an Extension method, the Class property/method will always take priority.
The compiler never complained since it could see the Extension method Count()
Therefore I should have written "it might consume the IEnumerable while counting" or maybe rather "might enumerate".
Luckily the extension method Count and the Class property Count, does the same thing.
BillWoodruff wrote: Perhaps it is the case that transforming the IEnumerable to a List<T;> is a good thing to do, if it needs to be evaluated more than once. In general I'd say yes, but it probably depends on whether the cost of instantiating objects or saving memory is of the highest importance.
|
|
|
|
|
BillWoodruff wrote: Perhaps it is the case that transforming the IEnumerable to a List<T;> is a good thing to do, if it needs to be evaluated more than once.
Probably not, as that may copy all of the elements, which is just not worth the effort in this case.
I say again; either stop trying to get the Count, or change the parameter type to IList -- it is clear that you (Bill) do not want an IEnumerable at all for the method presented.
Actually, it's good that this discussion came up now because for the last few weeks I have been tweaking some Extension Methods that were accepting IEnumerable and I was concerned about what could be sent in. I have now changed the methods to specify IList and I think everything will be much better.
One of the problems I definitely had when specifying IEnumerable was that String implements IEnumerable, but I did not want to treat it the same as other IEnumerables, which meant testing for is string all the time. By changing to IList (which String does not implement), I no longer need that test.
|
|
|
|
|