|
But that's an extension method, and it would consume the IEnumerable while counting.
|
|
|
|
|
That's an interesting comment: the word "consume" usually means "use-up;" but, in this case, the code works, and works because a source IEnumerable can be "used" any number of times.
Of real interest is whether multiple evaluations of the IEnumerable source are very expensive ... in terms of memory, time.
Perhaps it is the case that transforming the IEnumerable to a List<T;> is a good thing to do, if it needs to be evaluated more than once.
thanks, Bill
«Tell me and I forget. Teach me and I remember. Involve me and I learn.» Benjamin Franklin
|
|
|
|
|
BillWoodruff wrote: because a source IEnumerable can be "used" any number of times.
Not all of them; and you can't tell. Queue and Stack implement IEnumerable, but they can be consumed only once (fortunately they have Count properties).
BillWoodruff wrote: whether multiple evaluations of the IEnumerable source are very expensive ... in terms of memory, time.
It may have to enumerate it fully; that takes time. Enumerating may also involve file or network access or similar (e.g. database access, reading from a socket) that uses time, IO, and memory.
And this particular result is not worth the effort in this case; so it's a waste.
Of course, it's possible that the Count method you use checks for certain types (e.g. Stack, Queue, Array, String) or interfaces (e.g. IList) and then uses the appropriate Count or Length methods rather than enumerating, but failing that, it must enumerate.
But the bottom line, in this case, is that there is no reason to check the Count anyway. And as Luc pointed out, if you're checking the Count you might as well check the chunksz before trying to divide by it. Which then leads to the question "what to do when the caller specifies a chunksz of zero?" -- and I suspect the "best" thing to do is to treat it as an "all the rest" value. But that's just my thought.
I suggest leaving the burden of checking such things to the caller. Document what the method does and let the buyer beware.
modified 9-Jan-16 23:34pm.
|
|
|
|
|
Thanks for the interesting response, the "quick sketch" I showed here was not meant to show all the programmer-is-an-idiot-proofing that might go in "production code."
I'll follow up on your comments by doing some testing with Queues and Stacks; never even thought of trying those.
cheers, Bill
«Tell me and I forget. Teach me and I remember. Involve me and I learn.» Benjamin Franklin
|
|
|
|
|
Be sure to hydrate.
|
|
|
|
|
PIEBALDconsult wrote: Queue and Stack implement IEnumerable, but they can be consumed only once
The enumerators for both Queue<T> and Stack<T> will not remove items from the collection, so you can iterate them as many times as you want.
BlockingCollection(T).GetConsumingEnumerable[^] is a better example.
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
What Piebald said.
I believe the best examples of IEnumerables that would be consumed are Iterator Methods (using yield return) and Enumerable.Range(,).
<Edit>removed some brain rot<Wrong is evil and must be defeated. - Jeff Ello
modified 11-Jan-16 12:32pm.
|
|
|
|
|
That looks horribly inefficient!
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
Worse than that, it doesn't work as intended.
Note to self, never post untested code you thought up before going to bed.
|
|
|
|
|
I realized from Piebalds response below that I owe you a proper answer.
If you take a look at the IEnumerable(T) Interface[^] it specifies only one Method, GetEnumerator.
All other methods are extensions that depend on that one single method.
So if we look at the IEnumerator(T) Interface[^] you'll notice that it has only three methods.
Dispose is of no interest here. MoveNext means that it's a forward only enumerator.
But note that Reset does not need to be implemented. which means you cannot count on getting restarted. So you can only count on enumerating once.
This is why I wrote that "it would consume the IEnumerable while counting".
So why did your code work?
I guess your Source parameter also implemented the ICollection(T) Interface[^] which specifies the Count property. (A List<T> for example)
And if you have a Class property/method with the same signature as an Extension method, the Class property/method will always take priority.
The compiler never complained since it could see the Extension method Count()
Therefore I should have written "it might consume the IEnumerable while counting" or maybe rather "might enumerate".
Luckily the extension method Count and the Class property Count, does the same thing.
BillWoodruff wrote: Perhaps it is the case that transforming the IEnumerable to a List<T;> is a good thing to do, if it needs to be evaluated more than once. In general I'd say yes, but it probably depends on whether the cost of instantiating objects or saving memory is of the highest importance.
|
|
|
|
|
BillWoodruff wrote: Perhaps it is the case that transforming the IEnumerable to a List<T;> is a good thing to do, if it needs to be evaluated more than once.
Probably not, as that may copy all of the elements, which is just not worth the effort in this case.
I say again; either stop trying to get the Count, or change the parameter type to IList -- it is clear that you (Bill) do not want an IEnumerable at all for the method presented.
Actually, it's good that this discussion came up now because for the last few weeks I have been tweaking some Extension Methods that were accepting IEnumerable and I was concerned about what could be sent in. I have now changed the methods to specify IList and I think everything will be much better.
One of the problems I definitely had when specifying IEnumerable was that String implements IEnumerable, but I did not want to treat it the same as other IEnumerables, which meant testing for is string all the time. By changing to IList (which String does not implement), I no longer need that test.
|
|
|
|
|
PIEBALDconsult wrote: it is clear that you (Bill) do not want an IEnumerable at all for the method presented. That's not correct; I wanted the method to work with Type 'String; and, it does.
«Tell me and I forget. Teach me and I remember. Involve me and I learn.» Benjamin Franklin
modified 11-Jan-16 14:11pm.
|
|
|
|
|
Well, that's alright then.
|
|
|
|
|
Thanks for the full response, Jorgen !
«Tell me and I forget. Teach me and I remember. Involve me and I learn.» Benjamin Franklin
|
|
|
|
|
Jörgen Andersson wrote: would consume
I suggest "may have to enumerate" as a more accurate word choice.
|
|
|
|
|
Yes, you're probably right.
I actually believe I owe Bill a proper answer.
|
|
|
|
|
|
You are bound to get a DivideByZeroInExceptionException
|
|
|
|
|
Luc?! 'Zat really you?
|
|
|
|
|
Yep. Sprinkling Seasons Exceptions.
|
|
|
|
|
Hey Luc, whaddup?
|
|
|
|
|
Hi Nish,
I'm just checking both sides of the moon. It seems rather dark either way...
|
|
|
|
|
Heheh
|
|
|
|
|
As others have said, this is the sort of thing that's better served using a procedural iterator method than trying to fudge it with LINQ.
Something like this would work, and would only require one pass through the source list:
public static IEnumerable<KeyValuePair<T1, List<T1>>> ToChunkedKvPList<T1>(this IEnumerable<T1> source, int chunksz)
{
if (source == null) throw new ArgumentNullException("source");
if (chunksz <= 0) throw new ArgumentOutOfRangeException("chunksz", "The chunk size must be greater than 0.");
return ToChunkedKvPListIterator(source, chunksz);
}
private static IEnumerable<KeyValuePair<T1, List<T1>>> ToChunkedKvPListIterator<T1>(IEnumerable<T1> source, int chunksz)
{
int currentChunkSize = 0;
T1 currentHeader = default(T1);
var currentChunk = new List<T1>(chunksz);
foreach (T1 item in source)
{
if (currentChunkSize == 0)
{
currentHeader = item;
}
else
{
currentChunk.Add(item);
}
currentChunkSize++;
if (currentChunkSize == chunksz)
{
yield return new KeyValuePair<T1, List<T1>>(currentHeader, currentChunk);
currentChunkSize = 0;
currentHeader = default(T1);
currentChunk = new List<T1>(chunksz);
}
}
if (currentChunkSize != 0)
{
yield return new KeyValuePair<T1, List<T1>>(currentHeader, currentChunk);
}
}
The only thing you lose is the exception if the source list count isn't a precise multiple of the chunk size, which would require multiple passes.
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
Very interesting, Richard, thanks ! I do not see using Linq as shown here as "fudging" in any sense of that word, but perhaps that word connotes something for you I am unaware of. As I said to Piebald, the example here was not meant to show all the fire-proofing/validation I would expect to see in "production code," but, of course, I am happy to see you included what you felt was needed.
My hypothetical take-away from your, and others', responses here is:
1. using the 'Count() method on an IEnumerable is either, or both:
a. expensive in terms of memory or time of computation
b. may "break" re-using that IEnumerable depending on what the IEnumerable base-Type is.
As time, and pending eye-surgery, permits, I'll look into those.
cheers, Bill
«Tell me and I forget. Teach me and I remember. Involve me and I learn.» Benjamin Franklin
|
|
|
|