Re: Surprising performance difference - C# Discussion Boards

Re: A windows service with System.Web.Mail

Heath Stewart24-Jun-04 12:44

24-Jun-04 12:44

Referencing assemblies is akin to binding libraries. If you need a type defined in another assembly, you have to reference it. Just like if you need to call a function or use a class defined in another library, you have to link it.

Hope that helps.

Microsoft MVP, Visual C#
My Articles

Surprising performance difference

the real bryon23-Jun-04 12:09

the real bryon

23-Jun-04 12:09

Hello,

I recently stumbled across a huge performance difference involving writing out a large hash table to a file. The hashtable is holding a very simple object called Token, that holds a string value called token and a couple counters. ToString() is overriden for the object to return:

tokenstring [space] counter1 [space] counter2

If I iterate through the values of the hashtable and write it out to a file, it takes around 1 minute and 40 seconds to complete.

Seems a little slow. Here is the weird part. If I iterate through the values of the hashtable and add them to an arraylist and then iterate through the arraylist to write out the values, it takes between 10 and 12 seconds.

The hash table is large, as the file that is written out is 39 Mb where each line is around 18 characters, where each line represents one hash table entry.

Below is some of my test code to clarify the situation if my explanation above has anyone confused. WriteTokensSlow takes about 1 minutes and 40 seconds. WriteTokensFast takes between 10 and 12 seconds.

private void WriteTokensSlow()
{
WriteCollection(tokensHT.Values);
}

private void WriteTokensFast()
{
ArrayList tokenList = new ArrayList(tokensHT.Values);
WriteCollection(tokenList);
}

private void WriteCollection(ICollection collection)
{
using (StreamWriter sw = new StreamWriter(savePath,false))
{
foreach (Token tokenItem in collection)
sw.WriteLine(tokenItem.ToString());
}
}

Can anyone explain why this is happening? Thanks.

Re: Surprising performance difference

leppie23-Jun-04 13:02

leppie

23-Jun-04 13:02

Im stumped! Both operations are O(n) I cant understand why one should take so much longer (if thats the only code). Confused | :confused:

^{top secret xacc-ide 0.0.1}

Re: Surprising performance difference

Heath Stewart24-Jun-04 5:00

Heath Stewart

24-Jun-04 5:00

The difference is in the enumerator. The Hashtable.Values property is, of course, an ICollection (which you already know). Hashtable+ValueCollection.GetEnumerator returns a Hashtable+HashtableEnumerator, which is significantly more complex (and hence slower) than the simple ArrayList+ArrayListEnumerator used when you enumerate the ArrayList.

When you create an ArrayList from an ICollection, Array.Copy is used to perform a shallow or deep copy depending on whether the source and destination arrays hold reference or value types. Either way, this operation is much faster than enumerating a collection and is also handled by the runtime itself (native code). For both these reasons, copying the arrays is much faster.

Hope that helps.

Microsoft MVP, Visual C#
My Articles

Re: Surprising performance difference

Jeremy Kimball24-Jun-04 6:07

Jeremy Kimball

24-Jun-04 6:07

That's why I try my damndest not to use the foreach construct unless there is absolutely no alternative. Enumeration sucks.

Jeremy Kimball

I have traveled the gutters, lo these many days, with no signs of life. Well met.
-brianwelsch

Re: Surprising performance difference

Heath Stewart24-Jun-04 6:50

Heath Stewart

24-Jun-04 6:50

The problem here, though, isn't enumeration in general. It's the Hashtable+HashtableEnumerator that's the problem (though I wouldn't call it a problem). A Hashtable isn't a simple collection (it's not really a collection at all), so enumeration is complicated. Enumerating over an ArrayList (using the ArrayList+ArrayListEnumerator) is much, much faster, as his performance results showed.

The problem here is just a case of using the right enumeration. It wasn't just the native code behind Array.Copy that produced better results (the ArrayList was still enumerated, remember).

Microsoft MVP, Visual C#
My Articles

Re: Surprising performance difference

Jeremy Kimball25-Jun-04 1:39

Jeremy Kimball

25-Jun-04 1:39

Doh...that's what I get for not reading the original code thoroughly...for some reason, my brain translocated the foreach loop.

At any rate, although it's extremely unlikely, might leppie be on to something with collisions? What method does the Hashtable use to solve collisions, some sort of linked bucket? Only problem with that theory is insertion would take a much longer time as well...

Jeremy Kimball

I have traveled the gutters, lo these many days, with no signs of life. Well met.
-brianwelsch

Re: Surprising performance difference

Heath Stewart25-Jun-04 3:35

Heath Stewart

25-Jun-04 3:35

No, because collisions occur when you insert items into a Hashtble. The original post is merely enumerating it. That merely entails enumerating the root buckets as well as any child buckets. In such a case, no hashing is necessary at all.

Microsoft MVP, Visual C#
My Articles

Re: Surprising performance difference

the real bryon24-Jun-04 6:59

the real bryon

24-Jun-04 6:59

Hi Heath,

If you chage the WriteTokensFast function to this:

private void WriteTokensFast()
{
ArrayList tokenList = new ArrayList();
foreach (Token tokenItem in tokensHT.Values)
tokenList.Add(tokenItem);
WriteCollection(tokenList);
}

It still only takes 10 to 12 seconds to run even though it is iterating through the Values and casting them to a Token object (the slow function takes a minute and 40 seconds).

Re: Surprising performance difference

Heath Stewart24-Jun-04 9:04

Heath Stewart

24-Jun-04 9:04

Interesting, though there is another likely possibility: if this is the first time your using those particular types, they must be JIT compiled. After that is complete, the types are cached as native code so the performance is akin to native code.

Try reversing the order of your calls and you should find the previously faster one takes as much time.

The Hashtable+HashtableEnumerator is still more complex. I'm surprised your not seeing a significant difference with around 39 MB of data in the Hashtable. I would expect a few seconds difference.

Microsoft MVP, Visual C#
My Articles

Re: Surprising performance difference

the real bryon24-Jun-04 12:09

the real bryon

24-Jun-04 12:09

Hi Heath,

I actually ran these in separate test app runs. They weren't getting executed in the same process one after another, so that wouldn't be it.

I am completely clueless to the cause, but I spent a lot of time testing slightly different scenarios to figure out where the problem was with the slow function. I've basically ruled out iterating through the hashtable and casting as the culprits, but that leaves me with nothing.

I've been programming using C# and the .Net framework since the beta of version 1 was available, and this is easily the strangest performance quirk I have come across.

Re: Surprising performance difference

Heath Stewart24-Jun-04 12:34

Heath Stewart

24-Jun-04 12:34

the real bryon wrote:
I've been programming using C# and the .Net framework since the beta of version 1 was available

Same here. That case is far from true for the majority of programmers (notice I didn't say "developers"?) that come here.

The only other thing I can think of is to profile the code and see where the bottleneck is. You can download a decent free one somewhere on MSDN, and there's other commercial ones around the 'net (including the Ants profile that advertises here on CodeProject).

This sure is a strange problem.

Microsoft MVP, Visual C#
My Articles

Re: Surprising performance difference

leppie24-Jun-04 7:16

leppie

24-Jun-04 7:16

Interesting, I didnt notice passing your ICollection implentation as a parameter to the constructor, uses that. I will keep this in mind Smile | :)

However, I still cannot see why it should take so much longer. Maybe fixing the Capacity to a HUGE initial value could solve it, the only thing I can think of is collision checking, so the hash is more spread, that will require more memory too.

^{top secret xacc-ide 0.0.1}

Re: Surprising performance difference

Heath Stewart24-Jun-04 8:51

Heath Stewart

24-Jun-04 8:51

When you pass an ICollection to the ArrayList ctor, the Capacity is set to the number of elements in the ICollection using the Count, yes.

Microsoft MVP, Visual C#
My Articles

DbNull to integer

IamADotNetGuy23-Jun-04 10:54

IamADotNetGuy

23-Jun-04 10:54

How to cast from type 'DBNull' to type 'integer'

Re: DbNull to integer

Colin Angus Mackay23-Jun-04 11:00

Colin Angus Mackay

23-Jun-04 11:00

IamADotNetGuy wrote:
How to cast from type 'DBNull' to type 'integer'

Why? Null means the absence of a value. What integer value would you cast it to?

"You can have everything in life you want if you will just help enough other people get what they want." --Zig Ziglar

The Second EuroCPian Event will be in Brussels on the 4th of September

Can't manage to P/Invoke that Win32 API in .NET? Why not do interop the wiki way!

My Blog

Re: DbNull to integer

IamADotNetGuy23-Jun-04 11:03

IamADotNetGuy

23-Jun-04 11:03

Hi sorry bit correction
One of the table in database is accepting Integer value,
But sometimes i want to pass null value to this column.

Please let me know on this

Re: DbNull to integer

Colin Angus Mackay23-Jun-04 11:10

Colin Angus Mackay

23-Jun-04 11:10

Ah! Okay that is easier:

If you are passing values in as parameters it is something like this:

command.Parameters.Add("@MyInteger", DBNull.Value);

If you are using some other means of inserting data and this doesn't make sense, let me know and I'll see if I can cook up an example for you.

Re: DbNull to integer

Valdair24-Jun-04 7:19

Valdair

24-Jun-04 7:19

to pass a null value to a DB you must specify the value NULL in the insert string...

INSERT INTO MyTable (MyColumn) VALUES (NULL)

now if you want to retrieve NULL's as a 0 value from the column you could do the following in SQL:

SELECT ISNULL(MyColumn, 0) FROM MyTable

this will return the value of 0 instead of NULL

of course this is assuming you are writing the SQL string if not, then I'd go with the other solutions

Re: DbNull to integer

Colin Angus Mackay23-Jun-04 11:02

Colin Angus Mackay

23-Jun-04 11:02

The best I can think of is something like this:

object fieldValue = reader.GetValue(columnIndex);
if (fieldValue == System.DBNull.Value)
    fieldValue = 0; // Treat null values in the column as a zero.

Re: DbNull to integer

Arjan Einbu23-Jun-04 11:04

Arjan Einbu

23-Jun-04 11:04

Sorry. No can do...

DbNull represents an unknown value, and an int has no means of representing that.

One way around this can be to define a specific integer value as the value of unknown/not set/DbNull. For instance use -1 for this if valid values for your integer always are positive.

object obj = some_value_thats_either_an_int_or_dbnull;
int i = obj is DbNull.Value ? -1 : (int)obj;

Have a look at my latest article about Object Prevalence with Bamboo Prevalence.

Re: DbNull to integer

IamADotNetGuy23-Jun-04 11:07

IamADotNetGuy

23-Jun-04 11:07

How can i insert null value to a column of number type??

Re: DbNull to integer

Arjan Einbu23-Jun-04 11:13

Arjan Einbu

23-Jun-04 11:13

Ah... The other way around... I'd go with Colin Angus Mackay's solution above...

Have a look at my latest article about Object Prevalence with Bamboo Prevalence.

Logic Question on Databinding and others

se6en23-Jun-04 9:34

se6en

23-Jun-04 9:34

I am new to C# but have plenty of programming knowledge in C++ and Java.

I have a question on my logic for a program I am doing.

What I am doing is reading in a few tables from an Access DB and binding them to a form textbox fields. The form comes up with "company" infomation, and then in a subform is the contacts for that specific company. The company information is in one table with a foreign key to the contact table.

Now I can bring up all the company info and all the contact info right now. But I want the subform just to show the contacts just for that specific company. Is there a "filter" or something that I can run against my contact DataSet to show the appropiate rows, and hide the others?

I know this is pretty vague, but I just need the logic on how to do something like this.

_Jacob

Re: Logic Question on Databinding and others

Heath Stewart23-Jun-04 10:39

Heath Stewart

23-Jun-04 10:39

What are you using for a control to display the contacts?

Yes, there is a way to filter results. See the DataView class, which you create over a DataTable (or modify DataTable.DefaultView). But you may not have to.

If you use a DataGrid to display the contacts, you set the DataSource property to the DataSet, and set the DataMember to the "table.relationshipName", where "table" is your table name and "relationshipName" is your company<-contact relationship name. I know this works with two DataGrids, and it should work with bound TextBoxes since they use the same currency manager.

When using two DataGrids, they both have the DataSource property set to the same DataSet. The master grid sets the DataMember to the table name only. The detail grid does what I outlined above. When you select a record in the master grid, the current item is used to automatically filter (via the DataRelation) the detail grid. Since this all ties back to the CurrencyManager, a bound TextBox should work, too.

Here's to hoping! Beer | [beer]

Let me know. I don't have time to throw something together right now and have never had a need to try it this particular way.

Microsoft MVP, Visual C#
My Articles

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.