|
Hi all, i need help in removing stop words i mean i ve an ArrayList wich contains about 10000 words each word is stored at each index of ArrayList and i ve to remove all the occurences of some 300 words wich are also stored in an array list well i m trying to do this like this
for(int i = 0; i < stopWords.Count; i ++)
{
while(totalWords.Contains(stopWords[i]))
totalWords.Remove(stopWords[i]);
}
i have also tried to do this through this way
for(int i = 0; i < totalWords.Count ; i++)
{
for(int j = 0; j < stopWords.Count; j++)
{
if(totalWords[i].Equals(stopWords[j]))
{
totalWords.Remove(totalWords[i]);
i--;
}
}
}
but both of these methods r taking ages to complete .... so plz anyone tell me some better and efficient appraoch then this....in the above code totalWords is the arraylist wich contains all the words and stopWords is the arraylist that contains the words wich r to b removed
lookin forward for help
Regards,
-- modified at 8:09 Thursday 18th May, 2006
|
|
|
|
|
Its probably a combination of the fact that your doing a linear search through a large list (which might lead to 300*10000 comparisons) and that the ArrayList is getting reorganzied each time you remove an element.
Two things to improve:
1. Use BinarySearch (log n instead of n comparisons)
a) Sort the list with: totalWords.Sort();
b) Call BinarySearch(stopWords[i]) to get the index of the first found item (you'll get -1 if not found)
2. Instead of removing from the existing list create a new one and add the elements which are not in the stopword list. This will reduce reorganization overhead of the ArrayList.
|
|
|
|
|
thxx sir, but the problem is that i cant sort the arraylist bcz it contains the words from different docutments and if i sort it now i cant keep track that wich word occurs in wich document so as i cant arrange the array list i cant apply the binary search
so plzz tell me anyother solution
looking forward for help
Regards,
|
|
|
|
|
Then at least follow the other point:
ArrayList newTotalWords = new ArrayList(totalWords.Count);
for(int i = 0; i < totalWords.Count ; i++)
{
if (!stopWords.Contains(totalWords[i]))
{
newTotalWords.Add(totalWords[i]);
}
}
If you can sort the stopwords list then you can even use BinarySearch here:
stopWords.Sort();
ArrayList newTotalWords = new ArrayList(totalWords.Count);
for(int i = 0; i < totalWords.Count ; i++)
{
if (stopWords.BinarySearch(totalWords[i]) < 0)
{
newTotalWords.Add(totalWords[i]);
}
}
|
|
|
|
|
Thanks alot sir, its working really fine even better than my expectations i ve used the 2nd option of sorting the stopwords list and then applying binary search over it.....
Sir now i have another similar sort of problem....after removing the stopwords i hve to make an inverted index of the remaining words i.e to keep the record that how documents contain a certain word and how many times this word occurs in that particular file.......i ve done that but again the time is the major problem it takes lots of time i m wrting the code down wich i m using to do this.....
temp is to keep the record of the current document number
wIndex keeps the record of the objects of Teminology class each object of this class keeps track of all the info about a certain term.
for(int i = 0; i < wordList.Count; i++)
{
if (wordList[i].ToString().Equals(EOF))
{
temp++;
continue;
}
word[wIndex] = new Terminology();
word[wIndex].term = wordList[i].ToString();
termCount = 1;
cDocNo = temp;
jtemp = 0;
for(int j = i+1; j < wordList.Count; j++)
{
if(i == j)
continue;
if(cDocNo >= 1 && jtemp == 0 && temp >=1 )
{
jtemp++;
for(int k = 0; k< cDocNo;k++)
{
word[wIndex].tf.Add(0);
word[wIndex].docID.Add(k+1);
}
}
if(wordList[j].ToString().Equals(EOF))
{
cDocNo++;
word[wIndex].tf.Add(termCount);
word[wIndex].docID.Add(cDocNo);
if(termCount >= 1)
{
word[wIndex].df++;
docCount++;
termCount = 0;
}
continue;
}
if(wordList[i].Equals(wordList[j]))
{
wordList.RemoveAt(j);
termCount++;
j--;
}
}
wIndex++;
}
any suggestions to improve the efficiency of this code will b welcomed
looking forward for help
Regards,
-- modified at 10:34 Thursday 18th May, 2006
|
|
|
|
|
Could you please correct your post. There seems to be some errors in it like:
for(int j = i+1;j
I could guess what should be there but its easier if you repost.
|
|
|
|
|
i ve modified the code plzz check it
|
|
|
|
|
To be honest I have not fully understood your code (and I currently do not have the time to invest into this). First of all you should check if it is really doing what you are expecting (probably with some small sample data).
The only thing I can advice you is to avoid using RemoveAt on the wordList. For your understanding: If you remove the first element in an ArrayList it will copy all other elements internally (which in this case means copying 9999 words).
Also you have again the problem not being able to use BinarySearch. You could try to reorganzie your data. Instead of having one big list you could have separate lists for each document. You could then sort each of those without losing the reference to their respective documents and do some BinarySearches.
I probably have some time later on. As performance tuning is fun for me you could send me the complete code by mail (along with some sample data).
|
|
|
|
|
As you might know, the .NET Compact Framework is a minimized version of the normal .NET framework, enough stuff has been cut out in order to be able to run smoothly on devices with low mem.
The problem I am facing right now:
How can I ensure only 1 instance of my application is running? Named mutexes are not supported in .NET Compact Framework, so I have to find another way, that`s still safe.
Thanks for any help you can give,
Davy
|
|
|
|
|
i have a 500 by 400 form and with a statusstrip + 2 statusstriplabels on it.
set time on a statusstriplabel. everything works fine till now.
but if i resize my form. (statusstriplabel's) times' place remain the same.
any place in form can be replaced with dock or anchor support.
but if i use anchor for statusstrip, its place changes.
comes somewhere in form
i want it at bottom but it should be able to size itself according to form's size...
how can i do this ?
|
|
|
|
|
Leave the StatusStrip docked at the bottom. Set the Spring property of the left StatusLabel to true , and it will fill up the space in the StatusStrip and keep the right StatusLabel at the right end of the strip.
--
I've killed again, haven't I?
|
|
|
|
|
hi all,
let's say i have a few texboxes and a 'submit' button. how i enable enter key always active on that button ?
i mean if i have 3 texboxes and a submit button: pressing 4 times on TAB key, submit button will be active. (pressable)
anytime i hit enter, i want to activate my submit button ¿
thanks in advance.
|
|
|
|
|
u can try on textbox's keypress event following code
if(Convert.ToInt16(e.KeyChar)==13)
button1.PerformClick();
rahul
|
|
|
|
|
check the AcceptButton property of ur form and select the button wich u want to b pressed on the enter key
i hope this will help
Regards,
|
|
|
|
|
As MSDN's docs say, Thread.Abort() throws an exception in the corresponding thread:
"Raises a ThreadAbortException in the thread on which it is invoked, to begin the process of terminating the thread. Calling this method usually terminates the thread."
Does it mean that I have to catch the exception, and perform an aborting routine?
Or perhaps I should leave the exception, and expect it to just terminate?
Thanks,
Shy.
|
|
|
|
|
Not handling it will result in an unhandled exception in your application which might result in a crash (or that nasty dialog coming from .Net).
Catching it an gracefully clearing up resources/doing cleanups is the way to go.
|
|
|
|
|
You can catch the ThreadAbortException in the thread if you want to make it aware of that it is being aborted.
There are three ways of keeping the thread from being aborted:
:: Go into an eternal loop after catching the exception to keep from exiting the catch block, as exiting the catch block automatically raises the exception again.
:: Go into an eternal loop in a finally block, as all pending finally blocks are executed before the thread is killed.
:: Call Thread.ResetAbort after catching the exception to cancel the abort. This is of course the preferred method if the intention really is to keep the thread alive.
In any other case, the thread will be aborted.
---
b { font-weight: normal; }
|
|
|
|
|
Hi,
Can any one tell me how to convert the below code to c#
Int *ptr;
ptr=&array[0];
sum = avg(ptr,index)
I know that we can use ref key word but how do i assign the address to ptr?
|
|
|
|
|
Hello Folks,
I am wondering if I can start a second application from a first one, something like shell execute but on Win CE.
I use C# on Win CE and try to write something like a bootloader. The bootloader starts and checks a few things and then should launch another application.
Does somebody know how to d that?
Thanks a lot,
Marco
|
|
|
|
|
u can try
System.Diagnostics.Process.Start(compelete path of application to run);
rahul
|
|
|
|
|
Thanks Rahul,
unfortunately for whatever reason I don't have the namespace Process under System.Diagnostics ...
Anyway lots of thanks for your help!
Marco
|
|
|
|
|
You can use the native calls ShellExecute or CreateProcess by using dllimport
In the class you wish to start another app use the following namespace
using System.Runtime.InteropServices;
then define from which dll to get the function and the function name with dllimport and on the immediately next line define the function with all the parameters
<br />
[DLLImport("Coredll.dll", EntryPoint="TheFunctin")]<br />
public static extern void TheFunction(... all your params...);<br />
In the above example replace the TheFunctin with the actuall function name you wish to import (in your case CreateProcess or ShellExecute, and replace the ...all your params... part with correct parameters -> depends on which function you^ll use...
I think CreateProcess is in the Coredll.dll lib and ShellExecute is in the CeShell.dll, but you`ll have to doublecheck that.
Friendly Regards,
Davy
|
|
|
|
|
Davy,
thanks a lot for your help!
I'm not quite familiar with the dll-import but I guess I'll figure that out.
Best regards,
Marco
|
|
|
|
|
Hi,
I've created a control which has a string field, accessed through a bindable property.
However, when the control is in a form, and the bindingsource has a null value in the binded property, the value of the string is never updated.
I've tried using the advanced databinding properties to set an empty string when the incoming value is null, but no luck so far.
Any ideas? thanks in advance
|
|
|
|
|
I hane an application where one of my dialogs is created when the program starts. I like to call Form.Show() and Form.Hide() to show and hide the form.
When the user presses the system-X button the form is per default disposed. How do I prevent that I just want a Form.Hide() when the user does that so that I can call Form.Show() again without making a new instance of it.
_____________________________
...and justice for all
APe
|
|
|
|