|
Try with the "/data/page" xpath instead.
Otherwise, it might be the schema that removed the nodes. Does the schema that the data references to exist?
Check the contents of the InnerXml property once the document is loaded, to see if the page nodes gets loaded at all.
---
b { font-weight: normal; }
|
|
|
|
|
hhmmm, that didn't work, the page nodes do get loaded though.
The data that I am trying to process comes from the Wikipedia database dump (you know, the open source encyclopedia, see http://en.wikipedia.org/wiki/Main_Page).
I am practising on a small foreign language XML dump which can be downloaded from http://download.wikimedia.org/wikipedia/am/20051020_pages_current.xml.bz2 (the english wikipedia dump is about 3gbs, so you dont exactly want to process that every time you test the program!).
I extracted the exact first line of the XML, which is;
<mediawiki xmlns="http://www.mediawiki.org/xml/export-0.3/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.mediawiki.org/xml/export-0.3/ http://www.mediawiki.org/xml/export-0.3.xsd" version="0.3" xml:lang="am">
This includes links to various websites that explain the schema.
I find it strange that it works by simply removing all the attribute material from this first line. I tried programmatically removing this using the Attributes.RemoveAll() method, but as I found out it should really be called RemoveAllButOne(), so that doesnt work either.
Thanks for trying to help, really appreciated, don't worry if don't have the time to go any further though.
Martin
|
|
|
|
|
I see. When you show the complete tag, it's obvious.
You need to use an XmlNamespaceManager object along with the xml document to be able to reach the nodes that belong to a namespace.
---
b { font-weight: normal; }
|
|
|
|
|
|
I am having trouble using the code from the tutorial posted here. The following is all of my code. I have commented what the original download version had in it...
using System;
using System.IO;
using org.pdfbox.pdmodel;
using org.pdfbox.util;
namespace Pdf2Text
{
class Program
{
///
/// The main entry point for the application.
///
[STAThread]
static void Main(string[] args)
{
DateTime start = DateTime.Now;
if (args.Length < 2)
{
//The following line is the way it is written in the downloaded version
//Console.WriteLine("Usage: PDF2TEXT ");
//I wrote this
Console.WriteLine("Usage: PDF2TEXT MyPDF.pdf MYTEXT.txt");
//return;
}
using (StreamWriter sw = new StreamWriter(args[1]))
{
sw.WriteLine(parseUsingPDFBox(args[0]));
}
Console.WriteLine("Done. Took " + (DateTime.Now - start));
Console.ReadLine();
}
private static string parseUsingPDFBox(string input)
{
PDDocument doc = PDDocument.load(input);
PDFTextStripper stripper = new PDFTextStripper();
return stripper.getText(doc);
}
}
}
I get the following error message when I try to run it:
"An unhandled exception of type 'System.IndexOutOfRangeException' occurred in Pdf2Text.exe
Additional information: Index was outside the bounds of the array."
Can someone help!
Curt
|
|
|
|
|
Probably tou get IndexOutOfRangeException when you use args[1] and args[1] is not avaliable. I think that you shoul uncoment this return:
<br />
if (args.Length < 2) {<br />
Console.WriteLine("Usage: PDF2TEXT MyPDF.pdf MYTEXT.txt");<br />
}<br />
Hope this hepls.
protected internal static readonly ... and I wish the list could continue ...
|
|
|
|
|
Thanks for responding! I did that and the console flashes up then disappears and nothing happens beyond that. I'm at loss! If you have anymore suggestion please let me know! Thanks!
Curt
|
|
|
|
|
I think this happens because you supply an insufficient number of parameters. Try this :
if (args.Length < 2) {<br />
Console.WriteLine("Usage: PDF2TEXT MyPDF.pdf MYTEXT.txt");<br />
Console.ReadLine();<br />
<br />
return;<br />
}
protected internal static readonly ... and I wish the list could continue ...
|
|
|
|
|
I am using the System.Threading.Timer timer.
Now if I have a interval set quite low say 10ms it is possible for my timer event handler to be queued as they cannot be prcessed because the UI thread is busy or another application is running. When the UI thread becomes free it then processes all the queued events at the same time.
Is it possible to determine programmatically the number of timer event handlers that are being queued. I want to be able to determine when there is a build up of queued event handlers.
Some code to demonstrate this would be useful.
Also, is it possible to limit the number of timer event handlers that are queued?
Thanks,
Liam
|
|
|
|
|
You could start a separate thread to do the actual work of the timer, that way the thread that is waiting for the event is never busy. This if course only works if the code is thread safe, as you could have several threads running at the same time.
You can use a counter to see how many threads are running. Increase the counter when a task starts and decrease it when it finishes.
If the code can't be made fully thread safe, you could queue the events yourself to be able to keep track of them. Still using a separate thread to do the work (to keep the main thread response to the events), but don't start a new thread until the previous finishes.
---
b { font-weight: normal; }
|
|
|
|
|
I found the question a little odd considering the Timer he's using doesn't use events. It uses a callback delegate executed on a seperate thread out of the thread pool.
So, my question would be, does each tick get its own thread? If so, then in theory, a 10ms time interval could exhaust the thread pool if the callback code takes more than 10ms to execute. What happens to the Timer then? Does the callback get queued up waiting for the thread pool? Does this queue have an upper limit...the size of the stack, maybe?
RageInTheMachine9532
"...a pungent, ghastly, stinky piece of cheese!" -- The Roaming Gnome
|
|
|
|
|
OK my terminology may be wrong. I use the following so I use the the wording "event handler"
tmrTimersTimer.Elapsed += new ElapsedEventHandler(tmrTimersTimerElapsedHandler);
No each tick does not get it's own thread. The callback code takes about 1-2 ms so should complete in plenty of time.
Now if another thread has the processor when my timer elapses it will queue. It will continue to queue timer callbacks. When my code gets the CPU back it will empty the queue and execute all the callbacks.
What I was hoping to achieve was to interrogate the queue that holds the timer callbacks.
|
|
|
|
|
OK. Now you've got me confused. Which timer are you using???
System.Threading.Timer - From your original post. Uses a callback delegate, not an event.
System.Timers.Timer - Uses an event called Elapsed .
RageInTheMachine9532
"...a pungent, ghastly, stinky piece of cheese!" -- The Roaming Gnome
|
|
|
|
|
Dave, sorry for the mixed up information I think I am confusing myself too.
What I have found is both timers:-
System.Threading.Timer
System.Timers.Timer
Will queue and "handler code" if it cannot be processed immediately. the "handler code" will then be processed when CPU time is avaliable.
Whereas
System.Windows.Forms.Timer
Does not queue and the "handler code" is essentially lost.
So I my question applies to both
System.Threading.Timer
System.Timers.Timer
Can I find out if there is a queue of "handlers" (either a callback delegate or an event handler) waiting to be processed?
|
|
|
|
|
Threading.Timers uses the ThreadPool to execute the callback code. You could try to call ThreadPool.GetMaxThreads and ThreadPool.GetAvailableThreads to see how many are in use, but I doubt it'll be that accurate.
Timers.Timer uses Events to notify your app of the Timer Tick, I THINK by sending WM_TIMER messages to your apps message pump. You might try looking into GetQueueStatus[^] to see if it'll return the number of WM_TIMER messages you want.
There is no queue of callbacks maintained anywhere. They are executed by the ThreadPool, which does not keep track of the source, or type, of callback to be executed.
RageInTheMachine9532
"...a pungent, ghastly, stinky piece of cheese!" -- The Roaming Gnome
|
|
|
|
|
I'm trying to detect the interfaces directly inherited by a class. I've tried to use the Type.GetInterfaces method but in a situation like this:
public interface I
{
void f();
}
public interface Ii : I
{
new void f();
void f1();
}
public class CC : Ii
{
public void f(){}
public void f1(){}
} the entire interface types' hierarchy would be returned as a list ( Ii and I instead of Ii )
Is there any way to detect the relationship between two parent-child interfaces? Or: is it possible to obtain only the interfaces directly inherited by a class?
"quot capita, tot sententiæ"
rechi+
|
|
|
|
|
|
We have made a graphics board on (OnPaint Event) . Our assignment is that
where we click in one of the rectangles of the board the event is triggered changing the color of the rectangle . We dont seem to understand . The coordinates of the rectangle are available through the mouse down event.
How is an interaction possible between the above specified events in order
to achieve the goal of color change event triggering .
|
|
|
|
|
Do the changes needed on your internal data to keep track of the color change, then you use the Invalidate method on the element showing the board. That will create a OnPaint event to redraw the element.
---
b { font-weight: normal; }
|
|
|
|
|
Sorry sir/madam
We are not at a professional level of development of right naow and didnt
understand your response while we do appreciate your concern and helping gesture . Please try to explain more more we really need some help .
|
|
|
|
|
What specifically was it in my response that you didn't understand?
---
b { font-weight: normal; }
|
|
|
|
|
here is one example
http://www.codeproject.com/csharp/chess.asp
check this too it is pretty good example
http://www.csharphelp.com/archives/archive246.html
|
|
|
|
|
Yes but that still is too abstruse considering Valil chess seems to be
broken into so many classes and interfaces . And we dont seem to understand prerogative behind the fragmented code . We just need simple answer to our question mouse click (color changing) event triggering on chess board that is part of our project . Any help will be really appreciated .
|
|
|
|
|
check this too it is pretty good example
http://www.csharphelp.com/archives/archive246.html
|
|
|
|
|
Thank you for response . You still have to understand that the aim of our
current project is to change the colour of the rectangle on the board when an event is triggered . (JUST THAT FOR NOW) . Thanks anyways.
|
|
|
|