|
I suggest you get a begginners book and start from chapter one, this would fall under basic IO functions and then using the systems.diagnostics.process class to execute the batch file you created.
|
|
|
|
|
Hi my friends ........ I want to do term weighting by an approach called Composite measure(Tf*idf) ... it has its own formula ....
But what I want to do is to read files like(Word 2003,2007 pdf files etc.... ) then to take each word as an array .... I can read all files line by line by the following code
<br />
try<br />
{<br />
using (StreamReader sr = new StreamReader(textBox1.Text,Encoding.ASCII,true))
{<br />
String line;<br />
while ((line = sr.ReadLine()) != null)<br />
{<br />
richTextBox1.Text =richTextBox1.Text +Environment.NewLine+ line;<br />
<br />
}<br />
}<br />
}<br />
and to check the content of the file i tried to display it on a Richtextbox1 .... but it displays encrypted file .....
What I want to know is
1. How can I put each words(separated by Space and newline) in to array ... just to know each word (here displaying the content is not necessary)
2. How can I display the content of each file on Richtextbox1 ... but not in an encrypted form ....
Thanks for your help
|
|
|
|
|
The first thing you need to know is: Word and PDF files (as well as many others) are not necesarily line based text documents. Instead they are frequently binary files - which explains why they look encrypted.
The second thing you need to know is: Word and PDF files can be encrypted. So that may be why they lookm encrypted.
Google for "Word File Format" and "PDF File Format" - this will provide a starting point for you to gather the information you are going to need.
All those who believe in psycho kinesis, raise my hand.
|
|
|
|
|
So how can i decrypt them? .... or can you tell me how to work on them please?
Just u can see my first question.
Thank you
|
|
|
|
|
Ummm...[^]
.45 ACP - because shooting twice is just silly ----- "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997 ----- "The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001
|
|
|
|
|
got noting ... just http://75.11.0.157/homenet/stupid.htm on z title bar .... What that supposed to mean?
|
|
|
|
|
Okay, how about this one[^]?
.45 ACP - because shooting twice is just silly ----- "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997 ----- "The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001
|
|
|
|
|
I couldn't open them either, using Chrome. It works in FF though
I are Troll
|
|
|
|
|
It's a picture and a caption - nothing special, and certainly nothing exotic. If Chrome can't open something that simple, I'd certainly entertain the idea of using one of the alternative browsers for my regular web-browsing pleasures...
.45 ACP - because shooting twice is just silly ----- "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997 ----- "The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001
|
|
|
|
|
John Simmons / outlaw programmer wrote: It's a picture and a caption - nothing special, and certainly nothing exotic.
content="IE=EmulateIE7" ..and some css to position all that
John Simmons / outlaw programmer wrote: If Chrome can't open something that simple, I'd certainly entertain the idea of using one of the alternative browsers for my regular web-browsing pleasures...
You could also entertain the idea of installing multiple browsers. It's not a marriage, and I'm not going to commit to a single system
I are Troll
|
|
|
|
|
Well, IE 6 and 8 show it just fine without any special compatibility tags.
.45 ACP - because shooting twice is just silly ----- "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997 ----- "The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001
|
|
|
|
|
Gets my five!
All those who believe in psycho kinesis, raise my hand.
|
|
|
|
|
Decrypt: First, find out the password...
Because they aren't stored as straight text, you can't just read them and identify the words. The files contain heaps of other stuff: font, size, colour, location, lines, boxes, italics, bold, pictures, spreadsheets, etc. etc. etc. If all you are interested in is the text of the document and doing some textual analysis, then the best thing you can do is to throw away as much of the formatting as possible, and save the file as a straight .TXT file from Word and/or PDF. You can then read the whole thing in, and use string.Split (with space and reasonable puncuation) to break it into words.
All those who believe in psycho kinesis, raise my hand.
|
|
|
|
|
Or forget about the Pdf and Word ... but how do i gather the words in an array ... in just a text document ...
Thank you
|
|
|
|
|
Use string.split with a blank space as your separator to populate an array of just each individual word.
check out the documentation[^] for more basic string manipulation.
|
|
|
|
|
Each document has a specific structure. Word-documents and PDF files can't be "read", because the computer doesn't know how to read them. Those documents contain extra information like "this part text in bold formatting", and "this in red". All that information is stored in between the words that you see when you open the thing in Word.
CoderForEver wrote: 1. How can I put each words(separated by Space and newline) in to array ... just to know each word (here displaying the content is not necessary)
You can't until you have something to decode the file. You can save Word-files as RTF. Take a look at the result with a text-editor, and you'll see where the extra codes are located. You can also save the file as HTML. Again, a coded form, just like the binary representation.
I are Troll
|
|
|
|
|
Eddy Vluggen wrote: You can save Word-files as RTF
So can I read this RTF file .... then display it on Richtext box ? ... or what is left?
Thnk you for your help
|
|
|
|
|
|
CoderForEver wrote: So can I read this RTF file .... then display it on Richtext box ?
Yup. The same method can be used to read plain text files. If you want to read another format, then you'll have to provide methods to read those formats.
Reading Word-files directly is a fair bit more complex.
I are Troll
|
|
|
|
|
One way to do this would be to use the Index Server IFilter approach and read the words this way, outlined here[^].
"WPF has many lovers. It's a veritable porn star!" - Josh Smith As Braveheart once said, "You can take our freedom but you'll never take our Hobnobs!" - Martin Hughes.
My blog | My articles | MoXAML PowerToys | Onyx
|
|
|
|
|
Hi All,
I am writing an application in which I need to put check if a class is user defined or built in (C# or .Net).
Can someone help me with this?
Thanks,
AksharRoop
|
|
|
|
|
This might not be the ideal solution, but you can get the namespace of a type. You could check whether the class part of the "System" or "Microsoft" namespaces
I are Troll
|
|
|
|
|
Thanks but I need better solution if any ..
|
|
|
|
|
There is no way to reliably tell. All classes in the .NET Framework are "user defined".
|
|
|
|
|
hi,
if you write the classes you can use Attribute elements to determinate it.
|
|
|
|