Click here to Skip to main content
16,020,347 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
What's Wrapper?
I want to use the parser in a C # program. But this is Java parser. But there is a Wrapper for C #. How can I add it to the program.
It is best to use a parser?
link parser: http://htmlunit.sourceforge.net/
Posted
Comments
Sergey Alexandrovich Kryukov 11-Feb-13 1:55am    
I doubt this is a wrapper. Who told it so? In the page you referenced, also there are no mention of a wrapper.
What parser are you looking for? HTML? Is it well-formed XML or not?
—SA
e.v.r 11-Feb-13 2:08am    
Thank you for comment.
this is a Htmlunit parser for java. but i want for c#. this parser have a wrapper.
Sergey Alexandrovich Kryukov 11-Feb-13 2:13am    
Who told you it does? Where? How?!
You need a pure .NET parser, that's it. Please see my answer.
—SA
e.v.r 11-Feb-13 2:22am    
The parser does not have the features that I want.
However, I noticed that the documentation. Someone did not tell me
I check the crawler and parser, but I did not get the result.
I need a good crawler that I felt would work better with the parser.
What do you think?
Sergey Alexandrovich Kryukov 11-Feb-13 2:36am    
How can you mix up a crawler and parser. You need to explain your ultimate goal, otherwise the discussion makes no sense at all...
—SA

Please see my comment to the question. I doubt you can simply use what you have referenced… You need to use something else.

If you want to parse HTML which is well-formed as XML (and only such HTMLs have a right to exist, but in real life… :-(), this is the best case, as .NET FCL has three different XML parsers (at least; I can reference/overview them if you want, but you will easily find them). If this is no case, you will need some HTML parser to do the dirty job. Try this one:
http://www.majestic12.co.uk/projects/html_parser.php[^].

[EDIT]

This is a problem of Web scraping: http://en.wikipedia.org/wiki/Web_scraping[^].

Please also see my past answers:
http://en.wikipedia.org/wiki/Web_scraping[^],
get specific data from web page[^],
How to get the data from another site[^].

—SA
 
Share this answer
 
v2
Comments
Sergey Alexandrovich Kryukov 11-Feb-13 2:37am    
[OP commented:]

Hello
Thank you for answer.
your link is good. but allow me that put check list for i need parser.
for example:

- The ability of a text file or an HTML DOM tree
- Ability to recognize the character Encoding
- Parse the JavaScript
- Interpretation and implementation of the Java
- Support HTTP and HTTPS protocols
- Support cookies
- Failing to identify whether the server response should be considered as an exception, or should be returned to a specific page (based on content)
- Support POST and GET methods
- Written in C #
Sergey Alexandrovich Kryukov 11-Feb-13 2:39am    
First of all: please don't post you comment as "answer".

Now I see: you don't really have a clue, not at all... And your purpose is very questionable. So, 1) you need to learn how Web and HTTP work, the role of HTML, etc; 2) you need to explain your ultimate purpose (and, next time, always start with it), otherwise there is nothing to talk about...

—SA
e.v.r 11-Feb-13 2:57am    
Thanks, sorry I am beginner in site.
I should explain what , so I mean it's so wide?
Sergey Alexandrovich Kryukov 11-Feb-13 2:59am    
I advise to explain your ultimate purpose. The idea is to abstract out from your ideas on how you should approach your goals, as they can be wrong.
—SA
e.v.r 11-Feb-13 3:03am    
I wrote a crawler that will list links to the site. But crawler can not find all the links correctly. I decided use the parser. But I could not found a proper parser, except that I gave the same link.
And also I can not use it because it was difficult for me to understand their code.
and i want to use this crawler in scanner.
ok?
e.v.r. wrote:
What are crawler between the parser? (Difference)
All "difference" questions are inherently incorrect. It only can be used as a figure of speech, if something is very similar. If some things have nothing to do one with another, you won't be able to define the notion of "difference". If you could, you would be able to answer "what's the difference between apple and Apple", but can you? :-)

And this is exactly the case. So, just learn what are they:

http://en.wikipedia.org/wiki/Parser[^],
http://en.wikipedia.org/wiki/WebCrawler[^].

The crawler has to use some parser, as its implementation detail, I guess. I depends though.

Clear enough, isn't it?

—SA
 
Share this answer
 
Comments
e.v.r 11-Feb-13 3:40am    
Oh yes, apple and apple :). That's so funny.
Thank you
Sergey Alexandrovich Kryukov 11-Feb-13 3:45am    
Funny or not, but this is true. :-)
You are very welcome. Good luck, call again.
—SA
e.v.r 11-Feb-13 5:46am    
I have a question.
What is the technology used in http://htmlagilitypack.codeplex.com/?
For asp.net 's or not?
Is it crawler to use it?
I saw many of these programs have been used
thanks
Sergey Alexandrovich Kryukov 11-Feb-13 11:39am    
I don't know; and you can never say for sure.
—SA

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900