A program for grab web information.but I dont know why the result is empty

Question

1.00/5 (2 votes)

See more:

It's a very simple piece of code. I've downloaded the web page source file to my local machine, but I'm not sure where my code is wrong. Both fields that print out are empty lists. I've also checked the xpath paths repeatedly, and they seem fine. The web page source code is a bit large, but it's manageable, with over 2000 lines.

the website code download

What I have tried:

Python

from lxml import etree
 
 
def main():
 
    html = etree.HTML('toolify_202408.html')
    publish_month = html.xpath('/html/body/div/section/div[1]/h2/text()')
    app_name = html.xpath('/html/body/div/section/div[2]/div[1]/div/div[1]/div[1]/div/div/a/div/text()')
    print(publish_month, app_name)
 
 
if __name__ == '__main__':

Posted 30-Aug-24 21:47pm

Hui Shawn

Add a Solution

Comments

[no name] 31-Aug-24 4:43am

It is not clear what you are trying to extract, but looking at the source of the page I cannot figure where either of your paths are expected to lead.

OriginalGriff 31-Aug-24 9:23am

To the magic world of cut'n'paste, I suspect.

[no name] 31-Aug-24 9:38am

:)

boxsinger 19-Sep-24 3:29am

I understand your concern. It seems like the paths or links you're referring to might not be properly defined or are leading to unclear destinations. Looking at the page source, it’s a bit confusing as well. Could you provide more details or clarify what specific information you’re trying to access? This might level devil help in pinpointing where the issue lies or how we can correct the paths.

Yvan Rodrigues 18-Sep-24 20:59pm

main() is never called in your code.

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)