Click here to Skip to main content
16,022,333 members
Please Sign up or sign in to vote.
1.00/5 (2 votes)
See more:
It's a very simple piece of code. I've downloaded the web page source file to my local machine, but I'm not sure where my code is wrong. Both fields that print out are empty lists. I've also checked the xpath paths repeatedly, and they seem fine. The web page source code is a bit large, but it's manageable, with over 2000 lines.


the website code download

What I have tried:

Python
from lxml import etree
 
 
def main():
 
    html = etree.HTML('toolify_202408.html')
    publish_month = html.xpath('/html/body/div/section/div[1]/h2/text()')
    app_name = html.xpath('/html/body/div/section/div[2]/div[1]/div/div[1]/div[1]/div/div/a/div/text()')
    print(publish_month, app_name)
 
 
if __name__ == '__main__':
Posted
Comments
[no name] 31-Aug-24 4:43am    
It is not clear what you are trying to extract, but looking at the source of the page I cannot figure where either of your paths are expected to lead.
OriginalGriff 31-Aug-24 9:23am    
To the magic world of cut'n'paste, I suspect.
[no name] 31-Aug-24 9:38am    
:)
boxsinger 19-Sep-24 3:29am    
I understand your concern. It seems like the paths or links you're referring to might not be properly defined or are leading to unclear destinations. Looking at the page source, it’s a bit confusing as well. Could you provide more details or clarify what specific information you’re trying to access? This might level devil help in pinpointing where the issue lies or how we can correct the paths.
Yvan Rodrigues 18-Sep-24 20:59pm    
main() is never called in your code.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900