Hi,
I want to skim data from some websites. But with my code, i can get it correctly at 1st time. But when i run again, it still generates body data with 200 code. But the data does not look like as the 1st time i got. I think maybe they block connection. How can i solve it ?
What I have tried:
This is my code :
url=
http://www.carparts.com/results/?N=0&Nr=AND%28universal%3A0%29&Ntk=Main&Ntx=mode+matchallany&Nty=1&PN=0+5727&VN=4294953018+4294962799+4294962221+4294957507+4294965468&universal=0[
^]
request_headers = {
"Accept-Language": "en-US,en;q=0.5",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Cache-Control": "no-cache, no-store, must-revalidate,post-check=0, pre-check=0",
"Content-Length":"22035",
"Connection":"keep_alive",
"Content-Type":"text/html; charset=UTF-8",
"Vary":"Accept-Encoding",
"Pragma":"no-cache"
}
pageSkimData = requests.get(url, headers=request_headers)
treeSkimData = html.fromstring(pageSkimData.content)