samedi 27 juin 2015

scrapy multiple start urls

i need to scrape this website for data, it has 10 categories with 180 pages each, so far, i managed to scrape the pages of a category like this: start_urls = ["http://ift.tt/1TW6ITG" % d for d in range(0, 180)] and it works, its scraping the 180 pages but i need to find a way to pass category names too.

I tried

start_urls = [
"http://ift.tt/1TW6ITG" % d for d in range(0, 180),
"http://ift.tt/1TW6HPs" % d for d in range(0, 180),
"http://ift.tt/1TW6HPu" % d for d in range(0, 180)
]

but it doesn't work (python error)

Any ideas?

Aucun commentaire:

Enregistrer un commentaire