用 scrapy 写的,碰到个问题,运行的时候,没有经过 pipelines 页面
wincos 为主目录
wincos/spiders/win4.py 内容是:
import scrapy from wincos.items import WincosItem from scrapy.http import Request
class Win4Spider(scrapy.Spider): name = 'win4' allowed_domains = ['www.win4000.com'] start_urls = ['http://www.win4000.com/meinvtag26_1.html']
def parse(self, response): mtitem = WincosItem() mtitem['title'] = response.xpath("//a/img/@src").extract() #标题 # http://www.win4000.com/meinv print("================") print(mtitem['title']) yield mtitem for i in range(1,3): url="http://www.win4000.com/meinvtag26_"+str(i)+".html" print(url) yield Request(url,callback=self.parse)
items 页面内容是: import scrapy class WincosItem(scrapy.Item): title = scrapy.Field()
pipelines 页面是: class WincosPipeline(object): def process_item(self, item, spider): print("===========88888888============") print(item) for i in range(0,len(item['title'])): print("===========666666============") print(item['title'][i]) return item
运行得到的数据是{'title':['所有的图片']
但是没有进入 pipelines 里面来,不知道问题在哪。想保存数据进来
![]() | 1 wuyifar 2020-03-05 11:10:00 +08:00 settings.py 这个文件中的 ITEM_PIPELINES 设置了吗, 优先级调高一点看一下 |
![]() | 4 Dustyposa 2020-03-05 17:04:19 +08:00 `Path(name).write_bytes()` 存图片 |