Github 上代理池项目 IPProxyPool 运行时出现的一个错误 - V2EX
V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
agua199408
V2EX    数据库

Github 上代理池项目 IPProxyPool 运行时出现的一个错误

  •  
  •   agua199408 2017-03-29 22:27:17 +08:00 1184 次点击
    这是一个创建于 3137 天前的主题,其中的信息可能已经有所发展或是发生改变。

    我在用Github 一个开源代理池项目 运行的过程中,在某些情况下会出现以下错误:

    Traceback (most recent call last): File "E:\proxy\IPProxyPool\spider\HtmlDownloader.py", line 18, in download r = requests.get(url=url, headers=config.get_header(), timeout=config.TIMEOUT) File "D:\anaconda\lib\site-packages\requests\api.py", line 70, in get return request('get', url, params=params, **kwargs) File "D:\anaconda\lib\site-packages\requests\api.py", line 56, in request return session.request(method=method, url=url, **kwargs) File "D:\anaconda\lib\site-packages\requests\sessions.py", line 488, in request resp = self.send(prep, **send_kwargs) File "D:\anaconda\lib\site-packages\requests\sessions.py", line 609, in send r = adapter.send(request, **kwargs) File "D:\anaconda\lib\site-packages\requests\adapters.py", line 479, in send raise ConnectTimeout(e, request=request) requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='proxy-list.org', port=443): Max retrie s exceeded with url: /english/index.php?p=9 (Caused by ConnectTimeoutError(<requests.packages.urllib 3.connection.VerifiedHTTPSConnection object at 0x0000000006B95390>, 'Connection to proxy-list.org ti med out. (connect timeout=5)')) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "D:\anaconda\lib\site-packages\sqlalchemy\engine\base.py", line 1139, in _execute_context context) File "D:\anaconda\lib\site-packages\sqlalchemy\engine\default.py", line 450, in do_execute cursor.execute(statement, parameters) sqlite3.OperationalError: database is locked The above exception was the direct cause of the following exception: Traceback (most recent call last): File "D:\anaconda\lib\multiprocessing\process.py", line 249, in _bootstrap self.run() File "D:\anaconda\lib\multiprocessing\process.py", line 93, in run self._target(*self._args, **self._kwargs) File "E:\proxy\IPProxyPool\spider\ProxyCrawl.py", line 26, in startProxyCrawl crawl.run() File "E:\proxy\IPProxyPool\spider\ProxyCrawl.py", line 56, in run self.crawl_pool.map(self.crawl, parserList) File "D:\anaconda\lib\site-packages\gevent\pool.py", line 308, in map return list(self.imap(func, iterable)) File "D:\anaconda\lib\site-packages\gevent\pool.py", line 102, in next raise value.exc File "D:\anaconda\lib\site-packages\gevent\greenlet.py", line 534, in run result = self._run(*self.args, **self.kwargs) File "E:\proxy\IPProxyPool\spider\ProxyCrawl.py", line 67, in crawl respOnse= Html_Downloader.download(url) File "E:\proxy\IPProxyPool\spider\HtmlDownloader.py", line 27, in download proxylist = sqlhelper.select(10) File "E:\proxy\IPProxyPool\db\SqlHelper.py", line 127, in select return query.order_by(Proxy.score.desc(), Proxy.speed).limit(count).all() File "D:\anaconda\lib\site-packages\sqlalchemy\orm\query.py", line 2613, in all return list(self) File "D:\anaconda\lib\site-packages\sqlalchemy\orm\query.py", line 2761, in __iter__ return self._execute_and_instances(context) File "D:\anaconda\lib\site-packages\sqlalchemy\orm\query.py", line 2776, in _execute_and_instances result = conn.execute(querycontext.statement, self._params) File "D:\anaconda\lib\site-packages\sqlalchemy\engine\base.py", line 914, in execute return meth(self, multiparams, params) File "D:\anaconda\lib\site-packages\sqlalchemy\sql\elements.py", line 323, in _execute_on_connecti on return connection._execute_clauseelement(self, multiparams, params) File "D:\anaconda\lib\site-packages\sqlalchemy\engine\base.py", line 1010, in _execute_clauseeleme nt compiled_sql, distilled_params File "D:\anaconda\lib\site-packages\sqlalchemy\engine\base.py", line 1146, in _execute_context context) File "D:\anaconda\lib\site-packages\sqlalchemy\engine\base.py", line 1341, in _handle_dbapi_except ion exc_info File "D:\anaconda\lib\site-packages\sqlalchemy\util\compat.py", line 202, in raise_from_cause reraise(type(exception), exception, tb=exc_tb, cause=cause) File "D:\anaconda\lib\site-packages\sqlalchemy\util\compat.py", line 185, in reraise raise value.with_traceback(tb) File "D:\anaconda\lib\site-packages\sqlalchemy\engine\base.py", line 1139, in _execute_context context) File "D:\anaconda\lib\site-packages\sqlalchemy\engine\default.py", line 450, in do_execute cursor.execute(statement, parameters) sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) database is locked [SQL: 'SELECT proxys. ip AS proxys_ip, proxys.port AS proxys_port, proxys.score AS proxys_score \nFROM proxys ORDER BY pro xys.score DESC, proxys.speed\n LIMIT ? OFFSET ?'] [parameters: (10, 0)]

    在 stackoverflow 看到以下解答:

    SQLite locks the database when a write is made to it, such as when an UPDATE, INSERT or DELETE is sent. When using the ORM, these get sent on flush. The database will remain locked until there is a COMMIT or ROLLBACK.

    结合上面的报错,我估计错误是这样产生的: 程序正在抓取过程中,爬到的代理都 send 到了 flush ,此时连接超时,程序来到HtmlDownloader.py文件的 27 行又开始调用数据库命令,从而引发了错误。

    然而我技术很菜,研究了很久也不知道怎么改好,希望大神帮忙分析分析,谢谢!

    2 条回复    2017-03-30 00:13:03 +08:00
    xrlin
        1
    xrlin  
       2017-03-29 23:27:43 +08:00 via iPhone
    反映给作者吧,作者更新效率挺高的。
    agua199408
        2
    agua199408  
    OP
       2017-03-30 00:13:03 +08:00
    @xrlin 嗯,已经提交了 issue
    关于     帮助文档     自助推广系统     博客     API     FAQ     Solana     4467 人在线   最高记录 6679       Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 27ms UTC 09:47 PVG 17:47 LAX 02:47 JFK 05:47
    Do have faith in what you're doing.
    ubao msn snddm index pchome yahoo rakuten mypaper meadowduck bidyahoo youbao zxmzxm asda bnvcg cvbfg dfscv mmhjk xxddc yybgb zznbn ccubao uaitu acv GXCV ET GDG YH FG BCVB FJFH CBRE CBC GDG ET54 WRWR RWER WREW WRWER RWER SDG EW SF DSFSF fbbs ubao fhd dfg ewr dg df ewwr ewwr et ruyut utut dfg fgd gdfgt etg dfgt dfgd ert4 gd fgg wr 235 wer3 we vsdf sdf gdf ert xcv sdf rwer hfd dfg cvb rwf afb dfh jgh bmn lgh rty gfds cxv xcv xcs vdas fdf fgd cv sdf tert sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf shasha9178 shasha9178 shasha9178 shasha9178 shasha9178 liflif2 liflif2 liflif2 liflif2 liflif2 liblib3 liblib3 liblib3 liblib3 liblib3 zhazha444 zhazha444 zhazha444 zhazha444 zhazha444 dende5 dende denden denden2 denden21 fenfen9 fenf619 fen619 fenfe9 fe619 sdf sdf sdf sdf sdf zhazh90 zhazh0 zhaa50 zha90 zh590 zho zhoz zhozh zhozho zhozho2 lislis lls95 lili95 lils5 liss9 sdf0ty987 sdft876 sdft9876 sdf09876 sd0t9876 sdf0ty98 sdf0976 sdf0ty986 sdf0ty96 sdf0t76 sdf0876 df0ty98 sf0t876 sd0ty76 sdy76 sdf76 sdf0t76 sdf0ty9 sdf0ty98 sdf0ty987 sdf0ty98 sdf6676 sdf876 sd876 sd876 sdf6 sdf6 sdf9876 sdf0t sdf06 sdf0ty9776 sdf0ty9776 sdf0ty76 sdf8876 sdf0t sd6 sdf06 s688876 sd688 sdf86