python模拟登录,cookie问题求解 - V2EX
V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
推荐学习书目
Learn Python the Hard Way
Python Sites
PyPI - Python Package Index
http://diveintopython.org/toc/index.html
Pocoo
值得关注的项目
PyPy
Celery
Jinja2
Read the Docs
gevent
pyenv
virtualenv
Stackless Python
Beautiful Soup
结巴中文分词
Green Unicorn
Sentry
Shovel
Pyflakes
pytest
Python 编程
pep8 Checker
Styles
PEP 8
Google Python Style Guide
Code Style from The Hitchhiker's Guide
exoticknight
V2EX    Python

python模拟登录,cookie问题求解

  •  
  •   exoticknight
    exoticknight 2013-04-29 04:56:44 +08:00 5627 次点击
    这是一个创建于 4616 天前的主题,其中的信息可能已经有所发展或是发生改变。
    代码如下
    # get the cookieJar instance
    cj = cookielib.CookieJar()
    # get the opener instance
    opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
    # get a real cookie
    opener.open(url)
    # post data here
    op = opener.open(url, data)

    op输出之后看到网站提示cookie无法读取。
    13 条回复    1970-01-01 08:00:00 +08:00
    alexrezit
        1
    alexrezit  
       2013-04-29 07:51:56 +08:00
    urllib2.install_opener(opener)

    我不会 Python, 不过在 build opener 下面加上这句应该就可以了, 你试试吧.
    mckelvin
        2
    mckelvin  
       2013-04-29 08:57:33 +08:00 via iPhone
    没跑过lz的代码,如果代码是文档上抄的没有问题那可能是服务器检测到了是爬虫,response headers里就没有加set-cookies ,建议request headers里补上User-Agent、Referer等项。
    推荐使用requests模块的session,使用异常方便。它把cookie封装好了,但还是有些坑,极少数情况下才需要人工去干预cookie。
    lfhong
        3
    lfhong  
       2013-04-29 10:16:14 +08:00   1
    我给你贴一个我写的browser吧,希望你能用上。

    import gzip
    import socket
    import urllib2
    import cookielib
    from StringIO import StringIO

    class Browser(object):

    def __init__(self, filecookie=None, PROXY=None):
    VERSION = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_7) AppleWebKit/534.35 (KHTML, like Gecko) Chrome/13.0.761.0 Safari/534.35'
    self.version = VERSION
    self.headers = []
    self.headers.append(('User-agent', self.version))
    self.headers.append(('Accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'))
    self.headers.append(('Accept-Charset', 'ISO-8859-1,utf-8;q=0.7,*;q=0.3'))
    self.headers.append(('Accept-Encoding', 'gzip'))
    self.headers.append(('Accept-Language', 'en-US,en;q=0.8'))
    self.headers.append(('Connection', 'keep-alive'))
    if filecookie:
    self.cj = cookielib.MozillaCookieJar(filecookie)
    else:
    self.cj = cookielib.CookieJar()
    if PROXY and 'http' in PROXY:
    proxy_handler = urllib2.ProxyHandler(PROXY)
    self.opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(self.cj), proxy_handler)
    else:
    self.opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(self.cj))
    self.opener.addheaders = self.headers

    def addheaders(self, headers):
    self.opener.addheaders = self.headers + headers

    def open(self, url, data=None, timeout=socket._GLOBAL_DEFAULT_TIMEOUT):
    if data:
    pg = self.opener.open(url, data, timeout=timeout)
    else:
    pg = self.opener.open(url, timeout=timeout)
    if pg.info().get('Content-Encoding') == 'gzip':
    buf = StringIO(pg.read())
    f = gzip.GzipFile(fileobj=buf)
    return f.read()
    else:
    return pg.read()
    lfhong
        4
    lfhong  
       2013-04-29 10:18:12 +08:00
    用的时候

    browser = Browser()

    pg_cOntent= browser.open(url) # 这是用 GET
    pg_cOntent= browser.open(url, data={'username':'1234', 'password': '12345'}) # POST
    for4
        5
    for4  
       2013-04-29 10:58:00 +08:00 via iPad
    建议用requests
    exoticknight
        6
    exoticknight  
    OP
       2013-04-29 13:56:00 +08:00
    @alexrezit 还是不行……
    exoticknight
        7
    exoticknight  
    OP
       2013-04-29 13:56:53 +08:00
    @mckelvin info()输出后是可以看到有set-cookie的……所以我才去折腾cookie
    exoticknight
        8
    exoticknight  
    OP
       2013-04-29 13:57:44 +08:00
    @lfhong 贴代码好评~我先去试试
    scola
        9
    scola  
       2013-04-29 15:12:25 +08:00
    前不久帮同事写了个下载助手,快速下载内部网站上的文件,网站需要登录
    lz可以参考下
    https://gist.github.com/325862401/5403766
    里面用到两个库ClientCookie ClientForm,参考这个
    http://code.activestate.com/recipes/391929-access-password-protected-web-applications-for-scr/
    exoticknight
        10
    exoticknight  
    OP
       2013-04-29 15:14:24 +08:00
    @for4 似乎是要keepalive的问题,我去试试requests
    exoticknight
        11
    exoticknight  
    OP
       2013-04-29 15:19:40 +08:00
    @scola 谢谢,我去研究一下
    qdcanyun
        12
    qdcanyun  
       2013-04-29 19:25:45 +08:00
    我可以推荐的requests里的Session么 完全满足你的要求
    exoticknight
        13
    exoticknight  
    OP
       2013-04-29 19:29:19 +08:00
    @qdcanyun 我去看了一下似乎是这样,正在抓包+尝试^_^
    关于     帮助文档     自助推广系统     博客     API     FAQ     Solana     897 人在线   最高记录 6679       Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 24ms UTC 20:44 PVG 04:44 LAX 12:44 JFK 15:44
    Do have faith in what you're doing.
    ubao msn snddm index pchome yahoo rakuten mypaper meadowduck bidyahoo youbao zxmzxm asda bnvcg cvbfg dfscv mmhjk xxddc yybgb zznbn ccubao uaitu acv GXCV ET GDG YH FG BCVB FJFH CBRE CBC GDG ET54 WRWR RWER WREW WRWER RWER SDG EW SF DSFSF fbbs ubao fhd dfg ewr dg df ewwr ewwr et ruyut utut dfg fgd gdfgt etg dfgt dfgd ert4 gd fgg wr 235 wer3 we vsdf sdf gdf ert xcv sdf rwer hfd dfg cvb rwf afb dfh jgh bmn lgh rty gfds cxv xcv xcs vdas fdf fgd cv sdf tert sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf shasha9178 shasha9178 shasha9178 shasha9178 shasha9178 liflif2 liflif2 liflif2 liflif2 liflif2 liblib3 liblib3 liblib3 liblib3 liblib3 zhazha444 zhazha444 zhazha444 zhazha444 zhazha444 dende5 dende denden denden2 denden21 fenfen9 fenf619 fen619 fenfe9 fe619 sdf sdf sdf sdf sdf zhazh90 zhazh0 zhaa50 zha90 zh590 zho zhoz zhozh zhozho zhozho2 lislis lls95 lili95 lils5 liss9 sdf0ty987 sdft876 sdft9876 sdf09876 sd0t9876 sdf0ty98 sdf0976 sdf0ty986 sdf0ty96 sdf0t76 sdf0876 df0ty98 sf0t876 sd0ty76 sdy76 sdf76 sdf0t76 sdf0ty9 sdf0ty98 sdf0ty987 sdf0ty98 sdf6676 sdf876 sd876 sd876 sdf6 sdf6 sdf9876 sdf0t sdf06 sdf0ty9776 sdf0ty9776 sdf0ty76 sdf8876 sdf0t sd6 sdf06 s688876 sd688 sdf86