
urllib2:
Date: XXX Server: Apache Last-Modified: XXX Accept-Ranges: bytes Content-Length: 12345678 Vary: Accept-Encoding Connection: close Content-Type: text/plain requests:
Content-Encoding: gzip Accept-Ranges: bytes Vary: Accept-Encoding Keep-alive: timeout=5, max=128 Last-Modified: XXX Connection: Keep-Alive ETag: xxxxxxxxx Content-Type: text/plain 为何 requests 少了 content-length ?其它发送请求的设置是完全一样的。。 requests 和 Chrome 开发者工具查看到的一致。但是这里我又需要 content-length 的值(为了断点续传)
import urllib2 import requests url = 'exmaple.com' headers = { "Authorization": "Basic xxxx", "Range": "bytes=0-" } req = urllib2.Request(url, headers=headers) resp = urllib2.urlopen(req) print resp.info() r = requests.get(url, headers=headers) print r.headers assert resp.info()['ETag'] == r.headers['ETag'] Date: Sat, 14 Jan 2017 09:39:50 GMT Server: Apache Last-Modified: Sat, 14 Jan 2017 09:39:49 GMT ETag: "e91103-10e04f7-5460abb4743a3" Accept-Ranges: bytes Content-Length: 17695991 Vary: Accept-Encoding Content-Range: bytes 0-17695990/17695991 Connection: close Content-Type: text/plain {'Content-Encoding': 'gzip', 'Transfer-Encoding': 'chunked', 'Accept-Ranges': 'bytes', 'Vary': 'Accept-Encoding', 'Keep-Alive': 'timeout=5, max=128', 'Server': 'Apache', 'Last-Modified': 'Sat, 14 Jan 2017 09:39:49 GMT', 'Connection': 'Keep-Alive', 'ETag': '"e91103-10e04f7-5460abb4743a3"', 'Date': 'Sat, 14 Jan 2017 09:39:50 GMT', 'Content-Type': 'text/plain'} 我也知道肯定是两次发送的请求header不一样。。。现在总算解决了。。
The response is different because requests indicates that it supports gzip-encoded bodies, by sending an Accept-Encoding: gzip, deflate header field. urllib2 does not. You'll find if you added that header to your urllib2 request that you get the new behaviour.
Clearly, in this case, the server is dynamically gzipping the responses. This means it doesn't know how long the response will be, so it is sending using chunked transfer encoding.
If you really must get the Content-Length header, then you should add the following headers to your Requests request: {'Accept-Encoding': 'identity'}.
1 redhatping 2017-01-14 17:18:47 +08:00 看官方文档 | ||
2 binux 2017-01-14 17:20:43 +08:00 via Android Content-Length 不应该手动设置 |
3 dofine OP @binux 我描述不清~~上边给出的结果是响应的 header ,我的意思是需要知道当前 content-length 的值。。但是 requests 的返回里面没有。。 urllib2 就有。。 @redhatping 文档已经看了许多遍了。。怀疑是服务器的问题? |
4 hahastudio 2017-01-14 17:25:47 +08:00 你这是结果,肯定还是因为你发送的请求不一样 |
5 Lonely 2017-01-14 17:36:14 +08:00 via iPhone 你把代码也贴出来啊…… |
6 dofine OP {'Range': 'bytes=0-', 'Authorization': 'Basic XXX'} 手动加了这个 header , urllib2 和 requests 返回的 ETag 都是一样的啊。。为什么会发送请求不一样呢。。 @hahastudio |
7 dofine OP ``` import os import urllib2 import requests url = 'exmaple.com' headers = { "Authorization": "Basic xxxx", "Range": "bytes=0-" } req = urllib2.Request(url, headers=headers) resp = urllib2.urlopen(req) print resp.info() r = requests.get(url, headers=headers) print r.headers assert resp.info()['ETag'] == r.headers['ETag'] ``` |
9 lhbc 2017-01-14 17:45:03 +08:00 明显你的 request header 不一样 |
10 hahastudio 2017-01-14 17:45:36 +08:00 requests 允许额外设置 auth 么? http://docs.python-requests.org/en/master/user/authentication/ |
11 dofine OP @hahastudio 开始就使用的文档里的方法,结果跟换成手动设置 auth 一样的。。 |
12 lbp0200 2017-01-14 17:55:21 +08:00 试试随机 ua |
13 dofine OP 谢谢大家。。 |
14 dsg001 2017-01-14 19:10:06 +08:00 抓包看看发送的请求有木有区别 |
15 qgy18 2017-01-14 23:11:08 +08:00 via iPhone |