
写贴代码吧:
Socket socket = new Socket(); InetSocketAddress inetSocketAddress = new InetSocketAddress("music.163.com", 80); socket.connect(inetSocketAddress); OutputStream outputStream = socket.getOutputStream(); BufferedWriter bufferedWriter = new BufferedWriter(new OutputStreamWriter(outputStream)); bufferedWriter.write("GET /artist?id=10000 HTTP/1.1\r\n"); bufferedWriter.write("Host: music.163.com\r\n"); bufferedWriter.write("User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36\r\n"); bufferedWriter.write("Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8\r\n"); bufferedWriter.write("Accept-Language: zh-CN,zh;q=0.9,en;q=0.8,fr;q=0.7\r\n"); bufferedWriter.write("\r\n"); bufferedWriter.flush(); InputStream inputStream = socket.getInputStream(); byte[] bytes = new byte[1024]; ByteArrayOutputStream baos = new ByteArrayOutputStream(1024 * 1024 * 4); int len; while ((len = inputStream.read(bytes, 0, bytes.length)) > 0) { baos.write(bytes, 0, len); } System.out.println(baos.toString(StandardCharsets.UTF_8.name())); 输出如下:
HTTP/1.1 200 OK Server: nginx Date: Tue, 26 Dec 2017 02:14:35 GMT Content-Type: text/html;charset=utf8 Transfer-Encoding: chunked Connection: keep-alive Vary: Accept-Encoding Cache-Control: no-store Pragrma: no-cache Expires: Thu, 01 Jan 1970 00:00:00 GMT Cache-Control: no-cache Content-Language: zh-CN X-Via: MusicServer X-From-Src: 218.17.158.4 a3e <!DOCTYPE html> <html> <head> <meta charset="utf-8"> <meta name="baidu-site-verification" cOntent="cNhJHKEzsD" /> <meta property="qc:admins" cOntent="27354635321361636375" /> ... </head> <body> ... </body> </html> 响应里面,响应头和响应体之间的 'a3e\n' 是什么鬼?是因为网易的服务器没严格按照 http 协议来吗?还是说有啥特殊含义呢?
1 gouchaoer 2017-12-26 10:26:11 +08:00 是 BOM 头么? |
2 janxin 2017-12-26 10:30:52 +08:00 a3e\n 是本来就是 body 里的 |
3 wsy2220 2017-12-26 10:33:55 +08:00 |
4 mengzhuo 2017-12-26 10:37:53 +08:00 Chunked 编码啊 难道 Java 连个标准 HTTP 库都没有么(手动滑稽 |
5 fcten 2017-12-26 10:49:49 +08:00 Transfer-Encoding: chunked a3e 表示的是后面内容的长度 |
6 Shazoo 2017-12-26 10:51:33 +08:00 连接服务器的 response header 里面已经告诉你 transfer-encoding 是 chunked 模式了。(而且,没给你 content-length ……) Transfer-Encoding: chunked chunked 是蛮老的一种传输方式。 这个一般被底层 http 库封装。由于你是用 socket 直连实现 http 协议,就暴露出来了。 |
12 clearbug OP @mengzhuo #4 哈哈,感觉 java9 之前的 http client 都是用的类库,目测 java9 的原生的 http client 很好用了 |
13 misaka19000 2017-12-26 11:44:57 +08:00 这个东西和 content-length 有什么区别吗? |
14 clearbug OP @misaka19000 #13 |
15 clearbug OP |
16 torbrowserbridge 2017-12-26 12:32:44 +08:00 via Android 难道 chunked 编码最后不应该有个 0 么 |
17 clearbug OP @torbrowserbridge 是有的,只是我上面贴的响应内容没贴全,之前没理解 chunked 编码,所以一直关注前面 a3e 的含义了 |