写贴代码吧:
Socket socket = new Socket();
InetSocketAddress inetSocketAddress = new InetSocketAddress("music.163.com", 80);
socket.connect(inetSocketAddress);
OutputStream outputStream = socket.getOutputStream();
BufferedWriter bufferedWriter = new BufferedWriter(new OutputStreamWriter(outputStream));
bufferedWriter.write("GET /artist?id=10000 HTTP/1.1\r\n");
bufferedWriter.write("Host: music.163.com\r\n");
bufferedWriter.write("User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36\r\n");
bufferedWriter.write("Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8\r\n");
bufferedWriter.write("Accept-Language: zh-CN,zh;q=0.9,en;q=0.8,fr;q=0.7\r\n");
bufferedWriter.write("\r\n");
bufferedWriter.flush();
InputStream inputStream = socket.getInputStream();
byte[] bytes = new byte[1024];
ByteArrayOutputStream baos = new ByteArrayOutputStream(1024 * 1024 * 4);
int len;
while ((len = inputStream.read(bytes, 0, bytes.length)) > 0) {
baos.write(bytes, 0, len);
}
System.out.println(baos.toString(StandardCharsets.UTF_8.name()));
输出如下:
HTTP/1.1 200 OK
Server: nginx
Date: Tue, 26 Dec 2017 02:14:35 GMT
Content-Type: text/html;charset=utf8
Transfer-Encoding: chunked
Connection: keep-alive
Vary: Accept-Encoding
Cache-Control: no-store
Pragrma: no-cache
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Cache-Control: no-cache
Content-Language: zh-CN
X-Via: MusicServer
X-From-Src: 218.17.158.4
a3e
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="baidu-site-verification" content="cNhJHKEzsD" />
<meta property="qc:admins" content="27354635321361636375" />
...
</head>
<body>
...
</body>
</html>
响应里面,响应头和响应体之间的 'a3e\n' 是什么鬼?是因为网易的服务器没严格按照 http 协议来吗?还是说有啥特殊含义呢?
1
gouchaoer 2017-12-26 10:26:11 +08:00
是 BOM 头么?
|
2
janxin 2017-12-26 10:30:52 +08:00
a3e\n 是本来就是 body 里的
|
3
wsy2220 2017-12-26 10:33:55 +08:00
|
4
mengzhuo 2017-12-26 10:37:53 +08:00
Chunked 编码啊
难道 Java 连个标准 HTTP 库都没有么(手动滑稽 |
5
fcten 2017-12-26 10:49:49 +08:00 1
Transfer-Encoding: chunked
a3e 表示的是后面内容的长度 |
6
Shazoo 2017-12-26 10:51:33 +08:00
连接服务器的 response header 里面已经告诉你 transfer-encoding 是 chunked 模式了。(而且,没给你 content-length ……)
Transfer-Encoding: chunked chunked 是蛮老的一种传输方式。 这个一般被底层 http 库封装。由于你是用 socket 直连实现 http 协议,就暴露出来了。 |
12
clearbug OP @mengzhuo #4 哈哈,感觉 java9 之前的 http client 都是用的类库,目测 java9 的原生的 http client 很好用了
|
13
misaka19000 2017-12-26 11:44:57 +08:00
这个东西和 content-length 有什么区别吗?
|
14
clearbug OP @misaka19000 #13
|
15
clearbug OP |
16
torbrowserbridge 2017-12-26 12:32:44 +08:00 via Android
难道 chunked 编码最后不应该有个 0 么
|
17
clearbug OP @torbrowserbridge 是有的,只是我上面贴的响应内容没贴全,之前没理解 chunked 编码,所以一直关注前面 a3e 的含义了
|