一个动态的网址,比如: https://www.v2ex.com?page=2 以及 https://www.v2ex.com/?page=2 这两种网址哪个才是最标准规范的 URL 形式?
1
yulitian888 2017-09-04 08:58:28 +08:00
scheme:[//[user[:password]@]host[:port]][/path][?query][#fragment]
|
2
jhaohai 2017-09-04 08:59:43 +08:00 via iPhone
如果是根的话需要加的吧
|
3
0ZXYDDu796nVCFxq 2017-09-04 08:59:48 +08:00 via iPhone
第一种不合法
|
4
GTim 2017-09-04 09:01:53 +08:00 via iPad
可以不加
|
5
torbrowserbridge 2017-09-04 09:02:07 +08:00
就算不是根,加与不加也是不一样的。例如:/news/?page=1 与 /news?page=1
|
6
FanWall 2017-09-04 09:04:32 +08:00 via Android
第二种
我觉得第一种的请求成功是因为浏览器自动补齐成第二种(抓包就会发现) 以前遇到类似的,第一种的 url 用 WinhttpCrackUrl 解析的话直接报 invalid |
7
zjsxwc 2017-09-04 09:05:51 +08:00
服务端可以区别,url 是""还是"/"来提供不同的内容。 所以 /something 与 /something/ 是 2 个不同的地址
|
8
canbingzt 2017-09-04 09:10:47 +08:00
|
9
weyou 2017-09-04 09:11:08 +08:00 via Android 4
其实 2 种都不规范,只是浏览器做了兼容。rfc1738:
HTTP httpurl = "http://" hostport [ "/" hpath [ "?" search ]] hpath = hsegment *[ "/" hsegment ] hsegment = *[ uchar | ";" | ":" | "@" | "&" | "=" ] search = *[ uchar | ";" | ":" | "@" | "&" | "=" ] |
11
weyou 2017-09-04 09:27:49 +08:00 via Android
@aleung rfc 是这样的规定,但是各大浏览器也没有按照这个做,即使没有 hpath 不一样也可以么加 search 么。
|
12
aleung 2017-09-04 09:33:13 +08:00 6
@weyou RFC 1738 is updated by RFC 3968.
``` 3. Syntax Components The generic URI syntax consists of a hierarchical sequence of components referred to as the scheme, authority, path, query, and fragment. URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ] hier-part = "//" authority path-abempty / path-absolute / path-rootless / path-empty The scheme and path components are required, though the path may be empty (no characters). When authority is present, the path must either be empty or begin with a slash ("/") character. When authority is not present, the path cannot begin with two slash characters ("//"). These restrictions result in five different ABNF rules for a path (Section 3.3), only one of which will match any given URI reference. The following are two example URIs and their component parts: foo://example.com:8042/over/there?name=ferret#nose \_/ \______________/\_________/ \_________/ \__/ | | | | | scheme authority path query fragment | _____________________|__ / \ / \ urn:example:animal:ferret:nose 3.3 Path path = path-abempty ; begins with "/" or is empty / path-absolute ; begins with "/" but not "//" / path-noscheme ; begins with a non-colon segment / path-rootless ; begins with a segment / path-empty ; zero characters ``` 因此,https://www.v2ex.com?page=2 以及 https://www.v2ex.com/?page=2 这两种网址都符合规范。 |
13
aleung 2017-09-04 09:33:49 +08:00
还有,是否支持是看服务端,跟浏览器没有关系。
|
14
anyforever 2017-09-04 09:35:25 +08:00
楼主得补点基础知识啊。
|
15
aleung 2017-09-04 09:42:30 +08:00 via Android
不好意思,输入错了,是 rfc 3986
|
16
Mss 2017-09-04 09:47:21 +08:00
不加斜杠请求会出现 301
|
18
doubleflower 2017-09-04 10:46:56 +08:00
@Mss 搞笑吧
|
19
zzNucker 2017-09-04 10:55:18 +08:00
规范和实现一向差距挺远的
|
21
zjp 2017-09-04 11:27:09 +08:00 via Android
|
23
Showfom 2017-09-04 12:43:32 +08:00 via iPhone
严格来说都是不规范的
|
24
doubleflower 2017-09-04 12:51:49 +08:00 1
|
26
lonelinsky 2017-09-04 15:00:05 +08:00
|
27
FrankFang128 2017-09-04 15:01:12 +08:00
看 RFC 的事情,非要发帖。。。
|
28
mozutaba 2017-09-04 15:17:42 +08:00
补齐了。。。
|
29
zhicheng 2017-09-04 17:08:51 +08:00
@weyou 第二种哪里不规范了?
httpurl = "http://" hostport [ "/" hpath [ "?" search ]] hpath = hsegment *[ "/" hsegment ] hsegment = *[ uchar | ";" | ":" | "@" | "&" | "=" ] search = *[ uchar | ";" | ":" | "@" | "&" | "=" ] 你发的 BNF,这个 '*' 是 匹配 0 个或多个的意思,hpath 是可以为空的。 在 'www.example.com?' 和 'www.example.com/?' 是一样的,浏览器会自动补上 '/' 不然没办法构造请求。 但对于其它路径不行,'www.example.com/hello' 和 'www.example.com/hello/' 是两个路径,有些 Web Server 会好心自动补上 '/' 或去掉 '/' 但有些不会。如果这两个路径表示的确实是同一个对象,服务器一般可以配置 URL rewrite 或把带 '/' 的 301 到不带 '/' 的路径上。 |
30
weyou 2017-09-04 17:59:12 +08:00
@zhicheng 首先应该参照 @aleung 说的 rfc3986, 我贴的 rfc1738 已经过时了。在 rfc3986 中这两种方式都是规范的。
然后, 如果 hpath 是可以为空的, 那 BNF 应该是 hpath = *[ "/" hsegment ] 而不是 hpath = hsegment *[ "/" hsegment ] 这种方式代表至少有一个 hsegment. 后面的*代表 0 个或者多个[ "/" hsegment ] 可以参照 rfc822 以及 rfc1738 中的说明: 5. BNF for specific URL schemes This is a BNF-like description of the Uniform Resource Locator syntax, using the conventions of RFC822, except that "|" is used to designate alternatives, and brackets [] are used around optional or repeated elements. Briefly, literals are quoted with "", optional elements are enclosed in [brackets], and elements may be preceded with <n>* to designate n or more repetitions of the following element; n defaults to 0. |