且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何在python中将url字符串拆分为单独的部分?

更新时间:2023-02-23 08:22:43

python 2.x中的urlparse模块(或python 3.x中的urllib.parse)将是这样做的方法.

The urlparse module in python 2.x (or urllib.parse in python 3.x) would be the way to do it.

>>> from urllib.parse import urlparse
>>> url = 'http://example.com/random/folder/path.html'
>>> parse_object = urlparse(url)
>>> parse_object.netloc
'example.com'
>>> parse_object.path
'/random/folder/path.html'
>>> parse_object.scheme
'http'
>>>

如果要在url下的文件路径上做更多工作,可以使用posixpath模块:

If you wanted to do more work on the path of the file under the url, you can use the posixpath module :

>>> from posixpath import basename, dirname
>>> basename(parse_object.path)
'path.html'
>>> dirname(parse_object.path)
'/random/folder'

然后,您可以使用posixpath.join将零件粘合在一起.

After that, you can use posixpath.join to glue the parts together.

我完全忘记了Windows用户会阻塞os.path中的路径分隔符.我阅读了posixpath模块文档,它特别引用了URL操作,所以一切都很好.

I totally forgot that windows users will choke on the path separator in os.path. I read the posixpath module docs, and it has a special reference to URL manipulation, so all's good.