更新时间:2023-11-25 16:04:04
import urllib
$ p
import mimetypes
$ b def guess_type_of(link,strict = True):
link_type,_ = mimetypes.guess_type(link)
如果link_type为None且严格:
u = urllib.urlopen(link)
link_type = u.headers.gettype()#或使用:u.info()。gettype()
返回link_type演示:
links = ['http ://***.com/q/21515098/538284',#这是一个html页面
'http://upload.wikimedia.org/wikipedia/meta/6/6d/Wikipedia_wordmark_1x.png',#这是一个png文件
'http://commons.wikimedia.org/wiki/File:Typing_example.ogv',#这是一个html页面
'http://upload.wikimedia.org/wikipedia/commons/ e / e6 / Typing_example.ogv'#这是一个ogv文件
]
链接链接:
print(guess_type_of(link))输出: p>
text / html
image / x-png
text / html
application / ogg
Suppose i have links as follows:
http://example.com/index.html http://example.com/stack.zip http://example.com/setup.exe http://example.com/news/
In the above links first and fourth links are web page links and second and third are the file link.
These are only some examples of files links i.e .zip and .exe, but there may be many other files.
Is there any standard way to distinguish between file url or web page link? Thanks in advance.
import urllib import mimetypes def guess_type_of(link, strict=True): link_type, _ = mimetypes.guess_type(link) if link_type is None and strict: u = urllib.urlopen(link) link_type = u.headers.gettype() # or using: u.info().gettype() return link_type
Demo:
links = ['http://***.com/q/21515098/538284', # It's a html page 'http://upload.wikimedia.org/wikipedia/meta/6/6d/Wikipedia_wordmark_1x.png', # It's a png file 'http://commons.wikimedia.org/wiki/File:Typing_example.ogv', # It's a html page 'http://upload.wikimedia.org/wikipedia/commons/e/e6/Typing_example.ogv' # It's an ogv file ] for link in links: print(guess_type_of(link))
Output:
text/html image/x-png text/html application/ogg