且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

从URL获取子域

更新时间:2023-02-26 10:15:08


任何人都有任何好主意除了
存储所有TLD的列表?


否,因为每个TLD在什么是子域上是不同的,第二级域名等。



请记住,有***域名,二级域名和子域名。在技​​术上来说,TLD以外的所有内容都是子域名。



在domain.com.uk示例中,域是一个子域,com是二级域,uk是tld。



所以这个问题比起腮红还要复杂,这取决于每个TLD的管理方式。您将需要一个包含其特定分区的所有TLD数据库,以及作为二级域和子域的数据。不过,TLD并没有太多,所以列表是可以管理的,但收集所有信息并不是微不足道的。可能已经有这样的列表。



看起来像 http:// publicsuffix.org/ 是一个这样的列表 - 适用于搜索的列表中的所有常用后缀(.com,.co.uk等)。解析它仍然不容易,但至少你不必维护列表。


公共后缀是一个下面的
互联网用户可以直接注册
名称。 public
后缀的一些例子是.com,.co.uk和
pvt.k12.wy.us。公开后缀
列表是所有已知的公共
后缀的列表。



公开后缀列表是Mozilla的
计划基础。
它可用于任何
软件,但最初创建
以满足浏览器
制造商的需求。它允许浏览器,
,例如:




  • 避免隐私破坏超级好友设置为
    级别域名后缀

  • 突出显示用户中最重要的一个域名
    界面

  • 按网站


查看列表,您可以看到它不是一个微不足道的问题。我认为一个列表是唯一正确的方式来完成这个...



-Adam


Getting the subdomain from a URL sounds easy at first.

http://www.domain.example

Scan for the first period then return whatever came after the "http://" ...

Then you remember

http://super.duper.domain.example

Oh. So then you think, okay, find the last period, go back a word and get everything before!

Then you remember

http://super.duper.domain.co.uk

And you're back to square one. Anyone have any great ideas besides storing a list of all TLDs?

Anyone have any great ideas besides storing a list of all TLDs?

No, because each TLD differs on what counts as a subdomain, second level domain, etc.

Keep in mind that there are top level domains, second level domains, and subdomains. Technically speaking, everything except the TLD is a subdomain.

In the domain.com.uk example, domain is a subdomain, com is a second level domain, and uk is the tld.

So the question remains more complex than at first blush, and it depends on how each TLD is managed. You'll need a database of all the TLDs that include their particular partitioning, and what counts as a second level domain and a subdomain. There aren't too many TLDs, though, so the list is reasonably manageable, but collecting all that information isn't trivial. There may already be such a list available.

Looks like http://publicsuffix.org/ is one such list - all the common suffixes (.com, .co.uk, etc) in a list suitable for searching. It still won't be easy to parse it, but at least you don't have to maintain the list.

A "public suffix" is one under which Internet users can directly register names. Some examples of public suffixes are ".com", ".co.uk" and "pvt.k12.wy.us". The Public Suffix List is a list of all known public suffixes.

The Public Suffix List is an initiative of the Mozilla Foundation. It is available for use in any software, but was originally created to meet the needs of browser manufacturers. It allows browsers to, for example:

  • Avoid privacy-damaging "supercookies" being set for high-level domain name suffixes
  • Highlight the most important part of a domain name in the user interface
  • Accurately sort history entries by site

Looking through the list, you can see it's not a trivial problem. I think a list is the only correct way to accomplish this...

-Adam