使用 Nokogiri 解析表

更新时间：2023-12-05 08:59:52

Use:

td//text()[normalize-space()]

This selects all non-white-space-only text node descendents of any td child of the current node (the tr already selected in your code).

Or if you want to select all text-node descendents, regardles whether they are white-space-only or not:

td//text()

UPDATE:

The OP has signaled in a comment that he is getting an unwanted td with content just a ' ' (aka non-breaking space).

To exclude also tds whose content is composed only of (one or more) nbsp characters, use:

td//text()[translate(normalize-space(), '&#160;', '')]

扫码关注1秒登录

发送“验证码”获取 | 15天全站免登陆

相关阅读