且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

使用Jsoup提取文本

更新时间:2023-12-03 21:18:34

您需要使用更复杂的CSS选择器.也许像这样:

You need to use more elaborate css selectors. Maybe something like:

public static void main(String[] args) {
  Pattern pat = Pattern.compile("(.*)News\\:\\p{Zs}(.*)Analysis\\:\\p{Zs}(.*)", Pattern.UNICODE_CASE);
  Document doc = null;
  try {
    doc = Jsoup.connect("http://fantasynews.cbssports.com/fantasyfootball/players/updates/187741").userAgent("Mozilla").get();
  } catch (IOException e1) {
    e1.printStackTrace();
    System.exit(0);
  };

  Elements titles = doc.select("table h3");
  for (Element title : titles){
    Element td = title.parent();
    String innerTxt = td.text();
    Matcher mat = pat.matcher(innerTxt);
    if (mat.find()){
      System.out.println("titel = " + mat.group(1));
      System.out.println("news = " + mat.group(2));
      System.out.println("analysis = " + mat.group(3));
    }
  } 
}

我建议您研究CSS选择器和 JSoup文档.

I suggest you look into css selectors and the JSoup documentation.