且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Android - Eclipse如何获取网页视图文本内容的字符串

更新时间:2023-11-07 17:37:52

要从HTML获取文本内容,请使用XmlPullParser。

To get text content from HTML, use XmlPullParser.

以下是参考资料:

Google开发人员文档

XML解析教程

private String readText(String htmlStr) throws IOException, XmlPullParserException {

             String toReturn = "";
             XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
             factory.setNamespaceAware(true);
             XmlPullParser xpp = factory.newPullParser();

             xpp.setInput(new StringReader ("htmlStr"));
             int eventType = xpp.getEventType();
             while (eventType != XmlPullParser.END_DOCUMENT) {
              if(eventType == XmlPullParser.TEXT) {
                  toReturn = "Text "+ xpp.getText());
              }
              eventType = xpp.next();
             }
            return toReturn;
}