且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

我将如何解析 R 中的 XML 文件并对数据进行基本的统计分析

更新时间:2021-11-20 09:04:59

z <- strptime ("HH:MM:SS.ms, "%H:%m:%S.%f")

你错过了一个结束的 " 所以它是无效的语法.

you miss a closing " so it is invalid syntax.

接下来,数据是非标准的,因为我们将使用点表示 seconds.subseconds,即 12:23:34.567 来表示时间戳.毫秒可以这样解析

Next, the data is non-standard as we would use a dot for seconds.subseconds, ie 12:23:34.567 to denote a timestamp. The milliseconds can be parsed this way

> ts <- "12:00:00.050"
> strptime(ts, "%H:%M:%OS")
[1] "2010-07-09 12:00:00 CDT"
> 

所以你不仅需要先把它从XML中取出来,还需要对字符串进行转换.否则,您可以解析字符串并手动"填充 POSIXlt 时间结构.

So you not only need to get it out of XML first, but also need to convert the string. Else, you can parse the string an fill a POSIXlt time structure 'by hand'.

后记:忘了说你需要启用亚秒打印:

Postscriptum: Forgot to mention that you need to enable printing of sub-second times:

> options("digits.secs"=3)         # shows milliseconds (three digits)
> strptime(ts, "%H:%M:%OS")
[1] "2010-07-09 12:00:00.05 CDT"   # suppresses trailing zero
> 

Postscriptum 2:由于 XML 包:

Postscriptum 2: You are also in luck with respect to your file thanks to the XML package:

> library(XML)
> xmlToDataFrame("c:/Temp/foo.xml")     # save your data as c:/Temp/foo.xml
      timeStamp   Price
1   12:00:00:01   25.02
2   12:00:00:02      15
3  12:00:00:025   15.02
4  12:00:00:031   18.25
5  12:00:00:039   18.54
6  12:00:00:050   16.52
7   12:00:01:01   17.50
>