且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

将函数应用于数据框中的每一列,观察每一列的现有数据类型

更新时间:2022-12-12 08:30:47

如果是有序因素",事情就会不同.这并不是说我喜欢有序因素",我不喜欢,只是说某些关系是为有序因素"定义的,而不是为因素"定义的.因子被认为是普通的分类变量.您正在看到因素的自然排序顺序,即您所在地区的字母词汇顺序.如果您想为每一列、日期和因素等自动强制数字",请尝试:

If it were an "ordered factor" things would be different. Which is not to say I like "ordered factors", I don't, only to say that some relationships are defined for 'ordered factors' that are not defined for "factors". Factors are thought of as ordinary categorical variables. You are seeing the natural sort order of factors which is alphabetical lexical order for your locale. If you want to get an automatic coercion to "numeric" for every column, ... dates and factors and all, then try:

sapply(df, function(x) max(as.numeric(x)) )   # not generally a useful result

或者,如果您想先测试因子并按预期返回:

Or if you want to test for factors first and return as you expect then:

sapply( df, function(x) if("factor" %in% class(x) ) { 
            max(as.numeric(as.character(x)))
            } else { max(x) } )

@Darrens 评论效果更好:

@Darrens comment does work better:

 sapply(df, function(x) max(as.character(x)) )  

max 使用字符向量确实成功.

max does succeed with character vectors.