且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

控制 ggplot2 中 facet_grid/facet_wrap 的顺序?

更新时间:2023-11-26 12:43:28

我认为我无法真正满足您的不制作新数据框"要求,但您可以即时创建新数据框:

I don't think I can really satisfy your "without making a new data frame" requirement, but you can create the new data frame on the fly:

ggplot(transform(iris,
      Species=factor(Species,levels=c("virginica","setosa","versicolor")))) + 
    geom_histogram(aes(Petal.Width))+ facet_grid(Species~.)

或者,在 tidyverse 成语中:

or, in tidyverse idiom:

iris %>%
   mutate(across(Species, factor, levels=c("virginica","setosa","versicolor"))) %>%
ggplot() + 
   geom_histogram(aes(Petal.Width))+ 
   facet_grid(Species~.)

我同意如果有另一种方法来控制它会很好,但是 ggplot 已经是一个非常强大(且复杂)的引擎......

I agree it would be nice if there were another way to control this, but ggplot is already a pretty powerful (and complicated) engine ...

请注意,(1) 数据集中的行的顺序与 (2) 因子的级别的顺序无关.#2 是 factor(...,levels=...) 变化的内容,以及 ggplot 用来确定构面顺序的内容.做#1(按指定顺序对数据框的行进行排序)是一个有趣的挑战.我想我实际上可以通过先做 #2 来实现这一点,然后使用 order()arrange() 根据因子的数值进行排序:>

Note that the order of (1) the rows in the data set is independent of the order of (2) the levels of the factor. #2 is what factor(...,levels=...) changes, and what ggplot looks at to determine the order of the facets. Doing #1 (sorting the rows of the data frame in a specified order) is an interesting challenge. I think I would actually achieve this by doing #2 first, and then using order() or arrange() to sort according to the numeric values of the factor:

neworder <- c("virginica","setosa","versicolor")
library(plyr)  ## or dplyr (transform -> mutate)
iris2 <- arrange(transform(iris,
             Species=factor(Species,levels=neworder)),Species)

如果不改变因子水平的顺序,我无法立即看到一种快速的方法(你可以这样做,然后相应地重置因子水平的顺序).

I can't immediately see a quick way to do this without changing the order of the factor levels (you could do it and then reset the order of the factor levels accordingly).

一般来说,R 中依赖于分类变量级别顺序的函数基于因子级别顺序,而不是数据集中行的顺序:上述答案更普遍.

In general, functions in R that depend on the order of levels of a categorical variable are based on factor level order, not the order of the rows in the dataset: the answer above applies more generally.