且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

最简洁的方式来标记/注释极端值与ggplot?

更新时间:2023-01-07 13:56:33

更新 scale_size_area()的scale_area()

您也许能够把一些东西从这个满足您的需求。

You might be able to take something from this to suit your needs.

library(ggplot2)

#Some data
df <- data.frame(x = round(runif(100), 2), y = round(runif(100), 2))

m1 <- lm(y ~ x, data = df)
df.fortified = fortify(m1)

names(df.fortified)   # Names for the variables containing residuals and derived qquantities

# Select extreme values
df.fortified$extreme = ifelse(abs(df.fortified$`.stdresid`) > 1.5, 1, 0)

# Based on examples on page 173 in Wickham's ggplot2 book
plot = ggplot(data = df.fortified, aes(x = x, y = .stdresid)) +
 geom_point() +
 geom_text(data = df.fortified[df.fortified$extreme == 1, ], 
   aes(label = x, x = x, y = .stdresid), size = 3, hjust = -.3)
plot

plot1 = ggplot(data = df.fortified, aes(x = .fitted, y = .resid)) +
   geom_point() + geom_smooth(se = F)

plot2 = ggplot(data = df.fortified, aes(x = .fitted, y = .resid, size = .cooksd)) +
   geom_point() + scale_size_area("Cook's distance") + geom_smooth(se = FALSE, show_guide = FALSE)

library(gridExtra)
grid.arrange(plot1, plot2)

最简洁的方式来标记/注释极端值与ggplot?

最简洁的方式来标记/注释极端值与ggplot?