
且构网 - 分享程序员编程开发的那些事


更新时间:2022-12-06 10:19:41


We can generate random values for the rest of the years. You do not need to do this since your data is complete. I'm just trying to create data that resembles yours:

mydata_struct = structure( list( Year = c( 2008L, 2008L, 2008L, 2008L, 2008L, 2008L, 2008L,
     2008L, 2008L, 2008L, 2008L, 2008L, 2008L, 2008L, 2008L, 2008L, 2008L, 2008L, 2008L,
     2008L, 2008L, 2008L, 2008L, 2008L, 2008L, 2008L, 2008L, 2008L, 2009L ),
     Country = structure( c( 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
     15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 1L ),
     .Label = c( "Austria", "Belgium", "Bulgaria", "Croatia", "Cyprus", "Czechia", "Denmark",
     "Estonia", "Finland", "France", "Germany", "Greece", "Hungary", "Ireland", "Italy",
     "Latvia", "Lithuania", "Luxembourg", "Malta", "Netherlands", "Poland", "Portugal",
     "Romania", "Slovakia", "Slovenia", "Spain", "Sweden", "United Kingdom" ),
     class = "factor" ), Prosperity.Index = c( 79.4, 76.1, 62, 65.1, 69.9, 70.9, 83.2, 73.5,
     81.2, 75.9, 79.9, 66, 66.7, 78.9, 69.6, 67.7, 66.6, 79.9, 73.4, 81.2, 66.9, 71, 62.6,
     68.2, 72.7, 72.6, 82.8, 78, 79.4 ) ), row.names = c(NA, 29L), class = "data.frame" )


Now we create data for the other years and a data frame Prosperity by copying the data for the first year:

names <- rep(mydata_struct$Country[1:28], 10)
years <- rep(2008:2017, each=28)
prosp <- rep(mydata_struct$Prosperity.Index[1:28], 10)
Prosperity <- data.frame(Country=names, Year=years, PI=prosp)


Now we will fuzz the other years and add a trend toward increasing prosperity:

Prosperity$PI <- rnorm(280, prosp, rnorm(280, 2, .25)) + (years - years[1]) * rnorm(280, 1, .25)


Here is where you start with your actual data. First we can get some statistics:

with(Prosperity, tapply(PI, Year, mean))   # Mean PI for each year
#  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017 
# 73.02 73.74 74.50 76.50 76.13 77.95 78.33 79.55 80.85 81.71 
with(Prosperity, tapply(PI, Year, sd))     # Standard deviation for PI for each year
#  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017 
# 6.861 6.422 6.840 6.935 6.582 6.592 8.331 6.777 7.489 8.044 
with(Prosperity, tapply(PI, Year, max) - tapply(PI, Year, min))  # Range in PI for each year
#  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017 
# 28.24 23.24 22.20 24.83 23.99 23.26 27.97 22.77 27.78 30.66 


plot(PI~Year, Prosperity)   # Plot all values
boxplot(PI~Year, Prosperity)   # Boxplots