且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何计算群集中某个状态到群集中心的距离?

更新时间:2022-12-06 10:19:41

在接下来的几年中,我们可以生成随机值。由于您的数据已完成,因此无需执行此操作。我只是想创建类似于您的数据:

We can generate random values for the rest of the years. You do not need to do this since your data is complete. I'm just trying to create data that resembles yours:

mydata_struct = structure( list( Year = c( 2008L, 2008L, 2008L, 2008L, 2008L, 2008L, 2008L,
     2008L, 2008L, 2008L, 2008L, 2008L, 2008L, 2008L, 2008L, 2008L, 2008L, 2008L, 2008L,
     2008L, 2008L, 2008L, 2008L, 2008L, 2008L, 2008L, 2008L, 2008L, 2009L ),
     Country = structure( c( 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
     15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 1L ),
     .Label = c( "Austria", "Belgium", "Bulgaria", "Croatia", "Cyprus", "Czechia", "Denmark",
     "Estonia", "Finland", "France", "Germany", "Greece", "Hungary", "Ireland", "Italy",
     "Latvia", "Lithuania", "Luxembourg", "Malta", "Netherlands", "Poland", "Portugal",
     "Romania", "Slovakia", "Slovenia", "Spain", "Sweden", "United Kingdom" ),
     class = "factor" ), Prosperity.Index = c( 79.4, 76.1, 62, 65.1, 69.9, 70.9, 83.2, 73.5,
     81.2, 75.9, 79.9, 66, 66.7, 78.9, 69.6, 67.7, 66.6, 79.9, 73.4, 81.2, 66.9, 71, 62.6,
     68.2, 72.7, 72.6, 82.8, 78, 79.4 ) ), row.names = c(NA, 29L), class = "data.frame" )

现在我们创建其他年份的数据和数据框繁荣通过复制第一年的数据:

Now we create data for the other years and a data frame Prosperity by copying the data for the first year:

names <- rep(mydata_struct$Country[1:28], 10)
years <- rep(2008:2017, each=28)
prosp <- rep(mydata_struct$Prosperity.Index[1:28], 10)
Prosperity <- data.frame(Country=names, Year=years, PI=prosp)

现在,我们将模糊其他年份并增加一种趋势不断增加的繁荣:

Now we will fuzz the other years and add a trend toward increasing prosperity:

set.seed(42)
Prosperity$PI <- rnorm(280, prosp, rnorm(280, 2, .25)) + (years - years[1]) * rnorm(280, 1, .25)

此处是您从实际数据开始的地方。首先我们可以得到一些统计数据:

Here is where you start with your actual data. First we can get some statistics:

options(digits=4)
with(Prosperity, tapply(PI, Year, mean))   # Mean PI for each year
#  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017 
# 73.02 73.74 74.50 76.50 76.13 77.95 78.33 79.55 80.85 81.71 
with(Prosperity, tapply(PI, Year, sd))     # Standard deviation for PI for each year
#  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017 
# 6.861 6.422 6.840 6.935 6.582 6.592 8.331 6.777 7.489 8.044 
with(Prosperity, tapply(PI, Year, max) - tapply(PI, Year, min))  # Range in PI for each year
#  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017 
# 28.24 23.24 22.20 24.83 23.99 23.26 27.97 22.77 27.78 30.66 

最后一些情节:

plot(PI~Year, Prosperity)   # Plot all values
boxplot(PI~Year, Prosperity)   # Boxplots