K Means Swiss Data

A Study of Cluster Means in Swiss Data

Rashmikant Dave

K Means Swiss Data

-This app calculates the K Mean of clusters within Swiss Dataset

-There are 2 tabs in this app. Documentation and K Mean Plot

-The app is set to default values initally. The number of clusters is chosen data is divided into chosen number of clusters

-See the plot on te K Mean Plot tab to see the result plotted

-See the least square mean of each cluster marked by X sign within each cluster

The formula to calculate K Mean is:

'||xi - vj||' is the Euclidean distance between xi and vj, 'ci' is the number of data points in ith cluster. , 'c' is the number of cluster centers

Refer the following wikipedia link to learn more abput K means Clustering:

https://en.wikipedia.org/wiki/K-means_clustering

Input data the K MEAN function

##              Fertility Agriculture
## Courtelary        80.2        17.0
## Delemont          83.1        45.1
## Franches-Mnt      92.5        39.7
## Moutier           85.8        36.5
## Neuveville        76.9        43.5
## Porrentruy        76.1        35.3

Cluster Calculations

##   Fertility Agriculture
## 1  73.25000    70.37273
## 2  59.67000    16.80000
## 3  72.56667    44.32000

Code that runs when Fertility Agruculture and 3 Clusters are selected

library(datasets)
selectedData <- 
  swiss[, c('Fertility', 'Agriculture')]
clusters <- kmeans((selectedData), 3)
palette(c("#E41A1C", "#377EB8", "#4DAF4A", "#984EA3",
          "#FF7F00", "#FFFF33", "#A65628", "#F781BF", "#999999"))
par(mar = c(5.1, 4.1, 0, 1))
plot(selectedData,
     col = clusters$cluster,
     pch = 20, cex = 3)
points(clusters$centers, pch = 4, cex = 4, lwd = 4)

Plot

Conclusion: Calculations and plot for test by choosing Fertility Agriculture and 3 clusters show same result as on the Application. plot of chunk unnamed-chunk-4