German Parliament Election Data — election • samplingbook

Data frame with number of citizens eligible to vote and results of the elections in 2002 and 2005 for the German Bundestag, the first chamber of the German parliament.

data(election)

Format

A data frame with 299 observations (corresponding to constituencies) on the following 13 variables.

state: factor, the 16 German federal states
eligible_02: number of citizens eligible to vote in 2002
SPD_02: a numeric vector, percentage for the Social Democrats SPD in 2002
UNION_02: a numeric vector, percentage for the conservative Christian Democrats CDU/CSU in 2002
GREEN_02: a numeric vector, percentage for the Greens in 2002
FDP_02: a numeric vector, percentage for the Liberal Party FDP in 2002
LEFT_02: a numeric vector, percentage for the Left Party PDS in 2002
eligible_05: number of citizens eligible to vote in 2005
SPD_05: a numeric vector, percentage for the Social Democrats SPD in 2005
UNION_05: a numeric vector, percentage for the conservative Christian Democrats CDU/CSU in 2005
GREEN_05: a numeric vector, percentage for the Greens in 2005
FDP_05: a numeric vector, percentage for the Liberal Party FDP in 2005
LEFT_05: a numeric vector, percentage for the Left Party in 2005

Details

German Federal Elections

Half of the Members of the German Bundestag are elected directly from Germany's 299 constituencies, the other half one on the parties' land lists. Accordingly, each voter has two votes in the elections to the German Bundestag. The first vote, allowing voters to elect their local representatives to the Bundestag, decides which candidates are sent to Parliament from the constituencies. The second vote is cast for a party list. And it is this second vote that determines the relative strengths of the parties represented in the Bundestag. At least 598 Members of the German Bundestag are elected in this way. In addition to this, there are certain circumstances in which some candidates win what are known as 'overhang mandates' when the seats are being distributed.

The data set provides the percentage of second votes for each party, which determines the number of seats each party gets in parliament. These percentages are calculated by the number of votes for a party divided by number of valid votes.

Source

The data is provided by the R package flexclust.

References

Kauermann, Goeran/Kuechenhoff, Helmut (2010): Stichproben. Methoden und praktische Umsetzung mit R. Springer.

Homepage of the Bundestag: http://www.bundestag.de.

Friedrich Leisch. A Toolbox for K-Centroids Cluster Analysis. Computational Statistics and Data Analysis, 51 (2), 526-544, 2006.

Examples

data(election)
summary(election)
#>                  state     eligible_02         SPD_02          UNION_02     
#>  Nordrhein-Westfalen:64   Min.   :152670   Min.   :0.1785   Min.   :0.1285  
#>  Bayern             :45   1st Qu.:190152   1st Qu.:0.3337   1st Qu.:0.3075  
#>  Baden-Wuerttemberg :37   Median :206148   Median :0.3845   Median :0.3628  
#>  Niedersachsen      :29   Mean   :205461   Mean   :0.3861   Mean   :0.3831  
#>  Hessen             :21   3rd Qu.:219907   3rd Qu.:0.4459   3rd Qu.:0.4342  
#>  Sachsen            :17   Max.   :249388   Max.   :0.6171   Max.   :0.7282  
#>  (Other)            :86                                                     
#>     GREEN_02           FDP_02           LEFT_02          eligible_05    
#>  Min.   :0.02251   Min.   :0.02489   Min.   :0.003352   Min.   :154154  
#>  1st Qu.:0.05734   1st Qu.:0.06001   1st Qu.:0.008415   1st Qu.:191819  
#>  Median :0.07638   Median :0.07428   Median :0.011201   Median :206345  
#>  Mean   :0.08484   Mean   :0.07342   Mean   :0.041898   Mean   :206924  
#>  3rd Qu.:0.10650   3rd Qu.:0.08601   3rd Qu.:0.019522   3rd Qu.:220944  
#>  Max.   :0.25029   Max.   :0.12420   Max.   :0.293128   Max.   :254100  
#>                                                                         
#>      SPD_05          UNION_05         GREEN_05           FDP_05       
#>  Min.   :0.1885   Min.   :0.1104   Min.   :0.02619   Min.   :0.04565  
#>  1st Qu.:0.2959   1st Qu.:0.2881   1st Qu.:0.05664   1st Qu.:0.08095  
#>  Median :0.3361   Median :0.3432   Median :0.07195   Median :0.09679  
#>  Mean   :0.3427   Mean   :0.3507   Mean   :0.08060   Mean   :0.09769  
#>  3rd Qu.:0.3883   3rd Qu.:0.4045   3rd Qu.:0.09818   3rd Qu.:0.11241  
#>  Max.   :0.5586   Max.   :0.6048   Max.   :0.22769   Max.   :0.16630  
#>                                                                       
#>     LEFT_05       
#>  Min.   :0.02275  
#>  1st Qu.:0.03866  
#>  Median :0.04888  
#>  Mean   :0.08870  
#>  3rd Qu.:0.07176  
#>  Max.   :0.35536  
#>                   

# 1) Draw a simple sample of size n=20
n <- 20
set.seed(67396)
index <- sample(1:nrow(election), size=n)
sample1 <- election[index,]
Smean(sample1$SPD_02, N=nrow(election))
#> 
#> Smean object: Sample mean estimate
#> With finite population correction: N=299
#> 
#> Mean estimate: 0.3515
#> Standard error: 0.0165
#> 95% confidence interval: [0.3192,0.3839]
#> 
# true mean
mean(election$SPD_02)
#> [1] 0.3861344

# 2) Estimate sample size to forecast proportion of SPD in election of 2005
sample.size.prop(e=0.01, P=mean(election$SPD_02), N=Inf)
#> 
#> sample.size.prop object: Sample size for proportion estimate
#> Without finite population correction: N=Inf, precision e=0.01 and expected proportion P=0.3861
#> 
#> Sample size needed: 9106
#> 

# 3) Usage of previous knowledge by model based estimation
# draw sample of size n = 20
N <- nrow(election)
set.seed(67396)
sample <- election[sort(sample(1:N, size=20)),]
# secondary information SPD in 2002
X.mean <- mean(election$SPD_02)
# forecast proportion of SPD in election of 2005
mbes(SPD_05 ~ SPD_02, data=sample, aux=X.mean, N=N, method='all')
#> 
#> mbes object: Model Based Estimation of Population Mean
#> Population size N = 299, sample size n = 20
#> 
#> Values for auxiliary variable: 
#> X.mean.1 = 0.3861, x.mean.1 = 0.3515
#> ----------------------------------------------------------------
#> Simple Estimate
#> 
#> Mean estimate:  0.3009 
#> Standard error:  0.0119 
#> 
#> 95% confidence interval [0.2775,0.3242]
#> 
#> ----------------------------------------------------------------
#> Difference Estimate
#> 
#> Mean estimate:  0.3355 
#> Standard error:  0.0088 
#> 
#> 95% confidence interval [0.3183,0.3526]
#> 
#> ----------------------------------------------------------------
#> Ratio Estimate
#> 
#> Mean estimate:  0.3305 
#> Standard error:  0.0072 
#> 
#> 95% confidence interval [0.3163,0.3447]
#> 
#> ----------------------------------------------------------------
#> Linear Regression Estimate
#> 
#> Mean estimate:  0.3223 
#> Standard error:  0.0063 
#> 
#> 95% confidence interval [0.31,0.3346]
#> 
#> ----------------------------------------------------------------
#> Linear Regression Model:
#> Call:
#> lm(formula = formula, data = data)
#> 
#> Residuals:
#>       Min        1Q    Median        3Q       Max 
#> -0.054727 -0.022938 -0.003066  0.027230  0.037138 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  0.08290    0.03137   2.643   0.0165 *  
#> SPD_02       0.62004    0.08729   7.103 1.28e-06 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.02908 on 18 degrees of freedom
#> Multiple R-squared:  0.737,	Adjusted R-squared:  0.7224 
#> F-statistic: 50.45 on 1 and 18 DF,  p-value: 1.277e-06
#> 
# true value
Y.mean <- mean(election$SPD_05)
Y.mean
#> [1] 0.3426949
# Use a second predictor variable
X.mean2 <- c(mean(election$SPD_02),mean(election$GREEN_02))
# forecast proportion of SPD in election of 2005 with two predictors
mbes(SPD_05 ~ SPD_02+GREEN_02, data=sample, aux=X.mean2, N=N, method= 'regr')
#> 
#> mbes object: Model Based Estimation of Population Mean
#> Population size N = 299, sample size n = 20
#> 
#> Values for auxiliary variable: 
#> X.mean.1 = 0.3861, x.mean.1 = 0.3515
#> X.mean.2 = 0.0848, x.mean.2 = 0.07
#> ----------------------------------------------------------------
#> Linear Regression Estimate
#> 
#> Mean estimate:  0.3291 
#> Standard error:  0.0051 
#> 
#> 95% confidence interval [0.3191,0.3391]
#> 
#> ----------------------------------------------------------------
#> Linear Regression Model:
#> Call:
#> lm(formula = formula, data = data)
#> 
#> Residuals:
#>       Min        1Q    Median        3Q       Max 
#> -0.037753 -0.016922 -0.004229  0.016320  0.048000 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  0.04326    0.02843   1.521  0.14652    
#> SPD_02       0.66001    0.07223   9.138 5.71e-08 ***
#> GREEN_02     0.36537    0.11489   3.180  0.00547 ** 
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.0237 on 17 degrees of freedom
#> Multiple R-squared:  0.8351,	Adjusted R-squared:  0.8157 
#> F-statistic: 43.06 on 2 and 17 DF,  p-value: 2.217e-07
#>