United States presidential election, 2012: Difference between revisions

From Opasnet
Jump to navigation Jump to search
(→‎Calculations: Oklahoma code works (on desktop) and confirms Choquette results!)
(→‎Calculations: code work on server, the key graph added)
Line 183: Line 183:


===Calculations===
===Calculations===
With this code, you can reproduce parts of the original Choquette and Johnson paper.


<rcode graphics="1">
<rcode graphics="1">
Line 188: Line 190:
library(ggplot2)
library(ggplot2)


opasnet.data
# OHIO DATA
data.oh <- opasnet.data("c/c8/RepublicanPrimaryElection2012_8-StateCountyTotals.csv")
head(data)
#MAKE IT A DATA.FRAME


#data.oh <- opasnet.csv("c/c8/RepublicanPrimaryElection2012_8-StateCountyTotals.csv")
#head(data.oh)


data.ok <- opasnet.data("6/6e/PrimaryElection2012OKresults_20120306.csv")
# OKLAHOMA DATA
#MAKE IT A DATA.FRAME


data <- data[data$Race_desc == "FOR PRESIDENT", ]
data.ok <- opasnet.csv("6/6e/PrimaryElection2012OKresults_20120306.csv", sep = ",", header = TRUE)
library(ggplot2)
ggplot(data, aes(x = County, weight = Total_votes, fill = Cand_desc)) + geom_bar(position = "stack")
data$Precinct <- as.factor(data$Precinct)


data.ok <- read.csv("c:/temp/PrimaryElection2012OKresults_20120306.csv")
data.ok <- data.ok[data.ok$Race_desc == "FOR PRESIDENT" & data.ok$Race_party == "REPUBLICAN" , ]
data.ok <- data.ok[data.ok$Race_desc == "FOR PRESIDENT" & data.ok$Race_party == "REPUBLICAN" , ]
data.ok$Cand_desc <- data.ok$Cand_desc[ , drop = TRUE]
data.ok$Cand_desc <- data.ok$Cand_desc[ , drop = TRUE] # Drop unused levels of candidate names.
ggplot(data.ok, aes(x = County, weight = Total_votes, fill = Cand_desc)) + geom_bar(position = "stack")
 
data.ok$Precinct <- as.factor(data.ok$Precinct)
data.ok$Precinct <- as.factor(data.ok$Precinct) # Change precinct numbers to factors.
Totals <- as.data.frame(as.table(tapply(data.ok$Total_votes, data.ok["Precinct"], sum)))
 
Totals <- as.data.frame(as.table(tapply(data.ok$Total_votes, data.ok["Precinct"], sum))) # Total votes for each precinct.
data.ok <- merge(data.ok, Totals)
data.ok <- merge(data.ok, Totals)
data.ok$Support <- data.ok$Total_votes / data.ok$Freq
data.ok$Support <- data.ok$Total_votes / data.ok$Freq
# Order precincts from smaller to larger and calculate candidate-specific cumulative sums.


data.ok <- data.ok[order(data.ok$Freq), ]
data.ok <- data.ok[order(data.ok$Freq), ]
Line 216: Line 215:
data.ok$Cumfreq[data.ok$Cand_desc == i] <- cumsum(data.ok$Freq[data.ok$Cand_desc == i])
data.ok$Cumfreq[data.ok$Cand_desc == i] <- cumsum(data.ok$Freq[data.ok$Cand_desc == i])
}
}
# Calculate cumulative support


data.ok$Cumsupport <- data.ok$Cumvote / data.ok$Cumfreq
data.ok$Cumsupport <- data.ok$Cumvote / data.ok$Cumfreq
Line 223: Line 224:
geom_point(shape=1) +    # Use hollow circles
geom_point(shape=1) +    # Use hollow circles
geom_smooth() +          # Add a loess smoothed fit curve with confidence region
geom_smooth() +          # Add a loess smoothed fit curve with confidence region
scale_x_log10()
opts(
axis.text.x = theme_text(size = 20),
axis.text.y = theme_text(size = 20),
axis.title.x = theme_text(size = 20),
axis.title.y = theme_text(size = 20, angle = 90),
legend.text = theme_text(size = 20),
legend.title = theme_text(size = 20),
title = paste("Smoothed regression of support of ", candidate, " along the size of precinct"),
plot.title = theme_text(size=20)
) +
scale_x_log10("Vote tally in precinct") +
scale_y_continuous("Candidate result in precinct, %")
 
return(out)
return(out)
}
}
Line 237: Line 250:
geom_line(size = 1.2) +
geom_line(size = 1.2) +
opts(
opts(
axis.text.x = theme_text(size = 10),  
axis.text.x = theme_text(size = 20),  
axis.text.y = theme_text(size = 10),  
axis.text.y = theme_text(size = 20),  
axis.title.x = theme_text(size = 10),
axis.title.x = theme_text(size = 20),
axis.title.y = theme_text(size = 10, angle = 90),  
axis.title.y = theme_text(size = 20, angle = 90),  
legend.text = theme_text(size = 10),
legend.text = theme_text(size = 20),
legend.title = theme_text(size = 10),
legend.title = theme_text(size = 20),
title = "Cumulative support of candidates, Oklahoma Primary election, March 6th, 2012",  
title = "Cumulative support of candidates, Oklahoma Primary election, March 6th, 2012",  
plot.title = theme_text(size=30)
plot.title = theme_text(size=20)
) +  
) +  
scale_x_continuous("Cumulative vote tally") +
scale_x_continuous("Cumulative vote tally") +
scale_y_continuous("Candidate result, %", limits = c(0, 0.5))
scale_y_continuous("Candidate result, fraction", limits = c(0, 0.5))
 
fig("JON HUNTSMAN")
fig("MICHELE BACHMANN")
fig("MITT ROMNEY")
fig("NEWT GINGRICH")
fig("OVER VOTES")
fig("RICK PERRY")
fig("RICK SANTORUM")
fig("RON PAUL")
fig("UNDER VOTES")


</rcode>
</rcode>


{{todo|How do I transform the input into a data.frame? The old version of opasnet.data did this, but now it only downloads a file without data.frame conversion. --[[User:Jouni|Jouni]] 23:32, 4 November 2012 (EET)|Teemu Rintala|project=Opasnet}}
[[image:Cumulative support in Oklahoma primary election 2012.png|thumb|500px|The data does indeed show like in the Choquette and Johnson paper. Compare to Fig 6 in <ref name="statisticalanomalies"/>]]


==See also==
==See also==

Revision as of 23:40, 5 November 2012

In August 13, 2012, Francois Choquette and James Johnson published a paper claiming that there are such features in the Republican primary election results that are statistically implausible to occur in an election that has not been manipulated.[1] In Finland, the issue was first raised by the Facebook group Open Democrary Finland on October 31, 2012.

Websites describing the situation

Research plan

The reseach question is this:

Based on statistical analysis of the election data, is there evidence of fraud in the US presidential election, 2012?

  1. Collect large enough a group of volunteers capable of a) data management, b) statistical analysis, c) wiki working, d) dissemination.
  2. Work with the primary election data until the real presidential election data comes out on Tuesday.
  3. Develop data management systems for voting data in Opasnet.
  4. Develop statistical analyses (based on [1]) in Opasnet.
  5. Agree on division of tasks for the presidential election.
  6. When the presidential election data becomes available, manage and analyse the data immediately based on the task devision.
  7. Publish results widely.

Data

Various sources (not evaluated, some not very good)

Data of Presidential Primary Elections 2012, Republican Party
State County-level data Precinct-level data Description
Alabama AL [2]
Alaska AK
Arizona AZ
Arkansas AR
California CA
Colorado CO
Connecticut CT
Delaware DE
Florida FL
Georgia GA
Hawaii HI
Idaho ID
Illinois IL
Indiana IN
Iowa IA
Kansas KS
Kentucky KY [3] [4]
Louisiana LA [5] Counties separately
Maine ME
Maryland MD
Massachusetts MA
Michigan MI
Minnesota MN
Mississippi MS
Missouri MO
Montana MT
Nebraska NE
Nevada NV
New Hampshire NH
New Jersey NJ
New Mexico NM
New York NY
North Carolina NC
North Dakota ND
Ohio OH Excel 6th March
Oklahoma OK Zip 6th March
Oregon OR
Pennsylvania PA
Rhode Island RI
South Carolina SC
South Dakota SD
Tennessee TN
Texas TX
Utah UT
Vermont VT
Virginia VA
Washington WA
West Virginia WV [6] 8th May
Wisconsin WI
Wyoming WY

From the Coquette and Johnson paper

  1. US Census Bureau: Census 2000 U.S. Gazetteer Files. County locations: [7]
  2. US Census Bureau: Population, Housing Units, Area, and Density 2010. Broken?
  3. Iowa Election Results, January 3, 2012: [8]
  4. New Hampshire Election Results, January 10, 2012: [9]
  5. Arizona Election Results, February 28, 2012: [10] [11]
  6. Ohio Election Results, March 6, 2012: [12]
  7. Oklahoma Election Results, March 6, 2012: [13]
  8. Alabama Election Results, March 13, 2012: [14] [15]
  9. Louisiana Election Results, March 24, 2012: [16] [17]
  10. Wisconsin Election Results, April 3, 2012: [18]
  11. West Virginia Election Results, May 8, 2012: [19]
  12. Kentucky Election Results, May 22, 2012: [20]
  13. [21] [22]


Unsuccessful searches of the claimed fraud

The claim that a major presidential candidate is in the finals because of an election fraud is a very severe one. One could assume that this would make headlines, if evidence is strong. However, nothing has been found from the major U.S. daily newspapers.

Calculations

With this code, you can reproduce parts of the original Choquette and Johnson paper.

+ Show code

The data does indeed show like in the Choquette and Johnson paper. Compare to Fig 6 in [1]

See also

References

  1. 1.0 1.1 1.2 Francois Choquette, James Johnson: Republican Primary Election 2012 Results: Amazing Statistical Anomalies. The Money Party, 2012. [1]

Related files

<mfanonymousfilelist></mfanonymousfilelist>