File:NZ opinion polls 2009-2011 -parties.png
NZ_opinion_polls_2009-2011_-parties.png (778 × 487 pixels, file size: 31 KB, MIME type: image/png)
Captions
Summary[edit]
DescriptionNZ opinion polls 2009-2011 -parties.png |
English: Graph showing support for political parties in New Zealand since the 2008 election, according to various political polls. Data is obtained from the Wikipedia page, Opinion polling for the New Zealand general election, 2011 |
Date | (first) 2011-07-29 (last) |
Source | Own work |
Author | Mark Payne, Denmark |
Figure is produced using the R statistical package, using the following code. It first reads the HTML directly from the website, then parses the data and saves the graph into your working directory. It should be able to be run directly by anyone with R.
rm(list=ls())
#==========================================
#Parameters
major.parties <- TRUE
if(major.parties) {
selected.parties <- c("Green","Labour","National") #use precise names from Table headers
ylims <- c(0,65) #Vertical range
output.fname <- "NZ_opinion_polls_2009-2011 -parties.png"
} else { #Small parties - please use "Maori" for the Maori party
selected.parties <- c("ACT","Maori","NZ First","United Future","Mana") #use precise names from Table headers
ylims <- c(0,6) #Vertical range
output.fname <- "NZ opinion polls 2009-2011 -smallparties.png"
}
#==========================================
#Shouldn't need to edit anything below here
#Misc preparation
selected.parties <- gsub(" ","_",selected.parties) #Handle the space in some names
#Load the complete HTML file into memory
html <- readLines(url("http://en.wikipedia.org/wiki/Opinion_polling_for_the_New_Zealand_general_election,_2011",encoding="UTF-8"))
closeAllConnections()
#Extract the opinion poll data table
tbl.no <- 2
tbl <- html[(grep("<table.*",html)[tbl.no]):(grep("</table.*",html)[tbl.no])]
#Now split it into the rows, based on the <tr> tag
tbl.rows <- list()
open.tr <- grep("<tr",tbl)
close.tr <- grep("</tr",tbl)
for(i in 1:length(open.tr)) tbl.rows[[i]] <- tbl[open.tr[i]:close.tr[i]]
#Extract table headers
hdrs <- grep("<th",tbl,value=TRUE)
hdrs <- hdrs[1:(length(hdrs)/2)]
party.names <- gsub("<.*?>","",hdrs)[-c(1:2)]
party.names <- gsub(" ","_",party.names) #Replace space with a _
party.names <- gsub("M.{1}ori","Maori",party.names) #Apologies, but the hard "a" is too hard to handle otherwise
party.cols <- gsub("^.*bgcolor=\"(.*?)\".*$","\\1",hdrs)[-c(1:2)]
names(party.cols) <- party.names
#Extract data rows
tbl.rows <- tbl.rows[sapply(tbl.rows,function(x) length(grep("<td",x)))>1]
#Now extract the data
survey.dat <- lapply(tbl.rows,function(x) {
#Start by only considering where we have <td> tags
td.tags <- x[grep("<td",x)]
#Polling data appears in columns 3-11
dat <- td.tags[3:12]
#Now strip the data and covert to numeric format
dat <- gsub("<td>|</td>","",dat)
dat <- gsub("%","",dat)
dat <- gsub("-","0",dat)
dat <- gsub("<","",dat)
dat <- as.numeric(dat)
names(dat) <- party.names
#Getting the date strings is a little harder. Start by tidying up the dates
date.str <- td.tags[2] #Dates are in the second column
date.str <- gsub("<sup.*</sup>","",date.str) #Throw out anything between superscript tags, as its an reference to the source
date.str <- gsub("<td>|</td>","",date.str) #Throw out any tags
#Get numeric parts of string
digits.str <- gsub("[^0123456789]"," ",date.str)
digits.str <- gsub("^ +","",digits.str) #Drop leading whitespace
digits <- strsplit(digits.str," +")[[1]]
yrs <- grep("[0-9]{4}",digits,value=TRUE)
days <- digits[!digits%in%yrs]
#Get months
month.str <- gsub("[^A-Z,a-z]"," ",date.str)
month.str <- gsub("^ +","",month.str) #Drop leading whitespace
mnths <- strsplit(month.str," +",month.str)[[1]]
#Now paste together to make standardised date strings
days <- rep(days,length.out=2)
mnths <- rep(mnths,length.out=2)
yrs <- rep(yrs,length.out=2)
dates.std <- paste(days,mnths,yrs)
# cat(sprintf("%s\t -> \t %s, %s\n",date.str,dates.std[1],dates.std[2]))
#And finally the survey time
survey.time <- mean(as.POSIXct(strptime(dates.std,format="%d %B %Y")))
#Get the name of the survey company too
survey.comp <- td.tags[1]
survey.comp <- gsub("<sup.*</sup>","",survey.comp)
survey.comp <- gsub("<td>|</td>","",survey.comp)
survey.comp <- gsub("<U+2013>","-",survey.comp,fixed=TRUE)
survey.comp <- gsub("(?U)<.*>","",survey.comp,perl=TRUE)
#And now return results
return(data.frame(Company=survey.comp,Date=survey.time,date.str,t(dat)))
})
#Combine results
surveys <- do.call(rbind,survey.dat)
#Restrict plot(manually) to selected parties
selected.parties <- sort(selected.parties)
selected.cols <- party.cols[selected.parties]
polls <- surveys[,c("Company","Date",selected.parties)]
polls <- subset(polls,!is.na(surveys$Date))
polls <- polls[order(polls$Date),]
polls$date.num <- as.double(polls$Date)
#Setup plot
ticks <- ISOdate(c(rep(2009,2),rep(2010,2),rep(2011,2),2012),c(rep(c(1,7),3),1),1)
xlims <- range(c(ISOdate(2008,11,1),ticks))
png(output.fname,width=778,height=487,pointsize=16)
par(mar=c(5,4,1,1))
matplot(polls$date.num,polls[,selected.parties],pch=NA,xlim=xlims,ylab="Party support (%)",
xlab="",col=selected.cols,xaxt="n",ylim=ylims,yaxs="i")
abline(h=seq(0,95,by=5),col="lightgrey",lty=3)
abline(v=as.double(ticks),col="lightgrey",lty=3)
box()
axis(1,at=as.double(ticks),labels=format(ticks,format="1 %b\n%Y"),cex.axis=0.8)
axis(4,at=axTicks(4),labels=rep("",length(axTicks(4))))
#Now calculate the loess smoothers and add the confidence interval
smoothed <- list()
predict.x <- seq(min(polls$date.num),max(polls$date.num),length.out=100)
for(i in 1:length(selected.parties)) {
smoother <- loess(polls[,selected.parties[i]] ~ polls[,"date.num"],span=0.5)
smoothed[[i]] <- predict(smoother,newdata=predict.x,se=TRUE)
polygon(c(predict.x,rev(predict.x)),
c(smoothed[[i]]$fit+smoothed[[i]]$se.fit*1.96,rev(smoothed[[i]]$fit-smoothed[[i]]$se.fit*1.96)),
col=rgb(0.5,0.5,0.5,0.5),border=NA)
}
names(smoothed) <- selected.parties
#Then add the data points
matpoints(polls$date.num,polls[,selected.parties],pch=20,col=selected.cols)
#And finally the smoothers themselves
for(i in 1:length(selected.parties)) {
lines(predict.x,smoothed[[i]]$fit,col=selected.cols[i],lwd=2)
}
#Add election date too
#abline(v=election.date,lwd=4)
#text(election.date,0,format(election.date,"%d %b %Y"),srt=90,pos=4)
legend("bottom",legend=gsub("_"," ",selected.parties),col=selected.cols,pch=20,bg="white",lwd=2,horiz=TRUE,inset=-0.225,xpd=NA)
#Add best estimates
for(i in 1:length(smoothed)) {
lbl <- sprintf("%2.0f±%1.0f %%",round(rev(smoothed[[i]]$fit)[1],0),round(1.96*rev(smoothed[[i]]$se.fit)[1],0))
text(rev(polls$date.num)[1],rev(smoothed[[i]]$fit)[1],labels=lbl,pos=4,col=selected.cols[i])
}
dev.off()
cat("Complete.\n")
Licensing[edit]
- You are free:
- to share – to copy, distribute and transmit the work
- to remix – to adapt the work
- Under the following conditions:
- attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
File history
Click on a date/time to view the file as it appeared at that time.
Date/Time | Thumbnail | Dimensions | User | Comment | |
---|---|---|---|---|---|
current | 20:46, 24 November 2011 | 778 × 487 (31 KB) | Ridcully Jack (talk | contribs) | + RMR poll 25/11 | |
19:18, 24 November 2011 | 778 × 487 (31 KB) | Ridcully Jack (talk | contribs) | + NZ Herald 25/11 | ||
05:47, 24 November 2011 | 778 × 487 (31 KB) | Ridcully Jack (talk | contribs) | + both 24/11 tv polls | ||
23:06, 22 November 2011 | 778 × 487 (30 KB) | Ridcully Jack (talk | contribs) | + Fairfax | ||
06:52, 19 November 2011 | 778 × 487 (31 KB) | Ridcully Jack (talk | contribs) | + Roy Morgan | ||
06:31, 18 November 2011 | 778 × 487 (31 KB) | Ridcully Jack (talk | contribs) | less smooth, follows evolving trends better ("span = 0.25") | ||
00:30, 18 November 2011 | 778 × 487 (29 KB) | Ridcully Jack (talk | contribs) | minor change to dates applied | ||
20:21, 17 November 2011 | 778 × 487 (29 KB) | Ridcully Jack (talk | contribs) | + Herald Digipoll 18/11 | ||
08:29, 17 November 2011 | 778 × 487 (29 KB) | Ridcully Jack (talk | contribs) | + tv3 17/11 | ||
06:31, 17 November 2011 | 778 × 487 (29 KB) | Ridcully Jack (talk | contribs) | + ONCB 17/11 |
You cannot overwrite this file.
File usage on Commons
There are no pages that use this file.
File usage on other wikis
The following other wikis use this file:
- Usage on ca.wikipedia.org
- Usage on en.wikipedia.org
Metadata
This file contains additional information such as Exif metadata which may have been added by the digital camera, scanner, or software program used to create or digitize it. If the file has been modified from its original state, some details such as the timestamp may not fully reflect those of the original file. The timestamp is only as accurate as the clock in the camera, and it may be completely wrong.
Horizontal resolution | 47.24 dpc |
---|---|
Vertical resolution | 47.24 dpc |