Literally minutes after Donald Trump’s election in 2016, political pundits, consultants and prospective candidates started a march toward the mid-term elections.
The expectations were set extremely high, with Democratic hopes of taking back the House of Representatives led, in part, by a huge gain in the limited number of remaining Republican-held congressional seats in California.
Yet today, less than a week out from election day, we are still unsure what will happen here in California or nationally.
As I discussed in a recent CA 120 article, if Democrats are able to meet their high expectations we should all be amazed. It would be an unprecedented feat for Democrats in California to make these gains in a gubernatorial election cycle – the kind of election cycle in which Republicans can usually relax and safely hold on to their seats, buoyed by higher relative turnout.
The polling and fundraising figures in the last several weeks have strongly reinforced this idea that we are in the middle of a big Democratic year.
As recently as Monday, the New York Times Upshot poll concluded in the 45th Congressional District that first-time candidate Katie Porter (D) is set to defeat incumbent Mimi Walters (R) in nearly every turnout scenario – from low turnout to absolutely every voter in the district casting a ballot.
We have also seen fundraising figures that are tenfold above what has been raised previously for Democrats in these districts. In total, the 10 most heavily targeted seats have seen $65 million in fundraising for Democrats, compared to under $35 million for Republicans.
While we have these strong indicators, others in campaigns throughout California are looking at a new set of data – the state’s relatively large early absentee vote – and doing a little second-guessing.
You can track the returns yourself, using this public tracker from Political Data for Legislative and Congressional districts. To look at local breakdowns at the city council, school board, county or statewide level, click here.
While the tracker is informative, there are good reasons that viewers of this data should take it with a grain of salt and not fall into the trap of over-analyzing it.
First, the 2018 general election is the largest by-mail voting election California has ever seen. Nearly 13 million voters have received a ballot in the mail, compared to just 9 million in the last gubernatorial election in 2014.
In California, a lot of voters are being pushed into voting by mail for the first time.
This increase in the number of ballots sent to voters primarily reflects young, Latino and independent voters. In 2014, 2.7 million voters aged 18-to-39 received a ballot in the mail. In 2018, this number increased to 4.6 million. There were a million more ballots sent to Latinos and 1.5 million more sent to no-party-preference voters.
Because the by-mail voter population has changed so dramatically, a comparison with 2014 numbers may not be as informative as we’d like.
Secondly, much of what it is being measured involves the mechanics of the elections.
For example, let’s say that in the first week of voting one county loses the staff person who processes ballots, while another county has a new team in place devoted to speedily getting this data processed.
One might quickly jump to the conclusion based on the early processed returns that voters in one county are motivated, while those in the other they aren’t. This kind of analysis would be measuring mechanics of processing ballots and misinterpret it as enthusiasm.
And this year in California the mechanics are actually much larger than this example.
In California, we have five counties that have converted to all-mail elections, with voting centers instead of traditional precincts. This means that a lot of voters are being pushed into voting by mail for the first time.
One function of this data is to help the state’s best pollsters understand the vote that is coming in, and this is clearly reflected in their polling.
Another forceful argument came from Nate Silver on the FiveThirtyEight elections podcast, in which he argued strongly that the early vote is a kind of fool’s gold, and that it isn’t “new” data since the pollsters are already accounting for it.
His colleague Nathaniel Rakich makes a similar refrain on twitter, stating “Sigh. I guess everyone is going to overreact to early-voting numbers again. Here are all the ways that can go wrong.” Even the New York Times’ Nate Cohn last night got in on the act, announcing the CA 45 polling results with his own twitter stream on the early vote.
All three Nathans are right.
One function of this data is to help the state’s best pollsters understand the vote that is coming in, and this is clearly reflected in their polling. This data is not independent of the other survey work that is being done to interpret where our elections are headed. It’s part of the larger whole.
When you see the statewide poll from Mark DiCamillo’s team at UC Berkeley, for example, you will find that a portion of the voters have been matched to this “already voted” set of data and are appropriately accounted for in their surveys.
So it wouldn’t be appropriate to then go back and say, “Republicans are voting two points higher than last cycle, so let’s take the UC Berkeley poll and adjust it two points more Republican.”
Finally, there is the very Halloween-appropriate term that experts use called “voter cannibalization,” in which you see a high turnout in early vote and conclude, wrongly, that there is a trend. Actually, it’s just the early voters cannibalizing the poll vote and giving a misleading turnout picture.
There was a great example of this in Texas last year when the early vote showed incredibly high Democratic turnout.
But in the weeks after the election, it was determined that the final overall turnout wasn’t nearly as high as the early data was suggesting. There was an increase, to be sure, but there were also just a lot of would-be poll voters casting their ballots early.
Despite all these caveats, and there are many, the early data is useful for testing the early expectations, getting an early read on which parts of the state appear motivated and to see how age, gender, party and ethnic subgroups are performing in what potentially is an historic election.
While tracking back to the 2014 cycle might have a lot of analytical landmines, we can reduce the error associated with the changing rates of absentee voting by also making comparisons to the June 2018 primary and even the 2016 general (albeit that was a presidential election).
Looking at those elections for comparison, we see a week out from election cay that turnout is up 80% among these early voters compared to the primary, which had 37% total turnout.
However, compared to this point in the 2016 general, turnout is actually down 23%. In a way, this is typical of a gubernatorial election cycle which is usually around 55-to-60% turnout, compared to primary elections at around 35% turnout, and presidential elections in the mid- to high-70% range.
Looking at it through a different lens, one-in-five absentee voters have returned their ballots so far, compared to one-in-eight who had done so at this point before the primary – a significant increase. But it isn’t greater than the one-in-four range we saw at this point before the 2016 general.
Based on the same one-week pre-election window, this increase is greatest in the competitive congressional districts, particularly those in Orange County.
These are all still coming in at below 2016 numbers, except for one subgroup: Asians. There is extremely high early vote from this community, particularly the Vietnamese community which has a large number of candidates in local races in Orange County, plus the state Senate contest with incumbent Republican Janet Nguyen.
Voters in other districts which have been seen as competitive aren’t appearing as stimulated to turn out – basically hovering around the same higher turnout than the primary, but falling far short of the 2016 General numbers.
While there has been a lot of focus on the Latino community and young voters, their share of the early seems to be tracking with their turnout in the primary – with Latinos just one-percentage point short of their Primary share, and a small, 3-point shift in the amount of the early voters that are seniors.
There has been some evidence that seniors’ domination in the early vote was waning over the last weekend.
In several districts, particularly those Orange County seats, we had seen very early vote coming in with 50-60% of the ballots coming from seniors. But that has shifted to 45%, with increases in the number of ballots coming from 18-to-45 year-olds.
While we like to focus on the top races that everyone will be watching on election night, not all campaigns are lucky enough to get all this attention. For those contests, the local elections tracker can provide the best early evidence of what to expect a week from now.
Using the early vote, campaigns can see if voters in their area are turning out. They can even dig into ethnic, gender, age, partisanship and other voting patterns that could be key to their local contests. This is valuable information for the vast majority of campaigns that aren’t necessarily in the spotlight.
Right now, the national models – whether at FiveThirtyEight, Washington Post or New York Times — don’t account for the early vote. But, as models get more sophisticated, and as the early vote starts settling down (being less impacted by constant changes in the laws and how they are applied) the early vote will definitely be put into models.
It won’t be the only thing, but this early vote can inform models about the pace, changes, and small adjustments to an election forecast.
Until then, we have a full week to stare at the maps and data as early vote comes in, and try to use it as just one tool in understanding this rather extraordinary election.
One thing to watch is whether there are upticks in the turnout from young voters, Latinos and independents who have for the most part underperformed expectations in this early vote.
And on election night, look closely at the initial results.
In the political data business, these are called the “801’s” because they are posted at one minute after the polls close at 8 p.m. That first wave of returns will be coming from these absentee ballots that have already been accounted for in the tracker.
You can put the composition of the electorate in the last absentee count next to the result and see what story that tells.
At that point, the polls get thrown out the window – I’m sure the public will be happy to hear that! — and we may be able to determine what will happen the rest of the night as the later results to be counted (poll voters, late absentees) come in and generally reflect voters who are younger, more progressive and more heavily minority.