Thank you for tuning in to this audio only podcast presentation. This is week 140 of The Lindahl Letter publication. A new edition arrives every Friday. This week the topic under consideration for The Lindahl Letter is, “Proxy models for elections.”
Sometimes a simplified model of something is easier to work with. We dug into econometric models recently during week 136 and they can introduce a high degree of complexity. Even within the world of econometrics you can find information about proxy models. In this case today we are digging into proxy models for elections. My search was rather direct. I was looking for a list of proxy models being used for elections [1]. I was trying to dig into election forecasting proxy models or maybe even some basic two step models. I even zoomed in a bit to see if I could get targeted on machine learning election proxy models [2].
After a little bit of searching around it seemed like a good idea to maybe consider what it takes to generate a proxy model equation to represent something. Earlier I had considered what the chalk model of election prediction would look like with using a simplified proxy of voter registration as an analog for voting prediction. I had really thought that would end up being a highly workable proxy, but it was not wholesale accurate.
Here are 3 papers I looked at this week:
Hare, C., & Kutsuris, M. (2022). Measuring swing voters with a supervised machine learning ensemble. Political Analysis, 1-17. https://www.cambridge.org/core/services/aop-cambridge-core/content/view/145B1D6B0B2877FC454FBF446F9F1032/S1047198722000249a.pdf/measuring_swing_voters_with_a_supervised_machine_learning_ensemble.pdf
Zhou, Z., Serafino, M., Cohan, L., Caldarelli, G., & Makse, H. A. (2021). Why polls fail to predict elections. Journal of Big Data, 8(1), 1-28. https://link.springer.com/article/10.1186/s40537-021-00525-8
Jaidka, K., Ahmed, S., Skoric, M., & Hilbert, M. (2019). Predicting elections from social media: a three-country, three-method comparative study. Asian Journal of Communication, 29(3), 252-273. http://www.cse.griet.ac.in/pdfs/journals20-21/SC17.pdf
I spent some time messing around with OpenAI’s GPT-4 on this topic. That effort drove down to a few proxy models that are typically used. The top 10 seemed to be the following: social media analysis, google trends, economic indicators, fundraising data, endorsement counts, voter registration data, early voting data, historical voting patterns, event-driven, and environmental factors. Combining all 10 proxy models into a single equation would result in a complex, multivariable model. Here’s a simplified representation of such a model:
E=α1(S)+α2(G)+α3(Ec)+α4(F)+α5(En)+α6(VR)+α7(EV)+α8(H)+α9(Ed)+α10(Ef)+β
Where:
- E is the predicted election outcome.
- α1, α2,…α10 are coefficients that determine the weight or importance of each proxy model. These coefficients would be determined through regression analysis or other statistical methods based on historical data.
- S represents social media analysis.
- G represents Google Trends data.
- Ec represents economic indicators.
- F represents fundraising data.
- En represents endorsement count.
- VR represents voter registration data.
- EV represents early voting data.
- H represents historical voting patterns.
- Ed represents event-driven models.
- Ef represents environmental factors.
- β is a constant term.
This equation is a linear combination of the proxy models, but in reality, the relationship might be non-linear, interactive, or hierarchical. The coefficients would need to be determined empirically, and the model would need to be validated with out-of-sample data to ensure its predictive accuracy. Additionally, the model might need to be adjusted for specific elections, regions, or time periods. It would be interesting to try to pull together the data to test that type of complex multivariable model. Maybe later on we can create a model with some agency designed to complete that task.
Footnotes:
[1] https://scholar.google.com/scholar?hl=en&as_sdt=0%2C6&q=election+proxy+models&btnG=
[2] https://scholar.google.com/scholar?hl=en&as_sdt=0%2C6&q=election+proxy+models+machine+learning&btnG=
What’s next for The Lindahl Letter?
- Week 141: Building generative AI chatbots
- Week 142: Learning LangChain
- Week 143: Social media analysis
- Week 144: Knowledge graphs vs. vector databases
- Week 145: Delphi method
If you enjoyed this content, then please take a moment and share it with a friend. If you are new to The Lindahl Letter, then please consider subscribing. New editions arrive every Friday. Thank you and enjoy the week ahead.
Leave a Reply