Machine learning election models

Thank you for tuning in to this audio only podcast presentation. This is week 139 of The Lindahl Letter publication. A new edition arrives every Friday. This week the topic under consideration for The Lindahl Letter is, “Machine learning election models.”

This might be the year that I finally finish that book about the intersection of technology and modernity. During the course of this post we will look at the intersection of machine learning and election models. That could very well be a thin slice of the intersection of technology and modernity at large, but that is the set of questions that brought us here today. It’s one of things we have been chasing along this journey. Oh yes, a bunch of papers exist related to the topic this week of machine learning and election models [1]. None of them are highly cited. A few of them are in the 20’s in terms of citation count, but that means the academic community surrounding this topic is rather limited. Maybe the papers are written, but have just not arrived yet out in the world of publication. Given that machine learning has an active preprint landscape that is unlikely. 

That darth of literature is not going to stop me from looking at them and sharing a few that stood out during the search. None of these papers is approaching the subject from a generative AI model side of things they are using machine learning without any degree of agency. Obviously, I was engaging in this literature review to see if I could find examples of the deployment of models with some type of agency doing analysis within this space of election prediction models. My searching over the last few weeks has not yielded anything super interesting. I was looking for somebody in the academic space doing some type of work within generative AI constitutions and election models or maybe even some work in the space of rolling sentiment analysis for targeted campaign understanding. That is probably an open area for research that will be filled at some point.

Here are 4 articles:

Grimmer, J., Roberts, M. E., & Stewart, B. M. (2021). Machine learning for social science: An agnostic approach. Annual Review of Political Science, 24, 395-419. https://www.annualreviews.org/doi/pdf/10.1146/annurev-polisci-053119-015921 

Sucharitha, Y., Vijayalata, Y., & Prasad, V. K. (2021). Predicting election results from twitter using machine learning algorithms. Recent Advances in Computer Science and Communications (Formerly: Recent Patents on Computer Science), 14(1), 246-256. www.cse.griet.ac.in/pdfs/journals20-21/SC17.pdf  

Miranda, E., Aryuni, M., Hariyanto, R., & Surya, E. S. (2019, August). Sentiment Analysis using Sentiwordnet and Machine Learning Approach (Indonesia general election opinion from the twitter content). In 2019 International conference on information management and technology (ICIMTech) (Vol. 1, pp. 62-67). IEEE. https://www.researchgate.net/publication/335945861_Sentiment_Analysis_using_Sentiwordnet_and_Machine_Learning_Approach_Indonesia_general_election_opinion_from_the_twitter_content 

Zhang, M., Alvarez, R. M., & Levin, I. (2019). Election forensics: Using machine learning and synthetic data for possible election anomaly detection. PloS one, 14(10), e0223950. https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0223950&type=printable 

My guess is that we are going to see a wave of ChatGPT related articles about elections post the 2024 presidential cycle. It will probably be one of those waves of articles without any of them really standing out or making any serious contribution to the academy. 

The door is opening to a new world of election prediction and understanding efforts thanks to the recent changes in both model agency and generative AI models that help evaluate and summarize very complex things. It’s really about how they are applied to something going forward that will make the biggest difference in how the use cases play out. These use cases by the way are going to become very visible as the 2024 election comes into focus. The interesting part of the whole equation will be when people are bringing custom knowledge bases to the process to help fuel interactions with machine learning algorithms and generative AI. 

It’s amazing to think how rapidly things can be built. The older models of software engineering are now more of a history lesson than a primer on building things with prompt-based AI. Andrew Ng illustrated in a recent lecture the rapidly changing build times. You have to really decide what you want to build and deploy and make it happen. Ferris Bueller once said, “Life moves pretty fast.” Now code generation is starting to move even faster! You need to stop and look around at what is possible, or you just might miss out on the generative AI revolution.

You can see Andrew’s full video here: https://www.youtube.com/watch?v=5p248yoa3oE 

Footnotes:

[1] https://scholar.google.com/scholar?hl=en&as_sdt=0%2C6&q=Machine+learning+election+models&btnG= 

What’s next for The Lindahl Letter? 

  • Week 140: Proxy models for elections
  • Week 141: Building generative AI chatbots
  • Week 142: Learning LangChain
  • Week 143: Social media analysis
  • Week 144: Knowledge graphs vs. vector databases

If you enjoyed this content, then please take a moment and share it with a friend. If you are new to The Lindahl Letter, then please consider subscribing. New editions arrive every Friday. Thank you and enjoy the week ahead.

What happens at the end of the blog

Earlier this week I was thinking about what exactly happens at the end of the blog. Most of the time in the lifecycle of a weblog or blog the end happens from abandonment. Probably the vast majority of blog type writing projects have been just abandoned. At some point, the writer just stops producing that type of prose and moves along to something new. A few of them were powered by writers that sustained them for years or perhaps decades. Those platforms of prose generation stood the test of online time. Generally, at the point of abandonment most of the self hosted blog experiments eventually vanish, expire, or are terminated. Sometimes they were built on a platform that just sustains and lingers. Those free platforms sometimes can last a very long time in the online world. 

In my case, from this point I know that the servers are paid out 5 years from now and assuming the platform properly updates itself the blog could survive during that time frame. Certainly the prose won’t really improve during that time. It will just survive online. My plans at the moment are to keep adding to the content. I write for the blog without consideration for an audience. The content is created really for my own purposes of writing. Throughout the last 20 years the blog content just mostly sits, lingers, and remains unmoving and uncompelling. It’s writing without a discrete future purpose. The prose was formed within the process of writing. 

Considering some writing schedule updates:

  • Saturday – daily blogging, early morning hours spent on The Lindahl Letter development
  • Sunday – daily blogging, early morning hours spent on The Lindahl Letter podcast recording
  • Monday – daily blogging, nels.ai development
  • Tuesday – daily blogging, nels.ai recording 
  • Wednesday – daily blogging, nels.ai publishes at 5 pm
  • Thursday – daily blogging, big coding adventures
  • Friday – daily blogging, The Lindahl Letter goes out at 5 pm

I have the outline of a book that probably needs to be written sometime soon. I could devote my Saturday and Sunday early morning time to working on the chapters of that book as blocks of content creation. All of that content is listed in the backlog and will eventually get built, but maybe the time to produce a certain section of that backlog is now instead of leader. It’s always the reframe of action that the time is now. Finding and sustaining the now is probably the harder part of that equation.

Attending an event and some focus time

Committed the first real post for nels.ai and have that system up and running for publishing on Wednesdays going forward.

I’m attending the CIO Future of Work Summit this morning. The first event was titled, “The Radical Next: Designing the Organization of the Future.” 

Throughout the day I stayed in the sessions. A few of them were pretty interesting:

  • “Is There an Easier Way to Manage and Remediate Risk?”
  • “Game Changer: Deploying Gen AI to Maximize Customer Experience”
  • “From Shadow IT to Shadow AI: How to Gain Visibility and Rein in Unauthorized Usage”
  • “Deliver a Cost-Effective, Sustainable & High-Quality Digital Workplace”
  • “Friend or Foe? What’s the Role of Automation and GenAI for the Future of Work?”

Edited and scheduled vlog day 33.

A lot of housekeeping tasks

Setup for the new nels.ai domain is now complete. The process ended up needing a bit of time (over a day) for the DNS to sort out. I’m going to spend some time focusing on building content for the new domain today. 

Polling for week 38 was completed and loaded. I need to do some benchmarking this week. 

I need to release a version of my base election prediction proxy model. The chalk model was not very predictive on a single factor. It’s time for a new and better model. 

I set up https://linktr.ee/nelslindahl today. This is one of those platforms that I’m not entirely sold on, but I thought it was time to build out as a profile. 

I signed up for https://openai.com/form/red-teaming-network

Working on learning

I set my alarm an hour early to join a class called “Application Development with Cloud Run” which was a part of Google’s Innovators Plus: Live Learning Events. I plan on attending a few of these events, but for some reason this series of events runs very early in the morning. 

You can see my Google Developer profile here: https://g.dev/nelslindahl 

Apparently I have a Google Cloud profile as well here: https://googlecloud.qwiklabs.com/public_profiles/ff63ebc9-8d5b-49a8-9c70-25c0c292ab73 

“LocalGPT Updates – Tips & Tricks”

Aligning the backlog for 2023

Today was the first day in the vlog series that I forgot to upload the video the day it was created and had to quickly finish the upload after my alarm went off. That meant instead of writing this delightful post I had to work on the upload flow. Don’t worry the video was posted an hour later in the day than the normal series, but it went live and all is well. I’m using a basic video editing workflow where all the content is created on my Pixel 7 Pro, content is edited in PowerDirector on the phone, and then it just gets uploaded to YouTube. I’m actually getting fairly proficient at using PowerDirector after 29 days of consecutive vlog updates. The YouTube Shorts format works well enough and making content that is always less than 60 seconds is a different sort of challenge. At this point, I think that is a better format than turning out sub 10 minute videos. 

Last night I got into a fight with my writing backlog. I have been looking at the remaining blocks of content for the year and trying to consider what could best occupy that space. I’m very clear on what will be created in 2024. I have a book in mind that needs to be written and I’ll go about creating it one block of content at a time. Probably from start to finish in my normal writing style. That effort will yield a roughly 20 chapter book. Aligning the backlog for 2023 has taken a bit of time and I’m really considering either combining my code development planning and writing planning or splitting them up. One of those two things needs to happen. Generally, I keep my writing backlog separate. That has worked out well enough. I’m starting to approach a window where I’m probably going to spend more time coding than writing words. I’m sure that is a good thing and naturally a part of the cycle of creation.

Being a reflective builder

Today started off in a rather normal sort of way. Two shots of espresso were made and were delightful. Sunrise happened outside the view of my window. My Saturday morning routine of watching a bit of the WAN show happened without interruption. I took a few moments to review my top 5 things from yesterday and it is somewhat satisfying to review and consider the flow of things from day to day. Being a reflective builder is an important part of the process. My argument represented as a hypothesis would be that on any given day we can accomplish 5 blocks of time building good things. To me that is a reasonable way to look at building and creating. Some people for sure are able to work in a different way creating more or less blocks of production. Generally I’m looking at reasonably hard things that are broken into achievable blocks of things that can be done. I cannot code a whole application in a single block of time. That task could be broken into a reasonable set of blocks and I could certainly work on completing that effort. 

Right now I’m working to finish up block 142 of the Lindahl Letter Substack publication. I’m seriously considering closing the newsletter at 150 weeks of writing effort. I might let it go till 156 weeks which would be a complete 3 years of content generation. I had considered switching to a pay model and delivering more in depth independent research each week. Each week right now I provide a brief research note on the topic I’m interested in researching. It’s really a sharing of what I’m interested in and that is the sole and direct focus of the writing enterprise on that one. I have already moved to sharing the same content on my weblog each week at the same time. That got me thinking about where people consume content these days.

Within academic spaces content  has always been harder to access than it should have been with paywalls, high prices, and subscriptions. Journals are great for keeping and storing ideas shared between academics who subscribe and read the journal. It’s a community of interest and it works generally for that academic community. People outside that circle wanting access might need to go to a library or decide if they want to pay for the journal. It’s a limiting circle of content management. Publishing a series of research notes is probably essentially ephemeral in nature. While in the abstract the internet never forgets we have reached the point where it’s really large and probably not backed up. That ephemeral nature will mean that the weekly posts will probably at some point vanish. I had considered that reality from the start of the endeavor and at the end of each year I pooled that year’s Substack content into a book. Right now two of those ponderous tomes of thought sit next to me on the shelf. 

Those efforts will probably stay in publication longer than anything stored on the internet at large. I keep my web hosting paid for 5 years out so in theory that is the longest horizon of serving up that content on the open internet. I’m digging into some deeper topics today and that is interesting for a Saturday morning.

Election prediction markets & Time-series analysis

Thank you for tuning in to this audio only podcast presentation. This is week 138 of The Lindahl Letter publication. A new edition arrives every Friday. This week the topic under consideration for The Lindahl Letter is, “Prediction markets & Time-series analysis.”

We have been going down the door of digging into considering elections for a few weeks now. You knew this topic was going to show up. People love prediction markets. They are really a pooled reflection of sentiment about the likelihood of something occuring. Right now the scuttlebut of the internet is about LK-99, a potential, maybe debunked, maybe possible room temperature superconductor that people are predicting whether or not it will be replicated before 2025 [1]. You can read the 22 page preprint about LK-99 on ArXiv [2]. My favorite article about why this would be a big deal if it lands was from Dylan Matthews over at Vox [3]. Being able to advance the transmission power of electrical lines alone would make this a breakthrough. 

That brief example being set aside, now people can really dial into the betting markets for elections where right now are not getting nearly the same level of attention as LK-99 which is probably accurate in terms of general scale of possible impact. You can pretty quickly get to all posts that the team over at 538 have tagged for “betting markets” and that is an interesting thing to scroll through [4]. Beyond that look you could start to dig into an article from The New York Times talking about forecasting what will happen to prediction markets in the future [5].

You know it was only a matter of time before we moved from popular culture coverage to the depths of Google Scholar [6].

Snowberg, E., Wolfers, J., & Zitzewitz, E. (2007). Partisan impacts on the economy: evidence from prediction markets and close elections. The Quarterly Journal of Economics, 122(2), 807-829. https://www.nber.org/system/files/working_papers/w12073/w12073.pdf

Arrow, K. J., Forsythe, R., Gorham, M., Hahn, R., Hanson, R., Ledyard, J. O., … & Zitzewitz, E. (2008). The promise of prediction markets. Science, 320(5878), 877-878. https://users.nber.org/~jwolfers/policy/StatementonPredictionMarkets.pdf

Berg, J. E., Nelson, F. D., & Rietz, T. A. (2008). Prediction market accuracy in the long run. International Journal of Forecasting, 24(2), 285-300. https://www.biz.uiowa.edu/faculty/trietz/papers/long%20run%20accuracy.pdf 

Wolfers, J., & Zitzewitz, E. (2004). Prediction markets. Journal of economic perspectives, 18(2), 107-126. https://pubs.aeaweb.org/doi/pdf/10.1257/0895330041371321 

Yeah, you could tell by the title that a little bit of content related to time-series analysis was coming your way. The papers being tracked within Google Scholar related election time series analysis were not highly cited and to my extreme disappointment are not openly shared as PDF documents [7]. For those of you who are regular readers you know that I try really hard to only share links to open access documents and resources that anybody can consume along their lifelong learning journey. Sharing links to paywalls and articles inside a gated academic community is not really productive for general learning. 

Footnotes:

[1] https://manifold.markets/QuantumObserver/will-the-lk99-room-temp-ambient-pre?r=RWxpZXplcll1ZGtvd3NreQ

[2] https://arxiv.org/ftp/arxiv/papers/2307/2307.12008.pdf

[3] https://www.vox.com/future-perfect/23816753/superconductor-room-temperature-lk99-quantum-fusion

[4] https://fivethirtyeight.com/tag/betting-markets/ 

[5] https://www.nytimes.com/2022/11/04/business/election-prediction-markets-midterms.html

[6] https://scholar.google.com/scholar?hl=en&as_sdt=0%2C6&q=election+prediction+markets&btnG= 

[7] https://scholar.google.com/scholar?hl=en&as_sdt=0%2C6&q=election+time+series+analysis&oq=election+time+series+an 

What’s next for The Lindahl Letter? 

  • Week 139: Machine learning election models
  • Week 140: Proxy models for elections
  • Week 141: Election expert opinions
  • Week 142: Door-to-door canvassing

If you enjoyed this content, then please take a moment and share it with a friend. If you are new to The Lindahl Letter, then please consider subscribing. New editions arrive every Friday. Thank you and enjoy the week ahead.

Maintaining 5 slots of building each day

Focusing inward on delivering requires a certain balance. My balance has been off recently. I got knocked off my feet and it impacted my ability to produce blocks of content for about a week. That type of thing does not normally happen to me. It was a new set of emotions and things to consider. Getting knocked down hard enough to pause for a moment and need to look around before moving again was a very new sensation. I’m not sure it was something that I was looking for or even prepared to experience. Really the only thing that put me back on the right track to success and deeper inward consideration (restored balance) was the passage of some time. It just took a little bit of time for me to internalize and move on to a new set of expectations. 

Each new day brings forward a set of time for creating blocks of content. My thoughts right now are around the consideration of making and maintaining 5 slots of building each day. To that end I have been sitting down on the whiteboard and writing down 5 good things to work on each day and trying to make sure they are attainable blocks to complete. At this time, I don’t want to put multi-slot blocks or all day blocks on the board for action and review. This is not the time for that type of stretching and personal growth by taking on highly complex activities. Right now is the time to make things clear, work on the clear things, and be stronger with that resolution every single day go forward. 

Maybe getting back to the absolute standard of sitting down at the very start of the day after drinking two shots of espresso and writing for a few minutes is the key to reframe my day. It is something that has been missing. It was missed. Perhaps it was missed more than I even realized at the time. I’ll admit to sitting down and watching about 4-5 seasons of the Showtime series Billions instead of actively writing and building. Alternatively, I could have been listing some graded sports cards on eBay and working to sell a few of them each day. Let’s zoom out for a second from those thoughts and consider what the next 30 days will uphold as a standard. 

One block of the daily 5 is going to be related to committing code on GitHub. I’m going to really focus my time and energy on making solid contributions to published code. Taking on that effort will help me be focused and committed to something that will become more and more necessary. Building code has changed a bit with the advent of LLMs, but the general thought exercise and logic remain pretty much the same. You might be able to take a wild run at something that was not attainable before and prompt your way to something magical. Generally you are going to go where logic can take you within the confines of the coding world as the framework is a lot more logical than it is purely chaotic in nature. 

5 good things for 9/15

  1. Rework block 142
  2. Commit something LangChain related in Colab
  3. Work on https://www.coursera.org/learn/intro-to-healthcare/home/week/1
  4. Review blocks 143-145
  5. Start building voter data baseline package

Outside of those efforts generally as a part of my daily routine I’m producing a daily vlog via YouTube Shorts and striving to output a daily reflection functional journal blog post. I’m going to try to take some inline functional journal notes throughout the day as well. That is going to structurally end up with a sort of blog post being written at the start of the day and then a bunch of more inline bullets being created. Posting is still going to happen at the end of the day or potentially a day delayed. 

Delivering independent research is more important now than ever. I have spent some time thinking about the models of how that research is delivered and what value it has generally. 

Block 142 is pretty much ready to go. I’ll be able to record it tomorrow morning and stay on track to have a 4 block recorded backlog of content ready to go for my Substack. 

During the course of reviewing blocks 143 to 145 I considered if those are even the right topics to spend time working. They are probably fine elements of things to research. It’s not about producing timely content, but instead it is about making meaningful blocks of content that are not time sensitive. That of course is always a harder thing to accomplish while producing independent research.

A few hours of extreme focus

This week had two days with a more active degree of functional journaling. It’s interesting to keep the Google Doc up and just drop little insights into it throughout the day. It reminds me of years ago when the internet was exciting and awesome or at least it seemed to be full of wonder. 

Today will be the 21st day of the vlog over on YouTube. It has been a long time since daily video delivery was a part of my workflow. 

I uninstalled Dropbox from my phone. 

I have spent a lot of time on block 142. It has involved going down all sorts of rabbit holes to dig into a variety of different adjacent topics.