Yesterday, I started looking around at all the content I have online. The only base I do not have covered is probably needing to share a new speaking engagement photo online. I need to set up a page for speaking engagements at some point with that photo and a few instructions on how best to request my engagement. Every time I have done a speaking engagement my weblog and Twitter traffic picked up for a little bit. Using the “Print My Blog” plugin I was able to export 1,328 pages of content for a backup yesterday. My initial reaction to that was wondering how many pages of that were useful content and how much of it was muddled prose. Not only did that question of usefulness make me wonder, but also I wondered if I loaded that file into the OpenAI GPT-2 what would come out as the predicted next batch of writing. That is probably enough content to spit out something that reasonably resembles my writing. I started to wonder if the output would be more akin to my better work or my lesser work. Given that most of my writing is somewhat iterative and I build on topics and themes the GPT-2 model might very well be able to generate a weblog in my style of writing.
Just for fun I’m going to try to install and run that little project. When that model got released I spent a lot of time thinking about it, but did not put it to practice. Nothing would be more personal than having it generally create the same thing that I tend to generate on a daily basis. A controlled experiment would be to set it up and let it produce content each day and compare what I produce during my morning writing session to what it spits out as the predicted next batch of prose. It would have the advantage or disadvantage of being able to review 1,328 pages and predict what is coming next. My honest guess on that one is that the last 90 days are probably more informative for prediction than the last 10 years of content. However, that might not be accurate based on how the generative model works. All that content might very well help fuel the right parameters to generate that next best word selection. I had written “choice” to end that last sentence, but that felt weird to write that the GPT-2 model was making a choice so I modified the sentence to end with selection.
Interrupted. School.