This weekend I’m chewing on some of that Microsoft Learn Python coding training. Instead of doing the exercises in Microsoft Visual Studio Code I’m working out of Google’s Colab online notebook system. It adds a little extra mental challenge to do everything, but in a different environment. I’m thinking about restructuring some of the content to be a more weekly look at how somebody could pick up and learn Python with tutorials based on chained lessons in a Jupyter notebook. That could make it a little more interactive for the respondent working through the learning points of emphasis.
All I really want to do today and during this weekend is spend some time coding in Python. That should be easy enough to focus on today. Recently, I have been working on some of the Microsoft training that is freely available. This section of the Microsoft website seems to be called Microsoft Learn and it has a ton of free content you can consume. As you can tell I’m trying out a traditional footnote approach this time around. Bracketed footnotes are being used to just reference things in a post. This seems like the most durable way to do that while I’m writing and using WordPress. Fancier ways exist to do these types of things, but they are not as durable or transferable between spaces utilizing a method of copy and paste. Beyond that aside I’m still focused on spending some type of coding in Python.
 Direct website link https://docs.microsoft.com/en-us/learn/browse/
Last night I spent some time working and thinking deeply about election engagement modeling to predict voting patterns. I’m hoping that later today the ideas in the pressure cooker of my thoughts will have evolved enough to be sharable on GitHub in the form of a Jupyter notebook. Most of the time I have thought out how I’m going to code or create something before I ever touch the keyboard. After it is all typed up and working, taking the next step of making the model shareable online in a repository will help set a date stamp to it and make it rather official. People have built some reliable and some very poor models for election prediction. It seems like now is as good as time to publicly throw my hat into the ring on this one. This notebook will include my first attempt to run a map driven model in a Jupyter notebook. That alone should be fun to figure out step by step how to load and model based on geographic data tables. Part of the fun of this exercise is learning a little more about how to use Jupyter notebooks and doing something that I would not normally spend my time doing. Right now having a few coding adventures is probably the right thing to do with my time.
Colorado has 4 major wildfires right now and the smoke from those fires has made the air quality in Denver questionable recently. You can see the statements about air quality on the official Colorado Department of Public Health & Environment website. The statements basically say that visibility and air quality have been impacted. Originally I had those previous sentences just hanging off the first paragraph. It took me a second to realize the topic had entirely changed and that a new paragraph was justified. I could probably continue to provide some supporting sentences or thoughts about the air quality right now, but you can imagine what a campfire smells like and extrapolate that to an entire region.
Let’s jump back to the key topic at hand for the day. I’m going to start learning how to use GeoPandas and Geoplot (or maybe Matplotlib) to create some sweet visualizations. I’m going to start out small using a few different examples before working up to building out a 50 state electoral college prediction visualization. It seems like it would be a good skill (or at least a fun one) to have going forward. My goal for this effort is to drop some of these examples on GitHub along the way. I always try to walk step by step through the example to ensure that it can be repeatable and that it is easy for somebody to click from step one to the last step and understand what happened. This is great for both helping other people and creating repeatability within the research effort. Having really solid Jupyter notebook documentation reduces the barrier for replication within research and that is fundamentally a healthy direction to take within academic research. Somebody could easily adapt my methods and change the data or run it again with the same data to verify things happened and worked as expected. The one probably with this method is that everything in a Jupyter notebook is like a snapshot in time. Things will change within the dependencies and at some point the notebook will have errors and start failing on some deprecated functionality. That is one of those things that is the most frustrating part of working with coding. You have to constantly rework things that were done to keep them current.
A 98-day publishing streak continues
My continued quest to build new examples of data analysis has been going well enough. Today will be a real test of my efforts to produce things. Very shortly I’m going to go spend the morning walking around nature. Yesterday, I managed to play 18 holes of golf and it worse me out. After returning from the golf course I didn’t really want to sit down and write or work on any coding projects. I mostly just wanted to drink some water and rest. During that period of restful contemplation I did give some thought to reworking my presentation and corresponding talk on, “Effective ML ROI use cases at scale.” The previous versions of that title had been “Building effective ROI ML use cases” and “ML use cases at scale with effective ROI.” Based on the three titles I’m sure you can see how one was superior. My preference is for what I feel is the crisper title. Normally, when I’m doing research and writing a paper I give the title a few searches using the Google search engine to see if something shows up. During the course of a literature review I end up searching other databases as well. That is one of those things that just has to be done.
Later today I’m going to try to do better than yesterday and pick back up my coding efforts and write a little bit more prose. My strategy yesterday which was effective was to start work on both the writing and coding early in the morning before my expedition to enjoy nature. My golf game is not particularly good or anything. These two days of vacation and golfing are mostly about spending some time outdoors and enjoying the beautiful Colorado weather in June. The high today is going to be about 83 degrees and the morning weather looks to be perfect for walking around. During the course of walking around and seeing the mountains in the background some of my thoughts will drift back to the best way to teach and demonstrate to others how to move along the path to finishing a solid coding quest. Part of that is slowly bringing people into the world of modifying data and doing machine learning within Jupyter notebooks. Working in Microsoft Excel is something that most people have done to work with data. You can see it right in front of you and it is easy to manipulate. Eventually people working with larger and larger datasets graduate into using Microsoft Access as a database management tool. Eventually that type of effort graduates into using a Microsoft SQL server or maybe one of the open source database alternatives.
Interrupted. School and golf.
Yesterday, I spent some time looking at new keyboard options online. Most of those searches happened via the Best Buy application. My last few keyboards have been ergonomic keyboards built by Microsoft. A replacement keyboard in the same model that I have right now is probably the best option in terms of price. The Natural Ergonomic Keyboard 4000 from Microsoft has been my go to keyboard for years. They now have a couple of different price points and options. One of the options is well over one hundred dollars. Normally, I would be able to head out to a store and take a look at the keyboards. During these strange times in quarantine that is not happening. I’m going to watch a few videos and see what people think about the Microsoft Surface Ergonomic Keyboard. It might very well be my next keyboard. This keyboard I’m using right now might just keep on working and no replacement will be ordered.
Right now my time is being split into two categories of activity. The first category of activity happens to be occurring during my morning window of writing some prose. You normally get to see that in the form of a weblog post. The things that happen during the course of waking up are covered and my ideas are converted into writing about the nature of things. Most of that ends up circling back on the idea of striving toward a perfect possible future and the efforts we make to move forward. The rest of it is muddling prose distilled from inaction and shared with the world. Nobody really wants to read about somebody else being conflicted due to a touch of writer’s block or worse procrastination. The second category of activity happens to be a little bit of Jupyter notebook development. My new Data-Analysis repository on GitHub is devoted to sharing notebooks with simple working examples of things you can do and modify to do other things. The idea is for me to explore data analysis efforts in Jupyter notebooks in a definable and repeatable method.
I’m really focused on putting together a collection of tools for people who are learning to do some complex things in the data science space with machine learning. Getting to that point means helping people get used to opening up rich data sets and doing something with them using Jupyter notebooks. The benefit of this approach is that somebody of any programming skill level can simply pick up the notebook and read the instructions clicking the executable code boxes from the top to the bottom and run the example. That is powerful in nature because reading and clicking is an easy way to start. You can also tinker with each box in the notebook until it does what you want. For somebody learning how to code this really isolates the problem in the chain of commands and you get pretty decent error messaging. Sometimes that is the key to learning enough to overcome the error.
Today, I called an audible (in honor of the Pro Bowl) and focused on the “Introduction to Git and GitHub” course by Google on Coursera. It is the 3rd course in that “Google IT Automation with Python Professional Certificate” series that was released this year. Today, it seemed like a really good idea to focus in on version control systems. This cup of coffee is starting to work. During all these courses I’m using Microsoft Visual Studio Code as my starting point to try to execute code locally. It does not have a cost and seems to be gaining popularity with people who are doing interesting things. My goal for the day is to try and finish this course.