Working from daft form to a final manuscript

I have been really focused on writing an introduction to machine learning syllabus to share with everybody over on my Substack newsletter. Most of my time and energy has gone into that effort. Right now I’m at the point where a draft exists and has been shared out. That is generally a great point in the process. For me it means that I need to let it breathe for a bit and then go back and rework and reread it a few days later. Picking it up with fresh eyes let me catch the little things that otherwise seemed ok in the initial draft. During the course of that process I have learned how to make figures, tables, and generally use the LaTeX syntax. That was indeed a battle and I shared the files for others to be able to take a look at them if they wanted to see how I used the syntax. I ended up having to learn the whole thing from a bunch of tutorials on YouTube along the way each time I wanted to do something new along the way. It was not until the last section in material that I had to learn how to make tables in LaTeX which was shockingly complex compared to what I expected. You have to understand a bit about how the structure works to see how to modify it in practice. 

Part of learning the LaTeX syntax during my journey was learning to appreciate the Overleaf website and how it manages that type of content. At first, I was wondering why this was any different than using the Google Doc or Microsoft Word processing environments. It really is a bit different and it worked out well enough. It is worth the small cost to be able to use it and I can see where having collaborators and sharing a document is something that the platform helps facilitate in a deeply powerful way. Now that the basic draft process on that syllabus is complete it is time to really focus deeply on the “what’s next?” question. Within my research trajectory notes and upcoming research pages on the Weblog I have a few ideas of what I’m working toward creating. At the moment, I’m thinking that my work with machine learning literature reviews is not complete. I may work out a few more deeper looks at some of the topics contained within the syllabus. I am able to format my research notes and literature reviews into LaTeX syntax PDF documents now. 

I read an article over at The Verge that Google is tracking what I’m doing in my Google Doc and that is not entirely surprising. I will say that during the course of writing in my Substack file which is now drafted to week 87 of 104 planned writing sessions the algorithm has gotten better at providing suggestions while I write on edits and matches my phrasing better. That document about machine learning is really close to 100,000 words right now it is at a word count of 96,925. I’m guessing that in terms of purely original technical learning prose creation I’m on the deeper end of the documents they are analyzing. Somebody I’m sure has written something that is longer. They probably have a different writing schedule than I do and the overall feel and style is probably different.

Worrying about productivity and LaTeX editing

Yeah, I sort of thought it would be possible to just jump in and use Overleaf to edit a LaTeX template. I’m going to end up going back and looking at a few tutorials on YouTube to understand the finer points of what is happening within the document. It was easy enough to save and load the template. Making a copy was pretty routine and renaming the original was highly intuitive. I was able to edit the title, author information, and a few of the elements in the source file did not really make sense to me. That is why I’m going to watch a couple of tutorial videos to really get a better understanding of what is going on within the document. At this point, I’m pretty sure this will be something that I can manage to help produce papers on a more regular basis from my work. That is where things are at right now. 

My current backlog of produced podcasts stands at 2 recorded and loaded episodes. One is ready to go out on July 15 and the other is ready for July 22. That leaves us with the draft of week 79 that is generally complete, but not very compelling. I had moved on and written a pretty decent missive for week 80 that is much longer. The outline for week 81 is clear enough, but it needs more work to bring it up to the standard necessary for recording. I knew that the content from week 81 to week 87 was going to be difficult to generate. Writing out an 8 part syllabus for how I would introduce machine learning is an interesting intellectual challenge. My goal of course is to allow anybody reading the material to come up to speed with a general understanding. The respondent would really have to read the materials and dig into them deeply to walk away with next level skills. That is really the hard part of putting this content together. It needs to be approachable to help provide the breath necessary to introduce machine learning. At the same time, the content contained in the syllabus has to provide enough depth for those respondents who are consuming it to gain knowledge beyond a basic introduction. 

I may very well for fun take the 8 part introduction to machine learning syllabus and convert it into a LaTeX document in Overleaf at the end of the process. That would take something that I know is going to be completed and give me an opportunity to really mess around with the typesetting. It might even give me a chance to help figure out the integration between Overleaf and Github which seems to exist, but I have not had the opportunity to explore. That will probably be a good use of my time. The other way to go about getting some practice with Overleaf and LaTeX would be to take a few of my talks over the last few years and convert them over to paper format. Most of those talks have a transcript and a PowerPoint which could be easily converted over to a LaTeX document. Honestly, that content was probably a better fit for dissemination by recorded video and the follow up transcripts. Most of the content people consume is just text in a browser from a webpage, news source, or some type of application. A much smaller percentage of the population in general consumes all their content from PDFs containing academic papers. 

I absolutely read a ton of articles and jump in and out of consuming content generally available and content packaged up as academic articles or research notes. Those of you who have read my work for a longer period of time will know that I enjoy a bit of research trajectory mixed into my papers. Knowing the bigger picture and where things are going is an important part of how I consume knowledge. I want to know where it fits into the broader spectrum of the academy and how the author intended it to either move things forward or cement something that needed to be shored up with additional research. That is an important part of the equation that is missing from a lot of machine learning papers that I end up reading. The authors get very focused on the mechanism of the mouse trap and how it functions. They don’t really share the importance of the mouse trap in the broader context of the research within the field. It’s possible that maybe a few papers on the research trajectory of machine learning are necessary. My thesis that has been advanced is that overcrowding is causing a problematic scenario where more content than can possibly be consumed is being created and the noise outpaces the signal by an order of magnitude.

Starting to learn how to edit with the LaTeX typesetting system

This weekend a little blogging on the WordPress Android application occurred via my Google Pixel 5 smartphone. Two different posts were made to keep my writing streak alive. Both of the posts were just updates to my activity during the weekend, but they were enough to keep things moving along. During the lengthy car ride back to Denver from Kansas City I gave some thought to the edges of the things being expired in my writing. I’m getting to the bleeding edge of a lot of different academic work. Writing is occurring often at that edge, but I’m not taking the time to put it into an academic paper format for submission. While I don’t wholesale believe in that type of writing for every purpose it probably is something that deserves an investment of my time and energy going forward. 

I’m learning how to use the online site Overleaf as a LaTeX editor. A lot of people ask questions online about the best LaTeX editor for beginners. Over the years I have become very skilled at using Microsoft Word to produce manuscripts and it has worked just fine. Millions of people use it daily. Right now I’m writing out of a Google Docs file with a .DOCX extension. Working out of a LaTeX editor is not something that I really ever do. Either I have to learn how to write in an editor that supports that format or I have to take the time at the end of the journey to convert everything over to that format. Some people have found ways to edit LaTeX documents in Google Docs and it seems that it might be possible. Instead of messing around with that type of effort I’m going to just go all in with Overleaf and see what happens. Today will be the day that starts and I’m hopeful it will be a fun adventure. Learning how to modify and work with LaTeX formatting is not really something that I want to invest my time and energy into, but it seems like something that will end up paying off in the end. 

It should be possible to take my research note on open software MLOps repositories shared on GitHub and get everything converted over to LaTeX using Overleaf. I found an arXiv style template that will serve as a basis for the final output. It should be a fun little adventure in the fine arts of typesetting. Right at the start it is clear that the source and recompile being split sides of a screen is radically different from what I normally handle as a workflow. Right now I’m writing in a print preview mode basically that shows me the read pretty much what will happen live within the document and what will be sent to the printer or a PDF document for that matter. I’m not sold on the idea that you need some type of academic typesetting to gatekeeper the publishing world as a technologic barrier to entry at the port of academic freedom.