Nels Lindahl — Functional Journal

A weblog created by Dr. Nels Lindahl featuring writings and thoughts…

Category: Writing

  • 20250205

    I started to wonder about when the Microsoft coding teams will start having one of their code improvement agents running on open source software on GitHub to suggest improvements by pull requests. Imagine if we had that type of learning agent improving and refining open source projects all the time to make things better and to initiate pull requests at some threshold of contribution level. This could be used as well to help find and fix known security vulnerabilities in repos that are posted to GitHub. We could also have a refinement layer for builds and branches that as code is produced does some additional deep research on the code and improvement suggestions. A whole world of possible GitHub and code repository improvements are going to be arriving soon based on what is now possible. It’s a huge door that is opening to really improve overall code quality and to make really big contributions across the open source code projects that make things run across a lot of foundational technological layers.

    I checked out this code earlier today: https://github.com/Deep-Agent/R1-V

  • 20250204

    I’m building up my collection of 4K Blu-ray science fiction movie collections. Fragmentation of content has meant that streaming is just not what it was at one point. I’d rather have a collection of movies and own them to watch them at my leisure. That is not a common or typically shared opinion about modern entertainment content. Things ended up getting off to a slow start today. I did not even really start writing until just before the very late Colorado Avalanche game tonight. These late start games are hard given that I’m going to fall asleep before the end of the game. It’s one of those wake up the next day and find out the final score sort of situations. 

  • 20250203

    In typical ChatGPT fashion I did not manage to get a single working .ipynb file out of it today. I hit my analysis limit and I’m going to have to wait until tomorrow to give it another old fashioned college try. Earlier today I watched this OpenAI YouTube video blog or relaxed video press release about how they are building and delivering a deep research agent. I generally hope this thing does a lot better job at producing deeper answers. My curiosity about it would be if it takes 15 minutes or more to produce some type of .ipynb file and it will just take a long time to produce a broken file. 

    The other thing I ended up reading today was this code project about the ah-ha moment from the DeepSeek research efforts or simply put, “Clean, minimal, accessible reproduction of DeepSeek R1-Zero.” You can find the code for here: https://github.com/Jiayi-Pan/TinyZero

  • 20250202

    Storing the web to interact with our language model driven future is probably something that should be considered. Search engines use sitemaps and crunch that data down after processing and collection. We could preprocess the content and provide files to be picked up with the content instead of trusting the processing and collection. I’m not sure we will end up with people packaging content for later distribution. That in some ways is a change from the delivering of hypertext to the online world to an entirely different method of sharing. We could just pre-build our own knowledge graph node and be ready to share with the world as the internet as it was constructed is functionally vanishing. Agentic interactions are on the rise and people visiting and reading online content will be diminished. Our method of interface will be with or through an agent and it will be a totally different online experience. 

    I actually spent a bunch of time yesterday working on distilling language models by starting with how to work with GPT2. A few of those notes are shared out on GitHub. They are fully functional and you can run them on Google Colab or anywhere you can work with a Python based notebook. I’m really interested in model distillation right now. A lot of libraries and frameworks for GPT distillation already exist and have for some time. You could grab Hugging Face’s transformers (DistilBERT, DistilGPT2), torch.distill, Google’s T5 distillation techniques, and DeepSpeed & FasterTransformer (for efficient inference). You could do some testing and see what results and benefits of GPT-2 distillation exist. A smaller model could provide reduced parameters for better efficiency. Faster inference could mean you could run on lower-end GPUs or even a CPU if you absolutely had to go that route. Distillation of models does help to preserve performance to retain most of the teacher model’s capabilities.

    Breakdown of the potential code steps:

    • Loads GPT-2 as a Teacher Model: The full-size GPT-2 model is used for generating soft targets.
    • Defines a Smaller GPT-2 as Student Model: Reduced layers and attention heads for efficiency.
    • Applies Knowledge Distillation Loss: Uses KL Divergence between student and teacher logits. Adds cross-entropy loss to ensure the model learns ground truth.
    • Trains the Student Model: Uses AdamW optimizer and trains for a few epochs.
    • Saves the Distilled Model: The final distilled model is saved for future use.

    I’m absolutely getting the most out of the free ChatGPT interface these days. I keep hitting the limit and having to wait till the next day to get more analysis time. That is probably a good use case to pay them via subscription, but I’m not going to do that. It makes it a little bit more fun to just try to get as much out of the free tier as humanly possible.

  • 20250201

    Today was one of those days where we moved 600 cases of cookies. That was a rather intense set of adventures that took up a solid chunk of the day. Tonight I’m watching the Paramount+ movie Section 31. You can apparently buy it on Blu-ray. Somebody probably should have found a way to have it open without so much exposition pulling characters together like somebody asked ChatGPT to merge a trimmed down Ocean’s style plot with some edgy Star Trek themes. I’m not entirely sure some or all of this Section 31 movie was not written by or in partnership with some sort of LLM being involved.

  • 20250130

    My Nespresso Vertuo coffee maker had been stuck in the descaling mode for the last 2 days. This was highly disappointing given that I really enjoy having two shots of espresso every morning. I put maybe 10 total full refills of water into the machine and let them run within the descaling mode into a red plastic bowl. Weirdly enough the process of refilling the water and the indicator turning red from overheating just happened over and over again. The solution ended up being only filling up the water reservoir with a small cup a few times in a row and eventually getting the machine to open, closing it and pressing down on the handle button combo, and after that it went back to regular operations. It was a very frustrating situation. I could not just sit and watch the machine malfunction for hours on end. To be clear the engineers working for Nespresso product design should have made an actual physical switch that said descaling mode and if you wanted to turn that off you could switch it back to regular operations. The current weird combo of clicking things and reading color combinations is beyond difficult to use and from reading the comments of many frustrated users online the current design is not very effective.

  • 20250128

    Oh my goodness; that Tuesday vibe is just everywhere today. I’m already thinking about lunch at the very start of the day which is never really a good sign for things. My primary breakfast food is Huel and that has worked out well enough. This Tuesday turned out to be extra full of Tuesday vibes. That is just how it goes sometimes. Watching the Colorado Avalanche and the Jayhawks are both options tonight. Both games are actually on ESPN+ which is odd to have happened at the same time. The games actually overlap so I will probably be jumping around between the two games. Streaming service fragmentation for sports viewing has gotten out of control. 

    I updated the WordPress theme header and footer to be more centered in style to make them work better across multiple types of views. Based on a couple of initial tests I’m pleased with how the design turned out. The design is probably locked in for the foreseeable future and with the drama surrounding WordPress who knows if this will be the last ever major theme release.

  • 20250127

    My blog posts are mostly just my thoughts at the time. It’s not a prompt based model generation using some type of system to generate the content. This in the end is content written the old fashioned way by an author at a keyboard making the prose happen. Beyond writing every day I spend time reading academic articles. Within that mission, yesterday, I ended up reading that DeepSeek article on my iPad and it turned out I had to turn it sideways to be able to actually read the text [1]. Sometimes I can read articles in the vertical mode, but the ones that are single columns end up needing that extra viewing space. It was a pretty easy to read article and it felt like it might have been written by or at least heavily edited by the very model they were explaining. Years ago I used to print articles and keep them in stacks on my desk and really all over my office. That was a different time and I’m not sure the same process would even really work for me anymore. I tend to read things on my computer screen and now on the iPad. Key ones get saved to my cloud storage files so I can easily get back to them in the version that I read at the time.

    I’m considering spending some time selecting a couple of topics and writing solid literature reviews. It feels like that might be a good way to get back into the habit of crafting article style blocks of content that are a lot more than a simple research note. It’s totally possible to build a collection of research notes and compile or distill that content into something more substantial. Based on my writing plan each morning on the weekends I have time to work on that type of effort.

    Footnotes:
    [1] https://arxiv.org/pdf/2501.12948

  • 20250126

    Focused writing matters. Producing really good prose on a daily basis is about the practice of writing. It’s gearing up to engage in the process of producing written words. Working on longer pieces of prose does benefit from sitting down and sketching out the path of things that should be produced. Just writing a single page and working to produce within the stream of consciousness method is a little different. I know that even during those writing sessions that are just focused on the actual act of creating whatever comes to the forefront of thought it is something where I need to remain focused and recognize that the writing matters. I don’t want to just produce a string of bullet points that lack focus or just have no real context for the reader. 

    I’m going to spend some time today reading the new academic paper from the DeepSeek artificial intelligence company [1]. They trained a model using a lot less hardware than comparable models which is interesting. I have saved the paper to my iCloud drive AI papers folder and it will be read on my iPad later today. It’s 22 pages and looks like a relatively easy read. By easy I mean that it is not wall to wall mathematics which makes a paper much harder for me to read as walking through other peoples mathematics generally takes me a lot longer than reading some text. 

    I’ll be curious to see if the methodology could be extended to run within my more complex knowledge reduce framework. Training would have to occur after the knowledge reduce transformation to make the model more efficient. The other method would be to use a multiple call framework that calls the knowledge repository before or during the primary model query. This is the same type of thing that would occur when you prioritize knowledge graph content over generated model content. My guess is that mixtures of knowledge graphs will be used with the next generation of models. You can only derive so much understanding of the world around you before context provides the additional information you need for understanding. Deriving context is a lot more layered and complex given that the context of why a word, situation, or action has a deeper meaning might be implied and not frequently cited. It’s also possible that meaning drifts over time meaning that contextual drift makes deriving the meaning situational. That is a lot harder to derive and keep in context. 

    I’m super interested in trying to make a list of what I would describe as my anchor articles or the things that I have written that would be the first things somebody should read. Naturally the next step in that process would be to figure out how to produce more of those primary anchor articles and less of the stuff toward the bottom of the list. As an intellectual exercise this would be pretty simple to accomplish. I would take a sheet of paper and divide it into 4 quadrants. Within the top left quadrant let’s call it section one and fill it with the most important written works. You should take a moment and write down as many things as you can think of and consider them to be a good anchor work or writing. Next move to the right top quadrant and begin to cross items off of the first quadrant and move them into the secondary bucket. It’s ok they cannot all be box one writing top shelf works of content. Get some content moved over. 

    After working on that 2nd quadrant for a while, move down to the bottom left quadrant. Let’s just call it the 3rd box and write down everything else you can think of as a written work. Some things just won’t come to mind and that is okay. Some writing efforts end up being forgettable. Logically those are the blocks of content that are trending toward the 4th quadrant of this paper that will be contained in the bottom right box. You can start to cross off things from the 2nd or 3rd quadrants and move them to the final 4th box. Those are the ones we are really interested in understanding. You cannot move all poor writing efforts into the stop doing category of prose generation. It’s going to happen false starts are as important a part of the process as the winners. Very few people are going to sit down and write one thing and it’s a quadrant one absolute winner then just stop producing new content. That would be a pretty rare thing to have happened. Most people write a variety of things and some of them are better than others and that is just a part of the process. 

    Trying to consider the things that ended up getting crossed out and moved to another quadrant and why that might be the case is important. You want to figure out what would have helped that content be better or maybe it should be rewritten now with new context or maybe your writing skills improved and now is the time to make it even better. It’s also possible that the writing itself was fine and the idea limited what quadrant it might end up living in during this exercise. Some of those things could end up being more appreciated later or it was just forgettable. I have written millions of words and some of that prose is interesting and a lot of it is not interesting. That is just the nature of being an avid writer. 

    Footnotes:

    [1] https://arxiv.org/pdf/2501.12948 

    My MacBook Air with Glitty cover. I’m setting a featured image for this post to test some of the design elements in the Twenty Twenty-Five theme.
  • 20250125

    Teams at both Apple and Samsung are working on non-invasive glucose monitors in watch form which is exciting. That will be a product that helps a lot of people. It will be a very useful application of technology to provide real time information to people. I think we are going to see a lot of wearable health technology going forward. People seem to have an appetite for it and the technology is now reaching a place where things are becoming possible that could be really interesting.

    Yesterday morning, I took the time to scroll all the way back to the start of my Google Keep feed this morning. Right now I am way back in my notes from 2013 to 2017. I’m going to share some of the more interesting ones. A lot of them just got archived in Google Keep. Very few were just deleted.

    • Writing like a memory jukebox that produces one story at a time
    • What is the utility of all the reminders of a certain day image or notes from agents?
    • Write a modern treatise on society based on expanding observation from a single person’s perspective to a wider social net
    • Where is all the innovation happening? Does that change over time?
    • Has social media fragmentation changed local business marketing?
    • A really weird setting of what should have been an asynchronous meeting because the person is not really ready to fully participate. Person is so busy that they can only engage to be critical and ask questions but are not ready for the second wave of questions or to provide any additional context. 
    • Organizing and sustaining decentralized communities of interest. AKA Community organizing revisited: The basics of organizing are pretty simple. Fundamentally this is a treatise on how to bring people together for a common purpose. A community is a diverse ecosystem that has a number of stakeholders and competing interests. Initially building an organizational map of the community takes time and access to key stakeholders. A shared purpose toward a common outcome is the basic building block of an organically built sustainable community. Organizing decentralized communities requires more than basic political methods. Political organizing and fundraising has become more targeted, but at the same time it has become less personal. A political contributor may make donations and read emails from a group without any personal relationship with the organization. Follow up sections: Mapping the networks, Triangulation strategy, Digital vs. Personal, Managing the message, Building frameworks, and Sustainability vs. Outcomes.
    • Civility’s commons: an uncommon civility
    • Do people really trust a user group to define the future of a product?
    • Questions about how faculty at colleges become a commodity. Does this demonstrate an oversupply of potential faculty or some other change?
    • Survey fatigue breaking the polling industry
    • Influencing the public mind: a study of multichannel influence and the public mind
    • Build an economic model using the original Google search algorithm
    • Revisit sentiment mining paper

    Some of the things that got stored in Google Keep are things that were not a part of my backlog system or really anything beyond a note taken at the time. The context is now gone and some of the notes are really short. Most of my notes are a few words and that is it in Google Keep. A few of them are longer like the ones above. If I went back to trying to write 3,000 words or more per day, then it would be a great prompt library to help use the current context to evaluate the prompt and produce content. Honestly, none of the bullets above caused the spark of innovation to get me to write about them in a meaningful way.

    I did listen to this podcast and enjoyed it: