Well over at https://nelslindahl.substack.com/ my next post just went live today.
Substack Week 2: Machine Learning Frameworks & Pipelines
Enter title… Machine Learning Frameworks & Pipelines
Enter subtitle… This is the nuts and bolts of the how in the machine learning equation
Ecosystems are beginning to develop related to machine learning pipelines. Different platforms are building out different methods to manage the machine learning frameworks and pipelines they support. Now is the time to get that effort going. You can go build out an easy to manage end to end method for feeding model updates to production. If you stopped reading for a moment and actually went and started doing research or spinning things up, then you probably ended up using a TensorFlow Serving instance you installed, Amazon SageMaker pipeline, or an Azure machine learning pipeline.[1] Any of those methods will get you up and running. They have communities of practice that can provide support.[2] That is to say the road you are traveling has been used before and used at scale. The path toward using machine learning frameworks and pipelines is pretty clearly established. People are doing that right now. They are building things for fun. They have things in production. At the same time all that is occurring in the wild, a ton of orchestration and pipeline management companies are jumping out into the forefront of things right now in the business world.[3]
Get going. One way to get going very quickly and start to really think about how to make this happen is to go and download TensorFlow Extended (TFX) from Github as your pipeline platform on your own hardware or some type of cloud instance.[4] You can just as easily go cloud native and build out your technology without boxes in your datacenter or at your desk. You could spin up on GCP, Azure, or AWS without any real friction against realizing your dream. Some of your folks might just set up local versions of these things to mess around and do some development along the way.
Build models. You could of course buy a model.[5] Steps exist to help you build a model. All of the machine learning pipeline setup steps are rather academic without models that utilize the entire apparatus. One way to introduce machine learning to the relevant workflow based on your use case is to just integrate with an API to make things happen without having to set up frameworks and pipelines. That is one way to go about it and for some things it makes a lot of sense. For other machine learning efforts complexity will preclude using an out of the box solution that has a callable API. You would be surprised at how many complex APIs are being offered these days, but they do not provide comprehensive coverage for all use cases.[6]
What are you going to do with all those models? You are going to need to save them for serving. Getting setup with a solid framework and machine learning pipeline is all about serving up those models within workflows that fulfill use cases with defined and predictable return on investment models.
From the point you implement it is going to be a race against time at that point to figure out when those models from the marketplace suffer an efficiency drop and some type of adjustment is required. You have to understand the potential model degradation and calculate at what point you have to shut down the effort due to return on investment conditions being violated.[7] That might sound a little bit hard, but if your model efficiency degrades to the point that financial outcomes are being negatively impacted you will want to know how to flip the off switch and you might be wondering why that switch was not automated.
Along the way some type of adjustment to a model or parameters is going to be required. I have talked about this before at length, but just to recap here the way I look at return on investment is pretty straightforward based on the value of the initial ML model minus the initial value of the model and the final value minus the initial value divided by the cost of investment times 100%. Yeah that was a lot to read, but it’s just going to give you a positive or negative look at whether that return on investment is going to be there for you. At that point you are just following your strategy and thinking about the return on investment model.
So again strict return on investment modeling may not be the method that you want to use. I would caution against working for long periods without understanding the financial consequences. At scale, you can very quickly create breakdowns and other problems within a machine learning use case. It could even go so far that you may not find it worthwhile for your business case. Inserting machine learning into a workflow might not be the right thing to do and that is why calculating results and making fact based decisions is so important.
Really any way you do it in a planful way that’s definable and repeatable is gonna work out great. That is fairly easy to say given that inserting fact based decision making and being willing to hit the off switch if necessary help prevent runway problems from becoming existential threats to the business. So having a machine learning strategy, doing things in a definable and repeatable way, and being ruthlessly fact based is kind of where I’m suggesting you go.
Obviously, you got to take everything that I say with a grain of salt, you should know upfront that I’m a big Tensorflow enthusiast. That’s one of the reasons why I use it as my primary example, but it doesn’t mean that that’s the absolute right answer for you. It’s just the answer that I look at most frequently and always look to first before branching out to other solutions. That is always based on the use case and I avoid letting technology search for problems at all costs. You need to let the use case and the problem at hand fit the solution instead of applying solutions until it works or you give up.
At this point in the story, you are thinking about or beginning to build this out and you’re starting to get ramped up. The excitement is probably building to a crescendo of some sort. Now you need somewhere to manage your models. You may need to imagine for a moment that you do have models. Maybe you bought them from a marketplace and you skipped training all together. It’s an exciting time and you are ready to get going. So in this example, you’re going from just building (or having recently acquired) a machine learning model to doing something. At that moment, you are probably realizing that you need to serve that model out over and over again to create an actual machine learning driven workload. Not only does that probably mean that you’re getting to manage those models, but also you are going to need to serve out different models over time.
As you make adjustments and corrections that introduce different modeling techniques you get more advanced with what you are trying to implement. One of the things you’ll find is that even the perfect model that you had and was right where you wanted it to be when you launched is slowly waiting to betray you and your confidence in it by degrading. You have to be ready to model and evaluate performance based on your use case. That is what lets you make quality decisions about model quality and how outcomes are being impacted.
I have a few takeaways to conclude this installment of The Lindahl Letter. You have to remember that at this point machine learning models and pipelines are pretty much democratized. You can get them. They are out in the wild. People are using them in all kinds of different ways. You can just go ahead and introduce this technology to your organization with relatively little friction.
- I’m still amazed that this technology is freely available.
- Frameworks are well developed and have been pressure tested at scale.
- Yeah, people have proven it works.
- The process has been well documented and the path is clear.
- Pipelines and automation save time. Fewer ML team members are needed to deliver this way.
- A lot of the first time doing this gotchas are managed away in this model based on leveraging community knowledge and practice.
- Serving multiple models and model management is hard.
- None of this replaces the deep work required to wrangle the data.
Footnotes:
[1] Links to the referenced ML pipelines: https://www.tensorflow.org/tfx, https://aws.amazon.com/sagemaker/pipelines/, or https://docs.microsoft.com/en-us/azure/machine-learning/concept-ml-pipelines
[2] One of the best places to start to learn about machine learning communities would be https://www.kaggle.com/
[3] Read this if you have a few minutes… it is worth the read https://hbr.org/2020/09/how-to-win-with-machine-learning
[4] https://github.com/tensorflow/tfx
[5] This is one of the bigger ones https://aws.amazon.com/marketplace/solutions/machine-learning
[6] This is one example of services that are open for business right now https://cloud.google.com/products/ai
[7] This is a wonderful site and this article is spot on https://towardsdatascience.com/model-drift-in-machine-learning-models-8f7e7413b563
What’s next for The Lindahl Letter?
- Week 3: Machine learning Teams
- Week 4: Have an ML strategy… revisited
- Week 5: Let your ROI drive a fact-based decision-making process
- Week 6: Understand the ongoing cost and success criteria as part of your ML strategy
- Week 7: Plan to grow based on successful ROI
- Week 8: Is the ML we need everywhere now?
- Week 9: What is ML scale? The where and the when of ML usage
- Week 10: Valuing ML use cases based on scale
- Week 11: Model extensibility for few shot GPT-2
- Week 12: Confounding within multiple ML model deployments
I’ll try to keep the what’s next list forward looking with at least five weeks of posts in planning or review. If you enjoyed reading this content, then please take a moment and share it with a friend.
Leave a Reply