Machine learning ops: how to make AI really work for you — from pilot to production
Machine learning and artificial intelligence (AI) in general have been around for a long time. Statistical methods have been used in analytics for decades and machine learning methods for at least 20 years. And now these same methods are being used today, just under a new umbrella – AI.
When we take this into account some people will think to themselves: “Hey we have that, I get statistics and forecasts sent to my email in excel sheets each day!” So in a sense, you already have MLOps in place. Somebody has built a report, has modeled the data for predictions and the deploying of the model forecast results has been automated.
Then we would like you to ask you, the reader, a handful of questions:
- Is this solution of yours easily reproducible?
- Can it be transferred and set up in another environment?
- Do you keep track of your models forecasting results?
- How long does it take to take action if you later find out that the model wasn’t working correctly and some actions/decisions have already been made on the model forecasts?
Not sure? This is quite common.
These are some of the questions where AI development practices usually fall short. Many companies do not have that much hands-on experience about AI in production and face the difficulty of bringing software development CI/CD practices to AI development.
PoC/MVP isn’t usually production ready, a lot of times it’s just a python script or a jupyter notebook with the necessary code to prove that the idea behind the model works. After this, the real work starts — building the infrastructure needed to have the model in production.
In our experience working in production, machine learning the modeling is usually 10% of the whole project. Data/feature engineering is 20-40% and a minimum of 50% is used setting up the infrastructure for production. Neglecting the fact that time to production in AI development is quite long can lead to not being able to leverage AI at all. But there are options to make this process smoother and faster.
Instead of one project, one model, one production integration way of working companies should have a MLOps platform. A platform that is modular and easy-to-use and scales as needed. Essentially all the building blocks that are needed to take a model to production (scheduled data acquisition and transformation, model training, model monitoring, e.g. deploying the model behind an API or OTA deployment to a network of IoT machines is containerized). This means that every time there is a need for a new model the existing pipeline can just be scaled and modified relatively fast to support a new model.
Once the infrastructure has been set up and the analysts, data scientists and devops people are familiar with using the platform, more time can be used in developing new models and enhancing old ones. This leads directly to increases in money earned vs money spent overtime.
This is essentially the idea behind MLOps, so if your company has decided to go full on with AI, then MLOps should definitely be a top priority in your AI development strategy.
This way of CI/CD thinking in AI can be further extended with FeatureOps (how can we be sure that model features are always calculated the same way if the dataset is slightly different or if the person who built the model and the features leaves the company, how can we select the right columns from a database really fast without prior knowledge of the data if we know that we’ll be solving a binary classification problem given that we know the target that we are predicting) and feature store functionalities (a sort of data warehouse just for AI datasets. Find more information at the following links:
MLOps platforms, feature Stores and even FeatureOps can be purchased as a product or SaaS instead of building your own and there are always plusses and minuses for both options.
If you would like to hear more about MLOps, see a demo or discuss other AI related work, don’t hesitate to contact us.