How to enable future AI?
We enable future AI by working out with you how to collect, clean and consolidate your business-critical data assets and turn them into a real-life AI solution.
Artificial intelligence can learn from data, but it can’t consolidate the data it needs from its typical corporate storage and application database state. More often than not, the result and usefulness of AI depends on quality and availability of data - and the business value on the exclusivity of your data.
At Veracell, we help companies come up with an actionable data and AI strategy and if needed, take the whole AI project from idea to production. The leanest way to start is our two-week Data & AI strategy sprint, which maps the route to future AI adoption. Take a look at what steps you need to take to start using AI in your company:
Document data models and processes
Enabling AI begins with mapping the current state and documenting data models and processes. We help you to identify, polish and protect your most valuable data assets and prepare them for use with state-of-the-art AI algorithms. Both application and AI development benefit of centralisation of data, resources, code and documentation. This is the reason we prefer cloud in everything we do.
Experimentation with models
Model development requires exploratory data analysis (increase understanding and potential of your data) as well as running AI modeling experiments, which is oftentimes trial-and-error. Fortunately, experimentation can be done potentially with low effort to understand the state of the data, and what kind of models can be built on top of it.
Fortunately, experimentation can be done potentially with low effort to understand the state of the data, and what kind of models can be built on top of it.
Importantly, experiments need to be traceable. Experiments yield assets not usually encountered in regular software development: for example data snapshot, preprocessor version, model version and model parameters or hyperparameters, which all have a potentially large effect on the model performance. Conveniently, most used and efficient tools for developing AI solutions in business and industry are open source and can provide solutions for example to the issue with traceablility.
Results of the experiments are typically shared as a digital notebook that describes the execution steps (code with comments) and outcomes, such as model performance under different scenarios, in visual format. This helps to decide if the model performance is good enough for production deployment. If not, should the model be improved, or is there an alternative approach to tackle the business problem with AI.
Scalable compute infrastructure
Most of the data sources and storages are not built for artificial intelligence and thus can not directly be used for it. AI model execution (training, and sometimes prediction) requires usually heavy CPU/GPU capacity, distributed computation, and fast data access. This workload imposes heavy stress on the database. Therefore, leveraging production database is not feasible, as that would compromise the performance of the production application. The solution is to build a data storage for AI segregated from operational (application) database.
Without proper database infra, managing large data sets makes model training slow, which makes innovation slow.
With multi-tenant SaaS providers, how data of multiple clients is stored is important for both the application and AI developers. Machine learning usually covers data of all clients, not just one at a time. Without proper database infra, managing large data sets makes model training slow, which makes innovation slow.
Building a production-grade AI
Software team support is essential in productionalizing AI as software engineers help integrate the AI solution into the production environment. As production-level models are brought into use, monitoring, versioning and re-training have to be taken care of. Data keeps changing, and knowing what has changed since the last AI model training is important.
A critical aspect when using machine learning models in production systems is monitoring the model performance (accuracy and latency). In some cases the situation or data may have drifted so far from what was used during model training that the model is rendered unusable. Such events should trigger re-training (or in some cases falling back to some previous model version).
MLOps means the process of deploying a machine learning model to a production system and governing the whole model lifecycle.
MLOps means the process of deploying a machine learning model to a production system and governing the whole model lifecycle. Tools such as MLFlow and Kubeflow allow tracking and versioning models, data, experiments and deploying models to various production environments. Versioning data is also important for reproducibility which may be needed in some cases, for example due to regulatory reasons.
A case study
We met a customer with a highly exclusive and valuable data asset. Data was stored in an application database of a third party and our customer was building a new application based on the collected data. We helped our customer to identify potential use cases for the data and envisioned how AI may help in achieving these goals. However, it was not time for it yet - we had seen only a small part of the data.
For everything to work properly, we built data integrations, new data models, storage layer and a database layer. Besides the basics, we also took care of the privacy and access control, and documented everything. All of this was made with future AI in mind. Having years of experience in real-life AI solutions, we know what a data scientist wishes for and what barriers we can demolish before running into them.
We started building an AI enabled data platform to provide for the needs of the application. To make it modern, secure and maintainable - and to ensure a painless handover in the future - we made it cloud-native. When there is enough data, validated use case and a dedicated data scientist, everything is ready.
We help you become AI enabled
We accelerate your way towards unlocking AI opportunities by planning, building and delivering cloud data platforms to store and automatically maintain everything you need to take full advantage of your data assets.
We are not a vendor you have to lock in to. All our solutions are either built on top of native cloud services, or, if we need to customise something, you own it. This remove barriers from involving other vendors or building your own team in the future.
Do you want to find out how to harness data to grow your business? If so, we would love to have a chat.
From NLP to AutoML – what we are working on
When we founded Veracell, we mostly worked on projects in the healthcare domain. Lately we’ve been active on many fronts. Here's a quick summary of what we are working on.