If you thought training AI models was hard, try building enterprise apps with them
Despite the billions of dollars spent each year training large language models (LLMs), there remains a sizable gap between building a model and actually integrating it into an application in a way that’s useful.
In principle, fine tuning or retrieval augmented generation (RAG) are well-understood methods for expanding the knowledge and capabilities of pre-trained AI models, like Meta’s Llama, Google’s Gemma, or Microsoft’s Phi. In practice, however, things aren’t always so straightforward, Aleph Alpha CEO Jonas Andrulis tells El Reg.
“About a year ago, it felt that everybody was under the assumption that fine tuning is this magic bullet. The AI system doesn’t do what you want it to do? It just has to be fine tuned. It’s not that easy,” he said.
As we’ve previously explored, while fine tuning can be effective at changing a model’s style or behavior, it’s not the best way to teach it new information.
RAG — another concept we’ve looked at in depth — offers an alternative. The idea here is that the LLM functions a bit like a librarian retrieving information from an external archive. The benefit of this approach is that information contained within the database can be changed and updated without the need to retrain or fine tune the model and that the results generated can be cited and audited for accuracy after the fact.
“Specific knowledge should always be documented and not in the parameters of the LLM,” Andrulis said.
While RAG certainly has its benefits, it relies on key processes, procedures, and other institutional knowledge being documented in a way that the model can make sense of it. In many cases, Andrulis tells us this isn’t the case.
But even if it is, it won’t do enterprises any good if those documents or processes rely on out-of-distribution data. That is, data that looks different from the data used to train the base model. For example, if a model was only trained on English datasets, it would struggle with documentation in German — especially if it contains scientific formulas. In many cases, the model simply won’t be able to interpret it at all.
As a result, Andrulis tells us, some combination of fine-tuning and RAG is usually required to achieve a meaningful result.
Bridging the gap
Aleph Alpha hopes to carve out its niche as a sort of European DeepMind by addressing the kinds of problems preventing enterprises and nations from building sovereign AIs of their own.
Sovereign AI generally refers to models that are trained, or fine tuned, using a nation’s internal datasets on hardware built or deployed within its borders.
“What we’re trying to do is be this operating system, this foundation for enterprises and governments, to jump off of and build their own sovereign AI strategy,” Andrulis said. “We try to add innovation where we feel it’s necessary, but also to leverage open source and state of the art where it’s possible.”
We don’t have to build another Llama or DeepSeek because they’re already out there
While this occasionally means training models, like Aleph’s Pharia-1-LLM, Andrulis emphasizes they’re not trying to build the next Llama or DeepSeek.
“I’m always directing our research to do things that are meaningfully different, not just copy what everybody else is doing, because that’s already out there,” Andrulis said. “We don’t have to build another Llama or DeepSeek because they’re already out there.”
Instead, Aleph is largely focused on building frameworks to make adopting these technologies easier and more efficient. The latest example of this is the Heidelberg-based AI startup’s new tokenizer-free, or “T-Free” training architecture, which aims to make fine tuning models that can understand out-of-distribution data more efficiently.
According to Aleph, traditional tokenizer-based approaches often require large quantities of out-of-distribution data in order to effectively fine-tune a model. This not only makes it computationally expensive, but assumes that sufficient data exists in the first place.
The startup claims its T-Free architecture side-steps this problem by ditching the tokenizer entirely. And, in early testing on its previously announced Pharia large language model (LLM) on the Finnish language, Aleph claims to have achieved a 70 percent reduction in training cost and carbon footprint compared to tokenizer-based approaches.
Aleph has also developed tools to help overcome gaps in documented knowledge, which might lead to the AI drawing inaccurate or unhelpful conclusions.
If for example there are two contracts that are relevant to a compliance question and they contradict one another, “the system can basically approach the human saying, I found a discrepancy … can you please give me feedback on whether that is an actual conflict,” Andrulis said.
The information gathered through this framework, which Aleph calls Pharia Catch, can then be fed back into the application’s knowledge base, or be used to fine-tune more effective models.
According to Andrulis, tools like these have helped the company win partners like PwC, Deloitte, Capgemini, and Supra, which work with end customers to implement the startup’s technology.
What about hardware?
Software and data aren’t the only challenges facing Sovereign AI adopters. Hardware is another factor that has to be taken into consideration.
Different enterprises and nations may have requirements to run on domestically developed hardware while others may simply dictate where the workloads can run.
All of this means that Andrulis and his team have to address the widest possible range of hardware possible, and Aleph Alpha is certainly attracting an eclectic group of hardware partners the least surprising of which is AMD.
Last month, Aleph Alpha announced a partnership with the up and coming AI infrastructure vendor to use its MI300-series accelerators.
Andrulis also highlighted the outfit’s collaborations with Britain’s Graphcore, which was acquired by Japanese mega-conglomerate Softbank last year, and Cerebras, whose CS-3 wafer scale accelerators are now being used to train AI models for the German armed forces.
Despite all of this, Andrulis is adamant that Aleph Alpha’s goal isn’t to become a managed service or cloud provider. “We will never become a cloud provider,” he said. “I want my customers to be free and without being locked in.”
It’s only going to get more challenging
Looking ahead, Andrulis anticipates that building AI applications is only going to become more complex as the industry moves away from chatbots toward agentic AI systems capable of more complex problem solving.
Agentic AI has become a hot topic over the past year with model builders, software devs, and hardware vendors promising systems that can complete multi-step processes asynchronously. Early examples include things like OpenAI’s Operator and Anthropics computer use API.
“What we did last year was, in most cases, pretty straightforward stuff. Easy things like summarization of documents or a writing assistant,” he said. “Now, it’s getting a little more exciting with things that, at first glance, don’t even look like genAI problems where the UX is not a chat bot.” ®