"Companies are afraid of missing out"

Mechanical engineers have been drawn to Linz for several months. Prof Dr Sepp Hochreiter has been conducting research at the JKU Linz for many years. He is regarded as one of the few AI luminaries in Europe. He is attracting a new generation of top researchers to the university. The most recent addition is Johannes Brandstetter, who came to the Danube from Microsoft in Amsterdam. We met him.

The new year is just a few days old and Ass. Prof. Dr Johannes Brandstetter and Prof. Dr Sepp Hochreiter from the JKU Linz are planning the next courses. The institute's Christmas party is also still to come. "We usually never make it in December because the NeurIPS (the most important AI conference in the world with more than 12,000 visitors) keeps us pretty busy," explains Brandstetter. The Linz team brought nine papers with them to New Orleans. One was among the top 10. At the same time, they have been researching the European alternative to the Transformer architecture, which makes large language models such as ChatGPT so powerful, in Linz for several months. The project is called XLSTM in reference to the LSTM algorithm, which Hochreiter invented 25 years ago. "The LSTM algorithm is widely used in the industry today. XLSTM will become even more powerful," predicts Brandstetter. The community is waiting for the first paper. XLSTM is LSTM plus exponential gating improved by vectorisation, some observers whisper. XLSTM is an autoregressive approach, can abstract and Hochreiter wants to keep it in Europe. "XLSTM is faster, we need less memory and has a linear runtime," Hochreiter recently explained in the podcast Industrial AI.

But the reality is that the AI world is in love with the Transformer architecture. But the architecture is brute force. Transformers offer good performance at the price of huge data sets and a lot of GPU computing power. In Linz, they are working on a leaner solution. They now have the GPUs for the computer for their own models. "Sepp can do it. Who, if not him?" Brandstetter is surprised at the many malicious comments made about Hochreiter, especially from the German research community. "I took a screenshot of all the comments here - for later," he commented under a LinkedIn article a few weeks ago. He defends his mentor and his research. Nowhere is a prophet less recognised than in his homeland - Hochreiter is originally from Germany. There is a Thomas Mann quote on the whiteboard in Hochreiter's office. "For the sake of goodness and love, man should not allow death to rule his thoughts," he wrote. To paraphrase Hochreiter: don't let "badmouthers" make you mad. Brandstetter has also internalised this. The young professor is regarded as one of the AI prodigies in Europe. Brandstetter did his PhD at CERN in Switzerland. "I was able to do research with brilliant minds," enthuses the physicist. What he doesn't reveal is that he was part of ground-breaking work in the field of Higgs boson physics. And yet he was drawn away - to Hochreiter in Linz. "Sepp drove me to go to Amsterdam. Max Welling's group discovered the topic of deep learning very early on and it was a fantastic time." The first few weeks in Amsterdam he was intoxicated - not by the weed - but by the many possibilities in the field of AI. Fellow travellers on the bus discussed NeurIPS papers. Brandstetter was amazed. "Over the years, a unique AI ecosystem has emerged in Amsterdam that I had previously only known from the USA." Qualcomm is there, Alphabet, Bosch, ASML is not far away, Microsoft and the Dutch proudly declare that over 100 companies are on the waiting list to get a lab at the University of Amsterdam. Brandstetter did research for Microsoft and now wants to advance the industry in Linz with his research - especially for the industry. We met him.

How was the NeurIPS?

Brandstetter: Over 12,000 people staring at countless posters in a hall with poor air quality - it's exhausting and a lot of fun to discuss and meet old and new colleagues. And we were well represented with nine papers and our sessions were very well attended.

Is this how quality is measured?

Brandstetter: Yes, if a lot of people gather around your poster, that's a good sign. (laughs)

And everyone was in a Large Language Model (LLM) frenzy?

Brandstetter: LLMs are fascinating, but many in our community are now happy if you don't talk about an LLM topic for a while.

So is the hype over?

Brandstetter: No, the models are now becoming multimodal, which can do much more than language models. Alphabet's Gemini model is a first approach.

It was pretty much torn apart in the media.

Brandstetter: But it's still the right approach.

LLMs became big in the USA. Many companies are conducting research in this area. All the attention is focussed on generative AI. For example, is research at the JKU Linz in the field of AI now in the 2nd Bundesliga or, to put it another way, is it still attractive?

Brandstetter: Good question, we also use LLMs in our research and are working on alternative architectures that are certainly more interesting for industry than transformers.

xLSTM?

Brandstetter: Exactly. But the hype surrounding LLMs is bringing us more students. Numbers are at record levels, there is more money for research and companies are afraid of missing out.

So more industrial projects?

Brandstetter: Yes, a lot of them. We now have to cancel. And interestingly, many German mechanical engineering companies are approaching us. We are experiencing the iPhone moment of AI in industry.

Perhaps this is also due to your focus on simulation and AI.

Brandstetter: Yes, definitely. Every day, thousands and thousands of computing hours are spent on modelling turbulence, simulating fluid or air flows, heat transfer in materials, traffic flows and much more. Many of these processes follow similar basic patterns, but require different and specialised software to simulate them. Even worse, for different parameter settings, the costly full-length simulations have to be run from scratch. Deep learning techniques are ready to develop models that perform simulations in seconds instead of days or even weeks. The hardware is capable of processing high-resolution inputs on an industrial scale, such as 3D meshes or images, creating the conditions for training deep learning models on a large scale.

What do you want to achieve?

Brandstetter: We want to make simulations better, faster and more generalised - to develop basic models for simulation. Neural networks have the potential to improve simulations on all fronts. We want to identify solutions to problems that previously seemed unthinkable. For example, there are many processes in industry that can only be modelled in a very rudimentary way, such as certain melting processes.

Data is always a problem.

Brandstetter: Not this time. Fortunately, many of the processes mentioned above have a common underlying dynamic - similar to how different languages have a common structure and grammar. There is an abundance of simulation data, we just need to use the right ones, and lots of them.

How can a neural network learn from a simulation and then improve the quality of the simulation?

Brandstetter: We generalise. We show the network many simulations - not just the melting simulation, for example, but also use other simulations from other domains. Fortunately, nature can be described by a few terms such as convection and diffusion, which alternate over and over again in different domains. This increases the quality across different domains.

That's the theory.

Brandstetter: No, it works. At Microsoft, for example, we have developed ClimaX, a flexible and generalisable deep learning model for weather and climate science that can be trained with heterogeneous data sets. ClimaX is the first basic model for weather and climate.
www.hannovermesse.de

Smart Factory