'60 Leaders' is an initiative that brings together diverse views from global thought leaders on a series of topics – from innovation and artificial intelligence to social media and democracy. It is created and distributed on principles of open collaboration and knowledge sharing. Created by many, offered to all.
ABOUT 60 LEADERS
'60 Leaders on Artificial Intelligence' brings together unique insights on the topic of Artificial Intelligence - from the latest technical advances to ethical concerns and risks for humanity. The book is organized into 17 chapters - each addressing one question through multiple answers reflecting a variety of backgrounds and standpoints. Learn how AI is changing how businesses operate, how products are built, and how companies should adapt to the new reality. Understand the risks from AI and what should be done to protect individuals and humanity. View the leaders.
'60 Leaders on Innovation' is the book that brings together unique insights and ‘practical wisdom’ on innovation. The book is organized into 22 chapters, each presenting one question and multiple answers from 60 global leaders. Learn how innovative companies operate and how to adopt effective innovation strategies. Understand how innovation and experimentation methods blend with agile product development. Get insights from the experts on the role of the C-Suite for innovation. Discover ways that Innovation can help humanity solve big problems like climate change.
What is the ‘AI State of the Art’?
During the latest several years, I’ve witnessed various interesting trends in AI systems available on the market that solve real-life use cases. A new generation of models, especially transformer-based models, re-shaped the foundation of what is possible in areas like Natural Language Processing (NLP). Then, as generally a bigger number of parameters in those models showed improvement in model performance, the competition of ‘who’s model has more billions of parameters’ emerged which in fact did work. As an example, models like the Megatron-Turing NLG model with 530B parameters not only push SOTA for some tasks but do so across a broad set of tasks including completion prediction, reading comprehension, commonsense reasoning, natural language interfaces, and others.
And while this does not mean we are any close to general AI systems, there is a higher level of generalization that models extract when compared to smaller, task-specific models. Those large models come at a cost though, since they require a large pool of resources and are only becoming popular due to the availability of powerful optimization and distribution learning algorithms. For example, the abovementioned Megatron-Turing NLG model leverages the DeepSpeed library, allowing to build pipeline parallelism to scale model training across nodes. By the way, cost (including environmental) to train such large-scale models means that effective re-use of those models across use cases is key for gaining positive net value. And, at the same time, those same powerful libraries, next generation algorithms, and cloud resources are available to literally anyone: as an example, using those same libraries and the cloud ML platform allowed one of our customers – the University of Pecs in Hungary – to train Hungarian language model in just several months from idea to production, with total resources cost of just around $1000.
Another interesting observation is the emergence of models which are multi-task, multi-language, or models that are built for one domain of AI tasks and then are successfully applied to another. For example, applying attention models initially developed for NLP to image recognition tasks shows impressive results. And even more so, new models emerge which learn simultaneously on different types of modalities – or multimodal architectures. Most prominent are probably recent architectures that use a combination of language and image data – they learn from both the objects in an image and the corresponding text - leading to knowledge that can be applied to a range of tasks, from classification to generating image description or even translation or image generation.
This paradigm of combining narrow learning tasks into a more general model which learns on many tasks simultaneously is also leveraged in the new generation of Language Models. As an example, Z-code models take advantage of shared linguistic elements across multiple languages to improve the quality of machine translation and other language understanding tasks. Those models take advantage of both transfer learning and multitask learning from monolingual and multilingual data to create a language model, improving tasks like machine translation by a significant margin across languages. This same approach is used in Florence 1.0 model, which uses XYZ-code, a joint representation of three cognitive attributes: monolingual text (X), audio or visual sensory signals (Y), and multilingual (Z), which allowed to advance multi-task, multi-lingual Computer Vision services for tasks like zero-shot image classification, image/text retrieval, object detection, and question answering.
There are of course many other models and developments, which push the boundaries of what is possible with ML models. Working with the customers and partners on their AI projects, what I am always looking for is how we can apply all of that state-of-the-art research to real-life customer projects. Using a large-scale model has its challenges from its size to inference costs and many approaches emerge to build sparser, more efficient models. As an example, the abovementioned Z-code models use a ‘mixture of experts’ approach, which means only a portion of a model is being engaged to complete a task. As a result, customers can make use of those powerful developments almost immediately after its introduction. Customers can today build applications leveraging Z-code models, or use multilingual language models with Cognitive Services APIs, or even apply powerful large-scale models like Open AI as a managed cloud service.
In general, this is probably the most impactful observation I record for myself in the latest couple of years: not only new models emerge, but those new generation models are also significantly decreasing the cycle from introduction to actual, applied implementations. This has its obvious benefits, but also imposes significant risks. For example, the latest speech generation services are available to anyone as easy-to-use APIs along with many other AI services now reaching human parity. This general availability increases the need to take responsibility for how those services are being applied. This is a separate topic by itself, as approaches to address those risks span across cultural, technological, and policy dimensions, like gating some of the services and reviewing the use case each time those services are being deployed, helping to ensure that those powerful advancements are applied not only where it can be used, but where it actually should be.