1. Multimodal AI
Multimodal AI goes beyond traditional single-mode data processing to encompass multiple input types, such as text, images and sound — a step toward mimicking the human ability to process diverse sensory information.
“The interfaces of the world are multimodal,” said Mark Chen, head of frontiers research at OpenAI, in a November 2023 presentation at the conference EmTech MIT. “We want our models to see what we see and hear what we hear, and we want them to also generate content that appeals to more than one of our senses.”
Agentic AI
Agentic AI marks a significant shift from reactive to proactive AI. AI agents are advanced systems that exhibit autonomy, proactivity and the ability to act independently. Unlike traditional AI systems, which mainly respond to user inputs and follow predetermined programming, AI agents are designed to understand their environment, set goals and act to achieve those objectives without direct human intervention.
For example, in environmental monitoring, an AI agent could be trained to collect data, analyze patterns and initiate preventive actions in response to hazards such as early signs of a forest fire. Likewise, a financial AI agent could actively manage an investment portfolio using adaptive strategies that react to changing market conditions in real time.
“2023 was the year of being able to chat with an AI,” wrote computer scientist Peter Norvig, a fellow at Stanford’s Human-Centered AI Institute, in a recent blog post. “In 2024, we’ll see the ability for agents to get stuff done for you. Make reservations, plan a trip, connect to other services.”
In addition, combining agentic and multimodal AI could open up new possibilities. In the aforementioned presentation, Chen gave the example of an application designed to identify the contents of an uploaded image. Previously, someone looking to build such an application would have needed to train their own image recognition model and then figure out how to deploy it. But with multimodal, agentic models, this could all be accomplished through natural language prompting.
“I really think that multimodal together with GPTs will open up the no-code development of computer vision applications, just in the same way that prompting opened up the no-code development of a lot of text-based applications,” Chen said.
3. Open source AI
Building large language models and other powerful generative AI systems is an expensive process that requires enormous amounts of compute and data. But using an open source model enables developers to build on top of others’ work, reducing costs and expanding AI access. Open source AI is publicly available, typically for free, enabling organizations and researchers to contribute to and build on existing code.
GitHub data from the past year shows a remarkable increase in developer engagement with AI, particularly generative AI. In 2023, generative AI projects entered the top 10 most popular projects across the code hosting platform for the first time, with projects such as Stable Diffusion and AutoGPT pulling in thousands of first-time contributors.
Early in the year, open source generative models were limited in number, and their performance often lagged behind proprietary options such as ChatGPT. But the landscape broadened significantly over the course of 2023 to include powerful open source contenders such as Meta’s Llama 2 and Mistral AI’s Mixtral models. This could shift the dynamics of the AI landscape in 2024 by providing smaller, less resourced entities with access to sophisticated AI models and tools that were previously out of reach.
“It gives everyone easy, fairly democratized access, and it’s great for experimentation and exploration,” Barrington said.
Open source approaches can also encourage transparency and ethical development, as more eyes on the code means a greater likelihood of identifying biases, bugs and security vulnerabilities. But experts have also expressed concerns about the misuse of open source AI to create disinformation and other harmful content. In addition, building and maintaining open source is difficult even for traditional software, let alone complex and compute-intensive AI models.
4. Retrieval-augmented generation
Although generative AI tools were widely adopted in 2023, they continue to be plagued by the problem of hallucinations: plausible-sounding but incorrect responses to users’ queries. This limitation has presented a roadblock to enterprise adoption, where hallucinations in business-critical or customer-facing scenarios could be catastrophic. Retrieval-augmented generation (RAG) has emerged as a technique for reducing hallucinations, with potentially profound implications for enterprise AI adoption.
RAG blends text generation with information retrieval to enhance the accuracy and relevance of AI-generated content. It enables LLMs to access external information, helping them produce more accurate and contextually aware responses. Bypassing the need to store all knowledge directly in the LLM also reduces model size, which increases speed and lowers costs.
“You can use RAG to go gather a ton of unstructured information, documents, etc., [and] feed it into a model without having to fine-tune or custom-train a model,” Barrington said.
These benefits are particularly enticing for enterprise applications where up-to-date factual knowledge is crucial. For example, businesses can use RAG with foundation models to create more efficient and informative chatbots and virtual assistants.
https://youtube.com/watch?v=gMxThHJXDwg%3Fautoplay%3D0%26modestbranding%3D1%26rel%3D0%26widget_referrer%3Dhttps%3A
5. Customized enterprise generative AI models
Massive, general-purpose tools such as Midjourney and ChatGPT have attracted the most attention among consumers exploring generative AI. But for business use cases, smaller, narrow-purpose models could prove to have the most staying power, driven by the growing demand for AI systems that can meet niche requirements.
While creating a new model from scratch is a possibility, it’s a resource-intensive proposition that will be out of reach for many organizations. To build customized generative AI, most organizations instead modify existing AI models — for example, tweaking their architecture or fine-tuning on a domain-specific data set. This can be cheaper than either building a new model from the ground up or relying on API calls to a public LLM.
“Calls to GPT-4 as an API, just as an example, are very expensive, both in terms of cost and in terms of latency — how long it can actually take to return a result,” said Shane Luke, vice president of AI and machine learning at Workday. “We are working a lot … on optimizing so that we have the same capability, but it’s very targeted and specific. And so it can be a much smaller model that’s more manageable.”
The key advantage of customized generative AI models is their ability to cater to niche markets and user needs. Tailored generative AI tools can be built for almost any scenario, from customer support to supply chain management to document review. This is especially relevant for sectors with highly specialized terminology and practices, such as healthcare, finance and legal.
In many business use cases, the most massive LLMs are overkill. Although ChatGPT might be the state of the art for a consumer-facing chatbot designed to handle any query, “it’s not the state of the art for smaller enterprise applications,” Luke said.
Barrington expects to see enterprises exploring a more diverse range of models in the coming year as AI developers’ capabilities begin to converge. “We’re expecting, over the next year or two, for there to be a much higher degree of parity across the models — and that’s a good thing,” he said.
On a smaller scale, Luke has seen a similar scenario play out at Workday, which provides a set of AI services for teams to experiment with internally. Although employees started out using mostly OpenAI services, Luke said, he’s gradually seen a shift toward a mix of models from various providers, including Google and AWS.
Building a customized model rather than using an off-the-shelf public tool often also improves privacy and security, as it gives organizations greater control over their data. Luke gave the example of building a model for Workday tasks that involve handling sensitive personal data, such as disability status and health history. “Those aren’t things that we’re going to want to send out to a third party,” he said. “Our customers generally wouldn’t be comfortable with that.”
In light of these privacy and security benefits, stricter AI regulation in the coming years could push organizations to focus their energies on proprietary models, explained Gillian Crossan, risk advisory principal and global technology sector leader at Deloitte.
“It’s going to encourage enterprises to focus more on private models that are proprietary, that are domain-specific, rather than focus on these large language models that are trained with data from all over the internet and everything that that brings with it,” she said.
6. Need for AI and machine learning talent
Designing, training and testing a machine learning model is no easy feat — much less pushing it to production and maintaining it in a complex organizational IT environment. It’s no surprise, then, that the growing need for AI and machine learning talent is expected to continue into 2024 and beyond.
“The market is still really hot around talent,” Luke said. “It’s very easy to get a job in this space.”
In particular, as AI and machine learning become more integrated into business operations, there’s a growing need for professionals who can bridge the gap between theory and practice. This requires the ability to deploy, monitor and maintain AI systems in real-world settings — a discipline often referred to as MLOps, short for machine learning operations.
In a recent O’Reilly report, respondents cited AI programming, data analysis and statistics, and operations for AI and machine learning as the top three skills their organizations needed for generative AI projects. These types of skills, however, are in short supply. “That’s going to be one of the challenges around AI — to be able to have the talent readily available,” Crossan said.
In 2024, look for organizations to seek out talent with these types of skills — and not just big tech companies. With IT and data nearly ubiquitous as business functions and AI initiatives rising in popularity, building internal AI and machine learning capabilities is poised to be the next stage in digital transformation.
Crossan also emphasized the importance of diversity in AI initiatives at every level, from technical teams building models up to the board. “One of the big issues with AI and the public models is the amount of bias that exists in the training data,” she said. “And unless you have that diverse team within your organization that is challenging the results and challenging what you see, you are going to potentially end up in a worse place than you were before AI.”
7. Shadow AI
As employees across job functions become interested in generative AI, organizations are facing the issue of shadow AI: use of AI within an organization without explicit approval or oversight from the IT department. This trend is becoming increasingly prevalent as AI becomes more accessible, enabling even nontechnical workers to use it independently.
Shadow AI typically arises when employees need quick solutions to a problem or want to explore new technology faster than official channels allow. This is especially common for easy-to-use AI chatbots, which employees can try out in their web browsers with little difficulty — without going through IT review and approval processes.
On the plus side, exploring ways to use these emerging technologies evinces a proactive, innovative spirit. But it also carries risk, since end users often lack relevant information on security, data privacy and compliance. For example, a user might feed trade secrets into a public-facing LLM without realizing that doing so exposes that sensitive information to third parties.
“Once something gets out into these public models, you cannot pull it back,” Barrington said. “So there’s a bit of a fear factor and risk angle that’s appropriate for most enterprises, regardless of sector, to think through.”