7 architecture considerations for generative AI
Create the right foundation for scaling generative AI securely, responsibly, cost effectively—and in a way that delivers real business value.
June 05, 2023
The latest generation of generative AI (Gen AI) applications has taken the world by storm. User adoption is exploding. New services are coming to market almost daily. And business leaders are understandably eager to tap into the power of this new technology. This is having a truly democratizing effect. Gen AI promises to empower every kind of business, including smaller companies and those that have historically lagged in tech maturity.
Why? Because much of the hard AI development work has already been done pre-training the foundation models. So the emphasis shifts from data science to domain expertise. And that’s something every successful business has in abundance.
It means, whatever the size of your company, you now have the opportunity to create extremely powerful and differentiating Gen AI solutions by applying foundation models to your unique data and expert business know-how.
We've seen the possibilities for our clients evolve rapidly. I recently sat down with Accenture’s Chief AI Architect, Atish Ray, to discuss what companies need to consider when getting their enterprises generative-AI ready.
Like any new foundational technology, you need to make sure you can scale Gen AI securely, responsibly, cost-effectively and in a way that delivers real value to the business.
– Atish Ray, Chief AI Architect
An example: Consider all the different AI models your business has to build and manage today. If you’re a large organization, they probably number in the hundreds. But when you start using pre-trained foundation models, you can bring that number down radically.
Instead of having to create a new model for each new use case, you can fine tune a small collection of pre-trained foundation models to achieve the same result. At the same time, instead of incorporating the domain intelligence always within ML models, you have the option to manage it outside while using the pre-trained models to generate them. And that fundamentally changes the way you need to think about your architecture.
So, what should business leaders do now to get Gen AI-ready? We believe there are seven key questions to ask.
The number of Gen AI models—and vendors—is growing all the time. Pure-play vendors like OpenAI and Cohere are offering next generation models as service, developed through fundamental research and trained on a large corpus of publicly available data.
At the same time, a significant number of matured open-source models are now available via hubs like Hugging Face. Cloud hyperscalers are also getting into the game by partnering with the pure-plays, adopting open-source models, pre-training their own models and also providing full-stack services. We're seeing more and more models-as-service that are pre-trained on specialized domain knowledge become available.
The good news is all options are increasingly viable. Lately, the introduction of smaller and lower-cost foundation models (such as Databricks’ Dolly) is making building or customizing Gen AI increasingly accessible. However, all options mandate careful considerations to ensure they fit your organization’s needs and asks.
Atish explains the two principal approaches businesses can take to access Gen AI models:
“The first is to go for “full control” by deploying the models on your own public cloud (e.g., cloud hyperscalers) or private infrastructure (e.g., private cloud, data centers). The second is to opt for speed and simplicity by accessing Gen AI as a managed cloud service from an external vendor. Both options have their merits. But if you choose full control, you need to be aware there are several additional factors to consider.”
These factors include identifying and managing the right infrastructure for these models, version controlling the models, developing associated talent and skills, developing full-stack services for easier adoption and more. There are also specialized infrastructure choices like GPUs from Nvidia or integrated appliance optimized across next-generation processers and model computation workloads from emerging players like SambaNova.
Having dedicated infrastructure does give you a better cost predictability for Gen AI models, but it comes with additional complexity and effort to achieve the right performance at enterprise scale.
Getting maximum business value from Gen AI often depends on leveraging your proprietary data to boost accuracy, performance and usefulness within the enterprise. There are several ways that you can adapt pre-trained models with your own data in order to consume within the organization. We like to summarize this as “buy, build or boost.”
Of course, to do this at speed and scale, you first need a modern data foundation, as part of the enterprise digital core, that makes it easier to consume data through the foundation models. This is a prerequisite for extracting accelerated and exponential value with Gen AI.
Clearly, it’s important that foundation models meet the overall security, reliability and responsibility requirements of the enterprise. Integration and interoperability frameworks are also key considerations for enabling full-stack solutions with foundation models in the enterprise.
But for the AI to be enterprise-ready, organization have to trust the AI. And that raises all sorts of considerations. Companies must carefully consider Responsible AI implications of adopting this technology for sensitive business functions. Built-in capabilities from Gen AI vendors are maturing, but for now, you need to look at developing your own controls and mitigation techniques as appropriate.
There are several practical actions companies can take to ensure Gen AI doesn’t threaten enterprise security. Adopting Gen AI is an ideal time to review your overall AI governance standards and operating models.
Although they come pre-trained, foundation models can still have significant energy requirements during adaptation and fine-tuning. If you are considering pre-training your own model or building your own model from the ground-up, this becomes very significant. There are different implications depending on the approach taken to buy, boost or build the foundation models.
Left unchecked, this will negatively impact the organization’s carbon footprint, especially when applications based on Gen AI are scaled up across the enterprise. So, the potential environmental impact needs to be considered up front in making the right choices about the available options. Accenture has developed tools like Green Cloud Advisor, which can support this process.
Having chosen and deployed a foundation model, companies then need to consider what new frameworks may be required to industrialize and accelerate application development. Having vector databases or domain knowledge graphs that capture your business data and broader knowledge (such as how business concepts are structured) also become important for developing valuable applications with Gen AI. Consider vectorization capabilities offered by some emerging players and cloud service providers.
Prompt engineering techniques are fast becoming a differentiating capability. And by industrializing the process, you can build up a corpus of efficient, well-designed prompts and templates that are aligned to specific business functions or domains. Look to incorporate enterprise frameworks to scale collaboration and management around them.
An orchestration framework is key for application enablement as stitching together a Gen AI application involves coordinating multiple components, services and steps. Several frameworks and services are emerging but still nascent. This includes workflow components that would incorporate critical human-in-the-loop process flows that are a must for multiple usage scenarios.
Similarly, industrializing the process of getting feedback from domain experts can be a key accelerator. Emerging solutions like Scale.AI now offer reinforcement learning with human feedback, making it easier for domain experts to label data, model edge cases and so on.
“As your Gen AI applications come up and running, you should consider the impact on your operability," says Atish. Some companies have already developed an MLOps framework to productize ML applications. Those standards require a thorough review to incorporate LLMOps and Gen AIOps considerations and accommodate changes in DevOps, CI/CD/CT, model management, model monitoring, prompt management and data/knowledge management in both pre-production and production environments. The MLOps approach will have to evolve for the world of foundation models, considering processes across the whole application lifecycle.
Moreover, as Gen AI leads to AutoGPT—where AI powers much more of the end-to-end process—we’ll see AI driving an operations architecture that automates productionizing, monitoring and calibrating these models and model interactions to continue to deliver business SLA.
By answering these architecture questions, organizations can position themselves to scale Gen AI with maximum efficiency and effectiveness and foster successful adoption across the enterprise.
But it’s important to bear in mind one additional point: Success with Gen AI is not just about getting the right architecture or even the right technology. It’s also about the people—both technical and non-technical—their skills and capabilities, their data and AI literacy, and the way they would work with the new technologies day-to-day. Managing the impact of that change on effected people is also critical for success.
This also demands a lot of flexibility from business and technology leaders. Because the technology’s moving so fast, it’s impossible to know for certain how the next few years will play out as an ecosystem of capabilities emerges around the foundation models. Adaptiveness and responsiveness—in the organization as well as IT—are also going to be essential in capturing the full value of this exciting step-change in AI capability.