The pre-processing layer in an LLM architecture serves a critical role in handling data. Its responsibilities include collecting and consolidating structured and unstructured data into a container and employing optical character recognition (OCR) to convert a non-text input into text. It’s also responsible for ranking relevant chunks to send based on a token (a fundamental unit of text that a language model reads and processes) with a limit (the maximum length of the prompt). Furthermore, it may utilize custom personally identifiable information (PII) and mask it to protect sensitive information.
The middleware layer facilitates seamless interaction between the operating system and various applications. It supports a wide range of programming languages, including Python, .NET and Java, which enables compatibility and smooth communication across different platforms.
The post-processing layer refines the LLM’s output by using prompt engineering to frame queries and by offering a fine-tuning application programming interface (API) for customization on domain-specific data (such as curating financial data for training). It then consolidates and evaluates the results for correctness, addressing bias and drift with targeted mitigation strategies, to improve output consistency, understandability and quality.
4) Assess cost, data ownership options and resources
The choice of an LLM implementation approach impacts the complexity and costs, including those associated with:
- Training
- Data collection, ingestion and cleansing
- Hiring data scientists
- Maintaining the model in production
The selection also greatly affects how much control a company will have over its proprietary data. The key reason for using this data is that it can help a company differentiate its product and make it so complex that it can’t be replicated, potentially gaining a competitive advantage. In addition, proprietary data can be crucial for addressing narrow, business-specific use cases.
Also, there are regulatory and ethical reasons for sustaining control. For example, depending on the data that is stored and processed, secure storage and auditability could be required by regulators. In addition, uncontrolled language models may generate misleading or inaccurate advice. Implementing control measures can help address these issues; for instance, preventing the spread of false information and potential harm to individuals seeking medical guidance.
Typically, there are three ways to implement an LLM — an API, platform as a service (PaaS) or self-hosted — each of which presents different considerations.
Off-the-shelf model via API
Using an API can alleviate the complexities of maintaining a sizable team of data scientists, as well as a language model, which involves handling updates, bug fixes and improvements. Using an API shifts much of this maintenance burden to the provider, allowing a company to focus on its core functionality. In addition, an API can enable on-demand access to the LLM, which is essential for applications that require immediate responses to user queries or interactions.
When a company uses an LLM API, it typically shares data with the API provider. It’s important to review and understand the data usage policies and terms of service to confirm they align with a company’s privacy and compliance requirements. The ownership of data also depends on the terms and conditions of the provider. In many cases, while companies will retain ownership of their data, they will also grant the provider certain usage rights for processing it. It’s beneficial for companies to clarify data ownership in their provider contracts before investing.
PaaS
PaaS provides companies access to use its LLM as part of a broader platform offering and allows customers to operate their LLMs without managing the underlying application infrastructure, middleware or hardware. However, by using this approach, companies may incur higher model costs associated with purchasing the rights to build on top of the LLM using their own data, as well as allowing domain specificity and model customization during deployment. It also enables companies to control their data and minimize the time to value and cost compared to the self-hosted approach. On the flip side, auditability of the data and the ability to provide comprehensive explanations for results can pose challenges as organizations are constrained given that their PaaS providers don’t provide the underlying data. In addition, PaaS can result in a greater total cost of ownership for the LLM and can be more complex than utilizing an API.
Self-hosting an LLM
This is the most expensive approach because it means rebuilding the entire model from scratch and requires mature data processes to fully train, operationalize and deploy an LLM. Furthermore, upgrading the underlying model for self-hosted implementations is more intensive than a typical software upgrade. On the other hand, it provides maximum control — since a company would own the LLM — and the ability to customize extensively.