AI Middleware

In mid-Apr 2024, Meta debuted the Llama 3 AI model, the latest iteration of its open source large language model. Compared to its predecessor, Llama 3 was trained on a dataset of 15 trillion tokens — 7X more training data…

Serverless Deployment of AI Middleware, LiteLLM, with Google Cloud Run

AI Infrastructure

AI Middleware, AI Proxy, Azure, Cloud Run, Generative AI, Google Cloud, LiteLLM, Microservices, Open Source AI, PostgreSQL, Serverless

AI middleware is an emerging term for the layer of the technology stack that facilitates the interfacing of AI end user applications with the Large Language Models and GPU-accelerated machines that drive them. Here are the major sub-categories of this…

Proxies & Load Balancers for AI LLM Models (AI Middleware)

AI Infrastructure

AI Middleware, Docker, GenAI, Generative AI, HAProxy, LiteLLM, LLaMA 2, Locally Hosted AI, Ollama, Open LLM Models, OpenAI Compatible API

The Cambrianesque explosion of capable, open Large Language AI Models represents an opportunity to extend virtually any application with AI capabilities, but a strategy for managing multiple AI endpoints is clearly needed. Hosting open models in your own environment requires…

AI Middleware

Llama 3 AI Model Serving with Ollama & LiteLLM

Serverless Deployment of AI Middleware, LiteLLM, with Google Cloud Run

Proxies & Load Balancers for AI LLM Models (AI Middleware)