Integrator of Open Source Software & Open Access AI Models

Generative AI & RAG

Chat with an LLM and get answers enriched by the documents and data you already have — through Retrieval Augmented Generation.

Databases & Data Pipelines

Extract, transform, and load (ETL) data from various sources into knowledge graph or vector embeddings to use with AI models.

On-Prem & Cloud Infrastructure

Host complete AI-powered applications, or AI endpoints for existing applications in secure, privacy-respecting environments.

AI Technology Stack

Frontend / AI Middleware

  • Bionic-GPT
  • LibreChat
  • Chatbot UI
  • LangChain
  • Ollama / HuggingFace TGI / LiteLLM
  • n8n Data Pipeline Automation

Databases for AI Applications

  • PostgreSQL & pgVector Extension
  • MongoDB & MongoDB Atlas
  • Supabase Platform
  • Neo4j Graph Database
  • Redis Cache / Redis as Vector Database
  • Elasticsearch for Vector Search

Compute Infrastructure & AIaaS

  • Amazon Web Services (Bedrock)
  • Google Cloud (Vertex AI)
  • Azure AI Services (AI Studio)
  • Paperspace by DigitalOcean
  • OctoAI Media & Text Gen Solution
  • Docker, Swarm, & Kubernetes

Foundational Open Source AI Models

LLaMA 3 and 2

Meta Llama 2 AI Models

Llama 3 and Llama 2 is a suite of AI models by Meta, including Llama 2 Chat, Instruct, Llama Guard, and Code Llama.

Mistral AI

Mistral is an AI model fine-tuned for chat applications, developed in Europe and the first to partner with Microsoft after OpenAI.

Falcon LLM

Falcon LLM was among the first models made available for research & commercial applications by Abu Dhabi’s Advanced Technology Research Council.

Google Gemma

Google Gemma is the latest open access model made available by Google, based on its flagship Gemini model, but optimized for edge AI.

Microsoft Phi-3

Microsoft Phi-3 is a 3.8B parameter small language model (SLM) that can be deployed & fine-tuned with low resource usage but high quality results.

Run at the Edge, in the Datacenter, or Cloud

Self-Hosted AI Models

Integrate applications, such as chatbots, with local endpoints so that your users’ prompts and confidential data never leaves your environment. Enjoy predictable, flat costs of provisioning as many GPU instances as you require. Best for sensitive use cases, where data sovereignty is a concern. 

AI Models as a Service (AI MaaS)

Integrate with external AI endpoints where you “pay as you go” with simple, per-token pricing and no up-front hardware cost. Take advantage of open models that are multiple times more cost effective than OpenAI GPT models. Best for general use cases and application prototyping.

Benefits of Composable AI Infrastructure

Generative AI (GenAI)

Generate images and text based on prompts using a Large Language Model (LLM).

Flat or Per-Token Pricing

Provision the number of GPUs you desire, or get started with no upfront hardware cost.

OpenAI Compatible API

Expose Hugging Face AI models with an OpenAI-compatible endpoint via LiteLLM Proxy.

Edge AI

Run AI models locally on the device where the application is used, or on IoT devices.

Secure by Default

Manage costs, and ensure appropriate use of AI applications with audit logs and integration with SSO.

Containerized Architecture

Deploy & scale with composable AI architecture, using pre-built stacks on Docker Swarm and Kubernetes.

Private AI

Ensure your proprietary data remains confidential, and prevent leaks of personally identifiable information (PII).

No Internet Connection Required

Enjoy the benefits of AI inferencing without transmitting any prompts or embedding data to an external provider.

Scalable Load Balancing

Scale out with multiple compute nodes for your AI endpoints as your inferencing demands grow.

Built on Open Software

Integrate an open source stack with open access AI models to build complete AI solutions with no licensing cost.

Use any Large Language Model

“Plug and play” AI models from repositories such as Hugging Face with one unified technology stack.

Integrate with Existing Apps

Use open AI models as a “drop-in replacement” for OpenAI’s API, with minimal to no refactoring.