AI Stack¶

The TAPPaaS AI Stack provides local AI capabilities, giving you control over your AI infrastructure while maintaining privacy and reducing costs.

Components¶

Component	Purpose
OpenWebUI	Web interface for interacting with LLMs
LiteLLM	Unified API gateway for multiple LLM providers
Ollama/vLLM	Local LLM serving engines

Architecture¶

graph TB
    User[User] --> OWU[OpenWebUI]
    OWU --> LLM[LiteLLM Proxy]
    LLM --> Local[Local LLM<br/>Ollama/vLLM]
    LLM --> Cloud[Cloud APIs<br/>OpenAI/Anthropic]

Prerequisites¶

Foundation installed
Sufficient RAM (minimum 8GB for AI workloads)
GPU recommended for local LLM inference

Hardware Requirements¶

Configuration	RAM	GPU	Use Case
Minimal	8 GB	None	API proxy only
Standard	16 GB	Optional	Small models (7B)
Performance	32+ GB	Recommended	Large models (13B+)

Quick Start¶

For a basic AI setup:

Deploy LiteLLM as your API gateway
Configure cloud provider API keys
Deploy OpenWebUI for user interface
Optionally add local LLM with Ollama

Stack Options¶

Cloud-Only Setup¶

Use LiteLLM as a proxy to cloud providers:

Low resource requirements
Access to latest models
Pay-per-use pricing
Requires internet connectivity

Hybrid Setup¶

Combine local and cloud models:

Use local models for common tasks
Fall back to cloud for complex queries
Balance cost and capability

Fully Local¶

Run everything on-premises:

Complete data privacy
No ongoing API costs
Requires significant hardware
Limited to available open models

Installation Guides¶

LiteLLM

Unified API gateway for managing multiple LLM providers
OpenWebUI

Feature-rich web interface for AI interactions
Ollama

Easy local LLM deployment and management

Cost Considerations¶

Cloud API Costs¶

Provider	Model	Approximate Cost
OpenAI	GPT-4	$0.03/1K tokens
Anthropic	Claude 3	$0.015/1K tokens
OpenRouter	Various	Varies

Self-Hosted Costs¶

Component	Monthly Cost
Electricity	€20-50
Hardware amortization	€30-100
Internet	Existing

Security Notes¶

API keys are stored securely in LiteLLM
Local inference keeps data on-premises
OpenWebUI supports user authentication
Consider network isolation for AI workloads