Get in Touch

Course Outline

AI Sovereignty and LLM Local Deployment

  • Risks associated with cloud LLMs: data retention, training on inputs, and foreign jurisdiction.
  • Ollama architecture: model server, registry, and OpenAI-compatible API.
  • Comparison with vLLM, llama.cpp, and Text Generation Inference.
  • Model licensing: terms for Llama, Mistral, Qwen, and Gemma.

Installation and Hardware Setup

  • Installing Ollama on Linux with CUDA and ROCm support.
  • CPU-only fallback and AVX/AVX2 optimization.
  • Docker deployment and persistent volume mapping.
  • Multi-GPU setup and VRAM allocation strategies.

Model Management

  • Downloading models from the Ollama registry: ollama pull llama3.
  • Importing GGUF models from HuggingFace and TheBloke.
  • Quantization levels: trade-offs between Q4_K_M, Q5_K_M, and Q8_0.
  • Model switching and limits on concurrent model loading.

Custom Modelfiles

  • Writing Modelfile syntax: FROM, PARAMETER, SYSTEM, TEMPLATE.
  • Tuning temperature, top_p, and repeat_penalty.
  • System prompt engineering for role-specific behaviour.
  • Creating and publishing custom models to the local registry.

API Integration

  • OpenAI-compatible /v1/chat/completions endpoint.
  • Streaming responses and JSON mode.
  • Integrating with LangChain, LlamaIndex, and custom applications.
  • Authentication and rate limiting using a reverse proxy.

Performance Optimization

  • Context window sizing and KV cache management.
  • Batch inference and parallel request handling.
  • CPU thread allocation and NUMA awareness.
  • Monitoring GPU utilization and memory pressure.

Security and Compliance

  • Network isolation for model serving endpoints.
  • Input filtering and output moderation pipelines.
  • Audit logging of prompts and completions.
  • Model provenance and hash verification.

Requirements

  • Intermediate knowledge of Linux and container administration.
  • High-level understanding of machine learning and transformer models.
  • Familiarity with REST APIs and JSON.

Audience

  • AI engineers and developers looking to replace cloud LLM APIs.
  • Organisations with data sensitivity constraints that prohibit cloud model usage.
  • Government and defence teams requiring air-gapped language models.
 14 Hours

Custom Corporate Training

Training solutions designed exclusively for businesses.

  • Customized Content: We adapt the syllabus and practical exercises to the real goals and needs of your project.
  • Flexible Schedule: Dates and times adapted to your team's agenda.
  • Format: Online (live), In-company (at your offices), or Hybrid.
Investment

Price per private group, online live training, starting from 2600 € + VAT*

Contact us for an exact quote and to hear our latest promotions

Provisional Upcoming Courses (Contact Us For More Information)

Related Categories