karpenter-optimizer

AGENTS.md - AI Assistant Guidelines

This document provides context and guidelines for AI coding assistants working on the Karpenter Optimizer project.

Project Overview

Karpenter Optimizer is a tool for optimizing Karpenter NodePool configurations based on actual cluster usage data. It analyzes Kubernetes workloads and node capacity to provide cost-optimized instance type recommendations.

Key Purpose

Architecture

Backend (Go)

Frontend (React)

Key Data Structures

NodePoolCapacityRecommendation (Primary Format)

type NodePoolCapacityRecommendation struct {
    NodePoolName             string   `json:"nodePoolName"`
    CurrentNodes             int      `json:"currentNodes"`
    CurrentInstanceTypes     []string `json:"currentInstanceTypes"` // Format: "m6g.2xlarge (27)"
    CurrentCPUUsed           float64  `json:"currentCPUUsed"`
    CurrentCPUCapacity       float64  `json:"currentCPUCapacity"`
    CurrentMemoryUsed        float64  `json:"currentMemoryUsed"`
    CurrentMemoryCapacity    float64  `json:"currentMemoryCapacity"`
    CurrentCost              float64  `json:"currentCost"`
    RecommendedNodes         int      `json:"recommendedNodes"`
    RecommendedInstanceTypes []string `json:"recommendedInstanceTypes"`
    RecommendedTotalCPU      float64  `json:"recommendedTotalCPU"`
    RecommendedTotalMemory   float64  `json:"recommendedTotalMemory"`
    RecommendedCost          float64  `json:"recommendedCost"`
    CostSavings              float64  `json:"costSavings"`
    CostSavingsPercent       float64  `json:"costSavingsPercent"`
    Reasoning                string   `json:"reasoning"` // AI-generated explanation
    Architecture             string   `json:"architecture"` // arm64 or amd64
    CapacityType             string   `json:"capacityType"` // spot or on-demand
}

NodeInfo (from Kubernetes client)

type NodeInfo struct {
    Name          string
    InstanceType  string
    Architecture  string
    CapacityType  string // "spot" or "on-demand"
    NodePool      string
    CPUUsage      *ResourceUsage
    MemoryUsage   *ResourceUsage
    PodCount      int
    // ...
}

API Endpoints

Primary Recommendation Endpoints

Data Endpoints

Important Patterns and Conventions

1. Recommendation Generation Flow

1. Fetch NodePools with actual nodes (ListNodePools)
2. For each NodePool:
   a. Calculate current capacity (CPU/Memory allocatable from nodes)
   b. Calculate current cost (based on actual instance types and capacity types)
   c. Find optimal instance types (findOptimalInstanceTypesWithCapacityType)
   d. Try both spot and on-demand to find best cost
   e. Only recommend if cost savings > 0
3. Enhance with Ollama explanations (if Ollama available)

2. Capacity Type Handling

3. Cost Calculation

4. Instance Type Selection

5. Node Usage Calculation

6. Frontend Format Support

Key Files and Their Purposes

Backend Core Files

Frontend Core Files

Important Notes

Dependencies

Cost Calculation Rules

  1. Spot instances: 25% of on-demand price (75% discount)
  2. On-demand instances: Full AWS Pricing API price
  3. Unknown capacity type: Default to on-demand
  4. Cost validation: Skip recommendations that increase cost by >10%

Instance Type Rules

  1. Query AWS API first for available types (future-proof)
  2. Filter GPU instances unless workloads require GPU
  3. Respect architecture (arm64 vs amd64)
  4. Limit combinations to 1-3 instance types
  5. Sort by cost efficiency before selecting

Node Usage Rules

  1. Use requests only (not limits)
  2. Exclude init containers
  3. Only scheduled pods (with nodeName)
  4. Exclude terminating pods

Frontend Display Rules

  1. Support both formats (old and new)
  2. Show NodePool name prominently
  3. Show current vs recommended nodes, costs, CPU/Memory
  4. Display AI explanation when available
  5. Show cost savings prominently

Common Tasks

Adding a New Instance Type Family

  1. Add to estimateInstanceCapacity() in recommender.go
  2. Add pricing to onDemandPrices map (if hardcoded)
  3. Add to estimateCostFromFamily() if needed
  4. AWS API will automatically discover new types

Modifying Recommendation Logic

  1. Primary location: internal/recommender/node_pool_recommender.go
  2. Cost calculation: internal/recommender/recommender.go → estimateCost()
  3. Instance selection: findOptimalInstanceTypesWithCapacityType()

Adding a New API Endpoint

  1. Add route in internal/api/server.go → setupRoutes()
  2. Add handler function
  3. Use GenerateRecommendationsFromNodePools() for recommendations
  4. Pass progressCallback for SSE endpoints

Updating Frontend Display

  1. Check NodePoolCard.js for format detection
  2. Support both old and new formats
  3. Test with both recommendation endpoints

Testing Considerations

Environment Variables

Code Style

Important Warnings

⚠️ DO NOT:

âś… DO:

Recent Changes