E2E Testing Guide | MLOS Foundation

📊 Live Test Reports: View the latest E2E validation report

Overview

The MLOS E2E testing system validates the complete stack from model installation through inference. Tests run automatically on GitHub Actions and publish results to a live dashboard.

┌─────────────────────────────────────────────────────────────────────┐
│                        E2E Test Pipeline                            │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  1. Download Releases          2. Install Models                    │
│  ┌──────────────────┐         ┌──────────────────┐                 │
│  │  Axon v3.1.3     │         │  Hugging Face    │                 │
│  │  Core v3.2.8     │         │  Models → ONNX   │                 │
│  └──────────────────┘         └──────────────────┘                 │
│           │                            │                            │
│           ▼                            ▼                            │
│  3. Start MLOS Core           4. Register Models                    │
│  ┌──────────────────┐         ┌──────────────────┐                 │
│  │  HTTP API :18080 │◀────────│  axon register   │                 │
│  │  ONNX Runtime    │         │  model.onnx      │                 │
│  └──────────────────┘         └──────────────────┘                 │
│           │                                                         │
│           ▼                                                         │
│  5. Run Inference Tests       6. Generate Report                    │
│  ┌──────────────────┐         ┌──────────────────┐                 │
│  │  POST /inference │────────▶│  HTML Report     │                 │
│  │  Validate Output │         │  GitHub Pages    │                 │
│  └──────────────────┘         └──────────────────┘                 │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Configuration Files

1. Model Configuration (`config/models.yaml`)

Defines which models to test:

models:
  gpt2:
    enabled: true
    category: nlp
    axon_id: "hf/distilgpt2@latest"
    description: "DistilGPT2 - Text generation"
    
  resnet:
    enabled: true
    category: vision
    axon_id: "hf/microsoft/resnet-50@latest"
    description: "ResNet-50 - Image classification"

2. Test Input Configuration (`config/test-inputs.yaml`)

Defines test inputs for each model:

models:
  gpt2:
    category: nlp
    tokenizer: "distilgpt2"
    test_text:
      small: "Hello, I am a language model."
      large: "Machine learning has transformed..."
    max_length:
      small: 16
      large: 128
    required_inputs: ["input_ids"]  # ONNX model inputs
    
  resnet:
    category: vision
    input_name: "pixel_values"
    image_size: 224
    normalization:
      mean: [0.485, 0.456, 0.406]
      std: [0.229, 0.224, 0.225]

⚠️ Important: The required_inputs must match exactly what the ONNX model expects. Different models have different input requirements!

Currently Tested Models

Category	Model	Status	Required Inputs
NLP	GPT-2 (DistilGPT2)	✅ Passing	`input_ids`
	BERT	✅ Passing	`input_ids`, `attention_mask`, `token_type_ids`
	RoBERTa	✅ Passing	`input_ids`
Vision	ResNet-50	✅ Passing	`pixel_values`
	ViT	✅ Passing	`pixel_values`
	ConvNeXt	✅ Passing	`pixel_values`
	MobileNetV2	✅ Passing	`pixel_values`
	DeiT	✅ Passing	`pixel_values`

Adding a New Model

Step 1: Check ONNX Model Inputs

First, determine what inputs your ONNX model expects:

# Using Docker to inspect model
docker run --rm --entrypoint="" \
  -v ~/.axon/cache/models/hf/your-model:/model \
  ghcr.io/mlos-foundation/axon-converter:latest \
  python3 -c "
import onnx
model = onnx.load('/model/model.onnx')
print('Model inputs:')
for inp in model.graph.input:
    print(f'  - {inp.name}')
"

Step 2: Add to `config/models.yaml`

models:
  my_new_model:
    enabled: true
    category: nlp  # or vision, multimodal
    axon_id: "hf/organization/model-name@latest"
    description: "My Model - Task description"

Step 3: Add to `config/test-inputs.yaml`

models:
  my_new_model:
    category: nlp
    tokenizer: "organization/model-name"
    test_text:
      small: "Test sentence."
      large: "Longer test text..."
    max_length:
      small: 16
      large: 128
    required_inputs: ["input_ids"]  # Match ONNX inputs!

Step 4: Test Locally

# Generate test input
python3 scripts/generate-test-input.py my_new_model small

# Install and test
axon install hf/organization/model-name@latest
axon register hf/organization/model-name@latest

# Run inference
curl -X POST http://localhost:18080/models/my_new_model/inference \
  -H "Content-Type: application/json" \
  -d "$(python3 scripts/generate-test-input.py my_new_model small)"

Core API Input Format

MLOS Core expects a flat JSON format where keys match ONNX input names:

# NLP model (single input)
{"input_ids": [101, 7592, 102]}

# NLP model (multiple inputs - e.g., BERT)
{
  "input_ids": [101, 7592, 102],
  "attention_mask": [1, 1, 1],
  "token_type_ids": [0, 0, 0]
}

# Vision model (flat array of pixel values)
{"pixel_values": [0.1, 0.2, 0.3, ...]}

💡 Tip: Use python3 scripts/generate-test-input.py <model> --pretty to see formatted JSON output for debugging.

Running Tests

Locally

# Full E2E test
./scripts/test-release-e2e.sh.bash

# With specific versions
AXON_VERSION=v3.1.3 CORE_VERSION=3.2.8-alpha ./scripts/test-release-e2e.sh.bash

GitHub Actions

# Trigger manually
gh workflow run e2e-test.yml \
  -f axon_version=v3.1.3 \
  -f core_version=3.2.8-alpha

# View results
gh run list --workflow=e2e-test.yml

🧪 E2E Testing Guide

Overview

Configuration Files

1. Model Configuration (`config/models.yaml`)

2. Test Input Configuration (`config/test-inputs.yaml`)

Currently Tested Models

Adding a New Model

Step 1: Check ONNX Model Inputs

Step 2: Add to `config/models.yaml`

Step 3: Add to `config/test-inputs.yaml`

Step 4: Test Locally

Core API Input Format

Running Tests

Locally

GitHub Actions

Troubleshooting

"Inference execution failed"

"Model not found"

Resources

Overview

Configuration Files

1. Model Configuration (config/models.yaml)

2. Test Input Configuration (config/test-inputs.yaml)

Currently Tested Models

Adding a New Model

Step 1: Check ONNX Model Inputs

Step 2: Add to config/models.yaml

Step 3: Add to config/test-inputs.yaml

Step 4: Test Locally

Core API Input Format

Running Tests

Locally

GitHub Actions

Troubleshooting

"Inference execution failed"

"Model not found"

Resources

1. Model Configuration (`config/models.yaml`)

2. Test Input Configuration (`config/test-inputs.yaml`)

Step 2: Add to `config/models.yaml`

Step 3: Add to `config/test-inputs.yaml`