π Live Test Reports:
View the latest E2E validation report
Overview
The MLOS E2E testing system validates the complete stack from model installation through inference. Tests run automatically on GitHub Actions and publish results to a live dashboard.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β E2E Test Pipeline β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β β β 1. Download Releases 2. Install Models β β ββββββββββββββββββββ ββββββββββββββββββββ β β β Axon v3.1.3 β β Hugging Face β β β β Core v3.2.8 β β Models β ONNX β β β ββββββββββββββββββββ ββββββββββββββββββββ β β β β β β βΌ βΌ β β 3. Start MLOS Core 4. Register Models β β ββββββββββββββββββββ ββββββββββββββββββββ β β β HTTP API :18080 βββββββββββ axon register β β β β ONNX Runtime β β model.onnx β β β ββββββββββββββββββββ ββββββββββββββββββββ β β β β β βΌ β β 5. Run Inference Tests 6. Generate Report β β ββββββββββββββββββββ ββββββββββββββββββββ β β β POST /inference ββββββββββΆβ HTML Report β β β β Validate Output β β GitHub Pages β β β ββββββββββββββββββββ ββββββββββββββββββββ β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Configuration Files
1. Model Configuration (config/models.yaml)
Defines which models to test:
models:
gpt2:
enabled: true
category: nlp
axon_id: "hf/distilgpt2@latest"
description: "DistilGPT2 - Text generation"
resnet:
enabled: true
category: vision
axon_id: "hf/microsoft/resnet-50@latest"
description: "ResNet-50 - Image classification"
2. Test Input Configuration (config/test-inputs.yaml)
Defines test inputs for each model:
models:
gpt2:
category: nlp
tokenizer: "distilgpt2"
test_text:
small: "Hello, I am a language model."
large: "Machine learning has transformed..."
max_length:
small: 16
large: 128
required_inputs: ["input_ids"] # ONNX model inputs
resnet:
category: vision
input_name: "pixel_values"
image_size: 224
normalization:
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
β οΈ Important: The
required_inputs must match exactly what the ONNX model expects.
Different models have different input requirements!
Currently Tested Models
| Category | Model | Status | Required Inputs |
|---|---|---|---|
| NLP | GPT-2 (DistilGPT2) | β Passing | input_ids |
| BERT | β Passing | input_ids, attention_mask, token_type_ids |
|
| RoBERTa | β Passing | input_ids |
|
| Vision | ResNet-50 | β Passing | pixel_values |
| ViT | β Passing | pixel_values |
|
| ConvNeXt | β Passing | pixel_values |
|
| MobileNetV2 | β Passing | pixel_values |
|
| DeiT | β Passing | pixel_values |
Adding a New Model
Step 1: Check ONNX Model Inputs
First, determine what inputs your ONNX model expects:
# Using Docker to inspect model
docker run --rm --entrypoint="" \
-v ~/.axon/cache/models/hf/your-model:/model \
ghcr.io/mlos-foundation/axon-converter:latest \
python3 -c "
import onnx
model = onnx.load('/model/model.onnx')
print('Model inputs:')
for inp in model.graph.input:
print(f' - {inp.name}')
"
Step 2: Add to config/models.yaml
models:
my_new_model:
enabled: true
category: nlp # or vision, multimodal
axon_id: "hf/organization/model-name@latest"
description: "My Model - Task description"
Step 3: Add to config/test-inputs.yaml
models:
my_new_model:
category: nlp
tokenizer: "organization/model-name"
test_text:
small: "Test sentence."
large: "Longer test text..."
max_length:
small: 16
large: 128
required_inputs: ["input_ids"] # Match ONNX inputs!
Step 4: Test Locally
# Generate test input python3 scripts/generate-test-input.py my_new_model small # Install and test axon install hf/organization/model-name@latest axon register hf/organization/model-name@latest # Run inference curl -X POST http://localhost:18080/models/my_new_model/inference \ -H "Content-Type: application/json" \ -d "$(python3 scripts/generate-test-input.py my_new_model small)"
Core API Input Format
MLOS Core expects a flat JSON format where keys match ONNX input names:
# NLP model (single input)
{"input_ids": [101, 7592, 102]}
# NLP model (multiple inputs - e.g., BERT)
{
"input_ids": [101, 7592, 102],
"attention_mask": [1, 1, 1],
"token_type_ids": [0, 0, 0]
}
# Vision model (flat array of pixel values)
{"pixel_values": [0.1, 0.2, 0.3, ...]}
π‘ Tip: Use
python3 scripts/generate-test-input.py <model> --pretty
to see formatted JSON output for debugging.
Running Tests
Locally
# Full E2E test ./scripts/test-release-e2e.sh.bash # With specific versions AXON_VERSION=v3.1.3 CORE_VERSION=3.2.8-alpha ./scripts/test-release-e2e.sh.bash
GitHub Actions
# Trigger manually gh workflow run e2e-test.yml \ -f axon_version=v3.1.3 \ -f core_version=3.2.8-alpha # View results gh run list --workflow=e2e-test.yml
Troubleshooting
"Inference execution failed"
- Check if JSON keys match ONNX model input names exactly
- Verify you're not sending extra inputs the model doesn't expect
- Use the ONNX inspection command to see actual model inputs
"Model not found"
- Ensure model is installed:
axon list - Ensure model is registered with Core