Latest Enhancements | MLOS Foundation

🚀 Major Enhancements

⚡ Axon: Universal Conversion

Multi-framework ONNX conversion with repository-specific strategies

Hugging Face (GPT-2, BERT, T5)
PyTorch Hub (ResNet, VGG)
TensorFlow (SavedModel, Keras)
ModelScope (Multimodal)
Auto-optimization

# One command, any model!
axon install hf/gpt2@latest
axon install pytorch/resnet50@latest
axon install tfhub/bert@latest

🧠 MLOS Core: Multi-Type Tensors

Enhanced ONNX plugin with comprehensive tensor type support

int64 for NLP token IDs
float32 for vision models
int32 for TensorFlow
bool for attention masks
Multi-input models (BERT)
Named inputs parsing

# Multi-input inference
curl -X POST /models/bert/inference \
  -d '{"input_ids": [101, 7592, 102],
       "attention_mask": [1, 1, 1]}'

🔗 Seamless Integration

Complete E2E workflow from any repository to kernel-level inference

Zero API changes
Backward compatible
~2-8ms inference
Dynamic shapes
Automatic type detection
Multi-strategy fallbacks

# Complete workflow
axon install hf/gpt2@latest
axon register hf/gpt2@latest
# Ready for inference!

📐 End-to-End Architecture

📊 By the Numbers

4

Data Types
Supported

4

Model
Repositories

~2-8ms

Inference
Time

100K+

Models
Available

0

API
Changes

100%

Backward
Compatible

🔬 Technical Highlights

Repository-Specific Conversion Strategies

Axon now intelligently routes models to the best converter for their source repository:

Hugging Face:     optimum → torch.onnx.export → transformers
PyTorch Hub:      TorchScript → torch → torchvision
TensorFlow Hub:   SavedModel → Keras H5 → tf2onnx
ModelScope:       Auto-detect → Framework-specific

Enhanced Tensor Parsing

MLOS Core plugin now parses JSON inputs with full type support:

// Single input (GPT-2)
{"input_ids": [15496, 11, 337, 43, 48, 2640, 0]}

// Multi-input (BERT)  
{
  "input_ids": [101, 7592, 1010, 1045, 2572, 102],
  "attention_mask": [1, 1, 1, 1, 1, 1],
  "token_type_ids": [0, 0, 0, 0, 0, 0]
}

Zero-Cost Abstraction

All enhancements leverage the existing generic void* API - proving the architecture was designed right from the start. No breaking changes, just more capabilities!

✅ Tested & Verified

🤗 NLP Models

GPT-2 (DistilGPT-2)
BERT (base-uncased)
RoBERTa
T5

Status: ✅ Passing

🔥 Vision Models

ResNet (50, 101, 152)
VGG (16, 19)
AlexNet
ViT (coming soon)

Status: ⏳ Ready (not tested)

🎨 Multi-Modal

CLIP (text + image)
Wav2Vec2 (audio)
ModelScope models

Status: ⏳ Ready (not tested)

Universal Model Inference
is Here!

🚀 Major Enhancements

⚡ Axon: Universal Conversion

🧠 MLOS Core: Multi-Type Tensors

🔗 Seamless Integration

📐 End-to-End Architecture

📊 By the Numbers

🔬 Technical Highlights

Repository-Specific Conversion Strategies

Enhanced Tensor Parsing

Zero-Cost Abstraction

✅ Tested & Verified

🤗 NLP Models

🔥 Vision Models

🎨 Multi-Modal

Ready to Try MLOS?

Universal Model Inferenceis Here!

🚀 Major Enhancements

⚡ Axon: Universal Conversion

🧠 MLOS Core: Multi-Type Tensors

🔗 Seamless Integration

📐 End-to-End Architecture

📊 By the Numbers

🔬 Technical Highlights

Repository-Specific Conversion Strategies

Enhanced Tensor Parsing

Zero-Cost Abstraction

✅ Tested & Verified

🤗 NLP Models

🔥 Vision Models

🎨 Multi-Modal

Ready to Try MLOS?

Universal Model Inference
is Here!