Our review
This skill helps migrate LLM training code from frameworks like verl, TRL, or OpenRLHF to MinT.
Strengths
- Provides detailed concept mappings between source frameworks and MinT
- Includes before/after code examples for common migration patterns
- Warns about common pitfalls like token alignment and LoRA learning rates
Limitations
- Requires the user to already be familiar with the source framework
- Does not cover all possible frameworks (e.g., DeepSpeed, NeMo)
Use this skill when you need to adapt an existing LLM training pipeline to the MinT infrastructure.
Do not use it if you are starting from scratch with no pre-existing code to migrate.
Security analysis
SafeThe skill provides pure guidance for code migration, with no executable tools, network requests, or destructive operations. It does not instruct the AI to run shell commands, alter system files, or exfiltrate data. The Python snippets are illustrative and not harmful.
No concerns found
Examples
Je veux migrer mon script d'entraînement TRL PPO vers MinT. Mon code utilise PPOTrainer, AutoModelForCausalLM et un dataset HuggingFace. Peux-tu me montrer la correspondance et un exemple de code avant/après ?J'ai un pipeline verl avec RolloutWorker et PPOTrainer. Comment le transformer pour utiliser SamplingClient et TrainingClient de MinT ? Donne un exemple concret.Dans mon code actuel j'utilise une perte DPO personnalisée. Comment faire la même chose avec forward_backward_custom() dans MinT ?description: Help migrate LLM training code from verl, TRL, or similar frameworks to MinT argument-hint: [framework_name or migration_question]
MinT Migration Assistant
Help users migrate their LLM training code to MinT from verl, TRL, OpenRLHF, or similar frameworks.
Instructions
When invoked:
- Identify the source framework (verl, TRL, OpenRLHF, custom PyTorch)
- Map concepts from source framework to MinT equivalents
- Provide before/after code examples for the specific migration pattern
- Highlight key differences (distributed training, checkpointing, loss functions)
- Warn about common pitfalls (token alignment, learning rate scaling, async patterns)
Reference
Read mint_api_reference.txt in this directory for complete MinT API documentation.
Concept Mapping
verl to MinT
| verl | MinT |
|------|------|
| RolloutWorker | SamplingClient |
| ActorRolloutRefWorker | TrainingClient + SamplingClient |
| RewardManager | User-defined reward function |
| PPOTrainer | forward_backward(loss_fn="ppo") + optim_step() |
| DataProto | types.Datum |
| vllm backend | MinT handles inference internally |
| fsdp/megatron sharding | MinT handles distributed training internally |
TRL to MinT
| TRL | MinT |
|-----|------|
| SFTTrainer | forward_backward(loss_fn="cross_entropy") loop |
| PPOTrainer | forward_backward(loss_fn="ppo") loop |
| DPOTrainer | forward_backward_custom() with DPO loss |
| AutoModelForCausalLM | service_client.create_lora_training_client() |
| HuggingFace dataset | Convert to list[types.Datum] |
Key Differences
-
Distributed training: MinT handles sharding server-side. No FSDP/Megatron config needed.
-
Model loading: Specify model by name, not local path.
training_client = service_client.create_lora_training_client(base_model="Qwen/Qwen3-8B") -
Inference: Built-in
SamplingClientinstead of vLLM workers.sampling_client = training_client.save_weights_and_get_sampling_client(name="...") -
Checkpointing: Named checkpoints with
save_state()/save_weights_for_sampler(). -
Loss functions: String selector for built-in losses.
"cross_entropy"- SFT"importance_sampling"- Basic policy gradient"ppo"- Clipped policy gradient"cispo"- Clipped importance sampling"dro"- Direct reward optimization
Common Pitfalls
-
Async pattern: Always call
.result()on futures.training_client.forward_backward(data, loss_fn).result() training_client.optim_step(params).result() -
Token alignment: Next-token prediction format.
input_tokens = all_tokens[:-1] target_tokens = all_tokens[1:] weights = weights[1:] # Aligned with targets -
LoRA learning rate: 20-100x higher than full fine-tuning.
-
Gradient accumulation: Multiple
forward_backwardcalls before singleoptim_step.
Next.js App Router Expert
Development
A skill that turns Claude into a Next.js App Router expert.
README Generator
Development
Creates professional and comprehensive README.md files for your projects.
API Documentation Writer
Development
Generates comprehensive API documentation in OpenAPI/Swagger format.