AI Developer

Salvo Software Mexico
Apply Now

About Salvo SoftwareSalvo Software is a global technology company specializing in custom software development and advanced engineering solutions. With distributed teams across the US, LATAM, and India, we partner with clients to build high-performance, scalable systems that solve complex technical challenges. Our culture values innovation, ownership, and engineering excellence. We’re growing our AI capabilities and are looking for a backend-focused AI Developer to join our team

Role DescriptionWe are seeking a highly skilled AI Developer with a strong backend and machine learning engineering background to design, train, optimize, and deploy LLM models in on-prem and offline environments. This role is deeply technical and hands-on, requiring expertise across Python ML stacks, model optimization, local inference frameworks, and DevOps workflows tailored for offline systems. You will work closely with our engineering and product teams to build end-to-end LLM pipelines, including data preprocessing, supervised fine-tuning, model quantization, evaluation, and deployment using local or air-gapped infrastructure. If you enjoy working with cutting-edge open-source LLMs, optimizing models for constrained environments, and building reliable backend pipelines, this role is for you.

ResponsibilitiesCore LLM Development• Train and fine-tune LLMs using supervised fine-tuning (SFT). • Work with open-source models such as LLaMA, Mistral, Qwen, and similar architectures. • Build LoRA / Q-LoRA pipelines for efficient fine-tuning. • Implement and optimize data preprocessing workflows, including tokenization and long-context handling. • Use and extend Hugging Face Transformers & Datasets for training and inference. • Parse and process structured and semi-structured data, including XML/XSD files. • Implement document parsing solutions for Office formats (python-docx, OpenXML).

Offline / On-Prem Model Expertise• Deploy, run, and maintain models fully offline and in air-gapped environments. • Perform model optimization and quantization (GGUF, GPTQ, AWQ, bitsandbytes). • Build and maintain inference systems using frameworks like vLLM, TGI, and Ollama. • Optimize GPU usage (CUDA, cuDNN, VRAM-aware batching). • Maintain local CI/CD pipelines for ML models without cloud dependencies. • Manage local model registries, versioning, and artifacts.

Backend & DevOps• Build backend services in Python for ML training and inference workflows. • Work with relational databases (Postgres/MySQL). • Use Docker and Git for reliable development and deployment pipelines. • Use Azure DevOps for CI/CD (including local runners when applicable).