Machine Learning Engineer

Codash Solutions
Noida, Uttar Pradesh

Job details

Job type

  • Full-time

Location

Noida, Uttar Pradesh

Full job description

Core Responsibilities

  • Design and implement traditional ML and LLM-based systems and applications
  • Optimize model inference performance and cost efficiency
  • Fine-tune foundation models for specific use cases and domains
  • Implement diverse prompt engineering strategies
  • Build robust backend infrastructure for AI-powered applications
  • Implement and maintain MLOps pipelines for AI lifecycle management
  • Design and implement comprehensive traditional ML and LLM monitoring and evaluation systems
  • Develop automated testing frameworks for model quality and performance tracking

Required Technical Skills

LLM Expertise

  • Model Fine-tuning: Experience with parameter-efficient fine-tuning methods (LoRA, QLoRA, adapter layers)
  • Inference Optimization: Knowledge of quantization, pruning, caching strategies, and serving optimizations
  • Prompt Engineering: Prompt design, few-shot learning, chain-of-thought prompting, and retrieval-augmented generation (RAG)
  • Model Evaluation: Experience with AI evaluation frameworks and metrics for different use cases
  • Monitoring & Testing: Design of automated evaluation pipelines, A/B testing for models, and continuous monitoring systems

Backend Engineering

  • Languages: Proficiency in Python, with experience in FastAPI, Flask, or similar frameworks
  • APIs: Design and implementation of RESTful APIs and real-time systems
  • Databases: Experience with vector databases and traditional databases
  • Cloud Platforms: AWS, GCP, or Azure with focus on ML services

MLOps & Infrastructure

  • Deployment: Experience with model serving frameworks (vLLM, SGLang, TensorRT)
  • Containerization: Docker and Kubernetes for ML workloads
  • Monitoring: ML model monitoring, performance tracking, and alerting systems
  • Evaluation Systems: Building automated evaluation pipelines with custom metrics and benchmarks
  • CI/CD: MLOps pipelines for automated testing, and deployment
  • Orchestration: Experience with workflow tools like Airflow.

Preferred Experience

  • LLM Frameworks: Hands-on experience with Transformers, LangChain, LlamaIndex, or similar
  • Monitoring Platforms: Knowledge of LLM-specific monitoring tools and general ML monitoring
  • Distributed Training and Inference: Experience with multi-GPU and distributed training and inference setups
  • Model Compression: Knowledge of techniques like distillation, quantization, and efficient architectures
  • Production Scale: Experience deploying models handling high-throughput, low-latency requirements
  • Research Background: Familiarity with recent LLM research and ability to implement novel techniques

Tools & Technologies We Use

  • Frameworks: PyTorch, Transformers, TensorFlow
  • Serving: vLLM, TensorRT-LLM, SGlang, OpenAI API,
  • Infrastructure: Kubernetes, Docker, AWS/GCP
  • Databases: PostgreSQL, Redis, Vector DBs

Application Question(s):

  • Do you have Experience with parameter-efficient fine-tuning methods (LoRA, QLoRA, adapter layers)
  • Do you have Experience with model serving frameworks (vLLM, SGLang, TensorRT)
  • Do you have 4+ years of experience in Mlops

Work Location: Hybrid remote in Noida, Uttar Pradesh