ML / AI Engineer

Join a world-leading quantitative trading fund as they expand their next-gen machine learning research platform. You'll shape core infrastructure, partner closely with researchers and drive high-impact engineering across large-scale, GPU-accelerated workloads.

Selby Jennings - Hong Kong - Full time

Salary: Negotiable

Responsibilities

Lead the design and development of a scalable, reliable, and reproducible machine learning research platform.
Build infrastructure to support large-scale experimentation, model training, and simulation across both onâ€‘premise highâ€‘performance compute environments and multiâ€‘cloud setups.
Work closely with researchers to understand evolving workflows and translate those needs into robust platform capabilities.
Architect and optimize distributed training pipelines for high-throughput, GPUâ€‘accelerated workloads.
Enhance experiment management, model versioning, artifact tracking, and data lineage to ensure transparent and repeatable research processes.
Develop tools and frameworks that improve feature engineering, dataset creation, and large-scale backtesting.
Drive initiatives to improve compute efficiency, resource allocation, and workload isolation across heterogeneous environments.
Enhance platform observability with improved metrics, logging, tracing, and debugging capabilities tailored to ML and distributed systems.
Support rapid iteration by delivering features and fixes quickly while maintaining strong engineering standards.
Contribute to long-term architectural planning to ensure the platform scales with growing data volumes and model complexity.

Qualifications

2+ years of experience designing and building distributed systems at scale, ideally supporting research or data-heavy workloads.
Strong programming skills in Python with a focus on clean, maintainable, high-performance code.
Experience running applications on Linux-based HPC clusters and/or cloud computing platforms.
Solid understanding of distributed computing, parallel processing, and resource management.
Hands-on experience with GPU workloads and familiarity with modern ML frameworks such as PyTorch, TensorFlow, or JAX.
Experience optimizing data pipelines and handling large structured and unstructured datasets.
Strong debugging skills with the ability to diagnose issues across multiple layers of the stack.
Comfortable working independently in a fast-paced, research-oriented environment.
Strong communication skills and experience collaborating directly with researchers or data-focused teams.

Preferred Attributes

Experience building internal ML platforms or research tooling at scale.
Familiarity with experimentâ€‘tracking tools, workflow orchestration systems, and model lifecycle management.
Experience with containerization and orchestration technologies (e.g., Docker, Kubernetes).
Exposure to high-performance or latency-sensitive domains such as quantitative research, simulation systems, or largeâ€‘scale distributed compute.

Apply

23975008

Specific Advice

Regionally Shared Advice

Totally Bazaar — Gift Gallery!

ML / AI Engineer

Specific Advice

Regionally Shared Advice