Responsibilities
- Operate, automate, and enhance the day to day operation of the AI Platform and Data Lakehouse, including deployment, monitoring, logging, and lifecycle management.
- Contribute to the design and development of AI platform components and services, supporting proprietary, open-source, and in-house applications.
- Develop backend services and APIs to integrate internal systems, external APIs, MCPs, and AI / LLM services.
- Support the deployment and integration of LLM-based solutions, including chatbots, RAG pipelines, and agent-based workflows in production applications.
- Work closely with AI Engineers and Data to enable data pipelines, model serving, orchestration, monitoring, and retraining workflows.
- Design and validate platform proof-of-concepts (PoCs), performance tests, and acceptance testing for AI, data, and LLM technologies.
- Evaluate new tools, frameworks, and technologies to improve performance, scalability, security, and maintainability of the AI platform.
- Provide clear visibility into platform structure, health, and performance through dashboards, logs, and operational metrics.
- Collaborate with Architects, Platform Engineers, and Product Owners in a continuous delivery environment.
- Independently manage assigned engineering deliverables while aligning with project timelines and stakeholder expectations.
Requirements
- Bachelor's degree in Computer Science, Software Engineering, Data Science, or a related field.
- 4+ years of hands on experience in platform engineering, application development, or data/AI platform delivery.
- Strong programming and scripting skills in Python, Bash, and SQL.
- Hands on experience working with Large Language Models (LLMs), including usage, deployment, or integration into applications.
- Practical experience with LLM based systems, such as chatbots, knowledge assistants, or workflow driven AI applications.
- Solid experience with Docker, Kubernetes, Terraform, and Helm for infrastructure and deployment.
- Experience designing and operating CI/CD pipelines using tools such as Jenkins, Git, and Ansible.
- Proven experience delivering AI platforms or data platforms in large-scale enterprise environments.
- Strong understanding of DataOps and MLOps, including model versioning, deployment, monitoring, and lifecycle management.
- Experience with public cloud platforms (AWS, Azure, and/or AliCloud); relevant certifications are an advantage.
If you're interested in this role, please forward your latest resume to cheryl.ng@hays.com.hk or contact Cheryl Ng at +852 2101 0081.