Salary: up to 250-275k SGD base
High-frequency prop trading firm with offices worldwide looking for skilled Senior Site Reliability Engineer developer to join their High Performance Computing team, developing and supporting their large-scale compute and storage platform.
This platform is designed to solve demanding problems – both business and financial – through computer modelling, simulation and analysis. You will be responsible for the deployment, operation and support of HPC infrastructure (focusing on diverse and distributed on-prem & cloud storage), schedulers – e.g. HTCondor or SLURM, and the container orchestration platform (Kubernetes), as well as managing hardware and software vendor relationships.
The successful SRE will have excellent communication skills, and previous exposure to at least one cloud platform.
- Solid Linux admin experience – in a large-scale research environment infrastructure would be ideal (scientific, financial, data analytics)
- Experience with managing a medium to large-scale platform environments, e.g. Kubernetes or Mesos
- Hands-on experience with at least one programming language (preferably Python)
- Degree (or equivalent) in Computer Science or related field
NB: Please don’t apply if you are a fresh graduate.
- Competitive salary + performance-based bonuses
- Generous benefits, including medical insurance and gym membership
- Collaborative and friendly environment with smart, highly engaged colleagues
- Relaxed, dress-down office culture, with breakfast, lunch and snacks provided