At CoolBitX, Site Reliability Engineers (SREs) are responsible for providing our customers with stable, high performance, and secure backend services to transact crypto currency and share transmittal compliance data at any time and anywhere. As the rapid development of the cryptocurrency industry, we are seeking experienced SREs to deliver the latest features. The role will work closely with backend developers to build high availability systems to meet or exceed the SLO. Also, SREs will participate in an on-call rotation to respond to production incidents and perform root cause analysis with cross-functional teams. We are looking for people who not only have solid infrastructure management experience but are also passionate about blockchain technology.
- Engage in the whole life-cycle of service development like system design, deployment, operation, capacity planning, and monitoring.
- Implement security best practices.
- Maintain live services by monitoring availability, latency, and error logging.
- Improve service reliability and scalability through automation and testing.
- Maintain, improve CI/CD pipeline.
- Troubleshoot live service issues, perform root cause analysis, and prevent incidents from happening again.
- Spot problems, improvements and performance bottlenecks proactively.
- Experience in building and maintaining GCP or AWS infrastructure (serverless services, VM, K8s, IAM, networking, monitoring services).
- Experience with infrastructure as code (IaC) tools, such as Terraform or CDK.
- Experience with monitoring and logging tools like Prometheus,Loki, or ELK.
- Experience with DevOps tools like Git, Cloud Build or CodeBuild.
- Experience with DNS and CDN management, such as Cloudflare.
- Experience with database administration, such as Redis, MySQL, and DynamoDB.