We are looking for an experienced Lead, Cloud Engineering to drive the design, automation, and reliability of our cloud infrastructure across global fintech operations.
You will lead a distributed team of cloud engineers responsible for ensuring our Google Cloud Platform (GCP) infrastructure is secure, scalable, and resilient — powering critical financial services and high-volume payment systems.
This role combines hands-on technical leadership with strategic oversight. You’ll work closely with backend, frontend, security, and SRE teams to strengthen our cloud foundations, optimize CI/CD processes, and ensure world-class uptime, compliance, and observability.
Responsibilities:
Cloud Infrastructure Design and Deployment:
- Design, deploy, and maintain cloud infrastructure solutions, using best practices, industry standards, and adhering to security guidelines.
- Collaborate with cross-functional teams, including development, operations, and security, to assess requirements and design scalable cloud architectures and resolve issues.
Monitoring and Performance Optimization:
- Knowledge of monitoring solutions to track the performance, availability, and security of cloud infrastructure and applications.
- Proactively identify and address performance bottlenecks, scalability issues, and security vulnerabilities.
- Troubleshoot and resolve issues related to cloud infrastructure, network connectivity, and application deployments.
- Implement automation and orchestration techniques to streamline cloud operations and improve efficiency.
- Continuously evaluate and recommend improvements to optimize the cloud infrastructure and enhance system performance.
Infrastructure as Code (IaC):
- Implement automation tools/IaC such as Terraform, CloudFormation, or Ansible to streamline the provisioning, configuration, and deployment of cloud resources.
- Develop and maintain reusable templates and scripts for provisioning and configuring cloud resources.
- Implement version control and change management practices for infrastructure code.
Security and Compliance:
- Implement robust security measures to protect cloud environments, data, and applications.
- Monitor and respond to security incidents, perform vulnerability assessments, and implement necessary remediation measures.
- Ensure compliance with industry regulations and standards, such as PCI-DSS etc.
Collaboration and Documentation:
- Document infrastructure designs, technical processes, deployment processes, and operational procedures.
- Strong verbal and written communication skills to articulate complex concepts and ideas clearly.
- Excellent teamwork and collaboration skills to work effectively with cross-functional teams.
- Contribute to knowledge sharing initiatives and provide training to team members on DevOps and cloud technologies.
Requirements:
Education and Experience:
- Bachelor's degree in Computer Science, Information Technology, or a related field (or equivalent work experience).
- Minimum 5 years as a Cloud Engineer or similar role, with a focus on automation, designing and managing cloud infrastructure.
- Hands-on experience with public cloud platforms such as Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform (GCP).
Technical Skills:
- Strong understanding of cloud services, including compute, storage, networking, and security.
- Solid understanding Jenkins, Git, continuous integration, continuous delivery.
- Proficiency with monitoring tools (e.g., Prometheus, ELK stack).
- Familiarity with infrastructure-as-code (IaC) tools like Terraform, pulumi, CloudFormation and configuration management tools (e.g., Ansible, Puppet, Chef)..
- Knowledge of scripting and programming languages, such as Go, Python, Bash, or PowerShell.
- Strong understanding of virtualization technologies, containerization (e.g., Docker, Docker compose, Kubernetes), and microservices architecture.
- Knowledge about Apache kafka
- Knowledge of networking concepts, protocols (e.g., TCP/IP, DNS), and network security.
- Experience with webservers Nginx, Apache etc.
Analytical and Problem-Solving Skills:
- Ability to analyze complex problems, propose effective solutions, and implement them in a timely manner.
- Strong troubleshooting and debugging skills to identify and resolve issues in cloud environments.
- Capacity to anticipate potential bottlenecks, performance issues, and security vulnerabilities and proactively address them.
Collaboration and Communication:
- Excellent teamwork and collaboration skills to work effectively with cross-functional teams.
- Strong verbal and written communication skills to articulate complex concepts and ideas clearly.
- Ability to document technical processes, system designs, and operational procedures effectively