Senior Site Reliability Engineer
We’re transforming the software industry. We’re Flexera. With more than 50,000 customers across the world, we’re achieving that goal. But we know we can’t do any of that without our team. Ready to help us re-imagine the industry during a time of substantial growth and ambitious plans? Come and see why we’re consistently recognized by Gartner, Forrester and IDC as a category leader in the marketplace.
Flexera delivers Technology Value Optimization solutions that enable some of the largest companies in the world to inform their IT so they can transform their IT. From on-prem to the cloud, companies can get the IT asset data needed to rightsize, reallocate spend, reduce risk and maximize ROI.
Senior Site Reliability Engineer
We are seeking a highly skilled and experienced Senior Site Reliability Engineer to join our team and play a crucial part in designing, building, and managing Flexera’s next generation Kubernetes platform in AWS using Amazon Elastic Kubernetes Service (EKS). Our goal is to establish a strong Kubernetes infrastructure comprising dozens to hundreds of clusters, while ensuring the ease of management by a small team of Site Reliability Engineers. The ideal candidate must bring hands-on expertise from previous jobs in modern Kubernetes tools and technologies for creating and managing a modern Kubernetes platform and associated CI/CD pipeline.
Cluster Management: Create automation that builds, configures, and manages Kubernetes clusters within the platform, ensuring they meet performance and security standards.
Integration of Modern Tools: Implement and integrate modern Kubernetes and CI/CD tools and technologies as needed to streamline deployment, scaling, management, and monitoring of cluster health and microservices.
AWS Integration: Work closely with AWS services and resources to optimize Kubernetes clusters for performance and cost-effectiveness.
Security and Compliance: Manage and enhance the security of our Kubernetes infrastructure and maintain compliance with internal standards.
Automation and CI/CD: Implement automation in deployment, updates, and scaling processes, and integrate CI/CD pipelines with Kubernetes.
Monitoring and Troubleshooting: Set up monitoring, logging, and alerting systems for Kubernetes clusters and provide expertise in troubleshooting issues. Be part of an on-call rotation supporting our mission critical production infrastructure.
Documentation: Maintain clear and comprehensive documentation of the Kubernetes infrastructure, configurations, and processes for the benefit of the SRE team.
Collaboration: Collaborate with cross-functional teams to understand application requirements and provide support to developers.
Hands-on Kubernetes Experience: Proven track record of deploying and managing multiple Kubernetes clusters in AWS running on EKS.
Kubernetes Networking Concepts: Knowledge of Kubernetes networking concepts, including pods, services, ingress controllers, and network policies. Proficiency in AWS networking services, such as Amazon VPC (Virtual Private Cloud), Route 53, ELB/ALB (Elastic Load Balancer/Application Load Balancer), gRPC, and Network ACLs is a must.
Automation Skills: Extensive hands-on experience from previous jobs with infrastructure as code (IaC) and tools used to automate Kubernetes infrastructure in AWS. This includes experience creating Terraform modules, Helm Charts, and Kubernetes manifests from scratch.
Tool Proficiency: Proficiency with modern Kubernetes tools such as AWS EKS, AWS CloudFormation, eksctl/kubectl, Kubeconfig, Helm, Karpenter, and other related technologies.
AWS Expertise: Deep knowledge of AWS services and their integration with Kubernetes as demonstrated by hands-on expertise from previous jobs.
CI/CD Design and Automation: An understanding of CI/CD best practices and experience building easy to use CI/CD pipelines leveraging modern GitOps tools such as ArgoCD and/or Flux to deploy helm charts.
GitOps Expertise: Proficiency in using Git for version control and solid understanding of Gitworkflows, branching strategies, and repository management.
Security Focus: Understanding of Kubernetes security best practices and compliance requirements including role-based access controls (RBAC) and zero-trust service design models.
Problem-Solving: Excellent problem-solving skills with the ability to diagnose and resolve complex Kubernetes issues.
Documentation: Ability to maintain clear and organized documentation for reference.
Certifications: Relevant certifications in Kubernetes and AWS are a plus.
Education and Experience
- Bachelor's or higher degree in Computer Science, Information Technology, or a related field.
- At least 3 years of hands-on job experience in building and managing Kubernetes clusters in AWS.
- At least 5 years of hands-on job experience managing services in AWS
- Bangalore based candidates would need to be available for 1.5 hours in the evening twice a week Monday – Thursday for meetings with US based staff.
March through October 8:30PM – 10:00PM IST
November through March 8:00PM – 9:30PM IST
- Candidates can flex their hours to cover after-hours activities.
Flexera is proud to be an equal opportunity employer. Qualified applicants will be considered for open roles regardless of age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by local/national laws, policies and/or regulations.
Flexera understands the value that results from employing a diverse, equitable, and inclusive workforce. We recognize that equity necessitates acknowledging past exclusion and that inclusion requires intentional effort. Our DEI (Diversity, Equity, and Inclusion) council is the driving force behind our commitment to championing policies and practices that foster a welcoming environment for all.