Manager, Site Reliability Engineering

Splash

This job is no longer accepting applications

See open jobs at Splash.See open jobs similar to "Manager, Site Reliability Engineering" Red Sea Ventures Portfolio.

Software Engineering

India · Gurugram, Haryana, India

Posted 6+ months ago

Overview:

Cvent is a leading meetings, events and hospitality technology provider with more than 4,800 employees and nearly 22,000 customers worldwide. Founded in 1999, the company delivers a comprehensive event marketing and management platform for event professionals and offers software solutions to hotels, special event venues and destinations to help them grow their group/MICE and corporate travel business.

The DNA of Cvent is our people, and our culture has an emphasis on fostering intrapreneurship --a system that encourages Cventers to think and act like individual entrepreneurs and empowers them to take action, embrace risk, and make decisions as if they had founded the company themselves. We foster an environment that promotes agility, which means we don’t have the luxury to wait for perfection. At Cvent, we value the diverse perspectives that each individual brings. Whether working with a team of colleagues or with clients, we ensure that we foster a culture that celebrates differences and builds on shared connections

About the role:

Cvent is looking for a Manager, Site Reliability Engineering to help us scale our systems and ensure stability, reliability and performance and rapid deployments of our platform. We build teams that are inclusive, collaborative, and have a strong sense of ownership for the things they build. If you have a passion and track record for solving problems; moreover, have strong leadership skills, this is a great fit for you.

As Manager, SRE you will demonstrate both emerging and current technologies, methods, and processes contributing to the evolution of software deployment processes, enhancing security, reducing risk, and improving the overall end-user experience. As part of the Technology R&D Team, you will play an integral part in advancing DevOps maturity and be a part of a new culture of quality and site reliability. You will continually improve our CI/CD tools, processes, and procedures. You will also be responsible for regular reporting to Senior Technology Leaders and providing updates on organizational risk exposure and risk related issues.

In This Role, You Will:

Set the direction and strategy for your team, and help shape the overall SRE program for the company
Support the growth by ensuring a robust, scalable, cloud-first infrastructure
Own site stability, performance and capacity planning
Participate early in the SDLC to ensure reliability is built in from the beginning, and creating plans for successful implementations/launches
Foster a learning and ownership culture within the team and the larger Cvent organization
Ensure best engineering practices through automation, infrastructure as code, robust system monitoring, alerting, auto scaling, self-healing, etc...
Manage complex technical projects and a team of SREs
Recruit and develop staff; build a culture of excellence in site reliability and automation
Lead by example – roll up your sleeves by debugging and coding; participate in on-call rotation & occasional travel
Represent the technology perspective and priorities to leadership and other stakeholders by continuously communicating timeline, scope, risks, and technical road map

Here's What You Need:

10+ years of hands-on technical leadership and people management experience
3+ years of demonstrable experience leading site reliability and performance in large-scale, high-traffic environments
Strong leadership, communication and interpersonal skills geared to getting things done
Developing themselves and the talent within their charge – fostering and creating opportunity for the team
Architect-level understanding of one or more of the major public cloud services (AWS, GCP or Azure), using them to effectively design secure and scalable services
Strong understanding of SRE concepts and the DevOps culture, with a focus on leveraging software engineering tools, methodologies and concepts
In-depth understanding of automation and CI/CD processes to go along with excellent reasoning and problem-solving skills
Experience with Unix/Linux environments with a deep grasp on system internals
Worked on large-scale distributed systems including multi-tiered architecture
Strong knowledge of modern platforms like Fargate, Docker, Kubernetes etc.
Experience working with monitoring tools (Datadog, NewRelic, ELK stack, etc) and Database technologies (SQL Server, Postgres and Couchbase preferred)
Validated breadth of understanding and development of solutions based on multiple technologies, including networking, cloud, database, and scripting languages.
Experience in prompt engineering, building AI Agents, or MCP is a plus.

This job is no longer accepting applications

See open jobs at Splash.See open jobs similar to "Manager, Site Reliability Engineering" Red Sea Ventures Portfolio.

See more open positions at Splash