Manager – Network Operations

NVIDIA

  • Full Time

To apply for this job please visit nvidia.wd5.myworkdayjobs.com.

Job Title: Manager & Technical Lead – Network Support

About the Role:

NVIDIA is looking for a Manager and Technical Lead of Network Support to join our global IT Infrastructure team. In this key leadership role, you’ll oversee the operations of NVIDIA’s global network, ensuring it meets high standards of reliability, scalability, and efficiency. You’ll lead a team of network support engineers with a strong focus on data-driven operations, observability, and continuous improvement.

This role requires a hands-on leader who can translate strategic vision into measurable outcomes, build high-performing teams, and work cross-functionally with architecture, engineering, and operations groups across NVIDIA.

What You will Be Doing:

  • Drive the health and performance of NVIDIA’s network through proactive monitoring and customer experience metrics.
  • Lead the evolution of the support model toward a more automated, data-driven SRE (Site Reliability Engineering) approach.
  • Provide strategic and technical direction to a team of network reliability experts.
  • Define and execute the vision and roadmap for network operations in collaboration with infrastructure and partner teams.
  • Develop and maintain runbooks, training programs, and operational best practices to support self-healing network design.
  • Analyze incident root causes (RCAs) and collaborate with AI Ops to improve observability across the full stack—from network to applications.
  • Influence the architecture of both on-premises and cloud networks through operational insights and data.

What We are Looking For:

  • Bachelor’s degree in Computer Science, Engineering, or a related technical field (or equivalent experience).
  • 10+ years of combined experience in network architecture, engineering, SRE, and operations—at least 4+ years in a leadership capacity.
  • Strong technical expertise in routing and switching protocols (e.g., EVPN, BGP, OSPF, VXLAN).
  • Proven experience managing globally distributed teams and aligning local execution with global standards.
  • Hands-on ability to dive deep into networking, systems, and infrastructure.
  • Strong communication and collaboration skills, with the ability to work effectively with senior leadership and technical teams.
  • Experience using tools like ServiceNow and Power BI for performance and capacity reporting.
  • Data-driven mindset with the ability to identify trends and drive scalable solutions across teams.

Bonus Points (Preferred Skills):

  • Experience operating large-scale, global networks.
  • Expertise with EVPN, MPLS-SR, Cumulus Linux, or equivalent networking technologies.
  • Knowledge of SRE principles—observability, SLIs/SLOs, alerting, logging, etc.
  • Experience with tools like Netbox/Nautobot, Prometheus, Grafana for monitoring and automation.

Why NVIDIA:

NVIDIA is a world leader in accelerated computing and AI. You’ll be part of a team that’s building and supporting the infrastructure that powers innovation across gaming, AI, autonomous vehicles, and more. This is your opportunity to impact a global organization at scale.

Job Overview