Location: Santa Clara, CA, USA
Product Manager – DGX Systems (Serviceability Focus)
NVIDIA is the defining technology company of the AI era. Our groundbreaking innovations in AI, accelerated computing, and full-stack solutions are redefining whats possible—and its all driven by exceptional people. We are looking for a Product Manager to lead the Serviceability lifecycle for Enterprise AI factory platforms, including DGX systems and DGX SuperPOD.
In this role, youll define product requirements, guide hardware/software integration across compute, networking, and storage, and lead execution from concept through deployment in customer environments.
What You will Be Doing:
-
Own and manage the serviceability aspects of the DGX product lifecycle—from development to obsolescence.
-
Define product requirements and user stories by synthesizing feedback from customers, solution architects, and internal teams.
-
Create technical content and collateral, including service guides, installation procedures, tools, and system labeling.
-
Drive improvements to system packaging for ease of installation and secure shipping.
-
Collaborate with engineering to enhance customer experiences in system deployment and firmware lifecycle management.
-
Conduct and refine the Out-of-Box Experience (OOBE).
What We are Looking For:
-
Strong understanding of system architectures, especially accelerated computing and networking.
-
Experience with field service procedures and hands-on management of system components.
-
Familiarity with NVIDIA tools and ecosystems (e.g., CUDA, DCGM, UFM).
-
Practical experience in data center deployment, operation, and solution management.
-
Proficiency in 3D design tools, such as CAD and rendering software.
-
Demonstrated problem-solving ability and comfort navigating technical trade-offs.
-
12+ years of experience in high-tech, AI/ML, cloud, or infrastructure product roles.
-
Bachelors or Masters in Engineering, Computer Science, or a related technical field. MBA is a plus.
Preferred Qualifications (Nice to Have):
-
Experience launching 0-to-1 products.
-
Exposure to data center technologies, including server architecture and liquid cooling.
-
Experience in large-scale system deployment, high-availability infrastructure, and clustered computing.
