Sustainable Talent is partnering with NVIDIA on the search for a Senior DevOps and Infrastructure Engineer to work supporting Nvidia's IPP's (Infrastructure, Planning and Process) Cloud Infrastructure Team. IPP is a global organization within NVIDIA. This group works with various other groups within NVIDIA such as Graphics Processors, Mobile Processors, Deep Learning, Artificial Intelligence and Driverless Cars to cater to their infrastructure needs. These cloud services provide almost half a million automated jobs per day on thousands of servers helping with the productivity of thousands of NVIDIA's software engineers worldwide. The cloud hosts a heterogeneous mix of machines and devices with various operating systems (Windows/Linux/Android), a multitude of hardware platforms both NVIDIA GPUs and Tegra Processors. Are you passionate about distributed infrastructure and looking for sophisticated, critical issues, ready to build the next generation of cloud services, design creative solutions, mine through data to uncover real problems and fix them? We are excited to onboard a fun-loving person like you.
What you'll be doing:
Work with NVIDIA Product Teams to understand new product requirements including HPC and AI/ML Products.
Finding Optimum Solutions to deploy these products in a Datacenter or a Lab environment using sophisticated design techniques, services and tools.
Assist in roll-out and deployment of new development features aimed at supporting the latest NVIDIA hardware and technologies.
Work closely with world-class engineers, architects, technical product managers and application developers setting the best strategies in place for a product launch.
Defining and implementing full scale solutions for product onboarding into our hosted and private cloud environments.
Solve sophisticated problems involving multi-site deployments of NVIDIA products.
Collaborate with multi-functional teams, including system engineering, software engineering, mechanical/thermal engineering, operations, data center teams, external vendors, and other partners to successfully deliver a reliable and robust platform from concept to prototype to deployments.
Directly contribute to the overall quality of deployments and improve time to market next gen products.
Develop Imaging Pipelines and manage new OS deployments, including provisioning these services into the cloud.
What we need to see:
Bachelor's or Master's Degree in Computer Science or Software Engineering, or equivalent experience.
5+ years of relevant experience.
3+ years of Linux and Scripting experience.
Solid background on Image development and OS kernels
A track record of quickly understanding new technologies outside of your domain expertise and deploying systems in sophisticated configurations from hardware through multiple layers of software in a fast-paced environment.
Strong technical skills and understanding of embedded systems, orchestration & automation systems, data centers and cloud architecture, as well as excellent communication and planning skills.
Strong problem-solving ability and experience in product engineering/failure analysis and debug/ HW or test design.
Understanding of dense datacenter design including compute, Storage and networking.
Ways to stand out from the crowd:
Experience in large scale QA environments, for product bring ups.
Background with supporting GPUs, embedded device development, driver development and CUDA applications.
Special skills in large-scale computing and cluster computing (MPI), data center design include high speed interconnect InfiniBand, Cluster Storage and Scheduling related design and/or management experience.
Experience with converged and hyper-converged hardware and servers.
Background with Python.
Familiarity with Jenkins, Ansible and REST APIs.
Strong background on Windows & Linux administration.
Sustainable Talent is a M/F+, Disabled, and Veteran Equal Employment Opportunity and Affirmative Action employer.
- Citizenship: Not Provided
- Incentives: Not Provided
- Education: Not Provided
- Travel: Not Provided
- Telework: Not Provided