Sr. System Engineer
Company: Support Revolution
Location: San Jose
Posted on: November 7, 2024
Job Description:
Location: San Jose, California, United StatesAbout
Supermicro:Supermicro is a Top Tier provider of advanced server,
storage, and networking solutions for Data Center, Cloud Computing,
Enterprise IT, Hadoop/ Big Data, Hyperscale, HPC and IoT/Embedded
customers worldwide. We are the #5 fastest growing company among
the Silicon Valley Top 50 technology firms. Our unprecedented
global expansion has provided us with the opportunity to offer a
large number of new positions to the technology community. We seek
talented, passionate, and committed engineers, technologists, and
business leaders to join us.Job Summary:As a Senior System
Engineer, you will work to port and optimize Digital Manufacturing
and general HPC / AI applications using Supermicro server hardware
platform, enabling breakthroughs in this rapidly moving field. You
will create application notes and blog content, and work closely
with the engineering teams, customers and partners. You will also
act as a senior technical figure within our product support
organization, debugging customer issues and providing concise
summaries and recommended fixes to our core engineering teams. You
will join the team to help build out, benchmark, and troubleshoot
the cluster for our customers including the in-house implementation
and the on-site HPC / AI deployment and acceptance. You will be
part of a talented team of engineers that demonstrate superb
technical competency, delivering mission critical infrastructure
and ensuring the highest levels of availability, performance and
security.Essential Duties and Responsibilities:Includes the
following essential duties and responsibilities (other duties may
also be assigned):
- Optimize the HPC/AI hardware platform
- Set up and configure complex test software or applications,
even when provided with incomplete steps or unclear
instructions.
- Analyze incomplete test setups, identify gaps, and
independently devise solutions to ensure successful execution.
- Troubleshoot installation and configuration issues, using
creative methods to resolve obstacles without all necessary
information
- Write and deploy custom scripts for ad-hoc tasks to meet
specific needs during onsite visits
- Develop strong technical relationships with our customers and
partners and achieve breakthroughs in HPC / AI performance
- Develop a deep understanding of the state-of-the-art in HPC /
AI domains and work with our customers and partners
- Become a recognized expert on HPC /AI applications and deliver
compelling training to our customers and partners
- Become a thought leader on HPC / AI application. Field &
resolve challenging / complex customer support issues
- Build processes and procedures for the HPC /AI solutions
- Prove of concept design/test and provide optimized benchmarks
on HPC/AI related applications in timely advance
- Optimize BIOS settings; OS / Network tuning and develop
different configurations for various types of simulations and come
up with efficient configurations for various loads
- Provide on-site deployment service and customer acceptance
verification and post level-1&2 support
- Draft and maintain technical documentations including technical
notes, blog, drawing or diagram
- Develop, review and understand the HPC roadmap to be able to
plan future software and hardware upgrades and refresh cycles to
maintain outstanding HPC infrastructure
- Work with the Product Management and Engineering to ensure a
good flow of customer feedback that can be incorporated into future
productsQualifications:
- MS or higher in related computationally intensive science or
engineering field.
- 8+ years of either AI/Deep Learning experience or related
experience writing and optimizing applications in HPC, scientific
libraries, compilers, digital signal processors or GPUs.
- Strong scripting and Linux OS internals knowledge.
- Solid grasp of networking, storage systems and batch
systems.
- Deep experience with C or FORTRAN, Shell/Python, Cuda and
in-depth knowledge of computer architectures, high performance
programming and parallel programming.
- Deep experience with HPC/AI application benchmarks at least
three of from the lists: LS-Dyna; Openform; Powerflow; Starccm+;
Ansys; WRF; NAMD; Amber; LAMMPS; Tensorflow; Pytorch; MXnet; Keras;
MLPerf etc.
- Ability to multitask effectively in a fast-paced environment;
Action-oriented with strong analytical and problem-solving
skills.
- Strong written and oral communications skills with the ability
to effectively interface with management and engineering.
- Comfortable in a customer-facing environment; Strong
team-working and excellent interpersonal skills.
- Work onsite at customer locations to complete the assignments
and projects within tight deadlines
- Travel is required, and the role may involve working outside of
regular business hoursSalary Range$140,000 - $158,000The salary
offered will depend on several factors, including your location,
level, education, training, specific skills, years of experience,
and comparison to other employees already in this role. In addition
to a comprehensive benefits package, candidates may be eligible for
other forms of compensation, such as participation in bonus and
equity award programs.EEO StatementSupermicro is an Equal
Opportunity Employer and embraces diversity in our employee
population. It is the policy of Supermicro to provide equal
opportunity to all qualified applicants and employees without
regard to race, color, religion, sex, sexual orientation, gender
identity, national origin, age, disability, protected veteran
status or special disabled veteran, marital status, pregnancy,
genetic information, or any other legally protected status.
#J-18808-Ljbffr
Keywords: Support Revolution, Merced , Sr. System Engineer, Other , San Jose, California
Didn't find what you're looking for? Search again!
Loading more jobs...