Datacenter Systems Architect

Applied Digital (NASDAQ: APLD)



About the job


Job Summary
The Data Center Systems Architect [HPC] will be a subject matter expert in High Performance Computing (HPC) Infrastructure and storage in conjunction with a team responsible for engineering, deploying, and supporting HPC based clusters and data centers. You will provide technical guidance to the team and perform activities required to design, build, support, and automate large, complex High-Performance Compute data center server systems.
The Systems Architect will also own Applied Digital's data center IT systems for existing cryptocurrency mining which includes design and engineering decisions. Additionally, this role will lead a team of systems engineers at data centers across North America with technical architecture and management responsibilities.
Primary Job Duties
  • Tend and observe equipment and machinery to verify efficient and safe operation.
  • Architect systems based on customer requirements, budgets, timelines, and parts availability
  • Design and implement scalable systems, software, and architectures
  • Support existing teams and operations
  • Enhance efficiency, robustness, and scalability
  • Lead capacity planning to help determine compute and storage
  • Own job scheduler, such as SLURM, including configuration, optimization, and advanced features
  • Plan customer dataset storage and systems to support their requirements
  • Optimize and troubleshoot complex ML/AI jobs and pipelines
  • Apply in-depth HPC and Linux expertise to collaborate with stakeholders across IT and domain disciplines to expand HPC use cases
  • Evaluate, analyze, and integrate HPC technologies such as job schedulers, high performance interconnects, networked filesystems, cybersecurity, cluster management, virtualization, networking, performance tuning, and data center planning
  • Act as the senior engineer assessing innovative technologies and integrates existing commercial and open-source software solutions
  • Work closely with Network team to define and design network requirements for systems environments

Desired Skills and Experience

Education and Experience


Minimum Bachelor of Science degree in Computer Science Engineering or a related study. Advanced degree preferred.


Authorized to work in U.S.



  • Architecting, developing, deploying, and operating large scale distributed systems at scale

  • <li class="MsoNormal" style="color: black; margin-bottom: 0in; line-height: normal; mso-list: l0 level1 lfo1; user-select: text; -webkit-user-drag: none; -webkit-tap-highlight-color: transparent; cursor: text; overflow: visible;" data-leveltext="路" data-font="Symbol" data-listid="1" data-list-defn-props="{"335552541":1,"335559684":-2,"335559685":720,"335559991":360,"469769226":"Symbol","469769242":[8226],"469777803":"left","469777804":"路","469777815":"hybridMultilevel"}" data-aria-posinset="2" data-aria-level="1">System, datacenter, or DevOps engineer in a complex HPC datacenter environment
    <li class="MsoNormal" style="color: black; margin-bottom: 0in; line-height: normal; mso-list: l0 level1 lfo1; user-select: text; -webkit-user-drag: none; -webkit-tap-highlight-color: transparent; cursor: text; overflow: visible;" data-leveltext="路" data-font="Symbol" data-listid="1" data-list-defn-props="{"335552541":1,"335559684":-2,"335559685":720,"335559991":360,"469769226":"Symbol","469769242":[8226],"469777803":"left","469777804":"路","469777815":"hybridMultilevel"}" data-aria-posinset="3" data-aria-level="1">Experience with Job Schedulers for High Performance Computing (HPC) systems, including consideration of resilience, memory, scalability, and central processing unit (CPU) footprint
    <li class="MsoNormal" style="color: black; margin-bottom: 0in; line-height: normal; mso-list: l0 level1 lfo1; user-select: text; -webkit-user-drag: none; -webkit-tap-highlight-color: transparent; cursor: text; overflow: visible;" data-leveltext="路" data-font="Symbol" data-listid="1" data-list-defn-props="{"335552541":1,"335559684":-2,"335559685":720,"335559991":360,"469769226":"Symbol","469769242":[8226],"469777803":"left","469777804":"路","469777815":"hybridMultilevel"}" data-aria-posinset="4" data-aria-level="1">Experience doing performance analysis studies of software and applications on HPC system architectures
    <li class="MsoNormal" style="color: black; margin-bottom: 0in; line-height: normal; mso-list: l0 level1 lfo1; user-select: text; -webkit-user-drag: none; -webkit-tap-h

Apply now
Apply now

Please let Applied Digital (NASDAQ: APLD) know that you found this job on Web3Jobs.so. Your support will help us grow!