HPC Network Administrator / Computer Scientist (Scientist 2/3)

What You Will Do

The High Performance Computing (HPC) Division at Los Alamos National Laboratory provides scientific computing resources consisting of some of the largest and most innovative HPC systems in the world. Our HPC Network Team provides vanguard production support, research, and development for existing and future systems that feed and unleash the power of our supercomputers. The selected candidate will participate in a regularly scheduled on-call rotation in support of 24/7 production systems. In addition, some non-standard working hours may occasionally be required. Visit the HPC website to learn more: https://www.lanl.gov/org/ddste/aldsc/hpc/index.php

Innovators and builders at heart, the Network Team is seeking our next dynamic team member to help maintain, define, evaluate, develop, and deploy our existing and future high speed networking environments.

This position will be filled at either the Computer Scientist 2 or Computer Scientist 3 level, depending on the skills of the selected candidate. Additional job responsibilities (outlined below) will be assigned if the candidate is hired at the higher level.

Scientist 2 $99,200 - $164,100
  • As a Scientist 2 HPC Network Administrator will participate in periodic 24/7 on-call responsibilities and work both independently and collaboratively with other members of the team or group, including contributing to the design, deployment, testing, analysis, verification, and validation of both existing networks and networks in development, including modifications and additions to linux-based systems, configurations, and methods.
  • You will apply existing scientific resources to diagnose root cause of system failures in collaboration with administrators of other HPC subsystems; bring up new hardware and test functionality; and document, design, and implement approaches for newer architectures.

Scientist 3 $119,200 - $201,100

In addition to the duties outlined above, a successful Scientist 3 candidate will be required to:
  • Work as a technical leader to develop innovative advanced concepts, theories, methods, techniques, and approaches to address specialized network problems, including proposing and implementing solutions to current problems and future HPC technologies in conjunction with junior and senior administrators and technical staff within and across teams.
  • Proactively examine our HPC network infrastructure through creation of experiments and tooling to validate solutions and to detect and diagnose hardware health issues; and analyze and share published research papers in the area of HPC and networking.
  • You will work to influence organizational, project, and program strategies and directions related to networking and make decisions and/or recommendations that influence the achievement of key programmatic objectives.
  • Mentoring students, junior staff, and peers in technical and professional growth activities is highly valued as is maintaining state-of-the-art technical expertise and knowledge within HPC networking and developing new skills in related disciplines.
  • In addition, the successful candidate will be expected to lead peer review of the work of others within HPC Division and participate in peer review across organizations or disciplines within the laboratory, develop ideas for new technical proposals and business development opportunities, contribute to the state-of-the-art in networking, develop new skills consistent with state-of-the-art, and present best practices and research results to national peers at conferences, workshops, and meetings, as well as participate in national strategic partnerships.

What You Need

Minimum Job Requirements:

  • Broad knowledge of network administration, including knowledge of TCP/IP, Ethernet, network switch configuration/administration
  • Demonstrated knowledge of building, configuring, and administering production Linux computer/support systems, including strong command line and service level Linux operating system skills, working knowledge of or experience with hardware and software security practices, and experience scripting in Bash, Perl, Python, or similar languages.
  • Broad knowledge of networking security concepts and practices, including best practices for network security and system hardware, software hardening and working with network firewalls and/or network access-control lists (ACLs).
  • Demonstrated ability to work within a team environment.

Additional Job Requirements for Scientist 3:

In addition to the requirements outlined above, qualification at the higher level requires:
  • Demonstrated record of accomplishment and expertise in network administration, including demonstrated expertise in building, configuring, and managing data center networks to include layer2/layer3 TCP/IP networks, high-speed network interconnects (such as InfiniBand, Omni-Path or Slingshot), and configuration of NICs, routers, and network firewalls.
  • Broad demonstrated knowledge of production HPC system management topics, including programming, file systems, operating systems, and configuration management, with depth in one or more areas.
  • Demonstrated ability to evaluate competing HPC subsystem technologies.
  • Demonstrated ability to develop ideas for new tech proposals, participate in peer review, and contribute to the state-of-the-art in the area of networking.
  • Experience interacting with vendors and colleagues within the industry, including presenting technical papers and/or technical work to peers locally and at conferences.
  • Demonstrated ability to initiate, design, and lead projects.

Education/Experience at lower level:

Position requires a bachelor's degree in a STEM field from an accredited college and university and 4 years of relevant experience or an equivalent combination of education and experience directly related to the occupation.

Education/Experience at higher level:

Position requires a master's degree in a STEM field from an accredited college or university and 6 years of relevant experience or an equivalent combination of education and experience directly related to the occupation.

Desired Qualifications:
  • Experience working in a production computing environment, preferably with HPC data centers, large topology systems or at large scale.
  • Experience supporting a scientific user base and/or experience managing computers in a DOE or DOD classified environment.
  • Deep knowledge of and demonstrated experience with network protocols; Significant experience with multiple VLANs, tagged and untagged, as well as LACP and other port channel protocols; Practical experience with OSPF and other routing protocols; Practical experience with Juniper or other firewall systems.
  • Deep knowledge of and demonstrated experience with high-speed networks; configuration and administration of large scale Infiniband, Omni-Path or Slingshot fabrics; Knowledge of RDMA resources and concepts; knowledge of parallel programming (MPI, etc.).
  • Experience with multiple Linux distributions; experience diagnosing system software problems; familiarity with Ansible, Cfengine, Chef, Puppet, Salt, or similar configuration and automation tools and practices; experience with revision control systems such as Git, Subversion, or RCS; and/or experience with low-level system administration tools such as iperf, strace, tcpdump, and vmstat.
  • Demonstrated ability to develop new methods, techniques, or approaches to address critical technical problems and/develop new technical capabilities.

Work Location: The work location for this position is hybrid and is located in Los Alamos, NM. Hybrid is defined as working partially onsite/partially offsite but within 2 hours ground commute of this location. All work locations are at the discretion of management and can change at any time with appropriate notice.

Position commitment: Regular appointment employees are required to serve a period of continuous service in their current position in order to be eligible to apply for posted jobs throughout the Laboratory. If an employee has not served the time required, they may only apply for Laboratory jobs with the documented approval of their Division Leader. The position commitment for this position is 1 year.

Note to Applicants:

This position does have an on call portion for a full week at a time every couple of weeks. For more information please reach out to Alex Wroblewski (alexwrob@lanl.gov).
Where You Will Work

Located in beautiful northern New Mexico, Los Alamos National Laboratory (LANL) is a multidisciplinary research institution engaged in strategic science on behalf of national security. Our generous benefits package includes:

§ PPO or High Deductible medical insurance with the same large nationwide network

§ Dental and vision insurance

§ Free basic life and disability insurance

§ Paid childbirth and parental leave

§ Award-winning 401(k) (6% matching plus 3.5% annually)

§ Learning opportunities and tuition assistance

§ Flexible schedules and time off (PTO and holidays)

§ Onsite gyms and wellness programs

§ Extensive relocation packages (outside a 50 mile radius)
Additional Details

Directive 206.2 - Employment with Triad requires a favorable decision by NNSA indicating employee is suitable under NNSA Supplemental Directive 206.2. Please note that this requirement applies only to citizens of the United States. Foreign nationals are subject to a similar requirement under DOE Order 142.3A.

Clearance: Q (Position will be cleared to this level). Selected applicants will be subject to a background investigation conducted by or on behalf of the Federal Government, and must meet eligibility requirements* for access to classified matter. This position requires a Q clearance. and obtaining such clearance requires US Citizenship except in extremely rare circumstances. Dependent upon the position, additional authorization to access classified information may be required, which may or may not be available to dual citizens. Receipt of a Q clearance and additional access authorization ultimately is a decision of the Federal Government and not of Triad.

*Eligibility requirements: To obtain a clearance, an individual must be at least 18 years of age; U.S. citizenship is required except in very limited circumstances. See DOE Order 472.2 for additional information.

New-Employment Drug Test: The Laboratory requires successful applicants to complete a new-employment drug test and maintains a substance abuse policy that includes random drug testing. Although New Mexico and other states have legalized the use of marijuana, use and possession of marijuana remain illegal under federal law. A positive drug test for marijuana will result in termination of employment, even if the use was pre-offer.

Regular position: Term status Laboratory employees applying for regular-status positions are converted to regular status.

Internal Applicants: Regular appointment employees who have served the required period of continuous service in their current position are eligible to apply for posted jobs throughout the Laboratory. If an employee has not served the required period of continuous service, they may only apply for Laboratory jobs with the documented approval of their Division Leader. Please refer to Policy Policy P701 for applicant eligibility requirements.
Equal Opportunity: Los Alamos National Laboratory is an equal opportunity employer and supports a diverse and inclusive workforce. All employment practices are based on qualification and merit, without regard to race, color, national origin, ancestry, religion, age, sex, gender identity, sexual orientation, marital status or spousal affiliation, physical or mental disability, medical conditions, pregnancy, status as a protected veteran, genetic information, or citizenship within the limits imposed by federal laws and regulations. The Laboratory is also committed to making our workplace accessible to individuals with disabilities and will provide reasonable accommodations, upon request, for individuals to participate in the application and hiring process. To request such an accommodation, please send an email to applyhelp@lanl.gov or call 1-505-664-6947 option 2.