Archive Administrator Scientist 2/3 | Los Alamos, NM | Los Alamos National Laboratory

Archive Administrator Scientist 2/3

What You Will Do

The High Performance Computing (HPC) Division at Los Alamos National Laboratory provides scientific computing resources consisting of some of the largest HPC systems in the world, including a large (19K+ node) Cray system called Trinity, as well as numerous large commodity cluster systems. The High Performance Computing (HPC) Archival Storage Team within the HPC Systems Group (HPC-SYS) provides vanguard production monitoring, support, testing, and maintenance for existing archival storage systems and deployment support for future systems. Visit the HPC website to learn more: https://www.lanl.gov/org/ddste/aldsc/hpc/index.php

This role requires strong communication skills, as well as comprehensive troubleshooting and analytical skills. Team member duties include: designing, building, and maintaining world-class data movement and archival storage systems; evaluating and testing new technology and solutions; diagnosing, solving, and implementing solutions for various system operational problems; tuning tape storage systems to increase performance and reliability of services; process automation; interacting with vendors; and communicating and collaborating with other groups, teams, projects and sites. Specifically, the selected candidate will support Programmatic archival storage as well as Institutional backup systems in both production and forward-looking efforts.

The selected candidate will participate in a regularly scheduled rotation of on-call support of production systems, including some systems under 7x24 hour support. In addition, some non-standard working hours may occasionally be required. This position is full-time and is located at Los Alamos National Laboratory in Los Alamos, New Mexico.

This position will be filled at either the Scientist 2/3 level, depending on the skills of the selected candidate. Additional job responsibilities (outlined below) will be assigned if the candidate is hired at the higher level.

Scientist 2 ($96,100 - $159,000)
  • Participate in periodic on-call responsibilities.
  • Work both independently and collaboratively with other members of the Archive and Institutional Backups team after receiving initial direction and requirements from technical project leads.
  • Troubleshoot, diagnose root cause of system failures, and isolate the components / failure scenarios while working with internal & external stakeholders
  • Develop and publish updates on resolutions and communicate findings internally.
  • Work with team members to make modifications and additions to existing systems, code, and methods.
  • Work with team to bring up new hardware and test functionality.
  • Participate in process improvement, including deep multi-system problem isolation and resolution often in collaboration with administrators of other HPC subsystems.
  • Work with team members to document, design, and implement new ideas and approaches for newer architectures and improve those for existing ones.
  • Present best practices, experience reports, and/or research results to managers and to peers locally or at conferences.

Scientist 3 ($115,500 - $194,900)

In addition to the duties outlined above, a successful Scientist 3 candidate will be required to:

• Work as a technical leader/subject matter expert to propose and implement solutions to current problems and future deficiencies in our HPC archive storage environment in conjunction with junior and senior administrators and technical staff within and across teams.

• Proactively create experiments and tooling to validate solutions and to detect and diagnose hardware health issues.

• Analyze published research papers in the area of archive and data storage, summarize, and share implications and connections to ongoing work with team members.

• Interact and/or collaborate with people from other teams, groups, divisions, directorates, and programs to develop, implement, and/or communicate technical solutions.

• Enhance technical and professional expertise of other staff and students through active mentoring and training activities.

• Contribute to peer review of the work of others across organizations or disciplines within the laboratory.

• Present best practices and research results to national peers at conferences, workshops, and meetings, as well as participate in national strategic partnerships.

What You Need

Minimum Job Requirements:
  • Demonstrated knowledge of building, configuring, and administering production Linux computer/storage systems.
  • Experience managing relational database systems (Oracle, IBM DB2, etc.).
  • Experience with backup/archival software (Tivoli Storage Manager, Commvault, HPSS, Oracle HSM, etc.)
  • Experience deploying/managing storage systems including tape storage library infrastructure (Oracle, Quantum, IBM, SpectraLogic).
  • Practical experience scripting in Bash, Perl, Python, or similar languages.
  • Experience deploying and managing SAN infrastructure.
  • Working knowledge of networking concepts and practices.
  • Strong interpersonal and written and oral communication skills.
  • Demonstrated ability to work within a team environment.
  • Ability to mentor and lead individual junior team members and students.
  • Ability to acquire and maintain a DOE Q-level clearance.

Additional Job Requirements for Scientist 3:

In addition to the requirements outlined above, qualification at the higher level requires:
  • Broad demonstrated knowledge of production HPC system management topics, including networking, programming, file systems, operating systems, and configuration management, with depth in one or more areas.
  • Demonstrated programming experience including compiled languages and advanced scripting.
  • Ability to lead and mentor teams, students, or junior team members.
  • Demonstrated ability to initiate, design, and lead projects.
  • Demonstrated ability to evaluate competing HPC subsystem technologies.
  • Ability to analyze published research papers in the area of data storage, summarize research results, and share implications and connections to ongoing work with team members.
  • Demonstrated ability to develop ideas for new tech proposals, participate in peer review, and contribute to the state-of-the-art in the area of data storage.

Education/Experience at lower level: Positions requires a Bachelor' degree in a STEM field from an accredited college and university and 4 years of related experience, typically with post-doctoral research experience at a university or national lab or equivalent experience directly related to the occupation.

Education/Experience at higher level: Position requires a Master's degree in a STEM field from an accredited college or university and 6 years of relevant experience or an equivalent combination of education and experience directly related to the occupation.

Desired Qualifications:
  • Knowledge of file systems such as ZFS, EXT, XFS.
  • Experience working in a production computing environment, preferably with HPC data storage systems or at large scale.
  • Working knowledge of file system structures and algorithms.
  • Experience with Object storage and RESTful storage interfaces.
  • Experience diagnosing system software problems.
  • Experience supporting a scientific user base.
  • Experience with multiple network technologies (e.g., Ethernet, IB, OPA).
  • Experience with revision control systems such as RCS, Subversion, or Git.
  • Experience with low-level system administration tools such as perf, strace, tcpdump, and vmstat.
  • Experience managing computers in a DOE or DOD classified environment.
  • Familiarity with Cfengine, Chef, Puppet, Ansible, Salt, or similar configuration and automation tools and practices.
  • Contribution to open source or non-work-related projects.
  • An inquisitive nature.
  • Active DOE Q Clearance.

Location: This position will be located in Los Alamos, NM.

COVID Vaccine: The COVID vaccine is mandatory for all Laboratory employees, on-site contractors, and on-site subcontractors unless granted an accommodation under applicable state or federal law. This requirement will apply to those working on-site, those teleworking, and all new hires.

Position commitment: Regular appointment employees are required to serve a period of continuous service in their current position in order to be eligible to apply for posted jobs throughout the Laboratory. If an employee has not served the time required, they may only apply for Laboratory jobs with the documented approval of their Division Leader. The position commitment for this position is 1 year.

Note to Applicants: For full consideration, please submit a comprehensive cover letter that addresses each key requirements of the position along with a resume.
Where You Will Work

Located in beautiful northern New Mexico, Los Alamos National Laboratory (LANL) is a multidisciplinary research institution engaged in strategic science on behalf of national security. Our generous benefits package includes:
  • PPO or High Deductible medical insurance with the same large nationwide network
  • Dental and vision insurance
  • Free basic life and disability insurance
  • Paid maternity and parental leave
  • Award-winning 401(k) (6% matching plus 3.5% annually)
  • Learning opportunities and tuition assistance
  • Flexible schedules and time off (paid sick, vacation, and holidays)
  • Onsite gyms and wellness programs
  • Extensive relocation packages (outside a 50 mile radius)
Additional Details

Directive 206.2 - Employment with Triad requires a favorable decision by NNSA indicating employee is suitable under NNSA Supplemental Directive 206.2. Please note that this requirement applies only to citizens of the United States. Foreign nationals are subject to a similar requirement under DOE Order 142.3A.

Clearance: Q(Position will be cleared to this level). Applicants selected will be subject to a Federal background investigation and must meet eligibility requirements* for access to classified matter. This position requires a Q clearance which requires US Citizenship except in extremely rare circumstances. Dependent upon position, additional authorization to access nuclear weapons information may be required that may or may not be available to dual citizens depending upon the circumstances.

*Eligibility requirements: To obtain a clearance, an individual must be at least 18 years of age; U.S. citizenship is required except in very limited circumstances. See DOE Order 472.2 for additional information.

New-Employment Drug Test: The Laboratory requires successful applicants to complete a new-employment drug test and maintains a substance abuse policy that includes random drug testing.

Regular position: Term status Laboratory employees applying for regular-status positions are converted to regular status.

Internal Applicants: Regular appointment employees who have served the required period of continuous service in their current position are eligible to apply for posted jobs throughout the Laboratory. If an employee has not served the required period of continuous service, they may only apply for Laboratory jobs with the documented approval of their Division Leader. Please refer to Policy Policy P701 for applicant eligibility requirements.

Equal Opportunity: Los Alamos National Laboratory is an equal opportunity employer and supports a diverse and inclusive workforce. All employment practices are based on qualification and merit, without regard to race, color, national origin, ancestry, religion, age, sex, gender identity, sexual orientation or preference, marital status or spousal affiliation, physical or mental disability, medical conditions, pregnancy, status as a protected veteran, genetic information, or citizenship within the limits imposed by federal laws and regulations. The Laboratory is also committed to making our workplace accessible to individuals with disabilities and will provide reasonable accommodations, upon request, for individuals to participate in the application and hiring process. To request such an accommodation, please send an email to applyhelp@lanl.gov or call 1-505-665-4444 option 1.
Employment StatusFull Time