Data Scientist and AIOps Developer/Administrator (Scientist 2/3)

What You Will Do

The High Performance Computing (HPC) Division at Los Alamos National Laboratory provides scientific computing resources consisting of some of the largest HPC systems in the world as well as numerous large commodity clusters. Our HPC Systems Group (HPC-SYS) is creating a new AIOps team to make better use of our vast amounts of HPC System and Application data, this team will add AI solutions to our HPC infrastructure. The new team will work alongside the other Teams in HPC-SYS, Monitoring, Web Services and Cybersecurity. The Monitoring Team is responsible for collecting all data from our Data Centers, everything from Facilities to Clusters, and implementing operational dashboard, alerts and reports using tools like Splunk. Our Web Servers team runs our admin and user facing web sites, including user Documentation, Ticketing systems and Gitlab. Our Cybersecurity Team monitors and implements cybersecurity policies on our HPC systems.

The AIOps team will have three major focus areas, LLMs, System Data Analysis and System Automation. We have a large set of HPC specific documentation for both users and admins that will be integrated into LLMs, this team will be responsible for designing, building, and running the LLMs. We have massive amounts of system data. This team will implement ML and Data Science techniques to perform deeper analysis of the data to improve performance analysis, event correlation and anomaly detection. Finally, the team will investigate AI driven workflows for task automation within the Data Center. You will work closely with other members of the AIOps team and System Matter Experts (SMEs) in different HPC areas to design and develop these tools using our on-prem analysis and Gen AI systems.

The sucessful candidates' scope will include monitoring and analyzing system performance to identify anomalies. You will analyze large volumes of data to identify patterns and trends using ML and Data Science techniques with the goal of developing automation scripts and workflows to implement proactive measures. You and the team will maintain the AIOps platforms and tools including the user-facing LLMs. The successful candidate will continue actively growing their technical skills and keeping up to date with the latest technologies in the field. In addition, the selected candidate will have the opportunity to develop technical products such as technical documentation, presentations, technical papers, and reports, to communicate findings internally and at conferences. This position is full-time and is located at Los Alamos National Laboratory in Los Alamos, New Mexico.

This position will be filled at either the Scientist 2 or Scientist 3 level, depending on the skills of the selected candidate. Additional job responsibilities (outlined below) will be assigned if the candidate is hired at the higher level.

What You Need

Minimum Job Requirements:

Scientist 2: ($101,700 - $168,200)
  • Knowledge of Linux system administration, including command line Linux operating system skills, knowledge of hardware and software security practices
  • Experience building, testing, evaluating and running LLMs
  • Knowledge of Machine Learning and Data Science techniques for data analysis
  • Strong knowledge of Python and AI frameworks
  • Ability to clean, preprocess, and analyze large datasets
  • Knowledge of containerization technologies such as Docker and Kubernetes

Additional Job Requirements for Scientist 3:

Scientist 3: ($122,300 - $206,300)

In addition to the Job Requirements outlined above, qualification at the higher level requires:
  • Extensive experience analyzing system log and metric data with strong statistical analysis skills and understanding of ML algorithms
  • Experience with Machine Learning and Data Science techniques for data analysis, anomaly detection and event correlation
  • Knowledge of anomaly detection techniques and time series analysis
  • Knowledge of how to safeguard LLMs with guardrails and prompt engineering
  • Experience fine-tuning Foundation Models

Education/Experience at lower level:

Position requires a Bachelor' degree in a STEM field from an accredited college and university and 4 years of relevant experience or an equivalent combination of education and experience directly related to the occupation.

Education/Experience at higher level:

Position requires a Master's degree in a STEM field from an accredited college or university and 6 years of relevant experience or an equivalent combination of education and experience directly related to the occupation.

Desired Qualifications:
  • Experience building and running RAG based LLMs
  • Experience with implementing AIOps tools and workflows from ML Analysis to system automation and configuration management
  • Experience running on workflows on NVidia DGX/HGX systems or pods
  • Experience using Git for version control
  • Experience integrating operational metrics into a monitoring system such as Splunk
  • Familiarity with monitoring and logging tools like Syslog, Telegraf, Prometheus, Grafana, etc.
  • Experience with deep learning frameworks such as TensorFlow or PyTorch
  • Demonstrated effective communication skills, including demonstrated ability to work productively with customers and vendors
  • High attention to detail including excellent organizational skills, analytical thinking, observational and problem-solving skills. Proven ability to independently multi-task and adjust to the workings of a dynamic and fast paced environment.
  • An Active DOE Q Clearance

Work Location:

This position will be located in Los Alamos, NM, with the potential for a hybrid work arrangement (60% onsite/40% offsite) from a location within 2 hours ground commute of this location. Reporting onsite will be required. Hybrid is at the discretion of management and can change at any time with appropriate notice.

Position commitment: Regular appointment employees are required to serve a period of continuous service in their current position in order to be eligible to apply for posted jobs throughout the Laboratory. If an employee has not served the time required, they may only apply for Laboratory jobs with the documented approval of their Division Leader. The position commitment for this position is 1 year.

Note to Applicants:

For consideration, applicants should submit a cover letter addressing how their knowledge, skills and abilities meet the minimum requirements along with a resume.
Where You Will Work

Located in beautiful northern New Mexico, Los Alamos National Laboratory (LANL) is a multidisciplinary research institution engaged in strategic science on behalf of national security. Our generous benefits package includes:

§ PPO or High Deductible medical insurance with the same large nationwide network

§ Dental and vision insurance

§ Free basic life and disability insurance

§ Paid childbirth and parental leave

§ Award-winning 401(k) (6% matching plus 3.5% annually)

§ Learning opportunities and tuition assistance

§ Flexible schedules and time off (PTO and holidays)

§ Onsite gyms and wellness programs

§ Extensive relocation packages (outside a 50 mile radius)
Additional Details

Directive 206.2 - Employment with Triad requires a favorable decision by NNSA indicating employee is suitable under NNSA Supplemental Directive 206.2. Please note that this requirement applies only to citizens of the United States. Foreign nationals are subject to a similar requirement under DOE Order 142.3A.

Clearance: Q (Position will be cleared to this level). Selected applicants will be subject to a background investigation conducted by or on behalf of the Federal Government, and must meet eligibility requirements* for access to classified matter. This position requires a Q clearance. and obtaining such clearance requires US Citizenship except in extremely rare circumstances. Dependent upon the position, additional authorization to access classified information may be required, which may or may not be available to dual citizens. Receipt of a Q clearance and additional access authorization ultimately is a decision of the Federal Government and not of Triad.

*Eligibility requirements: To obtain a clearance, an individual must be at least 18 years of age; U.S. citizenship is required except in very limited circumstances. See DOE Order 472.2 for additional information.

New-Employment Drug Test: The Laboratory requires successful applicants to complete a new-employment drug test and maintains a substance abuse policy that includes random drug testing. Although New Mexico and other states have legalized the use of marijuana, use and possession of marijuana remain illegal under federal law. A positive drug test for marijuana will result in termination of employment, even if the use was pre-offer.

Regular position: Term status Laboratory employees applying for regular-status positions are converted to regular status.

Internal Applicants: Regular appointment employees who have served the required period of continuous service in their current position are eligible to apply for posted jobs throughout the Laboratory. If an employee has not served the required period of continuous service, they may only apply for Laboratory jobs with the documented approval of their Division Leader. Please refer to Policy P701 for applicant eligibility requirements.
Equal Opportunity: Los Alamos National Laboratory is an equal opportunity employer and supports a diverse and inclusive workforce. All employment practices are based on qualification and merit, without regard to race, color, national origin, ancestry, religion, age, sex, gender identity, sexual orientation, marital status or spousal affiliation, physical or mental disability, medical conditions, pregnancy, status as a protected veteran, genetic information, or citizenship within the limits imposed by federal laws and regulations. The Laboratory is also committed to making our workplace accessible to individuals with disabilities and will provide reasonable accommodations, upon request, for individuals to participate in the application and hiring process. To request such an accommodation, please send an email to or call 1-505-664-6947 option 2.