Computing Systems Tec 3/4
- Req. Number: IRC135097
- Organization : HPC-OPS/High Performance Computing Operations Group
- City, State: Los Alamos, New Mexico
- Recruiter Name: Wroblewski, Alex Christopher
- Recruiter Email: alexwrob@lanl.gov
The High Performance Computing (HPC) Division provides production high performance computing systems services to the Laboratory. Our work starts with the early phases of acquisition, development, and production readiness of HPC platforms, and continues through the maintenance and operation of these systems and the facilities in which they are housed. HPC Division also manages the network, parallel file systems, storage, and visualization infrastructure associated with the HPC platforms. The Division directly supports the Laboratory's HPC user base and aids, at multiple levels, in the effective use of HPC resources to generatescience. Additionally, we support selected research activities that we deem important to our mission.
The High Performance Computing Operations Group (HPC-OPS) of the High Performance Computing Division has the main responsibility of providing operational duties that span the HPC Division. There are several teams within the group that take responsibility for the broad range of HPC platforms for a large and diverse customer base: HPC Operations, HPC Vendor Liaison team, and HPC logistics. We provide support and services to many production platforms at a world-class computing facility to ensure customers can accomplish their research and mission at extreme scale.
HPC-OPS is seeking a highly organized, self-motivated, and a proficient multi-tasker with a strong desire and capacity to learn to join our team as a Computing Systems Tec (CST) 3/4. By joining our team you will discover a world of opportunity. As a Computer Systems Technician on the High-Performance Computing Operations (TechOps), you will be a key member of the team that enables the Laboratory to accomplish its mission by providing strategic, 24x7, day-to-day technical support services for our High Performance Computing (HPC) capability. The TechOps provides cutting-edge technical support for HPC clusters, including but not limited to system health, network infrastructure, filesystem and hardware monitoring and support to maintain capability and implement continuous capability improvements across a complex and heterogeneous computing environment. This is your chance to directly support our national security mission and continue to make LANL the best place to work as a member of a dynamic, team-oriented, and leading-edge technical support team.
The TechOps Team requires a technician who has experience with computing system support across a large production computing environment in a professional setting. The selected candidate will have the capacity to resolve a range of computer system issues and hardware support, complete specialized tasks, and apply full knowledge of a range of related disciplines.
This position will be filled at either the CST-3, or CST-4 level depending on the skills of the selected candidate. Additional job responsibilities (outlined below) will be assigned if the candidate is hired at the higher level.
Computing Systems Tec 3 (CST-3) ($56,300 - $87,200)
Successful applicants will perform the following duties in this position:
- Be a key member of the team that enables the Laboratory to accomplish its mission by providing strategic, 24x7, day-to-day technical support services for LANL's High Performance Computing capability.
- Monitor the health of HPC clusters, including but not limited to system health, network infrastructure, filesystem and hardware monitoring and support to maintain capability and implement continuous capability improvements across a complex and heterogeneous computing environment.
Computing Systems Tec 4 (CST-4) ($67,800 - $107,700)
Successful applicants will perform the following duties in this position:
- Apply existing and demonstrated skill set and an intermediate level of experience to the HPC scope described above, including hardware expertise, system monitoring and response acumen, and system administration experience as detailed below.
What You Need
Minimum Job Requirements:
24/7 Strategic Support
Ability to work an assigned shift (i.e., day, swing, or grave) as well as off hours as assigned as part of 24x7 Computer System Support Team. Off-hours assignments may include an 8-hour shiftevery other weekend, holidays as assigned, on-call, and coverage on short notice as needed. This position is considered essential personnel and as such, the incumbent will be required to report to work if scheduled during inclement weather and winter closures.
Hardware Expertise
Basic knowledge of complex heterogeneous computing systems to include basic experience in troubleshooting, diagnosing and repairing hardware failures to component level on servers.
System Monitoring and Response Acumen
Familiarity with ticket and issue tracking systems and ability to set up and monitor the operation of computer consoles and peripheral equipment to monitor equipment and production application jobs, to identify areas in need of diagnostic tests to isolate equipment malfunctions, and to identify failures and initiate proper recovery procedures.
System Administration Experience
Basic knowledge of LINUX and/or Microsoft computer operating systems.
Communication and Teaming Skills
Demonstrated effective communication skills (both verbal and written), including the ability to communicate technical information to both technical and non-technical personnel, provide assistance and knowledge to peers, and collaborate with Group members, other HPC Group personnel and vendor representatives, as required.
Additional Job Requirements for CST-4:
In addition to the Job Requirements outlined above, qualification at the CST-4 level requires:
Hardware Expertise
Demonstrated ability to apply knowledge of complex heterogeneous computing systems to provide full life-cycle hardware support to include installation and integration of equipment and systems, troubleshooting, diagnoses, performance of preventive maintenance tasks (including logistic duties which include inventory control, shipping and receiving of spare parts), repair of hardware failures to component level as needed, reporting of hardware failures on vendor supported systems, trend analysis on hardware failures and repairs, and decommissioning activities on equipment in multi-system environments residing on both secure and open networks, with minimal to no supervision. Assignments are usually focused while working within established priorities, procedures, processes, and requirements or specifications. Impact of work is usually limited to a well-defined area of a project or specific assignment.
Demonstrated ability to develop, implement and maintain policies and procedures to troubleshoot, diagnose and repair computer hardware.
System Administration Expertise
Demonstrated knowledge of LINUX and/or Microsoft computer operating systems at an intermediate level to work closely with technical leaders and on-call system administrators to develop and implement innovative technical solutions to complex software problems and to provide primary-level system administration to address routine software issues, as well as ability to apply demonstrated experience in modifying scripts in various languages such as shell, python, perl, etc. to provide system monitoring, administration, and support.
Education/Experience at CST-3:
Position requires a High School diploma or equivalency and 2-4 years directly related experience; or; an equivalent combination of education and experience directly related to the occupation. Technical Institute graduation or AA degree may be preferred.
Education/Experience at CST-4:
Position requires a High School diploma or equivalency and 4-6 years directly related experience; or; an equivalent combination of education and experience directly related to the occupation. Technical Institute graduation or AA degree may be preferred.
Desired Qualifications:
- Experience working in a production HPC environment
- Demonstrated intermediate knowledge of Linux operating system using the command line interface (CLI)
- Demonstrated experience supporting midrange and mainframe computers and use of job control languages to determine software or hardware issues and system performance.
- Experience in troubleshooting and maintaining a redundant array of independent disks (RAID) systems.
- Familiarity with classified electronic media handling.
- Ability to interpret facility monitoring error messages.
- Experience in configuring, troubleshooting and repairing network switches.
- Ability to script in one or more languages such as shell, python, perl, etc.
- Active DOE Q-clearance
- An active SCI
Work Location: The work location for this position is onsite and located in Los Alamos, NM. All work locations are at the discretion of management.
Position commitment: Regular appointment employees are required to serve a period of continuous service in their current position in order to be eligible to apply for posted jobs throughout the Laboratory. If an employee has not served the time required, they may only apply for Laboratory jobs with the documented approval of their Division Leader. The position commitment for this position is 1 year.
Note to Applicants:
For full consideration, applicants should submit a cover letter addressing how their knowledge, skills and abilities meet the minimum requirements along with a resume.
Where You Will Work
Located in beautiful northern New Mexico, Los Alamos National Laboratory (LANL) is a multidisciplinary research institution engaged in strategic science on behalf of national security. Our generous benefits package includes:
§ PPO or High Deductible medical insurance with the same large nationwide network
§ Dental and vision insurance
§ Free basic life and disability insurance
§ Paid childbirth and parental leave
§ Award-winning 401(k) (6% matching plus 3.5% annually)
§ Learning opportunities and tuition assistance
§ Flexible schedules and time off (PTO and holidays)
§ Onsite gyms and wellness programs
§ Extensive relocation packages (outside a 50 mile radius)
Additional Details
Directive 206.2 - Employment with Triad requires a favorable decision by NNSA indicating employee is suitable under NNSA Supplemental Directive 206.2. Please note that this requirement applies only to citizens of the United States. Foreign nationals are subject to a similar requirement under DOE Order 142.3A.
Clearance: Q (Position will be cleared to this level). Selected applicants will be subject to a background investigation conducted by or on behalf of the Federal Government, and must meet eligibility requirements* for access to classified matter. This position requires a Q clearance. and obtaining such clearance requires US Citizenship except in extremely rare circumstances. Dependent upon the position, additional authorization to access classified information may be required, which may or may not be available to dual citizens. Receipt of a Q clearance and additional access authorization ultimately is a decision of the Federal Government and not of Triad.
*Eligibility requirements: To obtain a clearance, an individual must be at least 18 years of age; U.S. citizenship is required except in very limited circumstances. See DOE Order 472.2 for additional information.
New-Employment Drug Test: The Laboratory requires successful applicants to complete a new-employment drug test and maintains a substance abuse policy that includes random drug testing. Although New Mexico and other states have legalized the use of marijuana, use and possession of marijuana remain illegal under federal law. A positive drug test for marijuana will result in termination of employment, even if the use was pre-offer.
Regular position: Term status Laboratory employees applying for regular-status positions are converted to regular status.
Internal Applicants: Regular appointment employees who have served the required period of continuous service in their current position are eligible to apply for posted jobs throughout the Laboratory. If an employee has not served the required period of continuous service, they may only apply for Laboratory jobs with the documented approval of their Division Leader. Please refer to Policy Policy P701 for applicant eligibility requirements.
Equal Opportunity: Los Alamos National Laboratory is an equal opportunity employer and supports a diverse and inclusive workforce. All employment practices are based on qualification and merit, without regard to race, color, national origin, ancestry, religion, age, sex, gender identity, sexual orientation, marital status or spousal affiliation, physical or mental disability, medical conditions, pregnancy, status as a protected veteran, genetic information, or citizenship within the limits imposed by federal laws and regulations. The Laboratory is also committed to making our workplace accessible to individuals with disabilities and will provide reasonable accommodations, upon request, for individuals to participate in the application and hiring process. To request such an accommodation, please send an email to applyhelp@lanl.gov or call 1-505-664-6947 option 2 and then option 3.