HPC System Administrator
1 dia atrás
Description and Requirements We are Lenovo We’re a leader in genuine innovation, dreaming up – and building – the technology and services that enable and inspire progress around the world. Our innovative high-quality PCs & Smart Devices, Data Centers, Mobile and Smart Office products are designed and built with the customer in mind. And it’s our people who make this all happen - we believe different is better and our strength lies in this diversity. Lenovo is the Number 1 Supercomputing provider in the world measured by Top 500 entries, and we keep going. Our Data Center team is dedicated to fostering an environment that encourages entrepreneurism and ownership - a workplace where your talents can be challenged, and your efforts recognized and rewarded. We are currently hiring for an HPC System Administrator to work onsite with one of our customers based in Rio de Janeiro, Brazil. As part of the Lenovo Managed Services team your responsibilities will include: - Monitoring, maintaining, and managing the physical infrastructure of a data center, ensuring its smooth operation, reliability, and security. - Monitoring power and cooling systems and network connectivity - Hardware and system software debugging and troubleshooting. - OS Management - Addressing hardware and software issues - Responding to alerts, performing preventative maintenance, rolling out and upgrading firmware versions, and managing any issues that may arise to minimize downtime and optimize data availability - Become the customers’ Single Point of Contact (SPOC). - Opening hardware trouble tickets against different vendors. - Following up and reporting the progress on all issues. Respond to users and provide support to them on the daily operations of the cluster. - Daily system administration tasks including granting and deleting access. - Investigate and correct hardware defects in the cluster as reported, adhering to the service levels. - Resolve errors through developing, testing and implementing changes to the system. - Provide corrective and preventive maintenance, troubleshoot and isolate defects. - Perform Software and firmware testing for any fixes, upgrades, security patch. - Update the customers’ documentation when and as necessary to reflect the changes made to the system. - Compile Monthly Reporting and take part in monthly customer Service Reviews where required. Working directly with the customer you will be responsible for: - The installation, configuration, and the support of services as required within the central customer Research Computing Services platform team. - Work with vendors and customer Technology Office to design, implement and upgrade services using change management and revision - Control processes to ensure that changes are properly tracked and available for audit when required. - Analyze and troubleshoot system issues, defining, and resolving complex issues. - Develop innovative solutions to continuously improve HPC and address any shortfalls in provision. - Work closely with other customer staff, including Infrastructure Technology, Security and Governance teams. - Understand the importance of security and seek specialist security advice to secure systems. - Maintain a knowledge of technical developments, tools, and ideas in HPC, attending seminars, conferences, technical briefings, and other community events. - Work flexibly as a part of the customer Platform Team, supporting the group’s activities and undertaking individual projects. - Write and maintain documentation on system design and management processes to ensure knowledge is accessible and disseminated appropriately within the customer team. - Deliver a high-quality service through a collaborative approach and outstanding analytical skills. - Take an active part in meetings, representing the customer, and facilitating collaboration between partners. - Assist customer researchers to utilize the HPC resource, providing subject matter expertise support to the Customer Research Computing Analysts. The role gives you a great deal of independence and opportunity to take the lead and advise. You will be expected to work effectively in providing technical services in the areas of Server, Storage, Network, Power and Cooling, OS, and cluster management software. The job responsibilities involve providing knowledge
-
System Administrator
4 semanas atrás
São Paulo, Brasil Metal Toad Tempo inteiroJoin to apply for the System Administrator role at Metal Toad . Metal Toad is an award-winning AWS Consulting Partner and AWS Managed Services provider based in the USA. We help customers with cloud adoption by providing architecture, migration, optimization, machine learning, and 24/7 support. We are a professional services firm offering services in...
-
High-Performance Computing Systems Specialist
3 semanas atrás
São Paulo, Brasil beBeeSupport Tempo inteiroSenior HPC Cluster Support Engineer This role involves providing high-level technical support for large-scale production HPC environments running Bright Cluster Manager and Slurm. Main responsibilities include cluster administration, troubleshooting user issues, monitoring cluster health, and coordinating with vendors to resolve hardware and software...
-
Senior HPC Cluster Support Engineer
2 semanas atrás
São Paulo, Brasil Sky Systems, Inc. (SkySys) Tempo inteiroRole : HPC Cluster Support – CIBA 4 (Senior) Position Type : Part-Time Contract (20hrs / week) Contract Duration : 6 months Work Hours : EST or PST Location : 100% Remote We're seeking a Senior HPC Cluster Support Engineer to maintain and support large-scale production HPC environments running Bright Cluster Manager and Slurm. This role focuses on cluster...
-
Senior Hpc Cluster Support Engineer
2 semanas atrás
São Paulo, Brasil Sky Systems, Inc. Tempo inteiroRole : HPC Cluster Support – CIBA 4 (Senior) Position Type : Part-Time Contract (20hrs / week) Contract Duration : 6 months Work Hours : EST or PST Location : 100% Remote We're seeking a Senior HPC Cluster Support Engineer to maintain and support large-scale production HPC environments running Bright Cluster Manager and Slurm. This role focuses on cluster...
-
Systems Administrator
2 semanas atrás
Sao Paulo, Brasil Kyndryl Brasil Serviços Limitada Tempo inteiro**Why Kyndryl**Our world has never been more alive with opportunities and, at Kyndryl, we’re ready to seize them. We design, build, manage and modernize the mission-critical technology systems that the world depends on every day. Kyndryl is at the heart of progress — dedicated to helping companies and people grow strong. Our people are actively...
-
Systems Administrator
2 semanas atrás
Sao Paulo, Brasil Kyndryl Brasil Serviços Limitada Tempo inteiro**Why Kyndryl**Our world has never been more alive with opportunities and, at Kyndryl, we’re ready to seize them. We design, build, manage and modernize the mission-critical technology systems that the world depends on every day. Kyndryl is at the heart of progress — dedicated to helping companies and people grow strong. Our people are actively...
-
High-Performance Computing Systems Specialist
3 semanas atrás
São Paulo, SP, Brasil beBeeSupport Tempo inteiroSenior HPC Cluster Support Engineer This role involves providing high-level technical support for large-scale production HPC environments running Bright Cluster Manager and Slurm. Main responsibilities include cluster administration, troubleshooting user issues, monitoring cluster health, and coordinating with vendors to resolve hardware and software...
-
Systems Administrator
4 semanas atrás
São Paulo, Brasil I-Yuno Malaysia Sdn Bhd Tempo inteiroIyuno is currently seeking an experienced and capable Systems Administrator accustomed to working in a fast-paced environment. This role will be responsible for maintaining network, storage and server systems in a corporate production environment. The position will prioritize maintaining our network infrastructure and will assist with storage and server...
-
Systems Administrator
2 semanas atrás
Sao Paulo, Brasil Kyndryl Brasil Serviços Limitada Tempo inteiro**Why Kyndryl**Our world has never been more alive with opportunities and, at Kyndryl, we’re ready to seize them. We design, build, manage and modernize the mission-critical technology systems that the world depends on every day. Kyndryl is at the heart of progress — dedicated to helping companies and people grow strong. Our people are actively...
-
Systems Administrator
2 semanas atrás
São Paulo, São Paulo, Brasil Iyuno Tempo inteiroIyuno is currently seeking an experienced and capable Systems Administrator accustomed to working in a fast-paced environment. This role will be responsible for maintaining network, storage and server systems in a corporate production environment. The position will prioritize in maintaining our network infrastructure and will assist with storage and server...