Inference Systems Engineer
Há 2 dias
Inference Systems EngineerRemoteInfrastructure / Serving Systems$5,651 - $6,469/month USDRole OverviewAs an Inference Systems Engineer, you will own the serving runtime that powers production LLM inference.This is a deeply technical role focused on system performance and stability: optimizing request lifecycle behavior, streaming correctness, batching/scheduling strategy, cache and memory behavior, and runtime execution efficiency.You will ship changes that improve TTFT, p95/p99 latency, throughput, and cost efficiency while preserving correctness and reliability under multi-tenant load.You will collaborate closely with platform/infrastructure operations, networking, and API/control-plane teams to ensure the serving system behaves predictably in production and can be debugged quickly when incidents occur.This role is for engineers who can reason about the entire inference pipeline, validate improvements with rigorous measurement, and operate with production-grade discipline.ResponsibilitiesOwn the end-to-end serving runtime behavior: request lifecycle, streaming semantics, cancellation, retries interaction, timeouts, and consistent failure modes.Design and implement batching and scheduling strategy: dynamic batching, admission control, fairness under mixed tenants, priority lanes, and backpressure mechanisms to prevent cascading failures.Optimize performance at the systems level: reduce time-to-first-token, improve tail latency stability, increase tokens/sec throughput, and improve accelerator utilization under realistic workloads.Improve memory behavior and cache efficiency: KV-cache policies, fragmentation control, eviction strategies, and safeguards against OOM cliffs and performance thrash.Drive runtime execution optimizations: operator-level improvements, quantization integration, compilation/tuning paths where appropriate, and parameterization that produces stable performance across deployments.Establish a performance measurement discipline: reproducible benchmarks, realistic traffic traces, profiling workflows, regression detection gates, and dashboards tied to production outcomes.Build production readiness into the system: feature-flagged rollouts, canarying, safe configuration changes, and incident playbooks that reduce MTTR.Partner with networking and infrastructure operations to align deployment topology, failure domains, and capacity constraints to performance and reliability goals.Collaborate with product and API teams to ensure the serving layer's guarantees are reflected accurately in external interfaces and customer expectations.Requirements5+ years building high-performance systems (model serving, GPU systems, performance engineering, or low-latency distributed systems).Strong understanding of LLM inference tradeoffs: batching vs latency, prefill vs decode dynamics, cache behavior, memory pressure, and tail latency causes.Comfort working across Python/C++ stacks with production profiling and debugging tools.Track record of shipping performance improvements that hold up under production variance and operational constraints.Strong engineering hygiene: tests, instrumentation, documentation, and careful rollout discipline.Ability to communicate clearly across teams and operate calmly during incidents.#J-*****-Ljbffr
-
Inference Systems Engineer
Há 3 dias
Manaus, Brasil Noxx Tempo inteiroInference Systems Engineer Remote Infrastructure / Serving Systems $5,651 - $6,469/month USD Role Overview As an Inference Systems Engineer, you will own the serving runtime that powers production LLM inference. This is a deeply technical role focused on system performance and stability: optimizing request lifecycle behavior, streaming correctness,...
-
Systems Engineer
Há 3 dias
Manaus, Brasil Moralis Tempo inteiroSystems Engineer (Fully Remote)We’re looking for a Systems Engineer to help build, automate, and support development and test environments across a large-scale platform. You’ll play a key role in designing, deploying, and optimising cloud and on-prem environments that support critical applications from build through to production support.What you’ll...
-
Linux System Engineer
Há 3 dias
Manaus, Brasil InComm Payments Tempo inteiroWe are seeking a highly skilled and experienced Senior Linux System Engineer to join our InComm Operations team. Ideally, you will have a strong background in Red Hat and Oracle Linux system administration, automation with Ansible, as well as deep expertise in Linux patching, scripting, and GIT version control. 100% Remote + CLT + Benefits (Health Insurance...
-
Ps Systems Engineer
4 semanas atrás
Manaus, Brasil Bradford Jacobs Tempo inteiroBradford Jacobs has partnered with an award-winning Technology Solution Provider dedicated to protecting and empowering organizations across the central U.S. They deliver innovative, business-aligned solutions in cybersecurity, cloud, infrastructure, Microsoft, Salesforce, and support services. As a Systems Engineer, you'll play a key role in designing,...
-
Linux System Engineer
Há 3 dias
Manaus, Brasil InComm Payments Tempo inteiroWe are seeking a highly skilled and experiencedSenior Linux System Engineerto join our InComm Operations team. Ideally, you will have a strong background in Red Hat and Oracle Linux system administration, automation with Ansible, as well as deep expertise in Linux patching, scripting, and GIT version control.100% Remote + CLT + Benefits (Health Insurance...
-
Senior Full Stack Software Engineer
3 semanas atrás
Manaus, Brasil Georgiatek Systems Inc. Tempo inteiroSenior Full Stack Software Engineer (Java & Node.js) Location : Remote – Brazil Employment Type: Full-time Salary: Negotiable About the Role We are seeking a highly skilled Senior Full Stack Software Engineer with strong experience in Java-based distributed systems, stream-based processing, and Node.js services. In this role, you will design, build, and...
-
Senior Full Stack Software Engineer
4 semanas atrás
Manaus, Brasil GeorgiaTEK Systems Inc. Tempo inteiroSenior Full Stack Software Engineer (Java & Node.js) Location: Remote – Brazil Employment Type: Full-time Salary: Negotiable About the Role We are seeking a highly skilled Senior Full Stack Software Engineer with strong experience in Java-based distributed systems, stream-based processing, and Node.js services. In this role, you will design, build,...
-
Java Engineer ID45903
3 semanas atrás
Manaus, Brasil AgileEngine Tempo inteiroJoin to apply for the Java Engineer ID45903 role at AgileEngine AgileEngine is an Inc. 5000 company that creates award‑winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas such as application development and AI/ML, and our people‑first culture has earned us multiple Best Place to Work...
-
Cloud Infrastructure Engineer
Há 3 dias
Manaus, Brasil Mtech Systems Tempo inteiroAt MTech Systems, our company mission is to increase yield in protein production to help feed the growing world population without compromising animal welfare or damaging the planet.We aim to create software that delivers real-time data to the entire supply chain that allows producers to get better insight into what is happening on their farms and what they...
-
Java Engineer ID45903
Há 3 dias
Manaus, Brasil AgileEngine Tempo inteiroOverview AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to Work awards. WHY JOIN US If you're looking for a place to grow,...