Job description
Description
NeuReality is seeking a Lead System Architect to join our system architecture team and help define the next generation of our AI-SuperNIC scale-out chip.
AI scale-out communication is a critical element in modern data centers, and emerging standards such as Ultra Ethernet aim to address this challenge. This role focuses on defining a high-performance Smart NIC architecture optimized for GPU-centric AI workloads, with emphasis on low-latency, high-bandwidth data movement.
You will work across hardware and software domains, collaborating closely with AI, platform, driver, and VLSI teams to design a competitive scale-out networking solution.
Responsibilities
Lead the software architecture and technical roadmap for or NeuReality’s next-generation ultra low-latency AI-SuperNIC software stack, including drivers, firmware, libfabric, and libibverbs.
Define the partitioning and interfaces between hardware, firmware, kernel drivers, user-space libraries, and AI frameworks.
Lead the design and implementation of high-performance networking, RDMA, and GPU-direct communication capabilities.
Drive software support for emerging technologies and standards such as UEC, UALink, MRC and RoCEv2 ecosystems.
Work closely with hardware, system architecture, and VLSI teams to optimize performance, scalability, and feature delivery.
Define performance goals and lead profiling, benchmarking, and optimization efforts for GenAI and distributed AI workloads.
Collaborate with customers, partners, and open-source communities to ensure ecosystem compatibility and adoption.
Mentor software engineers and provide technical leadership across firmware, driver, and networking software development
Requirements
BSc/MSc in Computer Science, Electrical Engineering, or a related field.
7+ years of experience in software architecture, networking software, or system software development.
Strong experience developing Linux kernel drivers, firmware, and user-space networking software.
Deep understanding of data center networking, including Ethernet, TCP/IP, routing, switching and congestion management
Proven experience defining software architectures that span hardware, firmware, kernel, and user-space components.
Strong programming skills in C/C++ and experience with Linux-based development environments.
Experience leading cross-functional technical initiatives and collaborating with hardware and system architecture teams.
Excellent analytical, debugging, and performance optimization skills.
Nice to Have
Experience with RDMA technologies and low-latency networking architectures.
Experience with libfabric, libibverbs, RDMA-core, DPDK, SPDK, or similar infrastructure software.
Familiarity with GPU communication technologies such as GPUDirect RDMA, NCCL, NVLink, or UALink.
Experience optimizing communication for distributed AI/ML workloads.
Contributions to open-source networking or Linux kernel projects.
Experience working on SmartNICs, DPUs, NICs, or networking ASICs.
Deep understanding of GenAI/ML infrastructure and distributed workloads
Is this role relevant for you?