Full Job Description
Cornelis Networks is seeking a talented Linux Kernel and Driver Developer to architect and optimize our next-generation High-Performance Computing (HPC) and Artificial Intelligence (AI) fabric software stack.
Your mission will center on development, optimization, and upstream maintenance of host driver software, focusing on our open-source hfi1 kernel driver and our high-performance user-space Omni-Path Express (OPX) libfabric provider. You will collaborate directly with silicon architects, hardware engineers, and the global open-source community to design software that scales to thousands of nodes with sub-microsecond latency. This role is remote from within the United States.
Key Responsibilities:
- Design & Optimize Device Drivers: Develop, maintain, and upstream the open-source `hfi1` kernel driver and related subsystems (such as InfiniBand verbs and RDMA core).
- Hardware-Software Co-Design: Partner closely with silicon architects and hardware developers to define register interfaces, MMIO pathways, command queues, and hardware-software contracts.
- Develop Zero-Copy Data Paths: Design and optimize low-latency, high-throughput DMA and RDMA transport engines, minimizing buffer copies and maximizing CPU-bypass capabilities.
- Debug Complex Kernel Concurrency: Identify and resolve intricate kernel-space race conditions, deadlocks, and memory issues under heavy multi-threaded, asynchronous networking workloads.
- Upstream & Community Engagement: Actively submit patches, participate in code reviews, and represent Cornelis within the Linux Kernel Mailing List (LKML) and open-source networking communities.
- Package & Build Automation: Maintain and optimize system build environments, kernel-module packages (DKMS, RPM, Kbuild), and automated integration tests.
Required Qualifications:
- Education: BS, MS, or Ph.D. in Computer Science, Computer Engineering, or a related field (or equivalent practical experience).
- Kernel-Space Mastery: 3+ years of professional experience writing production-grade C code inside the Linux kernel (kernel modules, LKM, memory management, or interrupt handlers).
- High-Speed Networking Protocol Knowledge: Direct experience with RDMA, InfiniBand (IB) Verbs, RoCE, or high-performance user-space bypass frameworks (such as libfabric / OFI or DPDK).
- Hardware Interface Fundamentals: Strong understanding of PCIe architectures, DMA engines, memory mapping (`mmap`), and MMIO.
- Advanced Kernel Debugging: Hands-on proficiency with kernel analysis tools including `KASAN`, `kmemleak`, `ftrace`, `tracepoints`, `kprobes`, and core crash dump analysis.
- Scripting & Automation: Proficiency in scripting languages (e.g., Python, Bash) for automated testing and performance profiling.
Nice to Haves:
- Active track record of contributions to upstream `kernel.org` (specifically under `drivers/infiniband/` or `drivers/net/`).
- Familiarity with kernel storage protocols (e.g., Lustre, NFS, SRP).
- Experience with GPU-direct communication technologies (e.g., GPUDirect RDMA, DMA-buf).