Bugcrowd

Reinforcement Learning Infrastructure (Cybersecurity)

Bugcrowd$176K — $242K *
US-AnywhereRemote in United States
Information Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • 5-7 years of experience in reinforcement learning and AI development
  • Strong background in system architecture and engineering
  • Proficient in Python and C; Rust is a plus
  • Experience with DevOps practices, including CI/CD pipelines
  • Familiarity with vulnerability research and software exploitation techniques
  • Comfortable working within Linux environments and low-level debugging
  • Experience with large open-source codebases and benchmark environments.

Responsibilities

  • Design and build systems for generating reinforcement learning environments
  • Develop pipelines that automate software project ingestion and analysis
  • Utilize Bugcrowd's Mayhem platform for vulnerability analysis
  • Create infrastructure for training next-generation AI models
  • Collaborate with AI labs like Anthropic and OpenAI for system enhancement
  • Focus on building sustainable, reproducible environments
  • Explore innovative methods for security challenge development.

Benefits

  • 100% remote work environment
  • Supportive company culture that values diversity
  • Access to cutting-edge AI and cybersecurity projects
  • Opportunity for career growth in a pioneering tech company
  • Flexible working hours to promote work-life balance
Full Job Description
Job Summary

The Bugcrowd RL and Reasoning Team focuses on pushing the boundaries of autonomous cybersecurity by building authentic reinforcement learning environments for foundational model companies. As a Staff Engineer, you will advance the frontier of AI Reinforcement Learning development and delivery. You will build the infrastructure and tooling that transforms real-world vulnerability research into large-scale reinforcement learning environments used to train next-generation AI systems.

This role is unique. You will help create the training environments that teach AI systems how to hack and defend software. Your work will directly influence the capabilities of the next generation of AI models. Instead of building a single application, you will build the infrastructure that generates thousands of environments used to train frontier AI systems.

Our team works at the intersection of AI, security research, and systems engineering, building environments that allow models to learn skills such as vulnerability discovery, exploitation, and remediation.

Essential Duties and Responsibilities

If you enjoy building high-performance systems that power cutting-edge AI research, this role is for you.

This role focuses on building the systems that generate RL environments, not just the environments themselves. You will design pipelines that ingest software projects, analyze them with Bugcrowd's Mayhem platform, and automatically construct training environments used by frontier AI labs including Anthropic, OpenAI, and Cohere.

The ideal candidate is a strong systems engineer who understands:
  • Reinforcement learning workflows
  • Building clean, reproducible Linux ML environments (containers, MCP, etc)
  • System security background in binary exploitation, such as buffer overflows, fuzzing, exploitation, and x86/64.
  • Experience developing applications in Python and C, with Rust a plus.

Education, Experience, Knowledge, Skills, and Abilities

Understanding of RL training workflows used by modern LLM systems
  • Experience with DevOps pipelines (e.g., github actions), reproducible builds (docker, buildkit, nix).
  • Proficiency in Python and C. Other languages (especially Rust) are a plus.
  • Understanding of software vulnerabilities, fuzzing, or program analysis
  • Experience with build systems and large open-source codebases
  • Comfort working with Linux systems and low-level debugging
  • Experience working with benchmark environments (CTFs, SWE-bench, security challenges, etc.) is a plus

Working Conditions and Physical Requirements

The ideal candidate must be able to complete all physical requirements of the job with or without reasonable accommodation.

Sitting and / or standing - Must be able to remain in a stationary position 50% of the time

Carrying and / or lifting - Must be able to carry / move laptop as needed throughout the work day.

Environment - remote, work-from-home 100% of the time.

Pay Range Disclosure

At Bugcrowd, we strive for fairness, equality and to create an environment that allows our people to perform at their very best. Our compensation philosophy is to foster a collaborative community that rewards, attracts and retains the best possible talent. The provided salary details are based on US national averages and we retain the flexibility to tailor to the needs of the business.

The national estimate for the current base range for the position of $176,400 - $242,550.

This position may also be eligible to participate in a discretionary bonus program or commission plan, subject to the rules governing the program, whereby an award, if any, depends on various factors, including, without limitation, individual and organizational performance.

About Bugcrowd

Bugcrowd is a cybersecurity company that provides a platform for crowdsourced security testing. The company was founded in 2012 and is headquartered in San Francisco, California. Bugcrowd's platform allows companies to run bug bounty programs, which incentivize ethical hackers to find and report security vulnerabilities in their systems. Bugcrowd has raised over $80 million in funding and has over 200 employees.
Learn more about Bugcrowd
Size
200 employees
Industry
Founded
2012

Similar Jobs

More Jobs at Bugcrowd

More Information Technology Jobs

Find similar Reinforcement Learning Infrastructure (Cybersecurity) jobs: