Minimum qualifications:- Bachelor's degree or equivalent practical experience.
- 5 years of experience designing, analyzing and troubleshooting large-scale distributed systems.
- 5 years of programming experience in Java, C or Golang.
- Experience developing with Spark, Hive, or similar engines.
- Experience in benchmarking and building custom benchmarks.
- Experience in developing cloud or software as a service (SaaS) products.
Preferred qualifications:- Master's degree or PhD in Computer Science or a related technical field.
- Experience with Data lakes like Apache Iceberg, Apache Hudi, Delta lake etc.
- Experience with Database optimizations - query and executor optimizations.
- Experience working with data science tools such as Jupyter notebooks.
- Experience with Open Telemetry, JMX and other monitoring solutions.
- Contributions to Apache or other similar open-source projects such as Iceberg, Delta, Hudi, Spark, Presto, Flink etc.
About the jobThe US base salary range for this full-time position is $174,000-$252,000 bonus equity benefits. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process.
Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google .
Responsibilities- Build customer-facing features for Managed Service for Apache Spark (formerly Dataproc) to run Spark in the cloud.
- Drive technical design and execution for performance and lakehouse features and enhancements.
- Enhance Apache Spark and lakehouse technologies like Iceberg or Delta Lake for performance, reliability, security, and monitoring.
- Contribute to documentation or educational content based on product updates and user feedback, and extend open-source technologies like Apache Spark, Flink, Hive, and Trino to improve debuggability, observability, and supportability.
- Review code developed by other developers and provide feedback to ensure style guidelines, code check-in, accuracy, testability, and efficiency.