Full Job Description
Description
APPLE INC has the following available in Cupertino, California and various unanticipated locations throughout the USA. Responsible for the design, development, and operation of large-scale distributed storage, retrieval, and data-processing systems that support real-time and high-performance AI/ML workloads. Architecting and implementing core infrastructure for an in-house key-value store with high availability, vector database, semantic caching orchestration pipeline, and multi-region data movement workflows. Build and maintain low-latency, fault-tolerant systems used to serve embeddings and similarity-search queries at scale, and data pipelines for AI/LLM applications. Design and implement new features of an in-house distributed key-value database, including data structures, storage formats, optimization strategies, and mechanisms for consistency and fault tolerance. Design and optimize a vector database engine supporting vector embeddings, similarity search algorithms, indexing algorithms, low-latency retrieval used for LLM inference, and ranking. Leverage popular ANN families (HNSW, IVF, PQ) to implement efficient, scalable, and high-performance similarity search solutions. Architect and implement semantic caching components that automatically infer and store frequently accessed or semantically related data such as embeddings, tokenized text, or LLM prompt contexts to reduce compute and storage overhead. Implement LLM-grounded validation to ensure the accuracy and robustness of cached data. Build and monitor multi-region data-copying workflows, including geo-distributed replication strategies, change-data-capture pipelines, failure-recovery logic, and consistency guarantees. Create and execute performance benchmarking frameworks to measure throughput, latency, memory usage, and index-build efficiency for storage, search and retrieval, and caching systems. Investigate and resolve production issues related to data correctness, memory/CPU hotspots, and distributed system failures. Implement and enforce observability features, including metrics, logging, tracing, and alerting, to ensure reliability and debuggability of core infrastructure components. Collaborate on architectural decisions and system interfaces to ensure alignment with engineering client teams, AI/ML researchers, product teams, and SRE teams, to ensure seamless integration with cloud infrastructure components. 40 hours/week. At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The base pay range for this role is between $226,138 - $272,100/yr and your base pay will depend on your skills, qualifications, experience, and location.
PAY & BENEFITS: Apple employees also have the opportunity to become an Apple shareholder through participation in Apple's discretionary employee stock programs. Apple employees are eligible for discretionary restricted stock unit awards, and can purchase Apple stock at a discount if voluntarily participating in Apple's Employee Stock Purchase Plan. You'll also receive benefits including: Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses - including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation. Learn more about Apple Benefits: https://www.apple.com/careers/us/benefits.html.
Note: Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program.
Minimum Qualifications
Master's degree or foreign equivalent in Information Technology - Mobility or related field and 3 years of experience in the job offered or related occupation.
3 years of experience with each of the following skills is required:
Architecting and implementing production-grade distributed databases including sharding mechanisms, replication protocols, query processing engines, supporting both exact-match queries and large-scale vector similarity search.
2 years of experience with each of the following skills is required:
Architecting cloud-native infrastructure including distributed processing frameworks (Spark or Flink), queueing systems (Kafka or Pulsar), and distributed databases (Apache Ignite or similar).
Customizing and Implementing distributed consensus algorithms (Raft or Paxos) to achieve trade-offs between performance and consistency in production systems.
Designing and deploying infrastructure components of machine learning systems for both training (PyTorch) and inferencing workloads in production environments.
Implementing and enhancing vector similarity search algorithms using approximate nearest neighbor (ANN) algorithms (FAISS or HNSW), search indexing algorithms, and embedding generation pipelines for production machine learning applications.
Implementing search indexing systems including inverted indexes, relevance ranking algorithms (TF-IDF or BM25), and tokenization pipelines for text-based search in production database systems.
1 year of experience with each of the following skills is required:
Building and deploying supporting infrastructure for real-time inference workloads.
Modifying or extending foundational key-value stores (LevelDB or RocksDB) for production environments, including tuning performance, implementing customized features, or optimizing storage formats.
Architecting and optimizing vector search systems including customizing indexing algorithms (HNSW or IVF) and optimizing query execution to reduce latency in production environments.
Preferred Qualifications
N/A