As a Data Engineer within the IDEA organization, you will lead the design, implementation, and successful delivery of large-scale, critical and complex data architecture, storage and pipelines that improve the lives of tens of millions of people every single day. The IDEA organization applies software and data engineering that helps power long-term fundamental investing at > $1 trillion scale. In the Data Engineer role, you will work with experienced data engineers while making technical decisions around how best to serve our customers. If you are looking to join a team that offers autonomy, plenty of space to run with new ideas, working in a nurturing environment, and challenging problems to solve at a significant scale, this might be the job for you!
What you’ll be doing:
- Build large-scale distributed data processing systems, data lakes, and optimize for both computational and storage efficiency on cloud platforms like AWS.
- Design, implement and automate data pipelines sourcing data from internal and external systems, transforming the data for the optimal needs of various systems.
- Design data schema and operate cloud-based data warehouses and SQL/NoSQL/temporal database systems.
- Develop batch and streaming pipelines using technologies including Apache Spark and Kafka.
- Own the design, development and maintenance of ongoing metrics, reports, analyses, dashboards, etc. to drive key business decisions.
- Monitor and troubleshoot operational or data issues in the data pipelines.
- Drive architectural plans and implementation for future data storage, ETL, reporting, and analytic solutions.
- Provide insightful code reviews, receive code reviews constructively and take ownership of outcomes (“you ship it, you own it”), working very efficiently and routinely deliver the right things in the front-end UI area.
Your background and who you are:
- You have a background in data and software engineering and a passion to learn.
- You've made mistakes in the past and have learned a lot from them. You apply this learning regularly.
- You believe there are generally multiple ways to solve a technical problem, each with different trade-offs.
- You approach projects, tasks, and unknowns with curiosity, and enjoy sharing what you know and what you learn with the people around you.
- You believe that a team is strongest when it is diverse and includes multiple perspectives.
- You are able to put yourself into your customer's shoes. You frequently immerse yourself in the customer experience to understand how you can better serve them.
- BS in Computer Science or related field, or an equivalent in relevant work experience.
- Experience implementing big data processing technology: Hadoop, Apache Spark, Kafka, etc.
- 5+ years of Coding proficiency in at least one modern programming language (Scala, Python, or Java).
- Experience writing and optimizing advanced SQL queries in a business environment with large-scale, complex datasets.
- 3+ year’s experience in cloud-first design, preferably AWS (S3, Kurbernetes, dynamic autoscaling, etc.).
- Experience in data architecture, databases (e.g., SQL Server, Oracle, PostgreSQL), SQL and DDD/ER/ORM design. Experience with NoSQL databases such as graph databases, wide column stores, etc.) are a plus.
- Interest and curiosity in emerging technologies on the web like GraphQL, web assembly, MLaaS etc
- Knowledge of software engineering practices & best practices for the software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations.