- Evaluate, design, build and maintain the Media Cloud back-end server architecture and data pipeline including creation of technical specifications and test scenarios;
- Establish and clearly communicate to other staff a technical vision for Media Cloud's data architecture;
- Collaborate with other developers, designers, and system administrators to develop and implement a technical roadmap to meet research needs and complete responsibilities;
- Write new code to scale systems to handle rapidly growing data requirements, architecting code for scalability;
- Build, maintain, and upgrade systems within existing codebase;
- Troubleshoot existing code, fix bugs, prioritize features, and improve back-end data tools to support high volume usage;
- Communicate project status internally and externally;
- Serve as a technical resource for both the technical and research teams;
- Other duties as assigned.
This position is not eligible for visa sponsorship.
- At least seven years' experience working as a software engineer/architect on big data-related projects;
- Programming fluency in Python;
- Five or more years of experience working with text-based data system (ie. NLP), PostgreSQL databases or Solr databases;
- Experience writing, maintaining, and optimizing SQL queries against large databases and scaling platforms to handle large data sets;
- Experience implementing and maintaining a production ETL pipeline;
- History of crafting, building, testing, and deploying robust code;
- Demonstrated ability to iterate quickly through prototypes;
- Demonstrated ability to use data to validate architectural decisions;
- Interest in working on issues related to hate-speech, democracy, gender, race, or health;
- Experience working on diverse teams, with strong interpersonal skills to interact with different disciplines.
- Bachelor's degree;
- Programming fluency in Perl;
- Experience writing web crawlers or API scrapers;
- Demonstrated ability to scale platforms to handle more users;
- Real passion for solving difficult engineering and data problems;
- Knowledge of and interest in social sciences.