Shape the future of global data science infrastructure as a
Site Reliability Engineer at EPAM. You'll architect, implement, and support cutting-edge platforms like Alteryx, Dataiku, and Azure Machine Learning that power critical business insights across wealth management, investment banking, and corporate functions.
At EPAM, you'll work on cutting-edge technologies, solve complex challenges, and shape the future of digital innovation. With access to continuous learning, mentorship, and global projects, your expertise will drive meaningful change.
RESPONSIBILITIES
- Design and implement robust infrastructure solutions to support enterprise-scale data science platforms across multiple global regions
- Provide expert-level production support for engineering teams and business stakeholders using MongoDB-based data science environments
- Develop automation frameworks that enhance system reliability, performance monitoring, and incident response capabilities
- Troubleshoot and resolve complex technical incidents as a Problem Manager, ensuring minimal disruption to business operations
- Collaborate with cross-functional teams to continuously improve core infrastructure and implement modern data science initiatives
- Ensure all platform changes and enhancements adhere to operational guidelines and compliance requirements across international jurisdictions
REQUIREMENTS
- 2+ years of hands-on administrative experience with data science platforms such as Alteryx Server, Dataiku, or Azure Machine Learning
- Strong MongoDB performance monitoring and optimization skills with focus on automation and reliability
- Demonstrated proficiency in DevOps practices using Unix Shell, Python, PowerShell scripting, or other programming languages
- Proven ability to analyze complex problems, design effective solutions, and implement technical improvements at scale
- Experience influencing IT stakeholders and business partners in enterprise technology environments
- Willingness to participate in occasional on-call rotation and weekend support for critical activities
- Unix and/or Windows administration experience (Optional)
WE OFFER
- Medical, Dental and Vision Insurance (Subsidized)
- Health Savings Account
- Flexible Spending Accounts (Healthcare, Dependent Care, Commuter)
- Short-Term and Long-Term Disability (Company Provided)
- Life and AD&D Insurance (Company Provided)
- Employee Assistance Program
- Unlimited access to LinkedIn learning solutions
- Matched 401(k) Retirement Savings Plan
- Paid Time Off
- Legal Plan and Identity Theft Protection
- Accident Insurance
- Employee Discounts
- Pet Insurance
- Employee Stock Purchase Program