Responsibilities - Proactively monitor the production environment to maintain a 24×7×365 uptime service.
- Manage a secure and high-performance network to support daily transaction processing, ensuring systems remain up and running.
- Perform regular monitoring of alerts to proactively identify potential issues and optimize system and network performance.
- Collaborate with cross-functional teams to troubleshoot and resolve system and network issues; work with Card Networks when required.
- Conduct weekly server maintenance, including system upgrades and patch management on Linux servers.
- Create and maintain detailed documentation of system and network configurations, procedures, and troubleshooting guides.
- Participate in Incident Response and Disaster Recovery planning and testing to ensure continuity of operations during incidents.
- Collaborate closely with IT and Operations teams to align system and network security measures with company policies and regulatory standards.
- Stay current on industry trends, emerging technologies, and security threats to enhance system reliability and resilience.
- Participate in an on-call rotation to provide off-hours support and respond promptly to system or network issues.
- Monitor system performance and respond to alerts and incidents, performing troubleshooting, root cause identification, and escalation when necessary.
- Proactively restart systems or services as needed during outages.
- Update system status on the company website during outages or service disruptions.
- Notify internal stakeholders of outages and send bulletins when outage thresholds are exceeded.
- Serve as the first point of contact for incidents during on-call or graveyard shifts.
- Participate in daily audit updates and documentation activities during on-call shifts.
- Update internal alert pages and notify the team in case of anomalies.
- Create and maintain a centralized documentation page for all change tickets.
- Maintain an up-to-date log of change tickets, including ticket number, requester, and status.
- Perform daily and weekly scheduled tasks as defined by procedures and provide status updates to the team.
- Participate in on-call handovers and communicate effectively with team members.
Qualifications - Bachelor's Degree in Information Technology, Computer Science, or a related field.
- Day shift, on-site in Palo Alto, CA.
- Previous experience as a Linux Administrator.
- In-depth knowledge of Linux (Ubuntu).
- Strong understanding of networking principles, protocols, and technologies.
- Hands-on experience with systems, routers, switches, firewalls, and related hardware.
- Relevant experience in system or network administration or a similar role.
- Willingness and availability to work on-call hours, including nights, weekends, and holidays.
- Strong analytical and problem-solving skills with attention to detail.
The compensation for this position is $100,000. However, base pay offered may vary depending on job-related knowledge, skills, and experience. In addition to a full range of medical, financial, and/or other benefits, dependent on the position offered.
BenefitsTabaPay offers the following benefits:
- 100% employer-paid health care insurance including medical, dental, vision, and life insurance (for employee only)
- Employer 401K Matching
- Generous and Flexible PTO