The Site Reliability Engineer is responsible for the development, implementation, and maintenance of the company automated solutions to improve the platform and application stability, performance, staff productivity, metrics, and reporting. The position requires deep and broad technical knowledge that will work with cross functional teams as they build, test, automate, maintain, troubleshoot, and document the company engineered solutions.
Essential Job Duties:
1. Develop and oversee the engineering of automated solutions.
a. Improve platform and application stability
b. Increase solutions performance
c. Improve staff productivity, metrics and reporting
2. Design, build, and maintain engineered solutions through collaborative efforts in cross-functional teams and third-party vendors.
3. Manage the health of the service capacity through planning and demand forecasting, software performance analysis, and system tuning.
4. Perform advanced troubleshooting of incidents in mission critical systems and participate in preventative problem management activities.
5. Partner and influence operation sustainment that reduces risks in the eco-system by driving towards automation and a touchless operation as a design / architecture construct. Aggressively pursue and promote safety and soundness actions such as vulnerability, patching, end of life, and resiliency.
6. Work with cross functional teams such as run build to automate builds and software deployments to testing and production environments.
7. Actively participate in organizational initiatives to define, measure, and report on automation scalability, performance and stability.
8. Research and development around emerging technologies
Essential Universal Job Duties:
• Improves self by certification programs or other methods to enhance job performance.
• Promotes the Company, its Mission, Core Values, programs, and achievements to the public and other employees.
o Core Values:
Continuous Improvement and Innovation
Sense of Urgency
• Functions as a team member by assisting, supporting, and encouraging other employees in any way possible.
• Performs related work as required, willingly and eagerly.
• Meets deadlines as required.
• Regular, predictable attendance is an essential function of this position.
• Capable of working independently
The above statements are intended to describe the general nature and level of work being performed by people assigned to this job. They are not to be construed as an exhaustive list of job duties performed by the personnel so classified.
Minimum Job Requirements:
Education: Bachelor’s Degree in BCIS, or Computer Science.
Related Experience: Two years working as a Site Reliability Engineer or in a related position
• Skills and experience with CI/CD, Linux, Containers, Orchestration (Kubernetes), AWS, Coding, and System Capacity Planning
• Very effective written and oral communication skills as well as strong interpersonal skills. Must be able to communicate effectively to a wide variety of audiences including software development teams, technical support, end-users, management, etc.
• Outstanding organizational and time management skills.
• Full understanding of the Agile methodology (SCRUM is preferred).
• Must be able to identify, analyze and solve complex problems in an efficient manner.
Desirable Training and Experience:
• Process Improvement methodology (e.g. Lean) exposure and/or experience preferred.
• Working knowledge of Application Life Cycle management tools.
• Additional related certifications
Geo-Comm is an equal opportunity employer, and does not discriminate in hiring or employment on the basis of race, creed, color, religion, sex, national origin, citizenship status, age, disability, marital status, familial status, sexual orientation, veteran status, public assistance status, or any other status protected by applicable law.
Geo-Comm Corporation provides a drug-free working environment.