关于我们
Key Responsibilities:
- Identify operational inefficiencies and automation opportunities within monitoring workflows and infrastructure.
- Design and implement automated solutions for deployment, configuration, and scaling of monitoring tools using Infrastructure-as-Code (IaC) technologies such as Terraform, Ansible, Puppet, or similar.
- Leverage REST APIs of platforms like Zabbix, SolarWinds, Prometheus, and Grafana to streamline and standardize monitoring setup and management.
- Develop reusable automation assets—scripts, templates, and modules—to ensure consistent monitoring practices across diverse environments.
- Integrate monitoring systems with alerting, ticketing, and reporting platforms to enable seamless incident management and visibility.
- Establish tagging strategies and observability standards to ensure uniform data collection and traceability across services.
- Support incident response by building automated diagnostics and enriching telemetry data for faster root cause analysis.
- Collaborate cross-functionally with DevOps and SRE teams to align monitoring automation with CI/CD pipelines and operational goals.
Tech Skills:
Infrastructure as Code (IaC) & Automation
- Terraform
- Ansible
- Puppet
- Scripting languages: Python, Bash, PowerShell, SSH
Monitoring & Observability Tools
- Zabbix
- SolarWinds
- Prometheus
- Grafana
- Datadog, New Relic, or Dynatrace (as alternatives or complementary tools)
API Integration & Automation
- Experience working with REST APIs for automation and integration
- Familiarity with JSON, YAML, and HTTP methods (GET, POST, PUT, DELETE)
CI/CD & DevOps Tooling
- Jenkins, GitLab CI, GitHub Actions, or similar
- Docker and Kubernetes (for containerized environments)
Alerting & Incident Management Integration
- ServiceNow, Jira, VictorOps, xMatters, or similar
- Knowledge of event correlation and automated diagnostics
Cloud Platforms (optional)
- AWS, Azure, or Google Cloud Platform
- Cloud-native monitoring tools like CloudWatch, Azure Monitor, or GCP Operations Suite
Preferred Qualifications:
Soft Skills & Operational Mindset
- Strong problem-solving and gap analysis capabilities
- Ability to identify low-hanging fruits for automation
- Experience in cross-functional collaboration (DevOps, SRE, IT Ops)
- Understanding of observability principles and tagging strategies
致力平等
酷澎一直致力于员工之间的平等。我们取得的空前成功,皆离不开全球多元化团队所付出的努力。