This site uses cookies. To find out more, see our Cookies Policy

Systems Performance Engineer 3 in Phoenix, AZ at MDI Group

Date Posted: 6/10/2019

Job Snapshot

Job Description


Systems Performance Engineer 3

Location: Phoenix, AZ

The role:

As we continue to migrate from monolithic applications to containerized distributed services, we face increasing challenges observing the health and performance of our systems due to increasing complexity of our hosting environment. Single stack or siloed monitoring tools are no longer able to capture the full depth of the increasingly unpredictable and complex interactions between services. The maturity of the monitoring and observability tools available to our engineers will determine our ability to correctly anticipate performance bottlenecks, identify anomalous system interactions, and properly diagnose the root cause of issues and the overall availability of our business services.

What you will do:

The SPG team’s mission is build an integrated tool suite, designed to proactively monitor and alert, in an intelligent manner, on Capacity, Performance & Availability across environments – Cloud and on-premise. We are in process of developing an enterprise platform for instrumenting, monitoring, alerting and visualizing the state of our systems as metrics, logging and tracing. A successful candidate will need to be capable of implementing different tools and building integrations between application monitoring tools, team collaboration tools, and event management tools.

Skills you have:

  • 5 years of hands-on experience in supporting system performance, monitoring/alerting needs of application and infrastructure/networking teams. Working experience in site reliability engineering (SRE) team.
  • Experience with the following Cloud native & DevOps concept:
  • Infrastructure as Code
  • Continuous delivery pipelines (CI/CD)
  • Experience in one or more scripting languages (Python (preferred), Go, Power Shell, Perl, etc.)
  • Experience in any Cloud technologies, preferably AW
  • Experience administering one or more of the following tools (or similar tools).
  • White-box Monitoring Tools:
    • Solarwinds NPM, SAM, WPM, DPA, SRM, NCM, NTA, NetPath, PerfStack
    • AppDynamics (APM)
    • Dynatrace Synthetics
    • Prometheus (open source)
  • Black-box Monitoring Tools:
    • Log Aggregation (Elasticsearch, Logstash, Kibana)
    • Tracing (OpenTracing, ISTIO, Chrome DevTools, Zipkin)
    • Profiling (JProfiler)
    • Event relay (Kafka)
  • Experience with Web Services (REST and SOAP APIs, XML, JSON, HTML)
  • Proficiency with distributed source control (GitHub)
  • Ability to tackle complicated technical challenges
  • Ability to work collaboratively across teams to understand their requirements and deliver appropriate observability tools and best practices to empower application and infrastructure owners to ensure maximum availability of their applications with quick recovery capabilities.
  • Be a subject matter expert (SME) in observability and monitoring/alerting space. Continue to develop their own knowledge and skills. Stay informed on latest development in observability, monitoring/alerting space. Assist with the skill development of peers.
  • Exceptional analytical skills
  • Exceptional verbal, written and listening communication skills
  • Ability to uphold Values & Performance Principles of collaboration, performance excellence, sense of urgency, openness to new ideas, inclusion & diversity, integrity, customer focus, and respect.


Search IT Jobs