Our Grafana Cloud pipeline moves millions of data points, log lines, and traces per second from our customers' environments into a highly available, low-latency stack that processes and stores the data, and serves it to dashboards and alerting tools. We aim to grow this to hundreds of millions per second, and it's critical that as we grow, we improve our performance, increase our reliability, and do it all more efficiently.
Cloud roles at Grafana Labs require engineers with a passion for performance and reliability, and who enjoy taking projects from conception to production. Grafana Cloud hosts services in Kubernetes, and our Cloud Platform squad owns and maintains the platform delivering Kubernetes and its required complementary services to Grafana Engineering, as well as designing, implementing and maintaining the virtual network infrastructure
Because we deploy production services, we have on-call rotations to ensure the health of the system. We dogfood our own services so being on call is an important way to understand our system and how to use the products we create.
Our culture is one of remote-first, and our engineering organization is largely remote. We provide guidance and meet regularly using video calls, and we need people who can work independently and can communicate well. Even if you are located near one of our small offices, working from home is both common and encouraged. Our teams also plan in-person team building meetups and also gather to attend industry conferences.
We care deeply about open source and the projects generally are open source, check them out: https://github.com/grafana.
We primarily use Go and Jsonnet.
About the role:
We are looking for an experienced software or site reliability engineer to join the Grafana Labs R&D team. We are hiring for the Cloud Platform squad that provides the platform on which Grafana Cloud delivers its services.
- Maintain and improve Grafana Labs’ provisioning and release tools, allowing rapid deployment of infrastructure and services
- Maintain and improve Grafana Labs’ monitoring tools and best practices to maximise system uptime and health
- Provision and administer the core infrastructure platform, Kubernetes
- Provision and administer the required Cloud Service Provider resources
- Work with other engineering teams to help them deploy and run their software in production
- Commercial experience as a site reliability, network and/or software engineer in Cloud and Linux environments, especially with distributed architectures
- Programming experience -- we use Go, Python and Shell
- Experience with containers and orchestration -- we use Docker and Kubernetes
- Proficiency with infrastructure as code and/or configuration management -- we use Terraform and Tanka/Jsonnet
- Experience with dashboards and monitoring tools like Grafana and Prometheus
Nice to have:
- Commercial experience in designing and managing networking in a Virtual Private Cloud
- Commercial experience of network services, including load balancers, firewalls and DNS
- Commercial experience of layer 2 and layer 3 networking, including VLANs and VPNs
- Commercial experience of the IP protocol suite, including BGP and NAT
- Experience working in remote and/or distributed business environments, demonstrating self motivation and communication skills
- Flexible hours
- The equipment you need to get the job done
- Generous vacation policy of 30 days per annum with national holidays in your country of residence on top
- Grafana operates in 32+ countries. We try to operate as one team and focus on global benefits which our whole team can enjoy. Inevitably there are some regional variations and we discuss the benefits offered in your country of residence through our interview process.
- We offer a competitive healthcare plan (Medical, Dental & Vision) for our US based employees via our co-employer JustWorks.
- We offer a 4% employer contribution match on our 401K/pension plans or a one time 4% salary increase after 6 months tenure depending on your location
- In the United States, the Base (OTE for commission positions) compensation range for this role is $149,000 - $210,000. Actual compensation may vary based on level, experience, and skillset as assessed in the interview process. Benefits include equity, bonus (if applicable) and other benefits listed here.
- Citizenship: Not Provided
- Incentives: Not Provided
- Education: Not Provided
- Travel: Not Provided
- Telework: Not Provided