š„Letās Do DevOps: Shard GitHub Actions Workload over Many Concurrent Builders
--
This blog series focuses on presenting complex DevOps projects as simple and approachable via plain language and lots of pictures. You can do it!
Hey all!
Iāve recently been building a tool Iām calling the GitHub Cop. Itās a tool that processes all 1.5k+ repos in my GitHub Org and sets all the settings, branch protections, auto-link rules, permissions, etc. Thatās a lot of changes each night, and on every repo sequentially.
To do that, Iāve built several tools, including an API token circuit breaker to avoid running out of API tokens when doing tens of thousands of actions in a short time, and how to paginate API calls when processing thousands of attributes, but none of those processes help speed up the single-threaded processing of thousands of repos.
What would help is having several threads working concurrently. Thankfully, GitHub Actions makes this pretty easy to do! I did have to teach bash some arithmetic, but I was able to make it work.
Letās talk about theory, then how I implemented this change in my custom bash program, and the limitations I faced and how I solved them.
GitHub Actions Theory: Matrix
Sharding (a word not recognized by my computerās dictionary), is the practice of splitting a single workload into several parts, and then assigning each those work āshardā or fragments to one of several builders. Itās usually assumed that the workload would then be processed concurrently. Itās a method of reducing the total time a workload runs by splitting it across several runners.
Because this is a relatively foundational topic of computing, GitHub Actions makes this relatively easy using the matrix
strategy. In math, a āmatrixā means a rectangular array or table where numbers are stored and often combined in a predictable way.
In a GitHub Actions context, a matrix
is values that are combined and handed off to the tasks you list on the job. If you have a single matrix value like attribute: [foo, bar]
, the job will be run twice, once where attribute
is set to foo
and once where attribute
is set to bar
.