šŸ”„Let’s Do DevOps: Shard GitHub Actions Workload over Many Concurrent Builders

Kyler Middleton
9 min readApr 24

This blog series focuses on presenting complex DevOps projects as simple and approachable via plain language and lots of pictures. You can do it!

Hey all!

I’ve recently been building a tool I’m calling the GitHub Cop. It’s a tool that processes all 1.5k+ repos in my GitHub Org and sets all the settings, branch protections, auto-link rules, permissions, etc. That’s a lot of changes each night, and on every repo sequentially.

To do that, I’ve built several tools, including an API token circuit breaker to avoid running out of API tokens when doing tens of thousands of actions in a short time, and how to paginate API calls when processing thousands of attributes, but none of those processes help speed up the single-threaded processing of thousands of repos.

What would help is having several threads working concurrently. Thankfully, GitHub Actions makes this pretty easy to do! I did have to teach bash some arithmetic, but I was able to make it work.

Let’s talk about theory, then how I implemented this change in my custom bash program, and the limitations I faced and how I solved them.

GitHub Actions Theory: Matrix

Sharding (a word not recognized by my computer’s dictionary), is the practice of splitting a single workload into several parts, and then assigning each those work ā€œshardā€ or fragments to one of several builders. It’s usually assumed that the workload would then be processed concurrently. It’s a method of reducing the total time a workload runs by splitting it across several runners.

Because this is a relatively foundational topic of computing, GitHub Actions makes this relatively easy using the matrix strategy. In math, a ā€œmatrixā€ means a rectangular array or table where numbers are stored and often combined in a predictable way.

In a GitHub Actions context, a matrix is values that are combined and handed off to the tasks you list on the job. If you have a single matrix value like attribute: [foo, bar], the job will be run twice, once where attribute is set to foo and once where attribute is set to bar.

Kyler Middleton

DevNetSecOps, DevRel, cloud security chick. I will teach you, it’s unavoidable. She/Her šŸ³ļøā€šŸŒˆšŸ³ļøā€šŸŒˆ, INFJ-A, support the EFF!