How to choose between mono-repo and poly-repo
January 21, 2021
January 21, 2021
In the past year or so, the conversations around "mono-repo/monorepo" and "poly-repo/polyrepo" came up quite frequently with my clients in the custom application space. The conversations seemed to go both directions: some clients are moving from poly-repo to mono-repo, while others are considering breaking up mono-repo to poly-repo. Today, I'd like to share my point of view on this topic.
What are mono-repo and poly-repo?
When you work with a large application or system, most likely you will want to break up your application into manageable components—have you heard of microservices (and its new buddy, microfrontend)? Mono-repo and poly-repo are the two most common approaches of how you store and organize your code. Mono-repo, as its name suggests, is storing the code of your entire application in a single repository (such as Git repo). You put microservices or UI components, or even shared libraries into separate directories under the same Git repo. Poly-repo is the opposite; each microservice, UI app, and the shared library has its own repository.
Which approach is better?
From my observation, there is no consensus in the community on what is “better." In my perspective, each approach has its own benefits and drawbacks. You will have to evaluate and understand the implications. Here are a few areas for consideration:
Mono-repo tends to make it easier to onboard someone. When you clone the repo, you get every piece of code that the application requires to run. For poly-repo, you will need to know which repos to clone, getting code from the right branch. If they don't use the same branching strategy, that's even tougher. New team members can also go through the code easily in their favorite development tools, since the source for every component is cloned at once.
Run the entire application locally
Note that, in many cases, this is not a requirement. Your engineers may only work with one component at a time, and wire up the local instance with other services in a shared dev environment.
Continuous Integration/Continuous Delivery pipeline
Poly-repo has a lower overhead in this area. On a Continuous Integration server, it is common to have a build and release pipeline dedicated for each component. (Remember, one of the microservice principles - independently deployable.) If the entire repository is for one component in the case of poly-repo, then any changes pushed to that repo would trigger a test, build & release pipeline for that particular component only. In the mono-repo world, you will have to specifically configure the pipeline to only trigger when the changeset includes the files for that pipeline that builds a specific component. Many of the Continuous Integration servers support this capability, but not all of them. (At one point Jenkins did not support this out-of-the-box, and would require a special plug-in and custom code to do it.) If there are files outside the component directory and they are changed, you may need to consider building every component.
Mono-repo encourages tightly coupled dependency between components in the repo, because all components are next to each other in the file structure. (Maybe saying encourages isn't fair—it doesn't stop you and it just makes it easy to do so.) This can be good or bad.
This is not directly related to mono- or poly-repo. But since using mono-repo can result in tightly coupled dependencies between components, your components may not be 100% independently deployable. Your release coordination effort will go up. Note: this concern can be addressed with the proper governance/review process—making sure developers don't build any tightly coupled relationships.
It is best to evaluate these concerns and the situation your teams are in. Some of these may have bigger impacts that others. From my experience, for a smaller team, starting with mono-repo is always safe and easy to start. Large and distributed teams would benefit more from poly-repo.