this post was submitted on 17 Apr 2025
23 points (92.6% liked)

Selfhosted

59999 readers
451 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam.

  3. Posts here are to be centered around self-hosting. Please ensure it is clear in your post how it relates to self-hosting.

  4. Don't duplicate the full text of your blog or git here. Just post the link for folks to click.

  5. Submission headline should match the article title.

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 3 years ago
MODERATORS
 

Edit: it seems like my explanation turned out to be too confusing. In simple terms, my topology would look something like this:

I would have a reverse proxy hosted in front of multiple instances of git servers (let's take 5 for now). When a client performs an action, like pulling a repo/pushing to a repo, it would go through the reverse proxy and to one of the 5 instances. The changes would then be synced from that instance to the rest, achieving a highly available architecture.

Basically, I want a highly available git server. Is this possible?


I have been reading GitHub's blog on Spokes, their distributed system for Git. It's a great idea except I can't find where I can pull and self-host it from.

Any ideas on how I can run a distributed cluster of Git servers? I'd like to run it in 3+ VMs + a VPS in the cloud so if something dies I still have a git server running somewhere to pull from.

Thanks

you are viewing a single comment's thread
view the rest of the comments
[–] solrize@lemmy.world 1 points 1 year ago* (last edited 1 year ago) (1 children)

I see, fair enough. Replication is never instantaneous, so do you have definite bounds on how much latency you'll accept? Do you really want independent git servers online? Most HA systems have a primary and a failover, so users only see one server. If you want to use Ceph, in practice all servers would be in the same DC. Is that ok?

I think I'd look in one of the many git books out there to see what they say about replication schemes. This sounds like something that must have been done before.

[–] marauding_gibberish142@lemmy.dbzer0.com 1 points 1 year ago (1 children)

Well it's a tougher question to answer when it's an active-active config rather than a master slave config because the former would need minimum latency possible as requests are bounced all over the place. For the latter, I'll probably set up to pull every 5 minutes, so 5 minutes of latency (assuming someone doesn't try to push right when the master node is going down).

I don't think the likes of Github work on a master-slave configuration. They're probably on the active-active side of things for performance. I'm surprised I couldn't find anything on this from Codeberg though, you'd think they have already solved this problem and might have published something. Maybe I missed it.

I didn't find anything in the official git book either, which one do you recommend?

[–] solrize@lemmy.world 2 points 1 year ago

Are you familiar with git hooks? See

https://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks

Scroll to the part about server side hooks. The idea is to automatically propagate updates when you receive them. So git-level replication instead of rsync.