I know that Lemmy is open source and it can only get better from here on out, but I do wonder if any experts can weigh in whether the foundation is well written? Or are we building on top of 4 years worth of tech debt?
There are no good code bases, only less bad ones.
The only valid measurement of code quality: WTFs/minute
Is Lemmy less bad or more bad than a typical open source project?
From some comments I’ve read, it’s at least in better shape than kbin? A few people expressed interest in helping with that project and then went running for the hills after reading through the code.
It’s probably not the only reason, but Rust is a much more attractive language/platform for devs to work with than PHP. (Source: https://survey.stackoverflow.co/2023/#section-admired-and-desired-programming-scripting-and-markup-languages)
It’s also more scalable, because it’s a compiled multi-threaded technology, while PHP is interpreted and mono-threaded.
Mother. Of. God. Did they really write Kbin in PHP?
I may be talking shit because I’m not a PHP coder, but the times I’ve seen it, it was a nightmare.
Which… makes sense. The creator of doesn’t like coding.
I actually hate programming, but I love solving problems. I really don’t like programming. I built this tool to program less so that I could just reuse code. PHP is about as exciting as your toothbrush.
So PHP it born out of a dislike of coding. In turn the documentation is all over the place and inconsistent.
Makes sense. And JavaScript is born out of a dislike of coders🤣
To be fair, PHP has slowly been getting it’s shit together since PHP 7, and 8 seems to be in a reasonably great shape compared to the horrors of 5.6
It has become really solid over time.
But it will always be a mono-threaded and interpreted technology, and therefore never a good choice for a high-scale solution like a Fediverse application.
There are good PHP codes out there as well…
PHP is really old isn’t it? I remember using phpBB forums some twenty years ago. They worked really well, but that’s going pretty far back.
PHP is an interpreted language which is inherently slow compared to a compiled language, such as Rust which is very fast. Modern PHP isn’t so bad kinda but I’m guessing a guy who hates programming and decided to start a new PHP project in 2023 isn’t really optimizing anything. Also, you’ll never get anyone you help you write PHP because gross. It’s a dead language with a small community of masochists and maintainers of legacy projects.
It would be like if you saw someone building a fighter jet and thought “hey, I can do better!” and then started getting your paper mache out to start playing air plane designer.
It is, but it aged pretty well.
Devs wanted to store state in objects, so it became object-oriented. It also gained a really solid full-fledged web framework with a strong community, with Symfony; and some strong micro-frameworks like Laravel.
But it will always be interpreted and mono-threaded, and therefore never a good choice for high-scale solutions. Facebook has to invent a brand new language (Hack) and runtime (HHVM) that was close enough to PHP so they could spend millions spending their PHP codebase to it, in order to make it compiled and multi-threaded, and make it scale.
That’s PHP for ya
I read from one admin that a Lemmy instance is a lot easier to set up and maintain than a kbin instance. It’s initially more complicated to set up and updates are just a super headache to deal with. That sounds like a showstopper. I mean kbin is not going to get too far if it’s that difficult to run and maintain an instance, no matter how good or bad the code.
From a user perspective kbin has a really nice looking interface, though Lemmy has more features. I’d like to see kbin do well. It’s younger than Lemmy so it’s going to be behind, but hopefully the overhead in running an instance can be resolved.
The best code base is the repo I just created and haven’t committed anything to.
Just clone this one. Guaranteed the best repo ever! https://github.com/kelseyhightower/nocode
Wait, so the answer is, “it depends?” 🌎👨🚀🔫👨🚀
No the answer is that it is written in a modern language, is in its infancy and needs a lot of work to be really great, but it’s based on a certified protocol ActivityPub, that Mastodon and other “fediverse” systems use. It’s going to be really great, eventually.
“It depends” is a reference to an inside joke between developers. I agree with you that it could be really great, whether or not a code base is “good” or “bad” is just a complicated and highly subjective question to answer
thanks - yes, I suspected it was. Lemmy is what it is - and agree the question is difficult to answer concisely. Understanding that interpretation of “good” vs “bad” codebases is subjective, there are plenty of production systems that are unambiguously “not good”. The great thing about lemmy isn’t the UI, it’s the threading and reddit-like communities built on the ActivityPub foundation. It’s the right foundation.
Yup, code is bad, more code is worse. And unparseable specialist code is a technical debt.
Hey, my code base is fantastic if you ignore all the stuff I had to inherit, did in a time crunch, or didn’t understand what I was doing!
If you think four years of technical debt is a lot, wait until you hear about Microsoft Windows.
I couldn’t imagine.
I love having 16-bit applications hidden in my 64-bit OS
Well it’s written in Rust. Doesn’t that make it automatically awesome and fast?
“It’s webscale”
Everyone knows relational databases don’t scale because they use joins.
Not sure if this is a serious blanket statement?
It’s from an old (god, now I feel old) meme / joke video. Here’s a link: https://www.youtube.com/watch?v=b2F-DItXtZs
There was a website / software called “xtra normal” that let you write an actual script and pick some avatars / actors and set up camera angles, etc. while it played out. Someone used it to mock the mongodb fucktards in the early 2000s.
What does this mean?
It’s from this old meme video
It’s fine. Nothing impressive about it but nothing horrifying about it. Could use better testing and better documentation, it’s pretty weak on both fronts. It’s a pretty young/immature code base, hard to have much tech debt yet. Not like its core dependencies can be a decade out of date. But it’s easy to navigate and understand,relatively speaking.
Could use better testing and better documentation
I’ve seen one dev talk about documentation and it’s admittedly weak, but they’re pretty impacted by everything else. It’s on the burner and they’ll work on it at some point.
that’s pretty much the definition of tech debt
It’s decent, but it isn’t scalable, at least not yet.
Right now the entire Lemmy backend is one big “monolith”. One app does everything from logins and signups to posting and commenting. This makes it a little harder to scale, and I wouldn’t be surprised to see it split out into multiple micro services sooner rather than later so some of the bigger instances can scale better.
I’d love to know where the higher level dev stuff is being discussed and if they’ve made a decision on why or why not microservices.
There’s no reason that a monolith can’t scale. In fact you scale a monolith the same way you scale micro services.
The real reason to use micro services is because you can have individual teams own a small set of services. Lemmy isn’t built by a huge corporation though so that doesn’t really make sense.
I disagree that it being a monolith is immediately a problem, but also
In fact you scale a monolith the same way you scale micro services.
This is just not true. With microservices, it is easy to scale out individual services to multiple instances as demand requires them. Hosting a fleet of entire Lemmy instances is far more expensive than just small slices of it that may require the additional processing power.
What microservices would you split Lemmy into? The database, image hosting and the UI are already separate.
Well, I’m going to start by repeating that I don’t necessarily agree that it being monolithic is necessarily a problem right now.
The immediate thought in my mind would be all of the federation logic. That’s where all of the instances seem to be lagging behind, and it seems the common fix is “just increase the workers to one billion”. Apparently that does something meaningful, but the developer in me wants to know how a few cores can put so many workers to use.
Spinning federation off into a microservice means you could deploy it on something like Cloud Run or AWS ECS, and have it autoscale as the workload demands it. Seems like a pretty prime candidate to me.
To me this sounds like a code / DB problem more so than a monolith vs microservice issue. You can totally run only the worker part of a monolith inside AWS ECS and have it autoscale, this is not specific to microservices.
I’d split it out into a few systems
- Signup/Login/Account management
- Posting/Commenting/Voting
- Moderation Controls
- DB Readers and Writers in different services
- Community management (may get lumped into moderation controls, but separation of duties may be needed)
- Edit: Federation is a big one I keep forgetting about
The ultimate goal, and I don’t know how possible it is with rust, would be to have a way to run those as individual services or as one part of a larger monolith. Smaller instances would be able to run it as one binary, while larger instances like Lemmy.world or Lemmy.ml can run each part independently. That would allow easier scaling on large instances while (hopefully) leaving it just as simple to deploy on a small scale too.
There’s no reason to split it into 100 different services, but 4-5 mid sized ones might help.
You can easily scale a monolith. You typically horizontally replicate any web server (monolith or not) to handle whatever traffic you’re getting. It shouldn’t really matter what type of traffic it is. Plenty of the world’s biggest websites run monoliths in production. You know how people used to say “rails doesn’t scale”? Well they were wrong because Rails monoliths are behind some huge companies like GitHub and Shopify.
The lemmy backend is also quite lightweight and parallel so it’s cheap and effective to replicate.
In my professional experience microservices are usually a dumpster fire from both the dev perspective and an ops perspective (I’m a Site Reliability Engineer).
This isn’t really contradictory to what I said. I only wished to express that the two don’t scale exactly the same (in terms of execution)
Lemmy’s backend is native code, not run in a virtual machine or interpreted from text like most backends (Java and PHP are still so, so popular, as is Python). You’re not going to pay much extra for an extra megabyte or 10 of RAM being used per instance for the extra code sitting idle. It certainly shouldn’t use much processing power when not in use.
I’d be less concerned with memory (of which Lemmy seems to use very little), and much more concerned with CPU core count. I touched on it in my other comment, but I don’t understand how a few cores is supposed to handle the ridiculous number of federation workers people are setting their instances to.
Does code that exists but isn’t being executed often actually impact CPU usage that much?
Linux kernel has like 30 million lines of code and you probably compile a fifth of it into your actual kernel binary by default. That’s still several million lines, but it doesn’t use much CPU at all. Rather, eliminating all the excess code keeps your size down so you can load it from disk in 0.0001 seconds instead of 0.0002.
Now if code that doesn’t need to run often, runs more often because we scaled the monolith horizontally - that IS a problem, but it’s not a problem inherent to the monolith design pattern, but rather a specific instance of bad design.
Again, my knowledge of the Lemmy codebase is very small, and we could possibly host the monolith in microservices style. The point I am making is this (when it comes to scaling monolith vs microservices):
If the federation logic were split out, we could configure it to run on super tiny docker instances on Google Cloud or AWS. Any time we needed it to, it would autoscale to handle the traffic. The configuration for these dockers could be super minimal memory, no storage, and multiple weak CPU cores. This would be super affordable while still being able to handle as much traffic from federation as we ask it to. One of the cool things with Google Cloud Run is that it handles load balancing between docker containers for you (just point the federation traffic at the necessary URL)
IF Lemmy has things like background services, scheduled tasks, etc, this would significantly muddy the water (we would need each service to be able to handle being run on a multitude of instances, or we would need to be able to disable each one instance by instance). And if we just scaled by spinning up more instances of Lemmy, we would also need to ensure that only federation traffic is heading to the weaker instances that we spun up for such purpose, or we would need to ensure that each spun up instance has enough resources to handle federation traffic along with the main application.
I feel like I need to state once more: I don’t necessarily think Lemmy needs to move to microservices. Only that scaling monolith vs microservices is not necessarily the same.
I suppose that’s completely fair - async workers tend to fare well as standalone services and are often split off even in monoliths. But I guess what I’m saying is that splitting it might not actually win you THAT much compared to just scaling the whole thing. Not until we’re talking like 100 runners to 1 API instance or something. It gives you a bit of additional flexibility, but won’t necessarily be a huge difference in total resource cost, is what I’m saying. But it is still a good idea because it results in cleaner code and, as outlined before, tinier docker images.
Also the thing about Google Cloud Run is that it’s probably not a good idea for many instance owners. Autoscaling can lead to unexpected costs if set up by an amateur. But that’s an unrelated can of worms.
You definitely can’t scale a monolith the same way you can scale a micro service
You can easily scale a monolith. You typically horizontally replicate any web server (monolith or not) to handle whatever traffic you’re getting. It shouldn’t really matter what type of traffic it is. Plenty of the world’s biggest websites run monoliths in production. You know how people used to say “rails doesn’t scale”? Well they were wrong because Rails monoliths are behind some huge companies like GitHub and Shopify.
The lemmy backend is also quite lightweight and parallel so it’s cheap and effective to replicate.
In my professional experience microservices are usually a dumpster fire from both the dev perspective and an ops perspective (I’m a Site Reliability Engineer).
I can’t say I disagree… Poorly implemented microservice architecture is the bane of my existence. Well implemented, though, and it makes my job so much easier.
Granted, my SRE team has all public facing production infrastructure built using an IAC process, if something causes too much trouble, it’s easier to quarantine and rebuild the offending node(s), and can be complete in under 10 minutes.
The biggest problem is far too many developers ignore the best practices and just shift existing code into smaller services. That will never give you either performance or stability benefits. Honestly, it will probably make any issues worse. Microservice architecture is a huge shift in thinking. The services need to be fairly independent of each other to really make any gains. To get to that point will always take a whole lot of work. That being said, there is nothing inherently wrong with some monoliths, but the benefits of splitting out as much of the higher traffic and resource intensive work should never be overlooked.
the same way
Microservices aren’t a silver bullet. There’s likely quite a lot that can be done until we need to split some parts out, and once that happens I expect that federation would be the thing to split out as that’s one of the more “active” parts of the app compared to logins and whatnot.
Definitely not a silver bullet, but should stop the app from locking up when one thing gets overloaded. I’m sure they have their reasons for how it’s designed now and I’m probably missing something that would explain it all.
I’m still not familiar enough with how federation works to speak to how easy that would be. Unfortunately this has happened all as I’ve started moving and I haven’t gotten a chance to dive into code like id want to.
It’s also not the only solution for high-availability system. Multiple monoliths with load-balancing can be used as well.
Also, a lot of people are self-hosting. In this case, microservice won’t give them any scaling benefit.
The problem with scaling monoliths is you are scaling everything, including the pieces that have lower usage. The huge benefit you get from going to micoservices is you only have to scale the pieces that need to be scaled. This allow for horizontal scaling to use less compute resources. It also allow for these compute resources to be spread out as well.
A lot of the headaches can be removed by having an effective CI/CD strategy that is completely reusable with minimal effort.
The last headache would be observability. There you’re stuck either living with the nightmare of firefighting problems with 100 services in possibly 10 locations, rolling your own platform using FOSS tools or spending a whole lot of money on something like honeycomb, datadog or new relic.
But I’m an SRE, I live my life for scalablability and DevOps processes. I know I’m biased.
The admin of lemm.ee is currently scaling things horizontally - it is possible.
Scaling monoliths still works fine though. Microservices are first and foremost an answer to an organizational problem, not a technical one. There is a very high chance that if you are doing microservices with less than 20 people, or let’s say even 50 people, you are doing it wrong.
Microservices introduce a ton of overhead in engineering effort required, which needs to be balanced with the benefit they provide.
Scaling shouldn’t be the first and only reason for doing microservices.
And yes I’ve worked in shops with a few thousand engineers and microservices made sense there. But it does not for Lemmy. If you look at how most of these large companies that do microservices started, it was by building a monolith and scaling it far higher than what Lemmy currently has to deliver.
I’m a software engineer myself, but not familiar with your field. How would your practice be applied to self-hosting? I’m assuming a bunch of people with their home servers wouldn’t want to just run OpenShift.
Personally, I wouldn’t touch OpenShift. As someone that has a kubernetes cluster hosted at my house on a mixture of RPis, a nas and in VMs, I’m not one to to say what anyone else would do :).
But, that can be overcome, it’s all about designing you application for multiple different installs. You don’t have to have all your services running fully separately. You can containerize each service and deploy to an orchastration engine such as kubernetes or docker swarm, or you can have the multiple endpoints on a single machine with an install package that keeps them together. It’s all about architecting toward resiliency, not toward a pattern for no other reason.
Also, Google has some very good books on SRE for free.
I find that often the overhead from microservices is worse than any savings from dropping a megabyte worth of unused machine instructions from a binary.
When your microservices need to talk to each other (and I’m not sure how many services you could split out of Lemmy without them needing to talk to each other), you’re doing a bunch of HTTP requests that are way slower than just calling another function in your monolith.
I see this at work every day. We run a distributed monolith because someone thought microservices would be a good idea, but we can’t actually separate everything, so it’s usual for an incoming API call to make 2-3 more calls internally. It can get so, so slow.
Overly chatty micoservices are definitely an issue.
Changing your mindset to a microservice oriented architecture is not an easy feat, it’s something that took a lot of time for me to fully grasp (back in my architect/developer) days. Yes, you gain overhead that will need to be compensated for. But when do the benefits outweigh the disadvantages?
Here are some questions to ask during design: How much of this chattiness is because you are tightly coupling these services? Hom much should a microservice be talking between each other? Can you implement an event bus to handle that chatter between services?
Designing an application using microserves but just replicating the monolith application will give you scalability, but will not give resilience. What can you do to overcome that single point of failure? First, no more synchronous calls to these APIs, toss an event over then fence and move on. Degrade your application if the failure is something you can’t overcome, but don’t just stop the application because one API is no longer responsive.
Do you need everything to be a microservice? Probably not. The first thing you look at when moving from a monolith to microservice architecture is what makes the most sense to be moved. How much work can be offloaded to background jobs (using something like sidekiq)?
How do you handle installs? How many packages do we now have to create for this application to work?
There are a lot of questions that have to be answered before moving toward a microservice architecture. On top of that, there is a complete mindset change as to how the application works that needs to be accomplished. If you design your microservice application with a monolith application mindset, you’ll never realize any of the gains by making the move
I’m not into this topic, but i do want to share a talk by Joe Armstrong (creator of erlang).
https://www.youtube.com/watch?v=cNICGEwmXLU
The whole chanel is full of stuff like this.
Microservices can oftentimes cause more performance issue than they solve, as soon as they need to start talking to each other. Here’s someone with more experience than me explaining how it often goes wrong.
There’s nothing stopping you from putting a load balancer and running multiple instances of a monolith connected to one database. Then the database will also become a bottleneck, but that would still happen with microservices.
Exactly, and nothing prevent a monolith from doing vertical slicing at the database level as long as the monolith is not crossing its boundaries. This is the only scaling part that is inherent to microservices. If the issue is the horizontal scaling, microservices doesn’t solve anything in this case.
Also specifically on what I understand of the Fediverse, you want something easy to host and monitor since a lot of people will roll out their own instances which are known issues when running microservices.
This is a discussion I’m also interested in. Migrating a monolith to microservices is a big decision that can have serious performance, maintainability and development impact.
Microservices can be very complex and hard to maintain compared to a monolith. Just the deployment and monitoring could turn into a hassle for instance maintainers. Ease of deployment and maintenance is a big deal in a federated environment. Add too much complexity and people won’t want to be part of it.
I’ve seen some teams do hybrids. Like allowing the codebase to be a single artifact or allowing it to be broken by functionalities. That way people can deploy it the easy way or the performant way, as their needs change.
That’s what I’m thinking. Microservices could be a huge pain in the ass, but a hybrid approach would make things much better. Smaller instances wouldn’t be a problem, but the larger instances would be able to separate out components.
To keep it possible to run monolithicly would probably need a lot of work, but it’s possible to do and would probably be the best approach.
I have a friend that’s a lot more technical than me and he said that Lemmy’s codebase is kinda messy and relies on libraries that are still in beta and have not been tested well in the real world (since Rust is a relatively new language). This was a few months ago though, I’m not sure how much things changed since it’s been getting a lot more support and rewrote the front-end. The good news is it’ll get a lot better as more developers contribute to it.
I think a lot of people assume that because it’s written in Rust that means it has to be super stable but that really isn’t the case.
We need to rewrite it in Rust.
oah!!1
I’m porting it to COBOL so that 2 other people will know how to contribute.
Security by obscurity. Perfect.
It’s probably decent, but it is also worth noting that Lemmy was never really expecting the massive explosion of activity it currently has quite so soon.
The current code base was probably good for a small number of users/instances, but everything isn’t doing quite as well now that there are thousands, or even tens of thousands rattling about the place.
I think it will improve as more people get involved. The fundamentals seem to work fine. Haven’t looked at the repository yet but I am planning to do so and see whether I can make a (small) contribution somewhere. Probably in the form of cleaning up some technical debt.
Rust, babyyyyy
As long as the backend is stateless, it can be scaled to handle huge amount of users, at least in theory. IMO the main issue right now with Lemmy deployment is
pictrs
not being stateless. It uses a filesystem-based internal database calledsled
. Not only this make pictrs not stateless, you can’t even run multiple replica of pictrs in the same host because sled would crash if the database file lock is already acquired by another replica. Someone with some rust skill should consider donating their time to add postgresql support to pictrs soon, which will greatly help making Lemmy scalable. Too bad I know nothing about rust.An interesting choice of solution there for image hosting… I would have thought they would have gone with a simple proxy through to an object store like S3, GCS, Wasabi, insert other clone here. Or even picked an off the shelf BLOB capable system for self hosting like Mongo or Cassandra. Then your image hosting becomes stateless as you just give each image a flake ID, pop it in the storage system and give back a shortened URL. I’m sure they had their reasons though :-)
I think pictrs has a not very well documented object storage option. I’m struggling to find the link right now though.
Someone mentioned they had started out using websockets instead of http. I guess they’ve since migrated, but that design choice makes me wonder about the qualifications of the devs to make that kind of choice.
Websocket was deprecated with 0.18.0 which is the latest official release, but a lot of instances are on the release candidate for 0.18.1. because of some major improvements. My login instance is on that one which is great for me because I’m using a desktop browser and it looks way nicer. A lot of fixes too.
Websocket is easier to implement so that’s probably why they started with it. It has heavy overhead and doesn’t scale well so that’s the down side. It wasn’t trivial for them to move to http. Websocket was probably a better starting point for them at the time, but they did realize its shortcomings and deprecate it in time to support growth. I don’t know if I’d hold that against them.
There is no documentation for the web API 😕
If you think four years of technical debt is a lot, wait until you hear about Microsoft Windows.
Someone mentioned they had started out using websockets instead of http. I guess they’ve since migrated, but that design choice makes me wonder about the qualifications of the devs to make that kind of choice.
Why’s that?
Web sockets are meant for applications where it’s important that you receive updates fast in a push fashion. E.g. collaborative editors like Google docs or a chat application. To scroll Lemmy or open a specific Lemmy post you don’t need that at all. You can just fetch the data once and have users refresh manually if for example they want to fetch the latest comments on a post. Using websockets for that type of application just puts unnecessary strain on the server.
They also originally thought they’d have it update in real time which was a bit of a mistake. When you’re running a small test instance it’s kinda neat if comments and posts pop up as they’re made, but the reality of that is a scroll jacking nightmare.
Maybe they wanted a different type of user experience. But yeha, maybe that’s something Reddit could pull off because of their infrastructure… And they don’t even do it because there’s not much user value to it.
Although reddit does use some websockets so you can see how many users are also seeing the post at the same time.
But not websockets for EVERYTHING.
I’d go for an Elixir alternative to Lemmy/Kbin, but I’m just a shill for that lang who’s easily sold on others’ experiences with its deployment
I’ve been waiting for an Elixir version for a while now
Haven’t seen much traction on the few projects I’ve been seeing, but still holding out hope. Maybe I should get off my ass and start actually contributing