The biggest issue would be data retention. Reddit serves as a real world database that stores all the historical content and search engines like google make it searchable.
We’re talking about petabytes, and lemmy hardly has a few gigabytes.
Who is going to store all this data, even in a distributed environment, the bigger instances would have to store a few hundred terrabytes (per year).
Text is very light and compresses very well. While instances may risk having scaling issues with photo and video, text should be very easy to archive forever.
Personally I think its ok for instances to delete older posts to save space provided that there are means to archive threads that users find valuable.
For fediverse to thrive it should be as easy as possible for people to setup and manage instances without having to think about the storage space too much.
Archival of historical content is something that I feel should be handled separately.
On the other hand, do we really need to store it? Sure some posts will remain relevant, but many and even most posts on reddit, forums etc are outdated. Maybe communities and mods should decide what posts are relevant and make them permanent, where the rest just get erased after a set period of time that the community sets.
If stuff is deleted it can end up as a DenverCoder9 suituation where you search the tntire internet to find a solution to yer specific problem, find someone who had iot a decade ago, solved it and either never posted it or it was wiped.
I specifically use Reddit for the data retention and ease of finding “old” information, unlike basically other social media which scrubs it from any search even seconds after you looked at it (even if they still store the data)
The biggest issue would be data retention. Reddit serves as a real world database that stores all the historical content and search engines like google make it searchable.
We’re talking about petabytes, and lemmy hardly has a few gigabytes.
Who is going to store all this data, even in a distributed environment, the bigger instances would have to store a few hundred terrabytes (per year).
Text is very light and compresses very well. While instances may risk having scaling issues with photo and video, text should be very easy to archive forever.
Media is only stored by the instance of the user that’s uploading it, if you want to upload tons of data you’re going to end up having to self-host.
…and it’s not like links don’t break on reddit all the time. Don’t worry about archiving that’s what archive.org is for.
Personally I think its ok for instances to delete older posts to save space provided that there are means to archive threads that users find valuable. For fediverse to thrive it should be as easy as possible for people to setup and manage instances without having to think about the storage space too much.
Archival of historical content is something that I feel should be handled separately.
On the other hand, do we really need to store it? Sure some posts will remain relevant, but many and even most posts on reddit, forums etc are outdated. Maybe communities and mods should decide what posts are relevant and make them permanent, where the rest just get erased after a set period of time that the community sets.
If stuff is deleted it can end up as a DenverCoder9 suituation where you search the tntire internet to find a solution to yer specific problem, find someone who had iot a decade ago, solved it and either never posted it or it was wiped.
I specifically use Reddit for the data retention and ease of finding “old” information, unlike basically other social media which scrubs it from any search even seconds after you looked at it (even if they still store the data)
Wouldn’t that take even more resources?