Yesterday, as part of the discussions related to Lemmy current inability to delete all user content I wrote a proposal: if enough people stepped up to help with funding, I’d take my work on my Fediverser project (which already has an admin web tool that “knows” how to interface with Lemmy) to solve all the GDPR-specific issues that we were raised by @[email protected]
The amount asked is, quite frankly, symbolic. I offered to work 10h/week on it if at least 20 people showed up to contribute via Github (which would be $4/month) or to signup to my instance (which access is given via a $29/year subscription). In other words, I’m saying “Give me $80/month and I will work 40 hours per month on this thing which so many of you are saying is critical to the project.”
So now that we have passed 24 hours, 58 upvotes and a handful of “that’s great!” responses, let me tell you how that translated into actual supporters:
- Zero sponsors on Github
- Zero signups on Communick.
Don’t take this as me demanding anything. I’m writing this just to illustrate the following:
-
The Tragedy of the commons is real. I can bet that at least 30% of the 60+ thousand users on Lemmy are proud owners of a pricey iPhone, and most of these are okay with paying for an app to use on their pricey iPhones, but almost none of them will even consider throwing a few bucks per year on the way of an open source developer.
-
The Outrage Mill is not a “capitalist” or even “corporate” phenomenon. People were piling on the devs yesterday for completely ignoring “such a crucial piece of functionality”, but no one actually stepped up to offer (or gather) the resources needed to have this problem solved. It’s almost as if people were getting more out of the discussion about the problem than working through a solution.
-
“Skin In The Game” is a powerful filter. No matter how much people will tell you that something is important to them, the true test is seeing how many are willing to pay the asking price. If not people are not willing to pay $2 per hour of work, then I can assume that this is not really important.
That seems futile to me. Once you post, your content is all over the instances, admins have backups. The best you can do is guarantee GDPR on your local instance but the user has to go hunt down every other instance with a copy of it.
The fediverse can’t ever be properly GDPR compliant unless an EU bubble develops with instances with contracts between eachother to be GDPR compliant and they all only federate with eachother. Federated Lemmy instances would fall into subprocessors that you need to hold to GDPR standards, that’s just not possible the way things work right now.
Wouldn’t that be similar to what is happening with websites preventing access from the EU to avoid GDPR ?
Pretty much, although in this case I guess one can just make an account with one of such instances. But it would definitely make it harder for people like me who run their own instances.
People think GDPR is some magic spell that can be used to stop bits from being transmitted around the Internet.
It’s not. It’s just a set of instructions regarding what online services are supposed to do with the data of European users interacting directly with their servers. To be “GDPR compliant”, all instance admins need to be able to do is:
I’m reasonably certain that I can satisfy these regulations.
There is nothing in the law that says “if someone screams Gee-Dee-Pee-Arrr three times in front of their phone, their data becomes radioactive and must disappear from the Internet in 48 hours or the instance owner will pay 100 million euros + 3 pints of blood from their unborn first child”
Aren’t you also supposed to ensure that the third-party handling the PII is also GDPR compliant before the user consents to sharing it? Pretty sure my work training said so, but they could be erring on the safe side.
If not, that sounds like a giant loophole: you could just ask for consent, funnel all the data out of reach from the GDPR, and do all the analytics and profiling you want. Like, when Threads joins, what’s stopping them from swallowing all your user’s data? They can get it, they’re implicitly allowed to process it, and yet the data is now unencumbered from any further consent requests by the user. They don’t even have a way of knowing if the user is potentially from the EU.
Meta would of course be obligated to delete the data if the user goes to them and requests it to be deleted, but they might not even know Meta’s processing their data, and there’s a lot of privacy enthusiasts on Lemmy.
How can a user possibly consent to this properly, other than practically waiving their GDPR rights, which the law doesn’t allow?
Is there any new documentation around on that topic from actual lawyers analyzing the implications? It feels everything GDPR I see is opinions and personal interpretations of the law, which may be biased towards “it’s probably okay” as obviously we all want the fediverse to succeed.
In particular, ActivityPub pushes the data out for the most part, so one can’t argue “well I can’t stop people scraping my site illegally”, one could argue that instance admins should vet new instances before opening the data firehose.
It feels very much like depending on the case, and who got harmed how, a judge could decide the admins should have put technical safeties. I mean, we’re in the era of holding porn sites responsible for letting minors access the site and demanding they ID everyone to make sure. Lawmakers barely understand technology, let alone something like the Fediverse. I could see things go sour real fast.
User generated content != PII.
What’s stopping you (or anyone else) to just bypass authorized fetch and swallow the data stream from anyone?
Aren’t the usernames an identifier and therefore PII? As far as I understand you can’t even use a cookie or the user’s IP to determine unique visitors on a site because it identifies the user personally.
On the fediverse, every comment, every vote, every moderator action is completely public, and tied to the username. Unless the username is a throwaway and the user never ties it to their real identity in any way, that builds a ridiculously detailed profile of the user’s habits online. And still, you get enough of a profile I don’t doubt Google or Meta could manage to connect it to your profile easily unless you’re actively using a different persona.
It’s all completely public and available to anyone that wants it.
It’s even worse, images aren’t proxied right now so you can actually tie a username to an IP rather easily if you don’t use a VPN or block outside resources by default.
Not exactly a new threat to be fair, but really the only thing not being broadcasted everywhere about the user is their email address.
I guess the best one can do is clearly inform the user about the risks involved and honor incoming deletion requests properly, but man if a child get abused on the fediverse and you can barely yank the content, I can see a judge ruling that the fediverse as a whole is reckless.
Exactly.
To my understanding, the key part is that you are supposed to disclose any type of information that you are sharing with third-parties through back channels.
If you set a third-party tracking cookie on your site, then yes, the third-party can use the cookie to correlate users from different sites. But if you do what you just did and place a image that displays the IP, how can any third-party access this information? You have my IP and a request log, so what? Is there any way that another Lemmy instance can use this to identify me?
And distribution/collection of public information is not what the GDPR is trying to regulate!
Can you show where the GDPR excludes public information? Because if it doesn’t and can uniquely identify a person, then it’s still subject to this regulation.
Let’s say you go to a public forum and asks “please remove my PII”. To comply, they don’t need to remove your comments and posts, they just need to remove your username. Granted, the website owner might have the policy of deleting all the content, but you’ll have a hard time with the legal system to argue that they are not complying with the GDPR if they delete only the thing that really just identifies you uniquely.
But what if some of my comments include information that can uniquely identify me?
That can be something like “message me on Matrix at …”
In the case of the article I wrote, the image never federated.
I read the article when it was first posted!
That was lucky, but my point is more if you hadn’t noticed and posted it, it would be cached in everyone’s pictrs and it would have been a major nightmare. Can you imagine if you just accidentally uploaded the wrong file and didn’t double check? It’s honestly terrifying, and there’s not much you could do other than hope every instance you contact to delete it complies, and also don’t get the admin to take notice, make a copy and upload it elsewhere in the process.
Tools for local admins and moderators definitely could be improved, but I just wanted to point out GDPR and Fediverse is… complicated, and it’s still unclear if it possibly can be truly GDPR compliant.
Maybe a more detailed GDPR compliance strategy and roadmap would be more convincing than “trust me bro, pay me and I’ll make Lemmy GDPR compliant”.
That said Lemmy is definitely not super well designed, loads of questionable design decisions or most likely, temporary shortcuts that stayed. How on earth did nobody think of making sure to tie images to a post so it can be deleted is beyond me. It’s pretty obvious everyone can use any Lemmy instance as an image hosting site without ever posting the image to Lemmy.
Got it. But I just wouldn’t say it’s futile. The case of a KYC Selfie is especially bad, but the case of a nude is a better example of the usefulness of implementing a federated delete request.
There’s so much porn on the fediverse. Yes, It’s conceivable that some admins will patch their instance to ignore (or specifically give special attention-to) images that have received federated delete requests from other instances – but I don’t think there’s much incentive in them doing that for nudes when there’s already a firehose of other nudes incoming.
Even in the case where the image already federated, I think that implementing better data privacy functionality for images (including federated delete requests) would significantly reduce the harm of users and instance admins in 99% of cases. It’s not futile. Reducing harm is important and worthwhile.
I’m completely onboard with making the fediverse as safe as possible. I agree with your point. We should do everything we can to improve the tools.
Honestly, the complete lack of tracking of anything that goes into pictrs is a bit baffling. We have everything in place to delete posts, entire communities, even entire accounts. But the images are completely untracked. If they were properly tied to the post as they should, this whole thing would have been a non-issue. We had that figured out on forums like 20 years ago.
The reason I think it’s futile is that the amount of work required to make it work reasonably well. Lemmy is prototype quality, none of that have been thought through at all and there’s holes everywhere. It’s barely federating to Mastodon correctly. Then of course there’s the rest of the fediverse that might not support deleting properly, or might but use a different protocol to do it, possibly with different semantics. Does Mastodon even understand that deleting a thread should delete all its children? We have Kbin that’s not federating moderation at all to Lemmy, soon we’ll have Sublinks which hopefully will implement that stuff better.
Deletes don’t even currently federate properly to defederated instances. And even if it was sent out I don’t think it gets accepted either. So if A defederates B, and a user on A deletes their account, B won’t even know. And currently some admins are regularly purging the table that would allow one to properly do that, because it quickly grows several GBs monthly and outpaces even pictrs’s growth. In case of instances leaving the fediverse like Beehaw is thinking of, that also means user deletes wouldn’t federate at all to the months of data spread across the fediverse from before defederation. Yikes.
I think this might really need a new protocol as well. Because right now, it just federates delete activities for every post/comment and it’s killed a few servers a couple times.
It’s a huge undertaking that would take a ton of volunteers or a proper fundraiser to hire people full time to work on just that. And I feel like involving the GDPR in this adds a lot more rigid and legal requirements/expectations on top. This needs the whole fediverse to join forces and agree on some sort of standard on how to handle this universally.
IANAL, but fediverse instances need to find a way to automatically set up data processing agreements when initializing federation to be GDPR-compliant: https://gdpr.eu/what-is-data-processing-agreement/