• 1 Post
  • 83 Comments
Joined 1 year ago
cake
Cake day: June 30th, 2023

help-circle






  • It’s not the same issue at all.

    Piracy distributes power. It allows disenfranchised or marginalized people to access information and participate in culture, no matter where they live or how much money they have. It subverts a top-down read-only culture by enabling read-write access for anyone.

    Large-scale computing services like these so-called AIs consolidate power. They displace access to the original information and the headwaters of culture. They are for-profit services, tuned to the interests of specific American companies. They suppress read-write channels between author and audience.

    One gives power to the people. One gives power to 5 massive corporations.




  • The artists (and the people who want to see them continue to have a livelihood, a distinct voice, and a healthy engaged fanbase) live in that society.

    The platforms where the images are posted will be selling and brokering

    Isn’t this exactly the problem though?

    From books to radio to TV, movies, and the internet, there’s always:

    • One group of people who create valuable works
    • Another group of people who monopolize distribution of those works

    The distributors hijack ownership (or de facto ownership) of the work, through one means or another (either logistical superiority, financing requirements, or IP law fuckery) and exploit their position to make themselves the only channel for creators to reach their audience and vice-versa.

    That’s the precise pattern that OpenAI is following, and they’re doing it at a massive scale.

    It’s not new. Youtube, Reddit, Facebook, MySpace, all of these companies started with a public pitch about democratizing access to content. But a private pitch emerged, of becoming the main way that people access content. When it became feasible for them to turn against their users and liquidate them, they did.

    The difference is that they all had to wait for users to add the content over time. Imagine if Google knew they could’ve just seeded Google Video with every movie, episode, and clip ever aired or uploaded anywhere. Just say, “Mon Dieu! It’s impossible for us to run our service without including copyrighted materials! Woe is us!” and all is forgiven.

    But honestly, whichever way the courts decide, the legality of it doesn’t matter to me. It’s clearly a “Whose Line Is It?” situation where the rules are made up and ownership doesn’t matter. So I’m looking at “Does this consolidate power, or distribute it?” And OpenAI is pulling perhaps the biggest power grab that we’ve seen.

    Unrelated: I love that there’s a very distinct echo of something we saw with the previous era of tech grift, crypto. The grifters would always say, after they were confronted, “Well, there’s no way to undo it now! It’s on the blockchain!” There’s always this back-up argument of “it’s inevitable so you might as well let me do it”.



  • I’m dumbfounded that any Lemmy user supports OpenAI in this.

    We’re mostly refugees from Reddit, right?

    Reddit invited us to make stuff and share it with our peers, and that was great. Some posts were just links to the content’s real home: Youtube, a random Wordpress blog, a Github project, or whatever. The post text, the comments, and the replies only lived on Reddit. That wasn’t a huge problem, because that’s the part that was specific to Reddit. And besides, there were plenty of third-party apps to interact with those bits of content however you wanted to.

    But as Reddit started to dominate Google search results, it displaced results that might have linked to the “real home” of that content. And Reddit realized a tremendous opportunity: They now had a chokehold on not just user comments and text posts, but anything that people dare to promote online.

    At the same time, Reddit slowly moved from a place where something may get posted by the author of the original thing to a place where you’ll only see the post if it came from a high-karma user or bot. Mutated or distorted copies of the original instance, reformated to cut through the noise and gain the favor of the algorithm. Re-posts of re-posts, with no reference back to the original, divorced of whatever context or commentary the original creator may have provided. No way for the audience to respond to the author in any meaningful way and start a dialogue.

    This is a miniature preview of the future brought to you by LLM vendors. A monetized portal to a dead internet. A one-way street. An incestuous ouroborous of re-posts of re-posts. Automated remixes of automated remixes.

    There are genuine problems with copyright law. Don’t get me wrong. Perhaps the most glaring problem is the fact that many prominent creators don’t even own the copyright to the stuff they make. It was invented to protect creators, but in practice this “protection” gets assigned to a publisher immediately after the protected work comes into being.

    And then that copyright – the very same thing that was intended to protect creators – is used as a weapon against the creator and against their audience. Publishers insert a copyright chokepoint in-between the two, and they squeeze as hard as they desire, wringing it of every drop of profit, keeping creators and audiences far away from each other. Creators can’t speak out of turn. Fans can’t remix their favorite content and share it back to the community.

    This is a dysfunctional system. Audiences are denied the ability to access information or participate in culture if they can’t pay for admission. Creators are underpaid, and their creative ambitions are redirected to what’s popular. We end up with an auto-tuned culture – insular, uncritical, and predictable. Creativity reduced to a product.

    But.

    If the problem is that copyright law has severed the connection between creator and audience in order to set up a toll booth along the way, then we won’t solve it by giving OpenAI a free pass to do the exact same thing at massive scale.






  • There are ways to watermark plaintext. But it’s relatively brittle, because it loses signal as the output is further modified, and you also need to know what specific LLM’s watermarks you’re looking for.

    So it’s not a great solution on its own, but it could be part of something more comprehensive.

    As for non-plaintext file formats…

    A simple signature would indeed give us a source but not method, but I think that’s probably 90% of what we care about when it comes to mass disinformation. If an article or an image is signed by Reuters, you can probably trust it. If it’s signed by OpenAI or Stability, you probably can’t. And if it’s not signed at all or signed by some rando, you should remain skeptical.

    But there are efforts like C2PA that include a log of how the asset was changed over time, providing a much more detailed explanation of what was done explicitly by humans vs. generative automated tools.

    I understand the concern about privacy, but it’s not like you have to use a format that supports proving that an image is legit. But if you want to prove that it is legit, then you have to provide something that grounds it in reality. It doesn’t have to be personally-identifying. It could just be a key baked into your digital camera (assuming that the resulting signature is strong enough that it’s computationally expensive to try to reverse-engineer the key and find who bought the camera).

    If you think about it, it’s kind of crazy that we’ve made it this far with a trust model that’s no more sophisticated than “I can tell from the pixels and from seeing quite a few shops in my time”.