• MSids@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    3
    ·
    4 months ago

    I was interested in Apple’s approach where they would look at checksums of the images to see if they matched checksums of known CSAM. Its trivial to defeat by changing even a single pixel, but it’s the only acceptable way to implement this scanning. Any other method is an overreach and a huge invasion of privacy.

    • mox@lemmy.sdf.orgOP
      link
      fedilink
      English
      arrow-up
      11
      arrow-down
      1
      ·
      edit-2
      4 months ago

      Its trivial to defeat

      Maybe, depending on the algorithm used. Some are designed to produce the same output given similar inputs.

      It’s also easy to abuse systems like that in order to get someone falsely flagged, by generating a file with the same checksum as known CSAM.

      It’s also easy for someone in power (or with the right access) to add checksums of anything they don’t like, such as documents associated with opposing political or religious views.

      In other words, still invasive and dangerous.

      More thoughts here: https://www.eff.org/deeplinks/2019/11/why-adding-client-side-scanning-breaks-end-end-encryption

      • MSids@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        2
        ·
        edit-2
        4 months ago

        Checksums wouldnt work well for their purposes if they could easily be made to match any desired checksum. It’s one way math.

        • mox@lemmy.sdf.orgOP
          link
          fedilink
          English
          arrow-up
          5
          arrow-down
          2
          ·
          edit-2
          4 months ago

          One-way math doesn’t preclude finding a collision.

          (And just to be clear, checksum in the context of this conversation is a generic term that includes cryptographic hashes and perceptual hashes.)

          Also, since we’re talking about a list of checksums, an attacker wouldn’t even have to find a collision with a specific one to get someone in trouble. This makes an attack far easier. See also: the birthday problem.

    • ocassionallyaduck@lemmy.world
      link
      fedilink
      English
      arrow-up
      9
      ·
      4 months ago

      Even this method is overreach: who control the database?

      Journalist have a scoop on a US violation of civil rights? Well not if it is important to the CIA who slipped the PDF that was their evidence into the hash pool and had his phone silently rat him out as the one reporting.

      This hands ungodly power to those running that database. It’s blind, and it “only flags the bad things”. Which we all agree CSAM is bad, but I can easily ruin someone inconvenient to me if I was in that position by just ensuring some of his personal and unique photo get into the hash. It’s a one way process, so everyone would just believe definitively that this radical MLK guy is a horrible pedo because we got some images off his phone in a diner.

    • douglasg14b@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      4 months ago

      It’s not as easy to defeat as just changing the pixel…

      CSAM detection often uses existing features for image matching such as PhotoDNA by Microsoft. Similarly both Facebook and Google also have image matching algorithms and software that is used for CSAM detection which.

      These are all hash based image matching tools used for broad feature sets such as reverse image search in bing, and are not defeated by simply changing a pixel. Or even redrawing parts of the whole image itself.

      You’re not just throwing an md5 or an sha at an images binary. It’s much more nuanced and complex than that, otherwise hash based image matching would be essentially useless for anything of consequence.