misk@sopuli.xyz to Technology@lemmy.worldEnglish · 11 months agoAI image training dataset found to include child sexual abuse imagerywww.theverge.comexternal-linkmessage-square15fedilinkarrow-up1126arrow-down113
arrow-up1113arrow-down1external-linkAI image training dataset found to include child sexual abuse imagerywww.theverge.commisk@sopuli.xyz to Technology@lemmy.worldEnglish · 11 months agomessage-square15fedilink
minus-squaresir_reginald@lemmy.worldlinkfedilinkEnglisharrow-up6·edit-211 months agoremoving these images from the open web has been a headache of webmasters and admins for years in sites which host user uploaded images. if the millions of images in the training data were automatically scraped from the internet, I don’t find it surprising that there was CSAM there.
minus-squareCommunist@lemmy.mllinkfedilinkEnglisharrow-up1arrow-down1·11 months agoDon’t they need to label the data?
removing these images from the open web has been a headache of webmasters and admins for years in sites which host user uploaded images.
if the millions of images in the training data were automatically scraped from the internet, I don’t find it surprising that there was CSAM there.
Don’t they need to label the data?
Not manually