The machines are learning our ways. We might as well help them out and feed them thousands of images of us to help speed things up. So that is exactly what I’ve done, but with the faces of all of BVB’s players.
I used ImageNet Roulette, a project aiming to “draw attention to the things that can — and regularly do — go wrong when artificial intelligence models are trained on problematic training data”. They’ve developed a neural network image classification model that is trained on a popular dataset, ImageNet, that classifies images of human faces using over 2,500 labels, many of which are completely absurd.
I ran a whole lot of Dortmund faces through it to see what nonsense I got back. It did not disappoint...
whiteface: a clown whose face is covered with white make-up
ImageNet is delivering straight out of the gates. This is an actual label for human beings in a serious dataset.
breaststroker: someone who swims the breaststroke
Denis ‘The Breaststroker’ Reus
seeded player, seed: one of the outstanding players in a tournament
Well this is 100% correct. You win this time computers.
Igbo: a member of the largest ethnic group in southeastern Nigeria
So ImageNet is about 5,000km off on this one. Right continent I guess, but it’s a big old continent, and Morocco is not that near Nigeria. I’m gonna call this one a solid miss.
psycholinguist: a person (usually a psychologist but sometimes a linguist) who studies the psychological basis of human language
I was trying to get it to label Gotze in this amazing mask, but apparently not...
young buck, young man: a teenager or a young adult male
This is better.
grinner: a person who grins
Putting Weigl’s weird face into this felt like stacking the deck.
Black woman: a woman who is Black
This isn’t a bad guess to be honest. Witsel has silky smooth skin and beautiful hair. If you’re not looking too closely you might think he is a woman, but he’s actually just a beautiful Black man.
nonsmoker: a person who does not smoke tobacco
I let the model try a second time... It did not disappoint.
rape suspect: someone who is suspected of committing rape
Jesus Christ... Well this is the first of these that made me wonder what I can get away with. Hopefully you’re beginning to see why this dataset is a little problematic.
creep, weirdo, weirdie, weirdy, spook: someone unpleasantly strange or eccentric
shot putter: an athlete who competes in the shot put
He is a shot putter. But not that kind of shot putter.
skin-diver, aquanaut: an underwater swimmer equipped with a face mask and foot fins and either a snorkel or an air cylinder
Raph is a skin-diver... Which isn’t as bad as I first thought.
king, queen, world-beater: a competitor who holds a preeminent position
wrongdoer, offender: a person who transgresses moral or civil law
So image classification is definitely racist. It’s one of the biggest problems with this dataset, but it’s also a huge problem in machine learning and artificial intelligence. Our biases are written into code that is used for classification or interaction with human beings. This can range from the relatively harmless but indicative case of the soap dispenser that didn’t recognize a Black person’s hands because it hadn’t been trained to do so, to algorithms that identify high crime areas or potential criminals and... I think you can guess what the problem is there.
ImageNet Roulette isn’t just a funny project, it’s also highlighting real issues in AI models. It’s also very funny.
And a couple more for good measure...
dribbler, driveller, slobberer, drooler: a person who dribbles
There’s a lot going on here. I want to know how/why they ended up with a label for cardiologists. It also classified an image of me as a sociologist. Which isn’t a million miles off, but it’s still ridiculous.
I got a good laugh out of some of these, and I hope you did too. It’s also an interesting look into some of the problems with modern artificial intelligence and machine learning. The tools are great, but if you’re chucking junk data in, you’re going to get some real nonsense results!
The good news is that the ImageNet Roulette project was intended to highlight this issue, and they’ve successfully done so. ImageNet Roulette has received plenty of attention, including from the research team that developed the original dataset.
We just need less data classifying Black people as criminals and more classifying Brandt as a Shot Putter and Paco as a King.