Header Ads

You've been training AI for free

 This is gonna sound like a scam but I found a way to make extremely smallamounts of money online. So the one that caughtmy eye here is Pinterest. Which, we all know and love Pinterest. "Determine the topical relatedness "Between pieces of text." 40 cents. Wait there's one more, so this one's "Find info from an email." For three cents. They're just gonna email me, and then I like find info from it? Which seems cool. It turns out Amazon runsan online marketplace that farms out basic tasks computer programs have a hard time with. It's called Mechanical Turk, named after a robot from the 1700's. But, people just call it MTurk. So what I wanna do now,is Leonard Monteiro will pay me three centsto write the prices shown in an image. I'm not totally sure why thisis like, worth doing money, but okay, let's do it. It looks like a parking meter? I hope that this is fora good cause somehow, that this app is helpingpeople rather than just sending bills to people. So I feel like on theother end of this task there's some sort ofautomated parking meter app. Which is weird, but itturns out stuff like this happens all the time. Nearly every successful AI product has human beings behind it. You just don't see them untilyou look at the big picture.


 So what's going on here? Is there actually anapp that claims to read parking meter fines, but it'sactually humans doing it? Or, am I helping traintheir AI and these meters are just hypothetical examples? To figure that out, Iasked a resident AI expert James Vincent. - So, I did a little Googlinghere and yeah, sorry Russell this is not a parking meter at all. This is actually a littlegadget you put in supermarkets and you scan barcodes to check the prices. Now, the bigger questionis why does someone want you to write down all these prices? And I have two answers for you. In the first case, creating training data. Say you want to make amachine vision system that automatically does what you're doing. How does it actually knowwhere to look in the picture? How does it know what thebarcode scanner looks like?

 To teach it that information, you need to feed it labeled data. You need to get a humanto do that labeling, in this case, Russell. He labels the data, itgoes into the system, and the system learns whatthese things look like. - (laughing) Oh no, there's so much more! - That is how you train an AI system, but sometimes these systemsthey don't work, right? So, you use case number two, what's that? Well, that might be where the AI system actually can't do what it says it can do. It might be that there's toomuch glare in the picture and it can't read the numberson the screen very well. In those cases, you need to throw the data to someone who has the intelligence to work out what's going on. That's not a machine, that'sa human, like Russell. And they will label thedata for the machine and return it back to the end user. Sometimes companiesare upfront about this, and sometimes they lie about it too.

 Sometimes they will say, "Yes! "We've got a wizzy AI system that's "Doing all this automatically." And actually, they don't. Turns out that that AI isa lot of low-paid workers on a system like MechanicalTurk, like Russell, providing this data in the background. - If you've ever filled outa CAPTCHA you've probably done some of that work yourself. In theory those tests are meantto verify that you're human, but Google has startedusing them to collect data for other products too. Typing out this blurry word could help the character recognitionalgorithm in Google Books. These skewed numbers areprobably helping confirm an address in Google street view. The most recent CAPTCHA'sask you to identify all the squares of a picturethat have a car in it, at the same time thatGoogle's Waymo branch is trying to train self-driving cars. Even a simple task like setting a timer with Google Assistant canrequire an army of contractors manually annotating the data as a recent Guardian investigation showed. Sometimes users do thelabeling themselves. Facebook has some of thebest facial recognition data in the world, because theyalready have dozens of pictures of your face. You added them yourself.


Multiply that across billionsof users and it's all the data you need to build afacial recognition system, which can then startautomatically tagging your friends in the next set of pictures you upload. Suddenly, Facebook hasone of the most advanced facial recognition systems in the world, and they didn't have to pay a dime for it. When researchers at Googlewere trying to build a depth sensing camera,they went even further. What they really neededwere a bunch of videos where mobile cameras explored static space from different angles. But where would they find that? (atmospheric music) Google downloaded 2000mannequin challenge videos, fed them into an algorithm,and a new kind of depth sensing software was born. Think about it, everyminute, 500 new hours of content are added to YouTube. If you're training an AI that'sa lot of video to draw on. And there are no copyright restrictions on what you can use for training data. The same goes for websites,images, Wikipedia pages, it's all just there for the taking.


This has been a huge drivingforce for the AI boom. These systems need lotsof examples to recognize even the most basic patterns. That used to mean months of data entry, but now you can scrape everything you need from the internet in a matter of hours. And the people who made themannequin challenge videos, they didn't think they wereencoding depth information. If the researchers hadn't talked about their training system,it would feel like they'd done it all on their own. - The remarkable thingabout AI systems is that even though they are built on a foundation of human intelligence, theyregularly transcend that, and do something thatsurprises us or goes beyond what we thought was possible. One fantastic example ofthis is the AlphaGo program, which was designed byDeepMind, which was Google's AI lab here in London. And in 2016 and 2017 it played and beat the human champions ofthe ancient board game Go.


There's one particularly famous moment is now known simply as move 37. It was a move that was so unusual, so counter to human expectations, that the matches commentatorsthought it was a mistake. But it wasn't. It was a beautiful play,that completely undermined Lee's match, and led toAlphaGo winning the game. And it was something thathumans couldn't teach. It was something that themachine had learnt by itself. Yes, it started from afoundation of human intelligence, but it went beyond that. This, I think is wherepeople get so excited by AI, we're a long long way awayfrom building computers that are as flexiblyintelligent and sophisticated as humans, but we can stillbuild algorithms and systems that exceed human intelligence, even in very specific domains. - But that's AI at its best. The flip side is when anapp needs a description of what's in a photo, and thephoto recognizing algorithm just doesn't work.

 So you get a human being to fill it in, usually through a post on Mechanical Turk. That's a very old trick,going all the way back to the machine thatgave the site its name. The original mechanical Turk was this guy, a master chess-playing robot. Hundreds of years beforethere was anything we would think of as a computer. The Turk could beat most chess players, playing so well that people thought it was a technological marvel. But, really it was just a trick. There was a human being inside, hiding under the table anddirecting the moves from below. It was a human being,dressed up as a machine. A trick no-one had thought of until then. And as Amazon can tellyou, the trick still works. Thanks for watching, Ihope you liked the video. If you wanna know more about AI we did a whole video aboutwhat these changes look like at a social scale, whetherAI's destroying jobs, or gonna make everything free. So you can check that out here, or like and subscribe. 

No comments