You may have heard that machine learning technology has been able to beat all the world's greatest players of the ancient game of Go.
But can it book a flight on the web?
That's the intriguing prospect raised by the latest research for Google AI investigators.
In a new paper from the team, they trained a neural network to understand the structure of web pages and the choices it can make when filling out forms in an airline ticket booker, or interacting with a social media site.
The work broadly employs the same category of machine learning as Google's Go-winning AlphaZero software, what is known as "reinforcement learning." In RL, a neural network develops strategies of steps to take at each stage of trying to solve a problem as it receives rewards for good choices.
Also: Google suggests all software could use a little robot AI
The researchers figured out a way to train a neural network without being given human examples of how to navigate an online booking form. The approach makes the task of learning webpages and social media networks more "scalable," they write, where the possible combinations of states and actions can reach into the tens of millions.
The point is not necessarily to actually book a flight; it's more an exercise in how a neural network can find solutions to a problem with numerous variables, where human guidance, or "supervision," in training is infeasible.
The research paper, "Learning To Navigate The Web," posted December 21st on the arXiv pre-print server, is authored by Izzeddin Gur, Ulrich Rueckert, Aleksandra Faust, Dilek Hakkani-Tur, collectively associated with Google AI. The paper will be a poster session at the upcoming International Conference on Learning Representations, taking place next May in New Orleans.
Also: Google AI researchers find strange new reason to play Jeopardy!
This is more than just bots to crawl the Web. The authors describe the problem as being intractable when "learning from large set of instructions" that can include fields of a web form that have to be filled out, and long lists of things in the kind of drop-down menu picker a person would encounter on a flight booking site.
"As an example, in the flight-booking environment the number of possible instructions/tasks can grow to more than 14 millions, with more than 1700 vocabulary words and approximately 100 web elements at each episode."
The work picks up where another left off, last year's "World of Bits," by Tianlin Shi and colleagues at Stanford University. That paper tested the ability of a computer to learn to carry out mouse clicks and keyboard strokes to complete tasks on the Web, based on demonstrations provided by people.
Also: Google ponders the shortcomings of machine learning
Like the authors of that paper, the Google folks employ reinforcement learning, in this case the "Deep Q-Network" approach, where the neural network adjusts its estimation of future rewards as it steps through problem tasks, making choices.
But the Google researchers couldn't use human demonstrations, as in World of Bits case, so they came up with what they assert are two "novel neural network architectures."
The first, "QWeb," is a Deep Q-Network that is enhanced by breaking up the web page into rewards for each step in a travel booking exercise, such as entering the date of a flight. That tends to increase the rewards that the neural net receives as it goes along.
The second, called "INET," is another Deep Q-Network that gets rewards as it properly generates instructions for QWeb to follow. It's the INET's job to digest the Web page, in the form of a "document-object model," or "DOM," and come up with the steps QWeb should take to make choices in the Web form, such as picking an airport code from a drop-down list of "destinations" in the form.
Also: Google Brain, Microsoft plumb the mysteries of networks with AI
There are numerous other details where the authors tried things a little bit differently from previous approaches. For example, they used a technique called "curriculum learning," to break down big tasks into smaller ones, to help the neural net get through the multiple steps of a Web form.
They also used what are known as "shallow encodings," to enhance the neural net's understanding of the webpage. That way, it doesn't just see a vast list of airport names, it also acquires some sense of the structure of the webpage it's on.
The authors report that when they compared their results against those of the Stanford group, they could match its human-driven examples just as well with no human demonstrations on simple tasks such as clicking on a dialogue box, or logging in a user in a form.
In more complex tasks, tests developed by the Stanford group as a benchmark, referred to as "social-media-all," the computer must do things like block a given user on Twitter. The Google researchers relate that their enhanced neural network was able to succeed "where previous approaches failed to generate any successful episodes."
In the challenge of booking a flight, the little tricks, they report, such as shallow encoding, helped the neural network achieve success each time. Without those little tricks, they note, their network behaved in a fashion that sounds like a bored web surfer: "QWeb starts clicking submit button at first time step to get the least negative reward." Sounds just like an actual human experience of booking a flight online.
The authors write that they plan in future work to test their network in more complex environments with still more steps.
Perhaps they can teach it to figure out how to solve the captchas, as most humans seem often flummoxed by them.