A really long time ago, I read the dissertation written by Hofstadter and his associates regarding the program known as Metacat, whose life's ambition was to understand analogies and produce a new analogy following the rules given by an initial analogy. The catch was that the rules behind the first analogy were not explained to it: it had to "figure it out," as it were. It occasionally got it right, but was more interesting was when it got it "wrong:" because, technically, the analogies it came up with did indeed follow the same rules as the first... it was just a different rule than the humans giving it the examples had thought of.
Teaching an AI is, I imagine, roughly similar to teaching a child: you basically go over the basic concept again and again, then give the AI a chance to demonstrate that it has learned what you've been teaching it. However, I suspect that there may be a better way, at least for AI, given just how incredibly different they think than we do.
Instead of just giving the AI examples of what you want, and see what happens, allow it to give the humans examples. Allow it to produce an analogy, then request that the humans teaching it produce a similar analogy.
For example, let's say that we give the AI these two analogy sets: "AA:BB, CC:??", where it needs to figure out the second half of the second analogy. Now, I can come up with at least two very reasonable responses: DD, following a +1 increment on letter position; or FF, following a doubling of the letter's position (A is 1, double is 2, so B; C is 3, double is 6, so F). Doubtless there are others, all assuredly involving more complicated maths or takes on what exactly is going on in these analogies.
The humans who gave this example have a clear idea in mind, but the AI has little clue which one is right. Humans are, for the moment, more sensitive to context and tend to think simpler at younger ages: a young child will, for example, almost assuredly give DD as the answer. Most people might not even consider FF as a possible answer, if only because performing unusual mathematical operations on letter position may not occur to them. But to an AI, this shit is all math, which leads them to doing things that may - at first glance - seem preposterous.
This is why the idea of learning by doing - by making the AI give the humans an example and an unfinished analogy - would be so incredibly powerful. On the one hand, it gives it an opportunity to see how other people solve analogies: it gives it ways to examine the concept of the analogy when it is not the one being tested, allowing it to see how other entities react to the sorts of information that it's been being tested on. On the other, it gives us even greater insights into how a particular AI is thinking: what sorts of analogies is it creating? Are they human-easy, human-difficult? Are they long and complex, or are they short and simple? Through looking at what sorts of examples the AI creates, we get a better picture of its own understanding of analogies and how they are constructed.
Obviously, this concept can be expanded out to nearly anything, not just analogies. I think that one of our problems with teaching AIs to date is that we are not giving them the opportunity to learn through watching other entities work through the problems we're giving them: instead, we force them to solely rely upon their own experiential data and the information required to make decisions about the tasks we are making them go through.
Just as the presence of sensorimotor equipment may prove to be crucially important to the development of an AI, we also need to keep in mind the other myriad ways in which our present AI teaching methodologies are non-reflective of how human minds are taught and grow. The more alike we can make their environments and methodologies, the more likely we are, I think, to be able to arrive at strong AI.
Sunday, June 2, 2013
Friday, May 10, 2013
Agent Is As Agent Does
I've been thinking about this concept off and on for a couple months now, so I think I should type it down. Keep in mind that it's unpolished: I don't have a solid take on the precise execution of the concept, or if it is actually functionally different from anything in the space, but it seems an interesting way to handle things, to me. It also seems to be an interesting avenue to investigate cognitive science, as well.
Anyway, on to the concept.
Let's say that you're playing chess. Now, in chess, you have the game state, and a variety of pieces, each with different abilities and functionality. The overall goal is to win by taking your opponent's king.
Now, as a bit of an AI guy, I would say that the way for an AI to accomplish this would be best done with reinforcement learning. The AI plays randomly until it wins a game, which it then propagates a value back through the move set, making those moves preferable. So long as the state of the game exactly resembles a position it has seen before, it will have a value associated with a given move that it has made previously. Throw in a chance for the AI to say, "eh, fuck it," and try something new, and... we're done here.
This is boring and impractical. Not only that, but it doesn't seem a very good model for how humans do it. The randomness aspect, yes, but not the overarching approach: we know, for instance, that the best humans don't do much look-ahead. While the AI in question isn't doing actual look-ahead, it is effectively doing it.
Now, I'm a big proponent of the idea that AI thinking doesn't need to mimic human thinking, so the idea that the AI does it differently than the human doesn't bother me too much. But anyway.
So what if, instead of approaching the board from an overview, RTS-style approach, the AI instead gives each of its pieces its own agency. In effect, rather than looking at the whole of the game state, each piece looks around itself, and the parts of the state that are relevant to it, and applies values to each potential move it could make. Apply a sort of Bayesian agency to the system as a whole at the AI's overview level, and you get something really interesting.
So let's say that this particular AI - let's call him Sam - has had much success in the past with his opening move as moving one of the four central-most pawns ahead two spaces. Sam doesn't have access to this information directly - rather, what is going on here is that the pawns each say, "hey, my value of moving ahead two spaces is 8, my value for one space ahead is 4, and my value for not moving is -2." The other four pawns say some useless things and return values that aren't as good, so they're immediately discounted by the Bayesian agency at the top level.
So the Bayesian agent here has a bit of a problem, he has four sub-agents all giving him the same value. However, as the sub-agents learn, so, too, does the B agent: it knows that listening to the fifth pawn from the left usually yields successes (thanks to RL), but rather than working with raw values, it has percentages associated with each sub-agent's responses. So it returns to Sam: move the fifth pawn from the left forward two spaces.
In more complicated game states, the B agent could act as an intermediary between sub-agents. Say that two sub-agents, a queen and a knight, are both returning a 15 for a given move; to further complicate it, the B agent believes both with a 40% chance of success. All other moves are significantly weaker. The B agent might then have the authority to create a new sub-agent, that takes both sub-agent's actions into account, combining them into a single agent that can do an analysis of the effects of both actions, and return that information back to the B agent: thus allowing for a cogent decision to be made. The level of success of that action can then be propagated through to the sub-agents and the B agent, potentially allowing for a more intelligent strategy to emerge in a way that wouldn't be possible otherwise.
I realize that, taking a step back, this looks like an overcomplication of standard RL methods. Why have a multitude of agents all arguing and clamoring for attention, when the end result winds up being roughly the same - potentially even mildly worse, given potential information loss - and significantly more expensively, computationally?
Because the agents have different goals.
Consider this. Instead of a single agent, Sam, with the goal to take the opponent's king, you have a multitude of agents, all with different goals. Pawns want to get to the end of the board and get promoted. Kings want to avoid enemy pieces, with a stronger desire for that goal than any other piece. All pieces work to serve Sam's overarching goal, but recognize that they approach it differently, and this understanding of sub-agency allows Sam to reach his goal more efficiently.
Think about how your mind works. While you might have a goal in mind at any one time, there are still a multitude of other goals, all clamoring for attention in your head: eat food, put more toner in the printer, make babies, mow the lawn, start dieting, punch your boss in the face. At any given moment, we are shuffling priorities, sometimes taking stock of our current situation and attempting to determine which goal is most pertinent at the time.
In our AI efforts to date, though, we don't seem to recognize this. Our agency is not the result of a singular agent, but a multitude of sub-agents, which ultimately serve a higher power - the self - which decides which sub-agent to indulge at the moment. Selfhood is the agent of agency, through which sub-agents attain their agency by making the agent's goal their goal. We even create new agents or destroy old ones, as we come to epiphanies about ourselves and our goals in our lives - the goal of the self changes, which is reflected in the hierarchy of sub-agents.
So, yes, the end result is somewhat messier. But it allows for a significantly different approach than standard AI approaches to date. I think it also helps clarify some things in human cognition, too, explaining things like internal conflicts and the like.
Anyway, on to the concept.
Let's say that you're playing chess. Now, in chess, you have the game state, and a variety of pieces, each with different abilities and functionality. The overall goal is to win by taking your opponent's king.
Now, as a bit of an AI guy, I would say that the way for an AI to accomplish this would be best done with reinforcement learning. The AI plays randomly until it wins a game, which it then propagates a value back through the move set, making those moves preferable. So long as the state of the game exactly resembles a position it has seen before, it will have a value associated with a given move that it has made previously. Throw in a chance for the AI to say, "eh, fuck it," and try something new, and... we're done here.
This is boring and impractical. Not only that, but it doesn't seem a very good model for how humans do it. The randomness aspect, yes, but not the overarching approach: we know, for instance, that the best humans don't do much look-ahead. While the AI in question isn't doing actual look-ahead, it is effectively doing it.
Now, I'm a big proponent of the idea that AI thinking doesn't need to mimic human thinking, so the idea that the AI does it differently than the human doesn't bother me too much. But anyway.
So what if, instead of approaching the board from an overview, RTS-style approach, the AI instead gives each of its pieces its own agency. In effect, rather than looking at the whole of the game state, each piece looks around itself, and the parts of the state that are relevant to it, and applies values to each potential move it could make. Apply a sort of Bayesian agency to the system as a whole at the AI's overview level, and you get something really interesting.
So let's say that this particular AI - let's call him Sam - has had much success in the past with his opening move as moving one of the four central-most pawns ahead two spaces. Sam doesn't have access to this information directly - rather, what is going on here is that the pawns each say, "hey, my value of moving ahead two spaces is 8, my value for one space ahead is 4, and my value for not moving is -2." The other four pawns say some useless things and return values that aren't as good, so they're immediately discounted by the Bayesian agency at the top level.
So the Bayesian agent here has a bit of a problem, he has four sub-agents all giving him the same value. However, as the sub-agents learn, so, too, does the B agent: it knows that listening to the fifth pawn from the left usually yields successes (thanks to RL), but rather than working with raw values, it has percentages associated with each sub-agent's responses. So it returns to Sam: move the fifth pawn from the left forward two spaces.
In more complicated game states, the B agent could act as an intermediary between sub-agents. Say that two sub-agents, a queen and a knight, are both returning a 15 for a given move; to further complicate it, the B agent believes both with a 40% chance of success. All other moves are significantly weaker. The B agent might then have the authority to create a new sub-agent, that takes both sub-agent's actions into account, combining them into a single agent that can do an analysis of the effects of both actions, and return that information back to the B agent: thus allowing for a cogent decision to be made. The level of success of that action can then be propagated through to the sub-agents and the B agent, potentially allowing for a more intelligent strategy to emerge in a way that wouldn't be possible otherwise.
I realize that, taking a step back, this looks like an overcomplication of standard RL methods. Why have a multitude of agents all arguing and clamoring for attention, when the end result winds up being roughly the same - potentially even mildly worse, given potential information loss - and significantly more expensively, computationally?
Because the agents have different goals.
Consider this. Instead of a single agent, Sam, with the goal to take the opponent's king, you have a multitude of agents, all with different goals. Pawns want to get to the end of the board and get promoted. Kings want to avoid enemy pieces, with a stronger desire for that goal than any other piece. All pieces work to serve Sam's overarching goal, but recognize that they approach it differently, and this understanding of sub-agency allows Sam to reach his goal more efficiently.
Think about how your mind works. While you might have a goal in mind at any one time, there are still a multitude of other goals, all clamoring for attention in your head: eat food, put more toner in the printer, make babies, mow the lawn, start dieting, punch your boss in the face. At any given moment, we are shuffling priorities, sometimes taking stock of our current situation and attempting to determine which goal is most pertinent at the time.
In our AI efforts to date, though, we don't seem to recognize this. Our agency is not the result of a singular agent, but a multitude of sub-agents, which ultimately serve a higher power - the self - which decides which sub-agent to indulge at the moment. Selfhood is the agent of agency, through which sub-agents attain their agency by making the agent's goal their goal. We even create new agents or destroy old ones, as we come to epiphanies about ourselves and our goals in our lives - the goal of the self changes, which is reflected in the hierarchy of sub-agents.
So, yes, the end result is somewhat messier. But it allows for a significantly different approach than standard AI approaches to date. I think it also helps clarify some things in human cognition, too, explaining things like internal conflicts and the like.
Subscribe to:
Posts (Atom)