Professor Ben Polak: So
last time we saw this, we saw an example of a mixed
strategy which was to play 1/3, 1/3, 1/3 in our rock,
paper, scissors game. Today, we’re going to be
formal, we’re going to define mixed strategies and we’re going
to talk about them, and it’s going to take a while.
So let’s start with a formal definition: a mixed strategy
(and I’ll develop notation as I’m going along,
so let me call it P_i,
i being the person who’s playing it) P_i is a
randomization over i’s pure strategies.
So in particular, we’re going to use the notation
P_i (si) to be the probability that Player i plays
si given that he’s mixing using P_i.
So P_i(si) is the probability that P_i
assigns to the pure strategy si. Let’s immediately refer that
back to our example. So for example,
if I’m playing 1/3,1/3,1/3 in rock, paper, scissors then
P_i is 1/3,1/3,1/3 and P_i of rock–so
P_i(R)–is a 1/3. So without belaboring it,
that’s all I’m doing here, is developing some notation.
Let’s immediately encounter two things you might have questions
about. So the first is,
that in principle P_i(si) could be zero.
Just because I’m playing a mixed strategy,
it doesn’t mean I have to involve all of my strategies.
I could be playing a mixed strategy on two of my strategies
and leave the other one with zero probability.
So, for example, again in rock,
paper, scissors, we could think of the strategy
1/2,1/2,0. In this strategy I assign–I
play rock half the time, I play paper half the time,
but I never play scissors. So everyone understand that?
And while we’re here let’s look at the other extreme.
The probability assigned by my mixed strategy to a particular
si could be one. It could be that I assign all
of the probability to a particular strategy.
What would we call a mixed strategy that assigns
probability 1 to one of the pure strategies?
What’s a good name for that? That’s a “pure strategy.”
So notice that we can think of pure strategies as the special
case of a mixed strategy that assigns all the weight to a
particular pure strategy. So, for example,
if Pi(R) was 1, that’s equivalent to saying
that I’m playing the pure strategy rock,
i.e. a pure strategy.
So there’s nothing here. I’m just being a little bit
nerdy about developing notation and making sure that everything
is in place, and just to point out again,
one consequence of this is we’ve now got our pure
strategies embedded in our mixed strategies.
When I’ve got a mixed strategy I really am including in those
all of the pure strategies. So let’s proceed.
I’m going to push that up a little high, sorry.
So now I want to think about what are the payoffs that I get
from mixed strategies, and again, I’m going to go a
little slowly because it’s a little tricky at first and we’ll
get used to this, don’t panic,
we’ll get used to this as we go on and as you see them in
homework assignments and in class.
So let’s talk about the payoffs from a mixed strategy.
In particular, what we’re going to worry about
are expected payoffs. So the expected payoff of the
mixed strategy P, let’s be consistent and call it
P_i, the mixed strategy
P_i is what? It’s the weighted average–it’s
a weighted average or a weighted mixture if you like–of the
expected payoffs of each of the pure strategies in the mix.
So this is a long way of saying something again which I think is
a little bit obvious, but let me just say it again.
The way in which we figure out the expected payoff of a mixed
strategy is, we take the appropriately weighted average
of the expected payoffs I would get from the pure strategies
over which I’m mixing. So to make that less abstract
let’s immediately look at an example.
So here’s an example we’ll come back to several times,
but just once today, and this a game you’ve seen
before. Here is the game Battle of the
Sexes, in which Player A can choose–Player I can choose A
and B, and Player II can choose a and
b, and what I want to do is I want to figure out the payoff
from particular strategies. So suppose that P is being
played by Player I and P is let’s say (1/5,4/5).
So what do I mean by that? I mean that Player I is
assigning 1/5 to playing A and 4/5 to playing B.
And suppose that Q–so I am going to use P and Q because
it’s convenient to do so rather than calling them P_1
and P_2. So suppose that Q is the
mixture that Player II is choosing and she’s choosing a
(½, ½), so she’s putting a probability
1/2 on a and a probability 1/2 on b.
Just to notice I switched notation on you a little bit,
for this example to keep life easy,
I’m going to use P to be row’s mixtures and Q to be column’s
mixtures. And the question I want to
answer is what is the expected payoff in this case of P?
What is P’s expected payoff? The way I’m going to do that
is, I’m first of all going to ask what is the expected payoff
of each of the pure strategies that P involves,
the pure strategies involved in P.
So to start off–so the first step is ask what is the expected
payoff for Player I of playing A against Q and what is the
expected payoff for Player I of playing B against Q?
That will be our first question and we’ll come back and
construct the payoff for P. So these are things we can do I
think. So the expected payoff of A
against Q is what? Well, half the time if you play
A you’re going to find your opponent is playing a,
in which case you’ll get 2, and half the time when you play
A you’ll find your opponent is playing b in which case you’ll
get 0. So let’s just write that up.
So I’m going to get 2 with probability 1/2 plus 0 with
probability 1/2. Everyone happy with that?
That gives me 1. Please correct my math in this.
It’s very easy at the board to make mistakes,
but I think that one is right. Conversely, what if I played B?
What’s the expected payoff for the row player of playing B
against Q, where Q is 1/2,1/2? So half the time when I play B,
I’ll meet a Player II playing a and I’ll get 0 and half the time
I’ll find Player II is playing b and I’ll get 1.
So let’s write that up. So I’ll get 0 half the time and
I’ll get 1 half the time for an average of 1/2.
That’s the first thing I ask. And now to finish the job,
I now want to figure out what is the expected payoff for
Player I of using P against Q? That was the question I really
wanted to start off with. What’s the way to think about
this? Well P is 1/5 of the
time–according to P, 1/5 of the time Player I is
playing A and 4/5 of the time Player I is playing B,
is that right? So to work out the expected
payoff what we’re going to do is we’re going to take 1/5 of the
time, and at which case he’s playing
A and he’ll get the expected payoff he would have got from
playing A against Q, and 4/5 of the time he’s going
to be playing B in which case he’ll get the expected payoff
from playing B against Q. Now just plugging in some
numbers to that from above, so we’ve got 1/5 of the time
he’s doing the expected payoff from A against Q and that’s this
number we worked out already. So this number here can come
down here, 1. And 4/5 of the time he’s
playing B against Q, in which case his expected
payoff was 1/2, so this 1/2 comes in here.
Everyone okay so far, how I constructed it so far?
Is this podium in the way of you guys, are you okay?
Let me push it slightly. So the total here is what?
It’s going to be 1/5 of 1 plus 4/5 of ½.
4/5 of 1/2 is 2/5, so I’ve got a total of 3/5.
So the total here is 3/5. Everyone understand how I did
that? Now while it’s here let’s
notice something. When I played P,
some of the time I played A and some of the time I played B.
And when I ended up playing A, I got A’s expected payoff.
And when I played B, I got B’s expected payoff.
So the number I ended up with 3/5 must lie between the payoff
I would have got from A which is 1, and the payoff I would have
got from B which is 1/2. Is that right?
So 3/5 lies between 1/2 and 1. Everyone okay with that?
Now that’s a simple but very general and very useful idea it
turns out. The idea here is that the
payoff I’m going to get must lie between the expected payoffs I
would have got from the pure strategies.
Let me say it again. In general, when I play a mixed
strategy the expected payoff I get, is a weighted average of
the expected payoffs of each of the pure strategies in the mix,
and weighted averages always lie inside the payoffs that are
involved in the mix. So let me try and push that
simple idea a little harder. Suppose I was going to take the
average height in the class–average height in this
class. So let me just,
rather than use the class, let me just use some T.A.’s
here. So let me get these three
T.A.’s to stand up a second. Suppose I want to figure out
the average height of these three T.A.’s.
So stand up close together so I can at least see what’s going on
here. So I think, from where I’m
standing, I’ve got that Ale is the tallest and Myrto is the
smallest, is that right? So I don’t know instantaneously
what this average would be, but I claim that any weighted
average of their three heights, is going to give me a number
that’s somewhere between the smallest height of the three,
which is Myrto’s height, and the tallest height of the
three, which is Ale’s height, is that right?
Is that correct? So that’s a pretty general idea.
Thanks guys I’ll come back to you in a second.
Let’s think about this somewhere else,
let’s think about the batting average of a team.
The team batting average in baseball, let’s use the Yankees,
for example. We know that the team batting
average, the average batting average of the Yankee’s–I don’t
know what it is, I didn’t look it up this
morning–but I know it lies somewhere between the player who
has the highest batting average which I’m guessing is Jeter,
I’m guessing, and the lowest,
the person on the team who has the lowest batting average,
who is probably one of the pitchers who played,
who batted a few times in one of those inter-league games.
(It would have been better if I’d used the Mets but I feel I
should take pity on Mets fans this week and not mention them.)
So this is a very simple idea, it’s deceptively simple.
It says averages, weighted averages,
lie between the highest thing over which you’re averaging and
the lowest thing over which you’re averaging.
Everyone okay with that idea? Now this very simple idea is
going to have an enormous consequence, and here’s the
enormous consequence. Simple idea, big consequence.
So there’s going to be a lesson that follows from this
incredibly simple idea and this is the lesson.
If a mixed strategy is a best response, so if a mixed strategy
is the best thing you can be doing,
then each of the pure strategies in the mix–I’m being
a little bit loose here but I mean assigned positive
probability in the mix, for those people who are nerdy
enough to worry about it–each of the pure strategies in the
mix must themselves be best responses.
So, in particular, each must yield the same
expected payoff. So here’s a big conclusion that
follows from that incredibly simple idea about averages lying
between the highest one and the lowest one.
Let’s draw ourselves from that lesson to this big conclusion.
What is the conclusion? The conclusion is if a mixed
strategy is a best response, if the best thing I can do is
to play a mixed strategy, then each of the pure
strategies which I’m playing in that mix, which I’m assigning
positive probability to in that mix,
must themselves be best responses.
In particular, each of them therefore must
yield the same expected payoff. So let’s go back to our example.
Can I steal my three T.A.’s again?
Suppose the game, suppose the thing I’m involved
in–I should have made this easier before,
let me come down a little bit. I’ll stand above here,
this is good. So suppose the game I’m
involved in, the payoff in the game is, a game in which I have
to choose the tallest group of my T.A.’s.
So my payoff is going to be the average height of whichever
subgroup of my T.A.’s I pick and these are my three choices.
So if I pick more than one of them I’m going to get a weighted
average, that’s a mixed strategy.
My aim here is to maximize the height of whatever subgroup I
pick. So in this game,
here’s my three pure strategies: my three pure
strategies are to pick Myrto; Ale;
or Jake. Those are my three pure
strategies. And my mixture,
I could mix these two, I could mix these two,
I could mix all three. But remember my payoff here is
to get the group, the average as high as I can.
So how am I going to get the average as high as I can?
I get the average as high I as I can, I’m going to kick out
Myrto for a start because Myrto’s just bringing down the
average, is that right? Average height I should say,
there’s nothing–and actually I think I’m going to kick out Jake
as well I think, I’m probably going to kick out
Jake as well because that way I just have Ale.
So if it was the case that I was picking both of them,
it would have to be they were equally tall but since they’re
not equally tall, I should just pick the best one.
Let’s go back to my Yankee’s example, if I want to pick a
sub-team of the Yankee’s, I’m allowed to pick any number
of people, to have the highest average, batting average,
in that sub-team. The way to do it is to find the
Yankee who has the highest batting average and just pick
him. Let’s do one more example.
Let me use the front row of students here,
so here’s my, can I get this front of
students to stand up a second? This is a part of the row.
And suppose my aim in life is to construct the highest average
GPA. I’m not going to embarrass
these guys and ask them what their GPA’s are.
So my aim in life here is to pick some sub-group of these
one, two, three, four,
five, six, seven, eight students,
such that the average GPA of that sub-group is as high as I
can make it. So what will I do here?
So this being Yale I’ll just find the people who have the 4.0
GPA’s and just pick them. Is that right?
You might think well why not include somebody who has a 3.9
GPA? That’s pretty good.
So why not? Because if there’s anybody in
this group who has a 4.0 GPA, I’d do better just to pick that
person. The 3.9 person would just be
pulling down the average. Now suppose there’s nobody with
a 4.0 GPA and suppose it’s the case that three of these people,
let’s say these three people have a 3.9 GPA.
So these three have 3.9 GPA, imagine that,
and these other people they’ve got horrible grades like B+
somewhere. These are our future law school
students and these are the people–who knows what they’re
going to end up doing–being President probably.
So to construct the group with the highest average GPA,
what am I going to do? Well first I’ll throw out all
these guys with low GPA’s, so they can all sit down and
I’ll look at these last three and these last three,
if they’re all in the group they better all have the same
GPA. Why on earth?
If I’m trying to maximize the average of my group,
if any of them had a lower GPA I should kick them out,
and if one of them has a higher GPA than the other two,
I should kick out both the other two.
So if I’m including all three of them, in my constructing of
the average all of them must have the same GPA,
which I’m going to assume is 3.9, to assume you can still
make into law school. Everyone understand that?
Yeah? Okay, thanks guys.
So that’s the way I want to think about this.
So the idea here is if I’m using a mixed strategy as a best
response, it must be the case that everything on which I’m
mixing is itself best. And the reason is,
if it wasn’t, kick out the thing that isn’t
best and my average will go up. So that leads us to the next
idea, but before I do just for formality, let me add a
definition. The definition is this,
a mixed strategy profile–what I’m going to do now is I’m going
to define Nash Equilibrium again,
just so we have it in our notes somewhere.
So a mixed strategy profile–there should be a
hyphen there–(P_1*, P_2*,
…all the way up to P_N*),
is a mixed strategy Nash Equilibrium if for each Player
i–so for each Player i–that player’s mixed strategy
P_i* is a best response for Player i to the
strategies everyone else is picking P _-i*.
So I’m exploiting, by now, a well developed
notation for player strategies. So this definition of Nash
Equilibrium, it’s exactly the same as the definition of Nash
Equilibrium we’ve been using now for several weeks,
except everywhere where before we saw a pure strategy,
which was an S, I have replaced it with a P.
So the same definition except I’m using mixed strategies
instead of pure strategies. But an implication of our
lesson is what? It’s that if P_i* is
part of a Nash Equilibrium–so if Pi* is a best response to
what everyone else is doing, P_-i* –,
then each of the pure strategies involved in
P_i* must itself be a best response.
So an implication of the lesson is, the lesson implies the
following. If P_i* of a
particular strategy is positive, so in other words,
I’m using this strategy in my mix,
then that strategy is also a best response to what everyone
else is doing. Okay, so from a math point of
view this is the big idea of the day, this board.
If you’re having trouble reading this at the back,
trust me I’ve written that up on the handout that will appear
magically on the computer, at the end of class.
At the moment you’re staring at this, it’s all a bit new,
and as well as being new, you’re saying,
okay but so what, why do I care about this
seemingly mundane fact? The reason we’re going to turn
out to care about this seemingly mundane fact,
is that this fact is going to make it remarkably easy to find
Nash Equilibria. This fact, this lesson,
this idea that if I’m playing a pure strategy as part of the
mix, it must itself be a best
response, that’s going to be the trick we’re going to use in
finding mixed strategy Nash Equilibria.
The only way I can illustrate that to you is to do it,
so I’m going to spend the rest of today just doing that.
I’m going to look at a game and we’re going to go through this
game. We’ll discuss it a little bit
because it’s a fun game, and we’re going to find the
mixed-strategy equilibria of this game.
Everyone know where we’re going? I want to make sure before I go
on, are people looking very sort of deer in the headlamps?
That was a lot of formality to get through in a short period of
time. Does anyone want to ask a
question at this point? Are you okay?
Okay to go on? So just remember that the
conclusion here comes from this very simple idea.
The simple idea is, the payoff to a weighted
average must lie between the best and worst thing involved in
the average, and therefore if I’m including
things in there as part of a best response,
they must all be good. That’s the simple idea,
this is the dramatic conclusion.
So the only way to prove this to you and the only way to prove
to you that this is useful is to go ahead and do it.
So what I’m going to do is I’m going to clean these boards and
I’m going to start showing an example.
Again don’t panic, I think a lot of people at this
part of the class have a tendency to panic,
because it’s a new idea, it seems like a lot of math
around. None of it’s very hard math,
it’s all kind of arithmetic. It’s just this idea of not
panicking. So the example I want to look
at is going to be from tennis, and I’m going to consider a
game within a game, played by two tennis players,
and let’s call them Venus and Serena Williams.
So a couple of years ago we used to use Venus and Serena
Williams for this example, and then for a while I worried,
that you wouldn’t even remember who Venus and Serena Williams
were, and so we picked any two random
Russians, but now we’re back. Seems like we’re back to
picking Venus and Serena. So the game within the game is
this, suppose that they’re playing and Serena is at the net
and the ball is on Venus’ court, and Venus has reached the ball
and Venus has to decide whether to try to hit a passing shot
past Serena on Serena’s left or on Serena’s right.
Notice I’m going to exclude the possibility of throwing up a lob
for now, just to make this manageable.
So basically the choice facing Venus is should she try to pass
Serena to Serena’s left, which is Serena’s backhand side
or to Serena’s right, which is Serena’s forehand
side. People are familiar enough with
tennis to understand what I’m talking about?
So we’re going to assume this is Wimbledon,
otherwise no one would be at the net to start with I guess.
So this is at Wimbledon. Let’s try and put up some
payoffs here. So these are going to be the
payoffs. I think that this example is
originally due to Dixit, but it’s not a big deal.
I think this example is due to Dixit and Skeath.
So here’s some numbers and I’ll explain the numbers in a minute.
So this is 50,50, 80,20, 90,10 and 20,80.
So what are these numbers? So first of all let me just
explain what the strategies are, so I’m assuming the row player
is Venus and the column player is Serena.
I’m assuming that if Venus chooses L that means she
attempts to pass Serena to Serena’s left,
we’ll orient things from Serena’s point of view,
and if she hits right that means she’s attempting to pass
Serena on Serena’s right. If Serena chooses L that means
she cheats slightly towards her left: not cheats in the sense of
breaking the rules, but cheats in terms of where
she’s standing or leaning. And if she chooses right that
means she cheats slightly towards her right.
So this is cheating towards her backhand and this is cheating
towards her forehand, assuming she’s right handed,
which she in fact is. Okay, what do these numbers
mean? So let’s start with the easy
ones. So if Venus chooses left and
Serena chooses right, then Serena has guessed wrong.
Is that correct? In which case Venus wins the
points 80% of the time and Serena wins it 20% of the time.
Conversely, if Venus chooses right and Serena chooses left,
then again, Serena has guessed wrong and this time Venus wins
the points 90% of the time and Serena wins the points 10% of
the time. This should be a familiar idea
by now, but why is it the case these nineties and eighties are
not a 100%? Why is it the case that if
Serena guesses wrong Venus doesn’t win 100% of the time?
Anybody? Perhaps we can get a show of
hands, get some mikes up. Why isn’t it 100% here?
Wait for the mike. Student: Sometimes she
hits it out of bounds when she serves.
Professor Ben Polak: Right, this isn’t even a serve,
this is a passing shot but the same is true.
So sometimes you’re successfully going to hit it
past Serena but the ball is going to sail out.
So that happens 10% of the time here and 20% of the time here.
Look at the other two boxes, if Venus hits to Serena’s left
and Serena guesses left, then we’re going to assume that
Serena’s going to reach the ball and make a volley,
but her volley only manages to go in–go over the net and go
in–half the time, so the payoffs are (50,50).
Half the time Venus wins the point and half the time Serena
wins the point. Conversely, if Venus hits the
ball to Serena’s right and Serena guesses correctly and
chooses right, then we’re in this box.
Once again, Serena has guessed correctly and she’s going to
successfully reach the volley and this time she gets it in 80%
of the time, so Venus wins the point 20% of
the time and Serena wins it 80% of the time.
So just to finish up the description of the game here,
notice that we’re assuming that Serena is a little better at
volleying to her right than she is volleying to her left.
So this is her forehand volley and we’re going to assume that
that’s stronger than her backhand volley.
Conversely, we’re assuming that Venus’ passing shot is a little
better when she shoots it to Serena’s left than when she
shoots it to Serena’s right. This is her cross court passing
shot and this is her down the line passing shot.
So none of that fine detail matters a great deal,
but just if you’re interested that’s where the numbers come
from. I’m not claiming this is true
data by the way, I made up these numbers.
Actually I think Dixit made up these numbers,
I forget where I got them from. So okay, everyone understand
the game? So now imagine,
either imagine you are Venus or Serena, or imagine perhaps more
realistically, that you’ve become Venus or
Serena’s coach. Do I have any members of the
tennis team here? No.
Well imagine you’ve become their coach, so you take this
class and then you apply to replace their father as being
their coach. That’s a tough assignment I
would think. So an obvious question is,
you’re coaching Venus before Wimbledon, you know this
situation’s going to arise and you might want to coach Venus on
what should she do here? Should she try and pass Serena
down the line or she should try and hit the cross court volley,
cross court passing shot? Notice that this is a question
of should you, Venus, play to your strength
which is the cross court passing shot,
or should you play to Serena’s weakness, which would be to hit
it to Serena’s backhand. Playing to your strength is to
choose right and playing to Serena’s weakness is to choose
left. Conversely, for Serena,
should you lean towards your strength, which I guess is
leaning to the right or should you lean towards Venus’
weakness, which I guess is leaning left?
When you look at coaching manuals on this stuff,
or you listen to the terrible guys who commentate on tennis
for ESPN–oh no I’m getting in trouble again–very nice guys
who commentate on tennis for ESPN,
they say just incredibly dumb things at this point.
They say things like, you should always play to your
strengths and don’t worry about the other person’s weakness.
I think it won’t take much time today to figure out that’s not
great advice. But can people at least see
that this is a difficult problem, this is not an
immediately obvious problem, is that correct?
One reason it’s not immediately obvious is not only is no
strategy dominated here, but there is no pure strategy
Nash Equilibrium in this game, in this little sub game.
There is no pure strategy Nash Equilibrium–and notice that I
added the qualifier now. Previously I would just have
said Nash Equilibrium, but now that we have mixed
strategies in the picture, I’m going to talk about pure
strategy Nash Equilibria to be those that are the only
involving pure strategies. Okay, so why is there no pure
strategy Nash Equilibrium? Well let’s have a look.
So if Venus–If Serena thought that Venus was going to choose
left then her best response, not surprisingly,
is to lean left and if Serena thought that Venus was going to
choose right, then her best response is to
cheat to the right, so 50 is bigger than 20,
and 80 is bigger than 10. And conversely,
if Venus thought that Serena was cheating a bit to the left
then her best response is to hit it to Serena’s right,
and if Venus thought Serena was leaning to the right then Venus’
best response is to hit it to Serena’s left.
So I think that’s not at all surprising when you think about
it, not at all surprising, you’re going to get this little
cycle like this, but we can see immediately that
these best responses never coincide,
so there is no pure strategy equilibrium.
So that leaves us a bit stuck except I guess you know what the
next question’s going to be, and I shouldn’t leave it in too
much suspense. The next question’s going to
be, okay there’s not pure strategy Nash Equilibrium,
but we’ve just introduced a new idea which was what?
It was Nash Equilibrium in mixed strategies.
Maybe there’s going to be a mixed strategy Nash Equilibrium.
In fact, there is, there is going to be one.
So our exercise now is, let’s find a mixed strategy
Nash Equilibrium, and before we find it,
let’s just interpret what it’s going to mean.
A mixed strategy Nash Equilibrium in this game,
is going to be a mix for Venus between hitting the ball to
Serena’s left and Serena’s right,
and a mix for Serena between leaning left and leaning right,
such that each person’s mix, each person’s randomization is
a best response to the other person’s randomization.
Since these players are sisters and have played each other many,
many times, not just in competition but
probably in practice, it seems like a reasonable idea
that they might have arrived in playing each other,
at a mixed strategy Nash Equilibrium.
That’s what we’re going to try and do, now how are we going to
do that? So what we’re going to do is
we’re going to exploit the trick that we have here,
the lesson here. The lesson we have here says if
players are playing a mixed strategy as part of a Nash
Equilibrium, each of the pure strategies
involved in the mix, each of their pure strategies
must itself be a best response. We’re going to use that idea.
So let’s try and do that. So I’m hoping that by doing
this, I’m going to illustrate to you immediately,
that this idea is actually useful, at least useful if you
end up coaching the Williams sisters.
Alright, I want to keep this so you can still read it.
Ill bring it down a bit. Can people still read it?
Okay, so what I want to do is, I want to find a mixture for
Serena and a mixture for Venus that are equilibrium.
Having put it up there let me bring it down again.
This was not so intelligent of me.
I actually want to bring in some notation,
so as before, let’s assume that Serena’s mix
is, let’s use Q and (1-Q) to be
Serena’s mix and let’s use P and (1-P) to be Venus’ mix.
Let’s establish that notation. So here’s the trick,
So this is the slightly magic bit of the class,
so pay attention, I’m about to pull a rabbit out
of a hat. Trick, what should I do first,
to find Serena’s Nash Equilibrium mix,
so that’s (Q, (1-Q)), what I’m going to do is
I’m going to look at Venus’ payoffs.
So to find Serena’s Nash Equilibrium mix the trick is to
look at Venus’ payoffs, that’s going to be my magic
trick. Let’s try and see why.
So let’s look at Venus’ payoffs, Venus’ payoffs against
Q. So if Serena is choosing (Q,
1-Q), what are Venus’ payoffs? So if she chooses left then her
payoff is 50 with probability Q–and I’m going to use the
pointer here, and hope that the camera can
see this too. She gets 50 with probability Q
and she gets 80 with probability 1-Q.
If she chooses right then she gets 90 with probability Q and
she gets 20 with probability of 1-Q.
I meant to point to that. So what?
So what is this: we’re looking for a mixed
strategy Nash Equilibrium, so in particular,
not only Serena is mixing but in this case what we’re claiming
is, Venus is mixing as well. So if Venus is mixing as well,
that means that Venus is using the strategy left with some
probability P and using the strategy right with some
probability 1-P. Since Venus sometimes chooses
left and sometimes chooses right as her best response to Q,
her best response to Serena, what must be true of the payoff
to left and the payoff to right? Let’s go through it again,
so we’re going to assume that Venus is mixing.
So sometimes she chooses left and sometimes she chooses right
and she’s going to be, she’s in a Nash Equilibrium,
so she’s choosing a best response.
So whatever that mix P, 1-P is, it’s a best response.
Since she’s playing a best response of P and that sometimes
involves choosing left and sometimes involves choosing
right, it must be the case that what?
It must be the case that both left itself and right itself are
both themselves best response. If she’s mixing between them,
it must be that both choosing left or choosing right are
themselves best responses. If they weren’t she should just
drop them out of the mix, that would raise her average
payoff. Right, just like we dropped out
the short T.A.’s to get a high height and we dropped out the
failing Yale students to get a high GPA.
So if Venus is mixing in this Nash Equilibrium then the payoff
to left and to right must be equal,
they must both be best responses, both left and right
must be a best response, so in particular,
the expected payoffs must be the same.
Is that right, is that correct? So what does that allow me to
do? It allows me to put an equals
sign in here. Since left is a best response
and right is a best response, since they’re both best
responses, they must yield the same expected payoff.
Here’s their expected payoffs, they must be the same.
Now, I’ve got one equation and one unknown, and now I’m down to
algebra. So let me do the algebra.
I claim this expression is equal to that expression,
so simplifying a bit I’m going to get–you should just watch to
make sure I don’t get this wrong–I’m going to get 40Q,
so this implies 40Q is equal to 60(1-Q).
So I took this 50 onto this side and this 20 onto that side,
so I have 40Q is equal to 60(1- Q) and that implies that Q is
equal to .6. So those last two steps were
just algebra. So what was the trick here?
The trick was I found Q, which is how Serena is mixing
by looking at Venus’ payoffs, knowing that Venus is mixing
and hence I can set Venus’ payoffs equal to one another.
Say that again, I found the way in which Serena
is mixing by knowing that if Venus is mixing,
her expected payoffs must be equal and I solved out for
Serena’s mix, this is Serena’s mix.
Let’s do it again. Here I’m wishing I had another
board. I don’t want to lose those
numbers entirely, so I’m going to try and squeeze
in a bit. I know what I can do.
Let’s get rid of this one entirely.
There we go, that works. Let’s get rid of this one
entirely. I can still see my numbers.
Let’s do the converse. Let’s do the trick again,
this time what I’m going to do is I’m going to figure out how
Venus is mixing. I know how Serena is mixing
now, so now I’m going to work out how Venus is mixing.
Now, to figure out how Serena was mixing, I used Venus’
payoffs. So to find out how Venus is
mixing what am I going to do? I’m going to use Serena’s
payoffs. So to find Venus’ mix,
which is P, 1-P, –let’s be careful it’s her
Nash Equilibrium mix–use Serena’s payoffs.
Here we go, so if Serena chooses, this is S’s payoffs,
if Serena chooses L then her payoffs will be what?
So again, just watch to make sure I don’t get this wrong and
I’ll point to the things to try and help myself a bit.
So with probability P she’ll get 50.
So 50 with probability P, and with probability 1-P she’ll
get 10. And if she chooses to lean to
the right, to lean towards her forehand, then with probability
P she’ll get 20 and with probability 1-P she’ll get 80.
We know that Serena is mixing, so since Serena is mixing what
must be true of these two payoffs?
What must be true of the two payoffs?
The payoff to l and the payoff to r, what must be true about
them since Serena is using a mixture of these two strategies
in Nash Equilibrium? It must be the case that both l
is a best response and r is a best response,
in which case the payoff must be, someone shout it out,
equal, thank you. They must be equal,
these must be equal. They must be equal since Serena
is indifferent between choosing left or right and hence is
mixing over them. So again, using the fact that
they’re equal reduces this to algebra, and again,
I’ll probably get this wrong but let me try.
So I claim, let’s take 20 away from here, I’ve got 30P equals
70(1-P). I hope that’s right,
that looks right. Again, this is just algebra at
this point. So I took 20 away from here and
10 away from there, and this implies that P equals
.7. So I claim I have now found the
mixed strategy Nash Equilibrium. Here it is.
The Nash Equilibrium is as follows.
Let’s be careful, this is Venus’ mix.
So if Venus is mixing .7, .3, .7 on left and .3 on right,
and Serena is mixing .6, .4, so this is Venus’ mix and
this Serena’s mix. Venus is shooting to the left
of Serena with probability of .7 and Serena is leaning that way
with probability of .6. So we were able to find this
Nash Equilibrium by using the trick before.
Now let’s just reinforce this a little bit by talking about it.
So suppose it were the case that Serena, instead of leaning
to the left .6 of the time leant to the left more than .6 of the
time. So suppose you’re Venus’ coach,
and suppose you know that Serena leans to the left more
than .6 of the time, what would you advise Venus to
do? Let me try it again.
So suppose your Venus’ coach and suppose you’ve observed the
fact that Serena leans to the left more than .6 of the time,
what would you advise Venus to do?
Pass to the right, shout out. Student: Pass to the
right. Professor Ben Polak:
Pass to the right, exactly.
So if Serena cheats to the left more than .6 of the time,
then Venus’ best response is always to shoot to the right.
That maximizes her chance of winning the point.
Conversely, if Serena leans to the left less than .6 of the
time, then Venus should do what? Shoot to the left all the time.
So if Serena doesn’t choose exactly this mix,
then Venus’ best response is actually a pure strategy.
Say it again, if Serena leans to the left too
often, more than .6, then Venus should just go right
and if Serena leans to the left too little,
then Venus should always go left.
We can do exactly the same the other way around.
If Venus shoots to the right, so that’s her cross hand
passing shot more than .7 of the time,
and you’re Serena’s coach, what should you tell Serena to
do? Go that way all the time.
So if Venus is hitting it to Serena’s left more than .7 of
the time, Serena should just always go to her left,
and if Venus is hitting to the left less than .7 of the time,
so to the right more than .3 of the time,
then Serena should always go to the right.
So that’s how this kind of comes back into the sort of the
coaching manuals if you like. Okay, so how am I doing so far?
Have I lost everyone yet or are people still with me?
How many of you play tennis, ever?
So all your tennis is going to dramatically improve after
today, right? So now let’s make life more
interesting. Let’s go back to the start.
We’ve figured out this is an equilibrium, this is how Venus
and Serena play, Venus and Serena know each
other perfectly well, they know that they mix this
way, they’re going to best respond
to it, this is going to be where they end up.
But in the meantime, Serena hires a new coach and
Serena’s new coach is just very, very good at teaching Serena
how to play at the net, and in particular,
how to hit the backhand volley. So Serena’s new coach,
let’s say it’s Tony Roche or somebody, it’s just a brilliant
coach and Tony Roche is able to improve Serena’s backhand volley
and that changes these payoffs. So you should rewrite the whole
matrix but I’m going to cheat. So the new game is exactly the
same as it was everywhere else, except for now when Serena gets
to the backhand volley, she gets in it 70% of the time.
So there used to 50,50 in that box and now it’s 30,70.
So the game has changed because Serena has got better at hitting
backhand volleys. We want to figure out how is
this going to affect play at Wimbledon?
Now it doesn’t take much to check that there is still no
pure strategy Nash Equilibrium. It’s still the case,
in fact even more so, that Serena’s best response to
Venus choosing left is to lean to the left.
So it’s still the case that the best responses do not coincide,
there is still no pure strategy equilibrium.
What we’re going to do of course is we’re going to find a
mixed strategy equilibrium, but before we do so,
let’s think about this intuitively.
Let’s see if we can intuit an answer.
I’m guessing we can’t, but let’s see if we can intuit
an answer. So Serena has improved her
backhand volley, and hence when she reaches it
she gets it in more often. So one effect,
you might think, is what we might want to call a
direct effect and I think there’s two effects here.
There are two effects, one of these I’m going to call
the direct effect, and by effect,
I mean in particular an effect on how Serena should play the
game. So since Serena has improved
her backhand volley, when she reaches that volley
she gets it in more often, so one might say in that
case–your Serena’s coach–in that case you should lean to the
left more often than you did before,
because at least when you get that backhand volley you’re
going to get it in more often. So the direct effect says
Serena should lean left more, in other words,
Q should go up. Is that right?
So Serena’s now better at playing this backhand volley,
so she may as well favor it a bit more and hence Q will go up.
So that’s the direct effect, but of course there’s a “but”
coming. What’s the but?
Again, let’s see my tennis players here,
raise your hands if you play tennis.
Suddenly nobody plays tennis, come on raise your hands okay.
What’s the but here? We think Serena’s backhand has
improved so she might be tempted to play towards her backhand a
bit more often, what’s the but?
So I claim the but is this–you tell me if I’m wrong–the but is
that Venus (she’s her sister after all,
right, so Venus knows that Serena’s backhand has improved)
so Venus is going to hit it to Serena’s left less often than
before. Is that right?
So since Serena’s backhand has improved, Venus is going to hit
it to Serena’s backhand less often than before,
and that might make Serena less inclined to cheat towards her
backhand because the ball is coming that way less often.
So this is a indirect or a strategic effect.
The strategic effect is Venus hits L less often,
so Serena should reduce the number of times that she leans
to the left because the ball is coming that way fewer times.
Now notice that these two effects go in opposite
directions, is that right? One of them tends to argue that
Q would go up, that’s the direct effect and
the other one is more subtle, it says we now think about not
just how my play has improved, but also how the other person’s
going to respond to knowing that my play has improved,
that’s the more subtle effect and that’s going to push Q down.
That’s going to make it less likely, that’s an argument
against leaning to the left. So imagine you’re going to be
Serena’s coach, which of these effects do you
think is going to win, let’s have a poll.
Which of these effects do you think is going to win?
The direct effect or the indirect effect?
The direct effect or the strategic effect?
Who thinks the direct effect? Who thinks Serena,
who’d advise Serena to play to her strength a bit more and lean
left a bit more, who thinks the direct effect?
Raise your hands, let’s have a poll.
Who thinks the indirect effect, the effect of Serena hitting it
that way less often is going to win?
Who’s abstaining and basically refusing to be a coach?
Quite a number of you, all right.
Well we’re going to find out by re-solving for the Nash
Equilibrium. What we’re going to do is redo
the calculation we did before starting with Serena.
So to find Serena’s mix, to find Serena’s new
equilibrium mix, what do we have to do?
The question is, in equilibrium,
is Serena going to lean to the left more (so Q is going go up)
or less (so Q’s going to do down).
So I need to find out what is Serena’s new equilibrium mix.
What’s the new Q? How do I go about finding
Serena’s equilibrium Q, what’s the trick here?
Shout it out. Use Venus’ payoffs.
So to find the new Q for Serena, use Venus’ payoffs.
Now let’s do that. So from Venus’ point of view,
if she chooses left then her payoffs are now,
and again I should use the pointer,
30 with probability Q, this is the new Q and 80 with
probability 1-Q, 30 with probability Q plus 80
with probability 1-Q. Again, this is the new Q,
I should really give it, put Q prime or something but I
won’t. If she chooses right then her
payoff is what? It’s going to be 90 with
probability Q and 20 with probability 1-Q.
What do we know about these two payoffs if Venus is mixing in
equilibrium? We know she’s mixing in
equilibrium because we saw there was no pure strategy
equilibrium, so what we do know about these
two payoffs since Venus is using both these strategies in
equilibrium? They must be the same.
Since she’s using both these strategies, these strategies
must be equally good. They must both be best
responses so these two payoffs are equal.
Since they’re equal all I have to do is solve out for Q,
so let’s do it. So I’m going to get 90 minus 30
is 60Q, is equal to 80 minus 20 which is 60(1-Q),
so Q equals .5. If I did the algebra too
quickly just trust me, I think I got it right.
From here on in, it was just algebra.
So what have I found out? Did Q go up or go down?
Well it used to be, Q used to be what?
.6 and now its .5, so let me ask what I think is
an easy question, did it go up or down?
It went down. Q went down,
the equilibrium Q went down. So which effect turned out to
be bigger? The direct effect of playing
more to your strength or the indirect effect of taking into
account that your opponent is going to play less often to your
strength. Which effect turned out to be
the bigger effect? The indirect effect,
the strategic effect. Of course I really did want the
strategic effect to be bigger because this is a course about
strategy, but the strategic effect actually won here.
The strategic effect, the indirect effect is bigger.
That’s good news for me because it says the slightly dumb coach
who didn’t bother to take Game Theory would have stopped at
this direct effect and they’d have told Serena to go the wrong
way, but the smart coach who takes
my class, and therefore somehow contributes to my salary,
in an extraordinarily indirect way, gets it right.
Now we can also solve out for Venus’ new mix and we’ll do it
in a second. But before I do it,
let me just point out that we actually, we really can now
intuit Venus’ effect. It may not be exact numbers but
we can intuit here. As I claim, I claim if we think
this through carefully, we know whether Venus is
shooting more to the left, than she was before,
or less to the left, than she was before.
Notice that in the new equilibrium Serena is going less
often to her left even though she’s better at hitting the
backhand, she’s better at hitting the
ball when she gets there. So since Serena is leaning left
less often what must be true about Venus in this new
equilibrium? It must be the case that Venus
is hitting the ball to the left less often.
Does that make sense? We have enough information
already on the board to tell us that, nevertheless,
let’s do the math. Let’s go and retrieve a board
to do the math. Just to complete this,
let’s figure out exactly what Venus does do.
So to figure out what Venus is going to do, what’s our trick?
I want to figure out how Venus is going to mix.
I’m going to find out Venus’ new P, how do I find out Venus’
new equilibrium mix? I look at Serena’s payoffs.
So if Serena chooses left, her payoff is,
and I’ll read it off quickly this time,
is 70P plus 10(1-P) and if Serena chooses right her payoff
is 20P plus 80(1-P) and I’m praying that the T.A.’s are
going to catch me if I make a mistake here,
and I know these have to be equal because I know that in
fact Venus is mixing–sorry, I know that Serena is mixing,
so I know these must be equal. So since they’re equal I can
solve out and hope that I’ve got this right, so I’ve got 50P
equals 70(1-P), so P is equal to 7/12.
So again, that’s just algebra, I rushed it a bit,
it’s just algebra. Same idea, just algebra.
So 7/12 is indeed smaller than what it used to be,
because it used to be 7/10, so that confirms our result.
So the strategic effect dominated.
Venus shot to Serena’s backhand less often, and as a
consequence, so much so, that Serena actually found it
worthwhile going more to the right than she used to before.
Now let’s just talk this through one more time.
This was a comparative statics exercise.
We looked at a game, we found an equilibrium,
we changed something fundamental about the game,
and we looked again to look at the new equilibrium,
that’s called comparative statics.
Let’s talk through the intuition.
Before we made any changes Venus was indifferent.
She was indifferent between shooting to the left and
shooting to the right. Then we improved Serena’s
ability to hit the volley to her left, we improved her backhand
volley. If we had not changed the way
Serena played then what would Venus have done?
So suppose in fact Serena’s Q had not changed.
If Serena’s Q had not changed, remembering that Venus was
indifferent before, how would Venus have changed
her play? Somebody?
If we started from the old Q and then we improved Serena’s
ability to play the backhand volley, and if Q didn’t change,
what would Venus have done? She’d never,
ever have shot to the left anymore, she’d only have shot to
the right which can’t possibly be an equilibrium.
So something about Serena’s play has to bring Venus back
into equilibrium, it brings Venus back into being
indifferent, and what was it? It was Serena moving to the
left less often and moving to the right more often.
To say it again, if we didn’t change Q,
Venus would only go to the right, so we need to reduce Q,
have Serena go to the right, to bring Venus back into
equilibrium. Conversely, if Venus hadn’t
changed her behavior, if Venus had gone on shooting
exactly the same as she was, P and 1-P as before,
then Serena would have only gone to the left and that can’t
be an equilibrium. So it must be something about
Venus’ play that brings Serena back into equilibrium,
and what is it? It’s that Venus starts shooting
to the right more often. So just two reminders,
before you leave two reminders. Wait, wait, wait.
First, in about five minutes time a handout will magically
appear on the website that goes through these arguments again,
all of them in two other games, so you can have a look at the
handout. Second thing,
a problem set has already appeared by magic on that
website that gives you lots of examples like this to work on.
Play tennis over the weekend for practice and we’ll see you