RolePlay onLine RPoL Logo

, welcome to Community Chat:Religion

16:56, 1st May 2024 (GMT+0)

A prisoner's Dilemma competition.

Posted by TychoFor group 0
Tycho
GM, 1418 posts
Mon 26 May 2008
at 10:34
  • msg #1

A prisoner's Dilemma competition

Given all the discussion of (and confusion over!) the prisoner's dilemma problem in the politics thread, I though it'd be fun and interesting to run a simulation of various strategies, and see how they all fare against each other.

For those not familiar, the prisoners dilemma involves two players, each of which must choose one of two possible actions.  We'll call the two actions just 'A' and 'B,' though you can think of them as 'defect' and 'cooperate,' or 'rat' and 'dont rat,' or any number of other descriptors.  Depending upon which actions the players choose, each will receive a certain pay-off of 'points.'  If both players pick A, they each get 1 point.  If they both pick B, they each get 3 points.  If one picks A, and the other B, the player who picked A gets 5 points, and the player who picks B gets 0:


     A      B

A   1/1    5/0

B   0/5    3/3



The goal is to score as many points as possible over a series of games with a number of different opponents.

What I would like to do is to have people submit strategies to me, using PMs (so that other players won't know their strategies), and I'll run all the strategies against each other in a series of games to see how each one does.

For each* strategy submitted, I'll set it to play between 100 and 200 games (the number randomly determined) against each of the other strategies submitted.  I'll keep track of the total scores, and total number of games played, so we can find the average scores per game (thus, strategies that end up playing more games won't have an advantage).

*In order to make things more interesting, and to get a bigger data set, I will let people submit more than one entry.  However, your strategies will not play against other strategies submitted by you, as people could design sacrificial strategies which allow their other strategies to do better while doing poorly themselves.

You can submit strategies to me by posting here, and writing the strategy in a PM to me.  An example might look like:



Hey Tycho, here's my strategy!




Strategies can be as simple or complex as you want, but they can only involve the following information:
--All past picks made by you in this iterated game (ie, only versus this opponent, not those from games vs. other opponents.)
--All past picks made by your opponent in this iterated game
--how many rounds have been played in this iterated game
--your total points for this iterated game
--your opponents total points for this iterated game
--result of a random event (ie, 'I'll pick A 50% of the time and B 50% of the time, chosen randomly).
Note that if your strategy depends on past picks, be sure to include how you would like to make your first pick.

You can submit entries all this week, and I'll try to get the results posted sometime next week.  At that time, I'll reveal everyone's strategies, and we can see which types of them seem to be successful.

As I will be able to see everyone's strategies, I'll just include a few very basic entries, and one more complicated one which I've already thought up, and not add anymore.  Everyone else, though, is free to keep coming up with more strategies throughout the week.  The main point here is to see what happens, so feel free to try out different things, even if you think they won't do so well.  I look forward to seeing what everyone comes up with!
Bart
player, 284 posts
LDS
Mon 26 May 2008
at 11:08
  • msg #2

Re: A prisoner's Dilemma competition

I have two strategies.

1. I get a number of people from ASWoT or a different RPoL game to join up here and we all post a strategy.  Knowing that there will be at least 100 games in a series, each strategy will involve an eight step "handshake" (perhaps ABAAABBA) to make certain that we're dealing with an ally, then all of the strategies (but one) will be programmed to lose continuously (always choosing B), with the odd program always choosing A) thus giving the single remaining strategy near the maximum possible points.  Thus, our team (by working together and subverting our opponents), wins.

2. Tit for tat, with forgiveness.  My program would always do exactly what the other program did in the last round.  If the other program nailed me, then I nail it.  If the other program worked with me, then I work with it.  This will likely be a fairly common strategy, as if another program always works with me, then we all compromise and we end up doing well.  If another program always tries to work me over, then I work it over and we both do as well as the other.  But, to correct for "getting off on the wrong foot", there's a continually decreasing chance (starting at approximately 5%) that my program would be randomly benevolent, forgiving a previous ding.
katisara
GM, 2950 posts
Conservative human
Antagonist
Mon 26 May 2008
at 11:53
  • msg #3

Re: A prisoner's Dilemma competition

I remember doing this in class :)

Just a note, for those of you who read about this elsewhere, generally when I've seen it written, 'A' is the 'not rat' option, while 'B' is the rat option - the transpose of how Tycho wrote it in this case.  Of course, the game works the same either way, I just wanted to make sure that people who read about this somewhere else as well don't get confused.  If someone writes 'always choose A', here that would translate to 'always choose B'.
Tycho
GM, 1419 posts
Mon 26 May 2008
at 11:54
  • msg #4

Re: A prisoner's Dilemma competition

Bart:
I have two strategies.

1. I get a number of people from ASWoT or a different RPoL game to join up here and we all post a strategy.  Knowing that there will be at least 100 games in a series, each strategy will involve an eight step "handshake" (perhaps ABAAABBA) to make certain that we're dealing with an ally, then all of the strategies (but one) will be programmed to lose continuously (always choosing B), with the odd program always choosing A) thus giving the single remaining strategy near the maximum possible points.  Thus, our team (by working together and subverting our opponents), wins.

This is what I was hoping to avoid by not having strategies submitted by the same player compete with each other.  While there's nothing preventing you from making such plans with other players, I would ask that you don't do so, for this first implementation at least.  Perhaps in future versions we can allow that kind of thing, and see what kinds of meta-strategies show up, or perhaps add an evolutionary component, such that losing strategies die out, and thus can no longer help the successful ones.  For now, though, let's just stick to each person trying to win, rather than forming teams, and see what we learn.

Bart:
2. Tit for tat, with forgiveness.  My program would always do exactly what the other program did in the last round.  If the other program nailed me, then I nail it.  If the other program worked with me, then I work with it.  This will likely be a fairly common strategy, as if another program always works with me, then we all compromise and we end up doing well.  If another program always tries to work me over, then I work it over and we both do as well as the other.  But, to correct for "getting off on the wrong foot", there's a continually decreasing chance (starting at approximately 5%) that my program would be randomly benevolent, forgiving a previous ding.

A couple points here:
-First, you're free to post your strategies publicly like this, but doing so may give others an advantage, so you may want to submit them via private line.  All will be revealed at the end, so you'll still get to explain it.
-Second, as is, this isn't quite specific enough for me to code up, so wouldn't quite be a valid entry.  For one thing, it lacks the all important "what do I do in the first round" instruction, which is needed for strategies based on past actions.  And, you'll need to be a bit more specific about the forgiveness mechanism.  Let me know, specifically, what chance of forgiveness you'd like, or how to compute is based on the available data.

Thanks for being the first to respond, though!  Your tit-for-tat with forgiveness should be fine with just a bit more info.
Sign In