Once upon a time I took a job as a Gaming Analyst and Mathematician at a big Online Gambling company. What can I say. The term “Data Science” did not exist, and this was the highest paying job around. During my tenure there I designed the math behind a few successful ($$$) video slots such as this, this, this, and many more. I am going to assume most of the people who will read this do not see the fascination in gambling, let alone “silly” video slots, in which there is absolutely no strategy, and a player’s job is simply to push the button. And I would be the first to agree. But the mathematics. Oh, the mathematics and psychology behind making a successful game. Now *that* was fun. If you wish to skip the following long intro and *get* to the fun, go to the “Going on a Diet” section.

# The Wizard of Odds

If I want to analyze/simulate a slot, it can’t be anything I invented. The designs behind games which are currently active are top secret, and I can find myself fighting a lawsuit faster than a bullet. So, I am going to use an open source slot! Michael Shackleford a.k.a The Wizard of Odds is one of the most famous gambling experts in the world. Shackleford’s site is a source of inspiration for gaming mathematicians around the world, containing dozens of games designs, calculators and explanations, free of charge, all in the hopes of educating “gamblers if they’re going to gamble anyway”. The recurring theme:

if you read the site thoroughly, the inescapable conclusion is that with very few exceptions, there is no way to beat the house.

Among the many games broken down to bits in his site, Shackleford also published the full design of a 20-lines slot he calls The Atkins Diet Slot. I am going to expand on Shakckleford’s webpage, adding simulations and some math for you to play with. If you wish to play the slot itself, for free, take a few spins here.

# But First - Slots 101

### House Edge and RTP

It took me a while to see that “House Edge” is what people who know probability simply call “Expectation”. The term represents what percentage of a player’s bet will “stay”, *on average*, in the hands of the House. The Casino. The Boss. For example, take a simple game like the toss of a fair coin. You, the player bet $1 on Tails. If the coin comes up Tails, the House gives you $2, if it comes up Heads you lose your money. What is the House Edge? In a fair game it should be zero. Let \(X\) be the House earnings, then:

\(E(X) = P(House Wins) * House Earnings + P(House Loses) * House Loss =\) \(0.5 * (+1) + 0.5 * (-1) = 0\)

But games at the Casino aren’t fair. A more typical coin toss game would be: You bet $1 on Tails. If the coin comes up Tails, the House gives you $1.9, if it comes up Heads you lose. What is the House Edge?

\(E(X) = 0.5 * (+1) + 0.5 * (-0.9) = 0.05\)

So even though in a single game the House might pay you an extra $0.9 for your $1, *on average* it earns $0.05, or 5c for every $1 you bet, or simply 5%. Notice how *on average* is a key term here, Understand *on average* and live a happier life. Oh, and what’s RTP? Return To Player. 1 minus House Edge, or in our case 95%.

What is the House Edge and RTP of the following game: you bet $1 that the next result on your favorite slot would be “7 7 7”. This occurs every 10,000 rounds on average. If you do win, the House will give you $1,000. OK:

\(E(X) = 0.9999 * (+1) + 0.0001 * (-999) = 0.9\)

House Edge of 90%? RTP of just 10%? This has to be a non-realistic example! Actually, no. The thing is, in slots, you’re not betting on a single result, but a few. Each of them occurs with very low probability and gives you very low RTP. Together they sum up, usually to somewhere between 85% - 99% RTP, which means 1% - 15% House Edge. Last example is from the Atkins Diet slot itself. You see the probability for getting the highest prize - 5,000 times your bet for 5 “Atkins” symbols in a row - is 0.00000003, or 1 in ~33 million, which means:

\(E(X) = 0.99999997 * (+1) + 0.00000003 * (-4999) = 0.99985\)

House Edge of 99.99%! But the overall House Edge of this slot narrows down to only 3%.

### Multiple Lines Slots

Back in the old days, slot machines looked like this:

Those still exist. But nowadays, casinos make more money and players get more rush, from slots which look like this:

What is going on here. It’s actually pretty simple. These slots have multiple lines on a (usually) 3x5 grid of symbols, instead of a single line of 3 symbols. A winning line is (usually) a line in which a single symbol appears on 2-5 “reels” from left to right consecutively. The more lines the player wishes to play, the higher she must bet, the higher the chance *something* will happen (but will this increase the RTP? See later). Some multiple lines slots have 5 or 10 lines. Most have 20 or 25 - so if to play on a single line you bet $1, to play on all 25 lines you bet $25. But note: the multiplier listed in the paytable is “times” a single line bet! In this example if you won a “x100” event occured, you will win $1 x 100, not $25 x 100. Yet some slots have 100 lines. And some, called “All Pays” slots, have lines in each possible direction, which means, for a 4x5 grid, 4^5 lines. 1024 lines! Though in this kind of slots the bet would probably be fixed on all lines, and you can’t really bet $1,024 for a single spin.

Why am I telling you this? The Atkins Diet Slot has 20 lines. And the Wizard never seems to mention this in his calculations. This is because we only need to calculate what’s happening on a single line! The same applies to all lines and since the user will linearly increase his bet as he chooses more lines - the House Edge and RTP will stay the same.

Notice this does not mean the lines are independent. Far from it. More on this and why this is the “secret” appeal to modern video slots - later.

### Features: Wilds, Scatters and Jackpots

Last thing you need to know about modern slots - they have many features. A lot can happen.

Wilds: replace regular symbols, like a Joker. They can pay for themselves, or not. They can double or tripple the pay. They can expand on an entire reel to give you more wins. They can “blow up” to throw more wilds on the grid. Countless variations.

Scatter and Bonus Symbols: these give you a bonus, usually when appearing anywhere on the grid. This bonus can simply multiply 10 times your total bet. It can be free spins (in which more free spins can be won recursively, like in the Atkins Diet Slot). It can be a simple external game in which you choose one of five mystery prizes, or a very intricate game with multiple stages.

Jackpots: probably

*the*feature players like the most. A certain percentage of each player’s bet is “contributed” into a common prize pull, and “falls” into the hands of a single player either at random or when she gets a certain rare result on wheels. Some games have three Jackpots, small, medium and large!

Remember: as enticing as these features might seem to you, they are all part of a strict mathematical design, inteded to give the House its precious Edge. Bombs may blow up. Free spins may trigger free spins which may trigger free spins. Money may fall from the sky. But the House *will* win.

# Going on a Diet

Let’s get the slot’s paytable:

```
library(htmltab)
library(plyr)
library(tidyverse)
library(magrittr)
url <- "https://wizardofodds.com/games/slots/atkins-diet/"
payTable <- htmltab(doc = url, which = "//th[text() = 'Symbol']/ancestor::table")
payTable %<>%
set_colnames(c("Symbol", "Five", "Four", "Three", "Two")) %>%
mutate(One = rep(0, nrow(payTable)),
SymbolGen = c("Wild", paste0("Sym", 1:9))) %>%
set_rownames(c("Wild", paste0("Sym", 1:9))) %>%
mutate_at(c("One", "Two", "Three", "Four", "Five"), as.numeric)
payTable
```

```
## Symbol Five Four Three Two One SymbolGen
## 1 Atkins 5000 500 50 5 0 Wild
## 2 Steak 1000 200 40 3 0 Sym1
## 3 Ham 500 150 30 2 0 Sym2
## 4 Buffalo Wings 300 100 25 2 0 Sym3
## 5 Sausage 200 75 20 0 0 Sym4
## 6 Eggs 200 75 20 0 0 Sym5
## 7 Butter 100 50 15 0 0 Sym6
## 8 Cheese 100 50 15 0 0 Sym7
## 9 Bacon 50 25 10 0 0 Sym8
## 10 Mayonnaise 50 25 10 0 0 Sym9
```

Notice I made 2 small changes: I added a `One`

column to allow a future slot we’ll build to also pay for a single symbol appearing on the leftmost reel. And, I added a `SymbolGen`

column, to give the Wizards’s symbols generic names such as “Wild” instead of “Atkins”. The other table we need is the reels table (yes, most slots I’ve worked with store in memory an actual reels table). In this case we have 5 reels, each has 32 symbols (never the same two in a row although it’s possible). In each spin we will sample one of the 32 locations, for each reel:

```
reels <- htmltab(doc = url, which = "//th[text() = 'Stop']/ancestor::table")
colnames(reels) <- c("Stop", paste0("Reel", 1:5))
reels[, 2:6] <- reels[, 2:6] %>%
map_df(mapvalues, c(payTable$Symbol, "Scale"), c(payTable$SymbolGen, "Scatter"))
reels
```

```
## Stop Reel1 Reel2 Reel3 Reel4 Reel5
## 2 1 Scatter Sym9 Sym2 Sym2 Sym8
## 3 2 Sym9 Sym3 Sym6 Sym7 Scatter
## 4 3 Sym2 Sym1 Sym5 Wild Sym1
## 5 4 Sym4 Sym4 Scatter Scatter Sym2
## 6 5 Sym8 Sym7 Sym7 Sym6 Sym7
## 7 6 Sym5 Sym9 Sym9 Sym8 Sym4
## 8 7 Sym7 Sym2 Sym6 Sym7 Sym6
## 9 8 Sym9 Sym6 Sym2 Sym4 Sym8
## 10 9 Sym4 Sym8 Sym4 Sym1 Sym3
## 11 10 Sym6 Sym1 Sym8 Sym5 Sym7
## 12 11 Sym3 Sym4 Sym1 Sym8 Sym4
## 13 12 Sym8 Sym9 Sym3 Sym9 Sym2
## 14 13 Sym5 Sym2 Sym6 Sym4 Sym6
## 15 14 Sym9 Wild Sym9 Sym7 Sym1
## 16 15 Sym1 Sym6 Sym7 Sym6 Sym9
## 17 16 Sym3 Sym5 Sym4 Sym2 Sym5
## 18 17 Sym6 Sym7 Sym5 Sym9 Sym4
## 19 18 Sym7 Sym8 Sym8 Sym8 Sym2
## 20 19 Sym5 Sym4 Sym9 Sym3 Wild
## 21 20 Wild Sym3 Sym3 Sym4 Sym6
## 22 21 Sym8 Scatter Sym2 Sym7 Sym3
## 23 22 Sym9 Sym9 Sym4 Sym5 Sym9
## 24 23 Sym2 Sym6 Sym8 Sym6 Sym5
## 25 24 Sym7 Sym7 Sym7 Sym3 Sym2
## 26 25 Sym5 Sym8 Sym5 Sym8 Sym8
## 27 26 Scatter Sym5 Wild Sym9 Sym6
## 28 27 Sym6 Sym3 Sym3 Sym5 Sym1
## 29 28 Sym8 Sym9 Sym8 Sym2 Sym9
## 30 29 Sym4 Sym1 Sym6 Sym4 Sym4
## 31 30 Sym3 Sym2 Sym7 Sym1 Sym5
## 32 31 Sym1 Sym7 Sym9 Sym9 Sym7
## 33 32 Sym6 Sym8 Sym1 Sym8 Sym3
```

Again, I replaced the Wizard’s symbol names with generic names. Next let’s get the Scatter Pays table:

```
scatterPaysTable <- htmltab(doc = url, which = 9)[1:3, 1:2]
scatterPaysTable %<>% mutate_all(as.numeric)
```

Next, let’s get the 20 paylines (this info can only be manually copied following the highlighted lines when hovering over the actual game here):

```
paylines <- matrix(0, nrow = 20, ncol = 5)
paylines[1, ] <- c(2, 2, 2, 2, 2)
paylines[2, ] <- c(1, 1, 1, 1, 1)
paylines[3, ] <- c(3, 3, 3, 3, 3)
paylines[4, ] <- c(1, 2, 3, 2, 1)
paylines[5, ] <- c(3, 2, 1, 2, 3)
paylines[6, ] <- c(2, 1, 1, 1, 2)
paylines[7, ] <- c(2, 3, 3, 3, 2)
paylines[8, ] <- c(1, 1, 2, 3, 3)
paylines[9, ] <- c(3, 3, 2, 1, 1)
paylines[10, ] <- c(2, 1, 2, 3, 2)
paylines[11, ] <- c(2, 3, 2, 1, 2)
paylines[12, ] <- c(1, 2, 2, 2, 1)
paylines[13, ] <- c(3, 2, 2, 2, 3)
paylines[14, ] <- c(1, 2, 1, 2, 1)
paylines[15, ] <- c(3, 2, 3, 2, 3)
paylines[16, ] <- c(2, 2, 1, 2, 2)
paylines[17, ] <- c(2, 2, 3, 2, 2)
paylines[18, ] <- c(1, 1, 3, 1, 1)
paylines[19, ] <- c(3, 3, 1, 3, 3)
paylines[20, ] <- c(1, 3, 3, 3, 1)
```

The first line, `c(2, 2, 2, 2, 2)`

means we’re looking at the second row of the 3x5 grid. The 20th line, `c(1, 3, 3, 3, 1)`

means we’re looking on Reel1 (the leftmost reel) at the first row symbol. Then on Reel2 at the third row symbol. Then again on Reel3 at the third row symbol. Etc.

Now, one more thing before we can simulate: a small subtelty of slots is the rule that “in case more than a single win event occurs, the highest win pays”. This means, for example, that if you get `Wild, Wild, Sym1, Sym2, Sym3`

from left to right, although both the “2 Wilds in a row” and “3 Sym1’s in a row” events just occured - the latter one pays more (x40 vs. x5), so this is the multiplier you’re gonna get. This means there is *order* here, and we need to sort the paytable before checking what did we win:

```
longPayTable <- payTable %>%
select(Five, Four, Three, Two, One) %>%
t() %>%
set_colnames(payTable$SymbolGen) %>%
as_tibble() %>%
gather(SymbolGen, prize) %>%
mutate(SymbolGenN = rep(1:nrow(payTable), each = 5),
nConsecutive = rep(5:1, nrow(payTable))) %>%
arrange(-prize) %>%
filter(prize > 0)
longPayTable
```

```
## # A tibble: 34 x 4
## SymbolGen prize SymbolGenN nConsecutive
## <chr> <dbl> <int> <int>
## 1 Wild 5000 1 5
## 2 Sym1 1000 2 5
## 3 Wild 500 1 4
## 4 Sym2 500 3 5
## 5 Sym3 300 4 5
## 6 Sym1 200 2 4
## 7 Sym4 200 5 5
## 8 Sym5 200 6 5
## 9 Sym2 150 3 4
## 10 Sym3 100 4 4
## # ... with 24 more rows
```

As you can see even when two events pay the same, there is order: `Wild, Wild, Wild, Wild, Sym2`

will pay for “4 Wilds in a row” and not “5 Sym2 in a row” even though they both pay x500, because the former appears first.

# On Our Way to Simulation Land

Let’s simulate a spin. This means one of 32 locations for each of the 5 reels:

```
spin <- sample(1:32, 5, replace = TRUE)
spin
```

`## [1] 1 15 6 15 26`

Wow, that was easy. If we agree the location sampled represents what will appear in the middle row, we can see how the grid would look like:

```
reelsMat <- t(as.matrix(reels[, 2:6]))
firstRowSpin <- ifelse(spin - 1 < 1, 32, spin - 1)
thirdRowSpin <- ifelse(spin + 1 > 32, 1, spin + 1)
resGrid <- rbind(reelsMat[cbind(seq_along(firstRowSpin), firstRowSpin)],
reelsMat[cbind(seq_along(spin), spin)],
reelsMat[cbind(seq_along(thirdRowSpin), thirdRowSpin)])
resGrid
```

```
## [,1] [,2] [,3] [,4] [,5]
## [1,] "Sym6" "Wild" "Sym7" "Sym7" "Sym8"
## [2,] "Scatter" "Sym6" "Sym9" "Sym6" "Sym6"
## [3,] "Sym9" "Sym5" "Sym6" "Sym2" "Sym1"
```

Note first that the same location can be sampled for more than one reel, that is why we use `replace = TRUE`

when sampling. In other words the reels are completely independent. Secondly note that if the middle row holds location 1, the first row will hold the location 32. If the middle row is on location 32, the third row is on location 1. Hence the sense of a “reel”. So, can you spot our wins? Let’s do it bottom-up, and write a function which checks if we won on a single line, a single event:

```
checkWinSingleLineSingleEvent <- function(lineSymbols, nConsecutive, Sym) {
all(lineSymbols[1:nConsecutive] %in% c(Sym, "Wild"))
}
# did we win "3 Sym6 in a row" in the middle row?
checkWinSingleLineSingleEvent(resGrid[2,], 3, "Sym6")
```

`## [1] FALSE`

No. Now let’s write a function which will get a single payline and check whether we won any event:

```
checkWinSingleLine <- function(lineSymbols) {
won <- FALSE
prizeCounter <- 0
while(!won & prizeCounter < nrow(longPayTable)) {
prizeCounter <- prizeCounter + 1
won <- checkWinSingleLineSingleEvent(lineSymbols,
longPayTable$nConsecutive[prizeCounter],
longPayTable$SymbolGen[prizeCounter])
}
if (won) {
return(prizeCounter)
}
return(NA)
}
# did we win anything on payline 1 (which is the middle row)?
checkWinSingleLine(resGrid[2, ])
```

`## [1] NA`

No. Notice the critical use of a `while`

loop here. We’re checking the prizes by a specific order discussed before, and if a prize has been won there’s no need to check the next prizes.

Let’s write a function which will receive a middle row `spin`

and return a vector holding how many times each possible win event occured (if you’re following, you can tell the length of such a vector would be `nrow(longPayTable)`

which is 34 here):

```
nPaylines <- 20
checkAllLinesAllEvents <- function(spin, nPaylines) {
winVector <- rep(0, nrow(longPayTable))
nPayLinesWon <- 0
for (payline in 1:nPaylines) {
paylineSpin <- spin - 2 + paylines[payline, ]
paylineSpin <- ifelse(paylineSpin < 1, nrow(reels),
ifelse(paylineSpin > nrow(reels), 1, paylineSpin))
lineSymbols <- reelsMat[cbind(seq_along(paylineSpin), paylineSpin)]
prizeLoc <- checkWinSingleLine(lineSymbols)
if (!is.na(prizeLoc)) {
winVector[prizeLoc] <- winVector[prizeLoc] + 1
nPayLinesWon <- nPayLinesWon + 1
}
}
return(list(winVector = winVector, nPayLinesWon = nPayLinesWon))
}
# did win anything this spin?!
winVector <- checkAllLinesAllEvents(spin, nPaylines)$winVector
longPayTable %>%
cbind(winVector) %>%
filter(winVector > 0)
```

```
## SymbolGen prize SymbolGenN nConsecutive winVector
## 1 Sym6 50 7 4 1
## 2 Sym6 15 7 3 1
```

Yes, we have “4 Sym6 in a row” on line 4 (`c(1, 2, 3, 2, 1)`

), and “3 Sym6 in a row” on line 18 (`c(1, 1, 3, 1, 1)`

) with the help of a “Wild”. Verify that you understand that.

But our round isn’t complete yet. We still have not handled the “Scatter” symbol (or “Scale” in the Wizard’s case). The player will get a prize multiplier of her *total bet* if at least 3 scatters appear anywhere on grid. The player will also get 10 free spins (which can trigger more free spins!), at *the same* bet per line and no. of paylines chosen, with all wins tripled!

```
checkScatterPaysMultiplier <- function(spin) {
firstRowSpin <- ifelse(spin - 1 < 1, nrow(reels), spin - 1)
thirdRowSpin <- ifelse(spin + 1 > nrow(reels), 1, spin + 1)
resGrid <- rbind(reelsMat[cbind(seq_along(firstRowSpin), firstRowSpin)],
reelsMat[cbind(seq_along(spin), spin)],
reelsMat[cbind(seq_along(thirdRowSpin), thirdRowSpin)])
nScatters <- sum(resGrid == "Scatter")
scatterPaysMultiplier <- ifelse(nScatters < 3, 0,
scatterPaysTable$Pays[scatterPaysTable$Scatters == nScatters])
return(scatterPaysMultiplier)
}
```

We’ll leave the multiplying of the total bet and the free spins for the entire single round function. Which we’ll write right now:

```
nFreeSpins <- 10
freeSpinsMultiplier <- 3
oneRoundSimulation <- function(betPerLine, nPaylines, freeSpinMode = FALSE) {
spin <- sample(1:nrow(reels), 5, replace = TRUE)
baseResults <- checkAllLinesAllEvents(spin, nPaylines)
winVector <- baseResults$winVector * ifelse(freeSpinMode, freeSpinsMultiplier, 1)
nPaylinesWon <- baseResults$nPaylinesWon
baseLineWins <- crossprod(longPayTable$prize * betPerLine, winVector)
scatterPaysWins <- 0
freeSpinsWins <- 0
scatterPaysMultiplier <- checkScatterPaysMultiplier(spin)
if (scatterPaysMultiplier > 0) {
scatterPaysWins <- betPerLine * nPaylines * scatterPaysMultiplier *
ifelse(freeSpinMode, freeSpinsMultiplier, 1)
for (i in 1:nFreeSpins) {
freeSpinsWins <- freeSpinsWins + oneRoundSimulation(betPerLine, nPaylines, TRUE)$totalWin
}
}
totalWin <- baseLineWins + scatterPaysWins + freeSpinsWins
return(list(baseLineWins = baseLineWins,
nPaylinesWon = nPaylinesWon,
scatterPaysWins = scatterPaysWins,
freeSpinsWins = freeSpinsWins,
totalWin = totalWin))
}
# play a few rounds for fun:
print(paste0("You played 20 lines with $1 per line and won: $", oneRoundSimulation(1, 20)$totalWin))
print(paste0("You played 10 lines with $0.1 per line and won: $", oneRoundSimulation(0.1, 10)$totalWin))
print(paste0("You played 5 lines with $5 per line and won: $", oneRoundSimulation(5, 5)$totalWin))
```

```
## [1] "You played 20 lines with $1 per line and won: $0"
## [1] "You played 10 lines with $0.1 per line and won: $0"
## [1] "You played 5 lines with $5 per line and won: $0"
```

Notice the use of recursion in the Free Spins part of the function. A round can call 10 free rounds, each can call 10 other free rounds ad infinitum. Practically the probability of “infinitum” is infinitesimally small.

Let’s play… 100K rounds, on a single line, and see whether indeed the RTP is around 97% for this slot:

```
winnings <- data.frame(baseLineWins = numeric(0), scatterPaysWins = numeric(0),
freeSpinsWins = numeric(0), totalWin = numeric(0))
nRounds <- 100000
betPerLine <- 1
nPaylines <- 1
for (i in 1:nRounds) {
winnings[i, ] <- unlist(oneRoundSimulation(betPerLine, nPaylines))
}
totalBetNRounds <- nRounds * betPerLine * nPaylines
print(paste0("RTP from base lines: ", sum(winnings$baseLineWins) / totalBetNRounds))
print(paste0("RTP from scatter pays: ", sum(winnings$scatterPaysWins) / totalBetNRounds))
print(paste0("RTP from free spins: ", sum(winnings$freeSpinsWins) / totalBetNRounds))
print(paste0("RTP Total: ", sum(winnings$totalWin) / totalBetNRounds))
```

```
## [1] "RTP from base lines: 0.64213"
## [1] "RTP from scatter pays: 0.0663"
## [1] "RTP from free spins: 0.25158"
## [1] "RTP Total: 0.96001"
```

OK, quite close to the Wizard’s specs. Notice the large portion of the free spins feature in the RTP - when players finally get such a feature, they want MONEY. Will this RTP change when playing 20 lines?

```
winnings20Lines <- data.frame(baseLineWins = numeric(0), scatterPaysWins = numeric(0),
freeSpinsWins = numeric(0), totalWin = numeric(0))
nRounds <- 100000
betPerLine <- 1
nPaylines <- 20
for (i in 1:nRounds) {
winnings20Lines[i, ] <- unlist(oneRoundSimulation(betPerLine, nPaylines))
}
totalBetNRounds <- nRounds * betPerLine * nPaylines
print(paste0("RTP from base lines: ", sum(winnings20Lines$baseLineWins) / totalBetNRounds))
print(paste0("RTP from scatter pays: ", sum(winnings20Lines$scatterPaysWins) / totalBetNRounds))
print(paste0("RTP from free spins: ", sum(winnings20Lines$freeSpinsWins) / totalBetNRounds))
print(paste0("RTP Total: ", sum(winnings20Lines$totalWin) / totalBetNRounds))
```

```
## [1] "RTP from base lines: 0.6390915"
## [1] "RTP from scatter pays: 0.0662"
## [1] "RTP from free spins: 0.2623215"
## [1] "RTP Total: 0.967613"
```

Of course not! I tried explaining why above, if you still don’t see why, that’s OK, we can email :)

But this doesn’t give us, as game designers, any sense of what this game is really like. VP Technology and VP Finance might be happy, but VP Product should be worried. To make an extreme example: suppose I remove the Scatter Pays feature, the free spins, and put zero in the entire payTable except for “5 Wilds in a row” for which I’d give a prize of - take a breath - $32 million! Guess what - this game would have an equal RTP of about 97% (verify!), but is it the same game?!

# Simulation Land

Let’s look at 3 measures, or graphs:

- RTP distribution: we know the general RTP is 97% but what’s it like for a single player?
- Total Win distribution: does the game give high prizes with low frequency, low prizes with high frequency? How often does a player win more than her total bet?
- The “Money Path”: the balance of the “top”, “bottom” and “median” RTP players as a function of time

```
# a single simulation in which we record all data we need
performSimulation <- function(nPlayers, nRoundsPerPlayer, betPerLine, nPaylines) {
RTPs <- numeric(0)
totalWins <- list()
moneyPaths <- list()
for (player in 1:nPlayers) {
totalWinPlayer <- 0
currentPlayerWins <- numeric(0)
for (i in 1:nRoundsPerPlayer) {
roundWin <- oneRoundSimulation(betPerLine, nPaylines)$totalWin
currentPlayerWins[i] <- roundWin
totalWinPlayer <- totalWinPlayer + roundWin
}
RTPs[player] <- totalWinPlayer / (nRoundsPerPlayer * betPerLine * nPaylines)
totalWins[[player]] <- currentPlayerWins
moneyPaths[[player]] <- cumsum(currentPlayerWins) /
((1:nRoundsPerPlayer) * betPerLine * nPaylines)
}
return(list(RTPs = RTPs, totalWins = totalWins, moneyPaths = moneyPaths))
}
nPlayers <- 100
nRoundsPerPlayer <- 100
nPaylines <- c(rep(1, 2), rep(20, 2))
betPerLine <- rep(c(0.01, 1), 2)
simulationResults <- list()
# again, 100 players each play 100 rounds, different bet-per-line and no. of paylines choice
for (i in 1:length(nPaylines)) {
simulationResults[[i]] <- performSimulation(nPlayers, nRoundsPerPlayer,
betPerLine[i], nPaylines[i])
}
```

### RTP distribution

```
library(ggplot2)
library(gridExtra)
# getting just the RTPs in data.frames because ggplot likes that
RTPsDFs <- lapply(simulationResults, function(x) data.frame(RTP = x$RTP))
plotRTPsHist <- function(RTPsDF, title) {
ggplot(RTPsDF, aes(x = RTP)) +
geom_histogram(breaks = seq(0, 10, by = 1), color = "black", fill = "red", alpha = 0.5) +
labs(title = title, x = "RTP", y="#Players") +
theme(axis.text = element_text(size = 8),
axis.title = element_text(size = 10)) +
geom_vline(xintercept = 1) +
ylim(c(0, 100))
}
plotTitles <- paste0("$", betPerLine, " per line, ", nPaylines, " lines")
plotsList <- mapply(plotRTPsHist, RTPsDFs, plotTitles, SIMPLIFY = FALSE)
do.call(grid.arrange, plotsList)
```

Ah. See that in all four forms of play, most of the players (60-70) are at a loss after 100 rounds (below the black vertical line at RTP 100%). When playing 1 line there are few players who reached RTP over 300%! But: (a) they had a really boring experience betting on a single line - the player who bet 1 cent on 1 line and got to 400% RTP actually profitted… $3 (verify!) and (b) remember these are still very few players.

### Total Win distribution

```
library(scales)
totalWinsDFs <- lapply(simulationResults, function(x)
data.frame(totalWin = unlist(x$totalWins) + 1))
plotTotalWinHist <- function(totalWinsDF, title, totalBet) {
ggplot(totalWinsDF, aes(x = totalWin)) +
stat_density(aes(y=..count..), color="black", fill="green", alpha=0.5) +
scale_x_continuous(breaks=c(0,1,2,3,4,5,10,30,100,300,1000), trans="log1p", expand=c(0,0)) +
labs(title = title, x = "log(Total Win$ + 1)", y="#Rounds") +
theme(axis.text = element_text(size = 8),
axis.title = element_text(size = 10)) +
geom_vline(xintercept = log(totalBet + 1))
}
plotTitles <- paste0("$", betPerLine, " per line, ", nPaylines, " lines")
totalBets <- betPerLine * nPaylines
plotsList <- mapply(plotTotalWinHist, totalWinsDFs, plotTitles, totalBets, SIMPLIFY = FALSE)
do.call(grid.arrange, plotsList)
```

These plots are amazing considering they’re coming from the same game. Notice the x axis is on log(x + 1) scale, so “1” means “$0”, or no wins at all. Compare the two forms of playing 1 cent per line (top and bottom left): adding lines makes the game a lot more “volatile”. It increases the variablity of results, making it much more interesting for the player, even though she gets the same RTP, and it’s more expensive. Same goes with playing $1 per line - it’s expensive but when playing 20 lines, you get a rush in the form of the “hill” in the bottom-right plot - when you win, you win big. This is where the fact that the lines are not independent, and actually a winning line tends to occur with other winning lines - comes to play. That “hill” is what causes addiction to these games.

### Money Path

```
getMinMaxMedianPlayers <- function(RTPs) {
which.median <- function(x) which.min(abs(x - median(x)))
list(minPlayer = which.min(RTPs),
maxPlayer = which.max(RTPs),
medianPlayer = which.median(RTPs))
}
minMaxMedianPlayers <- lapply(simulationResults, function(x) getMinMaxMedianPlayers(x$RTPs))
nRoundsPerPlayer <- 100
getMoneyPaths <- function(sim, minMaxMedianPlayers) {
moneyPaths <- data.frame(minMoneyPath = sim$moneyPaths[[minMaxMedianPlayers$minPlayer]],
maxMoneyPath = sim$moneyPaths[[minMaxMedianPlayers$maxPlayer]],
medianMoneyPath = sim$moneyPaths[[minMaxMedianPlayers$medianPlayer]])
moneyPaths %<>% gather() %>% mutate(round = rep(1:nRoundsPerPlayer, 3))
}
moneyPathsDFs <- mapply(getMoneyPaths, simulationResults, minMaxMedianPlayers, SIMPLIFY = FALSE)
plotMoneyPaths <- function(moneyPaths, title) {
ggplot(moneyPaths, aes(x = round, y = value, color = key)) +
geom_line() +
labs(title = title, x = "Round", y="RTP") +
theme(axis.text = element_text(size = 8),
axis.title = element_text(size = 10))
}
plotTitles <- paste0("$", betPerLine, " per line, ", nPaylines, " lines")
plotsList <- mapply(plotMoneyPaths, moneyPathsDFs, plotTitles, SIMPLIFY = FALSE)
do.call(grid.arrange, plotsList)
```

Nice. But the max players make it difficult to see what’s going on with the median and min players. It does look however as though when playing 20 lines a lot more happens. It’s not just about a single peak, which takes time to decrease because you’re betting so conservatively - there are a few peaks with 20 paylines, highs and lows. Lastly see that in all forms of play, the trend of the max player is going downwards, and she *will* lose eventually. Statistics has spoken.

# So You Think You Can Do Combinatorics

OK, so far so good. But what if we wanted to increase the chances of winning “5 Wilds in a row”. The VP Product wants this event to actually happen, not 1 in 33 million! And we inserted another “Wild” in one of the reels. How would we know the expected House Edge or RTP?

When referring to the “Line Pay Combinations” table, the Wizard says this:

There are 32 × 32 × 32 × 32 × 32 = 33,554,432 possible outcomes in Atkins Diet. The following table shows how many of these combinations result in each possible win. These figures are the result of a program with five nested loops that tallied the total for each win for each possible combination.

I did in fact write such a program:

```
combTable <- matrix(0, nrow = nrow(payTable), ncol = 5)
colnames(combTable) <- c("One", "Two", "Three", "Four", "Five")
rownames(combTable) <- payTable$SymbolGen
nRowsReels <- nrow(reels)
for (r1 in 1:nRowsReels) {
for (r2 in 1:nRowsReels) {
for (r3 in 1:nRowsReels) {
for (r4 in 1:nRowsReels) {
for (r5 in 1:nRowsReels) {
spin <- c(r1, r2, r3, r4, r5)
res <- reelsMat[cbind(seq_along(spin),spin)]
won <- FALSE
prize_counter <- 0
while(!won & prize_counter < nrow(longPayTable)) {
prize_counter <- prize_counter + 1
won <- checkWinSingleLineSingleEvent(res, longPayTable$nConsecutive[prize_counter],
longPayTable$SymbolGen[prize_counter])
}
if (won) {
combTable[longPayTable$SymbolGenN[prize_counter],
longPayTable$nConsecutive[prize_counter]] <-
combTable[longPayTable$SymbolGenN[prize_counter],
longPayTable$nConsecutive[prize_counter]] + 1
}
}
}
}
}
}
```

The above monster will get you the Wizard’s combinations table. But on my laptop it takes ~11 to 12 hours to finish! Now, you can write this in a faster language, e.g. C or Java, you can parallelize the loops and perform them concurrently - but I think that even if you narrow down 11 hours to 11 minutes - it’s still way too long for small tweaks and changes, such as adding an extra “Wild” on reels. So why don’t we do it with, uhm, what you call it: math.

Ah, but this math thing can be pretty hard. Even this slot, which does not seem like much of a combinatorial^{1} challenge - can make you cry.

But before crying, let’s see that we get the Wizard’s “Symbol Distribution”:

```
reels[, 2:6] %>%
gather(key, Symbol) %>%
count(key, Symbol) %>%
spread(key, n, fill = 0)
```

```
## # A tibble: 11 x 6
## Symbol Reel1 Reel2 Reel3 Reel4 Reel5
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Scatter 2 1 1 1 1
## 2 Sym1 2 3 2 2 3
## 3 Sym2 2 3 3 3 4
## 4 Sym3 3 3 3 2 3
## 5 Sym4 3 3 3 4 4
## 6 Sym5 4 2 3 3 3
## 7 Sym6 4 3 4 3 4
## 8 Sym7 3 4 4 4 3
## 9 Sym8 4 4 4 5 3
## 10 Sym9 4 5 4 4 3
## 11 Wild 1 1 1 1 1
```

Let’s try to calculate the number of combinations to get “3 Wilds in a row” - should be simple, right?

\(Comb(3\space Wilds) = Comb(Wild\space Reel1) * Comb(Wild\space Reel2) * Comb(Wild\space Reel3) *\) \(Comb(Not\space Wild\space Reel4) * Comb(Whatever\space Reel2)\)

So that’s `1 * 1 * 1 * 31 * 32`

which is `992`

. But the Wizard has counted `513`

. Oh right, duh, if we get `Wild, Wild, Wild, Sym1, Sym2`

this could count as “4 Sym1’s in a row”, which pays x200, more than what “3 Wilds in a row” pays (x50). OK so we need to exclude on Reel4 all those symbols which pay larger than our “3 Wilds in a row” when occuring in a “4 in a row” sequence. In this case these are Sym1, Sym2, Sym3, Sym4 and Sym5:

\(Comb(3\space Wilds) = Comb(Wild\space Reel1) * Comb(Wild\space Reel2) * Comb(Wild\space Reel3) *\) \(Comb(Not\space Wild/Sym1/Sym2/Sym3/Sym4/Sym5\space Reel4) * Comb(Whatever\space Reel2)\)

This would get us to: `1 * 1 * 1 * 17 * 32`

which is `544`

. But the Wizard has counted `513`

. Oh, wait. What about `Wild, Wild, Wild, Sym6, Sym6`

? That should count as “5 Sym6 in a row” which pays x100, not “3 Wilds in a row” which pays only x50! And we’re counting that too! So we need to subtract all symbols appearing twice, thus completing a “5 in a row” sequence which would pay more than x50. In this case those are Sym6 and Sym7 (we already got rid of Sym1 through Sym5!):

\(Comb(3\space Wilds) = Comb(Wild\space Reel1) * Comb(Wild\space Reel2) * Comb(Wild\space Reel3) *\) \([Comb(Not\space Wild/Sym1/Sym2/Sym3/Sym4/Sym5\space Reel4) * Comb(Whatever\space Reel2) -\) \(Comb(Sym6\space or\space Sym7\space on\space Reel4\space and\space Reel5)]\)

Well that’s `1 * 1 * 1 * (17 * 32 - 3*4 - 4*3)`

which is `520`

. So close! But still not `513`

. Why are we counting 7 incorrect combinations?! I’ll tell you why: we’re also counting a pattern such as `Wild, Wild, Wild, Sym6, Wild`

, which pays x100 for “5 Sym6 in a row”, higher than our “3 Wilds in a row”. This occurs with Sym6 and Sym7:

\(Comb(3\space Wilds) = Comb(Wild\space Reel1) * Comb(Wild\space Reel2) * Comb(Wild\space Reel3) *\) \([Comb(Not\space Wild/Sym1/Sym2/Sym3/Sym4/Sym5\space Reel4) * Comb(Whatever\space Reel2) -\) \(Comb(Sym6\space or\space Sym7\space on\space Reel4\space and\space Reel5) -\) \(Comb(Sym6 \space or \space Sym7 \space on \space Reel4 \space and \space Wild \space on \space Reel5)]\)

Which gives: `1 * 1 * 1 * (17 * 32 - 3*4 - 4*3 - 4*1 - 3*1)`

which is `513`

, finally. And this is just “3 Wilds in a row”! And this is without writing the R function to do this automatically! And…

What we’re not done?! No. I am sorry. If you implement the function for calculating the number of “3 Wilds in a row” this way - you still might encounter mistakes. This can happen if the user of your software, deliberately or not, decides to enter, say x100 prize for “2 Wilds in a row”, keeping only x50 prize for “3 Wilds in a row” - what, this seems contrived to you? Trust me, this *can* happen.

So using math to calculate these probabilities is possible. In fact the one feature which caused us so much grief here is the fact that the “Wild” symbol pays, and if we were to just eliminate this feature, the whole calculation would become much simpler. But a good practice is to *never* disrespect your combinatorics. And that is why, in some cases, even the Wizard would use a simulation to calculate the House Edge.

# Smells Like a Shiny App

The next step, would may be to return to where I left and show you how I calcualte the entire probabilities table, in R. This way we can tweak our slot, each time looking at our three measures to see what exactly did we change in terms of not just House Edge but also the player’s experience. Shiny app, in short. Hope I helped some struggling gaming mathematician out there, hope I convinced you that gambling can also be fascinating, not just addictive, and that designing even a “silly” slot machine, can be a very creative and challenging task. Finally, if you have to gamble, and you happen to be on a RTP of 120% - stop. You beat the House. Make sure you know what you’re doing. Respect Statistics. And don’t drink and gamble.

might have made up a word↩