|
Post by Dashing Inventor on Jul 9, 2014 2:18:04 GMT -8
How many results are those numbers based off of?
|
|
|
Post by Arcanet on Jul 9, 2014 4:02:25 GMT -8
One million flips color vs color, so 16 million/table, I think.
|
|
Ziphion
Full Member
Resident Mathematician
Posts: 132
|
Post by Ziphion on Jul 9, 2014 6:12:22 GMT -8
Yeah, each percentage is the result of 1,000,000 opposed flips, so a 59.6% translates to about 596,000 wins for color A, 404,000 wins for color B. Originally I tried testing with only 300,000, but I was still getting a few instances of 49.9% and 50.1% when the two colors were identical, so I needed more data to smooth some of that randomness.
|
|
Ziphion
Full Member
Resident Mathematician
Posts: 132
|
Post by Ziphion on Jul 9, 2014 7:17:11 GMT -8
Interesting side note: unopposed attempts. As an example, blue's chance of success on an unopposed flip is just equal to its number of positive results (9) over every possible result (36), or 25%. This means on the original demo deck and altered deck 1, blue's chance of success on an unopposed attempt is actually even higher than blue vs. red (27.4% and 26.8% respectively). In altered deck 2, however, it's between yellow and red. Green's chance of success on an unopposed attempt (15/36 = 41.7%) is right in the middle between green and yellow in the original deck and altered deck 2, but between yellow and red on altered deck 1. In fact, let me just list the order of difficulty for each color's flips, from easiest to hardest:
Original demo deck Blue: blue (50%), green (41.3%), yellow (33.3%), red (27.4%), unopposed (25%) Green: blue (58.7%), green (50%), unopposed (41.7%), yellow (41.3%), red (34.2%) Yellow: blue (66.7%), green (58.7%), unopposed (58.3%), yellow (50%), red (42.4%) Red: unopposed (75%), blue (72.6%), green (65.8%), yellow (57.6%), red (50%)
Altered deck 1 Blue: blue (50%), green (41.7%), yellow (34.0%), red (26.8%), unopposed (25%) Green: blue (58.3%), green (50%), yellow (41.9%), unopposed (41.7%), red (34.0%) Yellow: blue (66.0%), unopposed (58.3%), green (58.1%), yellow (50%), red (41.6%) Red: unopposed (75%), blue (73.2%), green (66.0%), yellow (58.3%), red (50%)
Altered deck 2 Blue: blue (50%), green (40.4%), yellow (31.5%), unopposed (25%), red (23.3%) Green: blue (59.6%), green (50%), unopposed (41.7%), yellow (40.6%), red (31.5%) Yellow: blue (68.4%), green (59.4%), unopposed (58.3%), yellow (50%), red (40.4%) Red: blue (76.7%), unopposed (75%), green (68.5%), yellow (59.6%), red (50%)
|
|
|
Post by Dashing Inventor on Jul 10, 2014 17:53:47 GMT -8
I really love this discussion.
Just a note regarding design: When making changes to probabilities, I have to take into account how it will affect balance in all areas of the game. For example, if there are more high-returning flips (2 or 3 checks) that will significantly increase the amount of damage skilled characters can do, since checks are factored into damage. Since Simple System characters have a very limited number of life points, it could make combat unduly lethal. That's just an example of the ramifications these kinds of changes can make, and they need to be weighed carefully. It may at first seem like increasing probability gap between colors is a good thing, when in reality it may serve to throw off balance. This can be counteracted by making adjustments to other areas of the game, the bottom line is making changes can be an involved process.
|
|
|
Post by directedbyme on Jul 11, 2014 9:10:52 GMT -8
What if you only used the difference between the success? In other words if there were 3 checks on a flip to attack and the other person had 1 check to defend, the damage added would only be +2. Even though that would be a little bit of math.
|
|
Ziphion
Full Member
Resident Mathematician
Posts: 132
|
Post by Ziphion on Jul 11, 2014 14:04:50 GMT -8
DI: That's a good point about damage, I hadn't thought about that. Although now that I'm looking at it, using the original deck or altered deck 1, the average damage dealt on a successful hit doesn't actually change from blue to green to yellow to red; all of them are 1.67 (edit: not including crits), with the exception of original yellow (1.72) and original red (1.71). That seems surprising, but if you think about it, the ratio of singles to doubles to triples is the same (or nearly the same) for those two decks no matter your color. Altered deck 2 actually gives you better average damage for rotating the deck: damage for blue is 1.67, green is 2, yellow is 2.11, and red is 2.17. The damage increases with diminishing returns for each rotation. I think I like that better than having it be constant.
directedbyme: A little more arithmetic wouldn't be so bad, since armor pips are already subtracted from damage, but I kind of like the fact that a defender's evasion and damage reduction are independent variables. That is, the defender can put on better armor to become more resistant to damage, or they can conceal themselves in thick foliage or smoke to become more evasive.
|
|
|
Post by Dashing Inventor on Jul 11, 2014 14:48:40 GMT -8
Yeah, I really like AltDeck2, the distribution is clean and even no matter how you look at it. It's faithful to the original (which has been extensively tested) but knocks off some of the rough edges, without throwing off the combat balance.
The issue with subtracting defense checks from damage checks is that it adds another step to the damage process, which I am always hesitant to do.
|
|
Ziphion
Full Member
Resident Mathematician
Posts: 132
|
Post by Ziphion on Jul 13, 2014 12:08:22 GMT -8
Hello all! I've implemented multiple flips in my simulation, and I want to share my results with you.
Firstly, here's how my code works. I've gone over it many times and I'm confident it is accurately simulating my understanding of how this gameplay mechanic functions, but please let me know if my understanding is flawed.
When the “color” value is -1 (indicating double blue), flip two cards, look at the “color 0” (blue) values, and begin a tally of checks and strikes. If the first card is a critical success, treat it like a single success; if the first card is a critical failure, treat it like a single failure. If card 1 has checks, add the number of checks to the check tally; otherwise add strikes to the strike tally. Repeat for card 2. Return whichever is greater, the check tally or strike tally. If they are equal, flip another card, add its checks/strikes to the appropriate tally, and return whichever is greater after the tiebreaker. Reshuffle the deck when there are 4 or fewer cards left in it. The code is similar for colors -2 (triple blue), 4 (double red), and 5 (triple red).
AltDeck1, Original Multiflip Scheme (only 100,000 opposed flips per value, so some of the values should be off by at most 0.2%)
| 3 blue (-2) | 2 blue (-1) | blue (0) | green (1) | yellow (2) | red (3) | 2 red (4) | 3 red (5) | 3 blue (-2) | 50.0% | 36.4% | 25.9% | 25.0% | 24.5% | 23.9% | 22.4% | 20.5% | 2 blue (-1) | 63.9% | 49.9% | 31.3% | 29.4% | 27.8% | 26.0%
| 23.7%
| 22.3% | blue (0) | 74.4% | 68.6% | 50.0% | 41.6% | 34.0% | 26.9% | 26.0% | 23.8% | green (1) | 74.9% | 70.6% | 58.3% | 50.0% | 41.9% | 34.0% | 27.8% | 24.6% | yellow (2) | 75.8% | 72.4% | 66.1% | 58.1% | 50.0% | 41.6% | 29.4% | 25.0% | red (3) | 76.1% | 74.0% | 73.1% | 66.0% | 58.4% | 49.9% | 31.6% | 25.8% | 2 red (4) | 77.7% | 76.1% | 74.1% | 72.4% | 70.5% | 68.6% | 50.0% | 36.2% | 3 red (5)
| 79.3% | 77.6% | 76.2% | 75.4% | 75.0% | 74.2% | 63.7% | 50.0% |
AltDeck2, Original Multiflip Scheme (1,000,000 opposed flips per value)
| 3 blue (-2) | 2 blue (-1) | blue (0) | green (1) | yellow (2) | red (3) | 2 red (4) | 3 red (5) | 3 blue (-2) | 50.0% | 34.4% | 22.9% | 22.2% | 21.6% | 20.9% | 18.9% | 17.6% | 2 blue (-1) | 65.6% | 50.0% | 28.9% | 26.4% | 24.0% | 21.7% | 19.9% | 19.1% | blue (0) | 77.1% | 71.1% | 50.0% | 40.4% | 31.5% | 23.3% | 21.8% | 20.9% | green (1) | 77.8% | 73.6% | 59.6% | 50.0% | 40.7% | 31.5% | 24.1% | 21.6% | yellow (2) | 78.5% | 75.9% | 68.4% | 59.4% | 50.0% | 40.4% | 26.4% | 22.3% | red (3) | 79.1% | 78.3% | 76.7% | 68.5% | 59.6% | 50.0% | 28.9% | 22.9% | 2 red (4) | 81.0% | 80.1% | 78.3% | 76.0% | 73.6% | 71.1% | 50.0% | 34.5% | 3 red (5)
| 82.4% | 81.0% | 79.1% | 78.5% | 77.8% | 77.1% | 65.5% | 50.0% |
The way multiple flips work now, if you’re single blue (0) or higher and you’re facing a triple or double blue (-2 or -1), your chances of success are 74.8 ± 3.1% for AltDeck1, or 78.0 ± 3.2% for AltDeck2. That’s right - against the weakest or second-weakest possible opponent in the entire game, it doesn’t matter how strong you are, your chance to succeed is about 75%. Similarly, when facing a triple red, the strongest possible opponent, if you’re a single red or lower (even triple blue!), your chance of success is about 25%. Consider AltDeck2: A -2 flip against a +5 flip has an 18% chance, while a +3 flip against a +5 flip has a 23% chance (just 5% higher). That’s the difference between a 4 and a 5 on a twenty-sided die, after 6 level-ups (essentially an entire campaign). When considering the change in probabilities from rotating your deck, adding in multiple flips reduces the average from 9.1% down to 6.8 ± 6.2% (huge variance, by the way). Not to mention, the weakest possible character shouldn’t even be able to lay a scratch on the strongest possible character except in rare circumstances like critical hits; as it is, it’s almost as easy as flipping two coins and having both land on heads.
You can see above how the probabilities abruptly hit ~80% and ~20%, and stay there.
I tested an alternative multiple-flip scheme, one that’s similar to an existing Simple System mechanic: complications and hero cards. For example, if you’re at -1 in a skill, you flip two blues, but instead of using the greatest number of strikes or checks for the result, you simply pick the worst result, exactly like a complication. If you’re at a -2, flip three blues, pick the worst result, like two complications. Similarly, if you’re at 4 in a skill, you flip two reds and pick the better result; if you’re at 5, flip three reds and pick the best, like using a hero point.
The immediate benefit is that the probabilities are nicer, but there are other benefits to this as well. I only tested AltDeck2 for this since it seems to be a favorite here.
AltDeck2, "HeroComp" Multiflip Scheme (1,000,000 opposed flips per value)
| 3 blue (-2) | 2 blue (-1) | blue (0) | green (1) | yellow (2) | red (3) | 2 red (4) | 3 red (5) | 3 blue (-2) | 50.0% | 36.1% | 18.9% | 14.8% | 11.2% | 8.0% | 0.7% | 0.1% | 2 blue (-1) | 63.9% | 50.0% | 29.2% | 22.4% | 16.3% | 10.8% | 1.7% | 0.7% | blue (0) | 81.1% | 70.8% | 50.0% | 40.4% | 31.5% | 23.3% | 10.8% | 8.0% | green (1) | 85.2% | 77.7% | 59.6% | 50.0% | 40.6% | 31.6% | 16.3% | 11.2% | yellow (2) | 88.8% | 83.7% | 68.5% | 59.4% | 50.0% | 40.5% | 22.4% | 14.8% | red (3) | 92.0% | 89.1% | 76.7% | 68.5% | 59.6% | 50.0% | 29.2% | 18.9% | 2 red (4) | 99.3% | 98.3% | 89.1% | 83.7% | 77.7% | 70.9% | 50.0% | 36.1% | 3 red (5)
| 99.9% | 99.3% | 92.0% | 88.8% | 85.2% | 81.0% | 63.8% | 50.0% |
Here, when you calculate the overall average change in probabilities from rotations, it’s 9.3% (very close to the single-flip 9.1%!), plus or minus 5.5%. You’ll notice that the probability change between double-red and double-blue vs. triple blue is very negligible (0.6%), but that’s because we’re getting very close to 100%, a more sensible asymptotic limit than 80%, if you ask me. The strongest possible character has a 99.9% chance of success against the weakest possible character. We’re talking extremes here. In the unlikely event that the low extreme has an opportunity to go head-to-head against the high extreme, his chance of success is 0.1%, as opposed to the 20% from the original multiflip scheme.
Another benefit to using this “HeroComp” system is that it’s simpler, in my opinion. Instead of doing any math at all, you just look down at your cards and pick the highest or lowest result. It also pairs nicely with actual hero points and complications. Flipping blue (0) with a complication? That’s exactly the same as flipping double blue (-1). Flipping triple red (5) with complication? Exactly the same as double red (4). Flipping red (3) with a hero point? Exactly the same as triple red (5).
Upgrading a skill from single red to double red (or from double blue to blue) gives you a more sizable boost to your odds than any of the core-color rotations. If you only have opponents that are flipping core colors (no multiples), then blue to green / green to yellow / yellow to red gives you a 9.1% boost (as discussed earlier), while red to double red gives you 16.6%. If you include multi-flipping opponents, the probabilities are 7.0% and 15.1%. Here’s a thought: Maybe advancing your character beyond red costs two stat points instead of just one? Something to consider.
What do you guys think?
|
|
|
Post by Dashing Inventor on Jul 13, 2014 19:34:38 GMT -8
Excellent work.
|
|
|
Post by directedbyme on Jul 14, 2014 8:59:53 GMT -8
Love the statistics for "HeroComp" Alt2 deck. Also I like the idea of advancing beyond red costing 2. But what about if it also costs 2 to go from double blue to blue?
|
|
|
Post by Arcanet on Jul 14, 2014 9:32:34 GMT -8
We already start at Blue by default, unless you use a custom race. This will likely result in changes to the baseline stats/costs for custom race attributes, but as we've not yet been told how to handle those, discussing them beyond theoretical what-if is hard.
Costing two per point for 3-Blue -> 2-Blue -> Blue could be a good idea too. That way if you really want you can play f.ex. an Ogre Wizard, but because the default Ogre is as bright as a muddy boot you'll need to sacrifice more attributes to Intellect.
On a related note, when making a character, you're only allowed to put one point of the four to an attribute. What if this only applied to Green(1)+, so that if you so wish you could bring your sub-Green attributes up to Green?
|
|
|
Post by Dashing Inventor on Jul 14, 2014 19:14:44 GMT -8
One of the features of Simple System is fast and easy character advancement; varying the cost of advancing your character from one level to the next would probably adversely affect that. By the time you advance enough in any one ability to go from +3 to +4, you've already made a significant investment in terms of your character's overall potential and therefore I don't consider the jump in potency it affords to be broken.
Changing the way multiflips are resolved would have one significant impact - the effect of shuffling your deck. Previously I had left it to the discretion of each game group to allow reshuffling at any time, or requiring players to wait until their resolution deck had been exhausted to reshuffle. This would allow for some degree of player control over probabilities and therefore an acceptable degree of strategy. However, if players flipping multiple cards are able to choose critical results whenever they present (as opposed to only counting the result as critical if it is the last card flipped in a multiflip) would give those players too strong of an advantage. They would have a strong chance of flipping a critical in a fresh deck, and any time they flipped a critical they could simply reshuffle and have the same strong chance for their next attempt. Therefore I think it would be necessary to disallow reshuffling at any time.
|
|
Ziphion
Full Member
Resident Mathematician
Posts: 132
|
Post by Ziphion on Jul 14, 2014 20:45:59 GMT -8
I'm actually testing different reshuffling strategies as we speak (using the "HeroComp" multiflips), and it looks like the best you can do to improve your odds by smartly reshuffling is about a 2% increase, approximately 1/5 of a deck rotation. Basically I've given each check result a "positivity" value, with criticals counting for a lot, and single-checks counting as not so much. The deck's positivity goes up when you draw strikes, and it goes down when you draw checks. Once the deck's overall positivity (and therefore, your chance at getting a favorable result) drops below that of a newly shuffled deck, the simulation goes ahead and shuffles it. I've been playing around with different positivity weights, and pitting two decks at each other, one with the smart shuffling strategy, and one just shuffling when it's almost out of cards. Like I said, the highest edge I've been able to get that way is about 2%.
Oh, I should mention - the shuffling I'm talking about only happens here before each flip attempt is made. The deck doesn't reshuffle after drawing one of three cards, or anything like that. I think it's a fair rule to say you can't reshuffle in the middle of an attempt, you must wait until you've flipped all cards that are part of the attempt. Otherwise it would get ridiculous very quickly.
Edit: To clarify, that 2% is an average over all colors. A triple blue that's shuffling smartly has a 3.8% edge vs a triple blue that's only shuffling when the deck runs out, but only a 1.7% improvement of their chances vs. a single blue, and no improvement at all vs a triple red. (Note that those numbers are only for 100,000 flips, so these numbers might be off by a tiny bit; the randomness evens out when averaging.) For the single-flip colors, smartly shuffling actually helps you more than for the multi-flippers; they can improve their chances by (on average) 2.4%, while multi-blues and multi-reds can only hope for an average of 1.5%.
These results are the highest improvement to a color's chances via shuffling strategy I could find yesterday; I simply took the check value and cubed it (check value ^ 3) and set that equal to the positivity. Criticals' check value I've set to 16, for reasons that don't really make sense anymore; I probably should lower that now that the highest possible check value is only 3 (more on that in a minute). Anyway, that means critical successes have positivity = 4096, triples = 27, doubles = 8, singles = 1, and equal and opposite values for failures. So the deck is always shuffled when a critical success is played, unless a critical failure or two or three came before. Other shuffling schemes I tried: check value + 5 (so almost equal weight to all successes, with a little bit of preference for the higher successes), check value +2, check value, check value ^2, ^4, ^5. Thus far ^3 seems the best. Do you guys have any other suggestions for simulating "smart shuffling"?
The fact that the highest check value is now only 3 may seem to have an undesirable impact on damage for double/triple red at first glance. But I think it's easy to get around that by letting players choose different cards for their success and their weapon damage. For example, if they have a yellow weapon, and they're flipping double red, and one card is a strike, and the other is a double check, but the strike card has more yellow pips than the double check card: how about they use the higher of the two check results and the higher of the two pip results, even if they aren't on the same card? That way their average damage goes up a little bit as they advance beyond red. In fact, if the gains in damage aren't as much as previous levels, all the better - that'll "even out" their gains in accuracy a bit.
|
|
Ziphion
Full Member
Resident Mathematician
Posts: 132
|
Post by Ziphion on Jul 15, 2014 9:57:22 GMT -8
A friend and I were talking about crits during lunch today, and I realized that the increased chance of criticals when flipping multiples is a bigger deal than I thought. Flipping three cards with a full deck gives you over a 25% a 23.6% critical chance. That's crazy.
What if, when flipping multiples for red or blue, criticals only count if they're the last card flipped, like before? Sort of a combination of the two multiflip methods. Two benefits: 1) critical chance is equal for everyone; 2) hero points and complications feel more special, because they're the only way to change critical chance. If you've got +5 and you're flipping triple red and draw a critical success on card 2, tough luck; it's a single success. But if you've got +3 and you spend a hero point to flip three reds, you can use that second card as the critical it was meant to be. Same with complications: they make critical failures more common.
When I get home today I'll test out this system and see how the numbers look. I have a feeling it will look even smoother than "HeroComp".
|
|