|
Post by Arcanet on Jul 15, 2014 14:00:27 GMT -8
What would the successes/failures look like if non-last crits counted as triples, instead of singles? Would make the feeling of 3-red first/second draw crit feel good instead of "aww bugger".
Essentially making crits Triple+, as opposed to Single+.
|
|
Ziphion
Full Member
Resident Mathematician
Posts: 132
|
Post by Ziphion on Jul 15, 2014 17:10:04 GMT -8
Did some more tests! These results are averaged from two more 8x8 tables of results like the others, with each value in those tables representing 100,000 opposed flips, shuffling when the deck's empty. So I guess these new numbers represent 12.8 million flips. "OHCH" stands for "Original-HeroComp-Hybrid" (I'm not great with names), "OHCH 1" means criticals that aren't the last card drawn count as singles, "OHCH 3" means they count as triples.
Read this table like "When moving from [color1] to [color2], my odds improve by [result (± uncertainty) %] against an average opponent." If the uncertainty is high, it means there's a wide variance in the result for different opponents.
| Original | HeroComp | OHCH 3 | OHCH 1 | 3 blue to 2 blue | 5.9 (± 6.2) % | 6.9% (± 5.4) % | 6.2 (± 5.1) % | 5.4 (± 4.6) % | 2 blue to 1 blue | 10.1 (± 8.2) % | 15.1 (± 5.1) % | 14.6 (± 4.1) % | 13.5 (± 2.8) % | blue to green | 5.3 (± 4.1) % | 7.0 (± 2.5) % | 7.2 (± 2.4) % | 7.7 (± 1.9) % | green to yellow | 5.3 (± 4.1) % | 7.0 (± 2.5) % | 7.2 (± 2.4) % | 7.7 (± 1.9) %
| yellow to red | 5.3 (± 4.1) % | 7.0 (± 2.5) % | 7.2 (± 2.4) % | 7.7 (± 1.9) %
| 1 red to 2 red | 10.1 (± 8.2) % | 15.1 (± 5.1) % | 14.6 (± 4.1) % | 13.5 (± 2.8) %
| 2 red to 3 red | 5.9 (± 6.2) % | 6.9% (± 5.4) % | 6.2 (± 5.1) % | 5.4 (± 4.6) %
| --- |
|
|
|
| AVERAGE | 6.8 (± 6.2) % | 9.3 (± 5.5) % | 9.0 (± 5.1) % | 8.7 (± 4.4) % |
Just going by these numbers, my favorite of the four is the last column, OHCH 1. The uncertainty is a little lower, meaning there's more consistency in the performance gains you get from improving a stat. Also, I don't think treating crits as singles unless they're the last card is too "aww bugger"-y... that's just how games of chance go, and critical failures work the same way, and your opponent has the same luck that you do. Nothing like playing D&D and rolling 1's and 2's for your attacks, only to roll a natural 20 on something dumb like a History check.
|
|
|
Post by Dashing Inventor on Jul 15, 2014 17:23:46 GMT -8
Yeah, your chances of flipping a critical success are pretty strong if flipping 3+ cards on a fresh deck. I also like the idea that Hero Cards have an intrinsic advantage to straight flips, in that you could treat any critical cards you flip as such.
I will probably retain the single check/strike for criticals, as well as only counting them as criticals if they are the last card flipped.
Here is an alternative multiflip scheme: If you show show any checks when flipping red, you succeed and count all checks, and if you show any strikes when flipping blue you fail, counting all strikes. This works exactly the same as the herocomp scheme, and allows you to keep all your checks/strikes as in the original rules. This means that extremely skilled players still have the potential for greater damage, as well as the more satisfying probability for success against less skilled opponents.
|
|
Ziphion
Full Member
Resident Mathematician
Posts: 132
|
Post by Ziphion on Jul 15, 2014 18:08:15 GMT -8
Oh, I like that a lot. I think it will be slightly different than the herocomp scheme, since if a double red flips ✓ and ✓✓✓, and his yellow opponent flips ✓✓✓, that would be a win for red in the new scheme, and a tie for herocomp. I'm assuming if a red flips only strikes, they can choose the best strike result? And for blue, if they get no strikes, they must take the worst check result? I'll test it out ASAP.
|
|
|
Post by Dashing Inventor on Jul 15, 2014 18:36:05 GMT -8
That sounds good. Looking forward to your results.
|
|
|
Post by Arcanet on Jul 16, 2014 1:42:30 GMT -8
What if a double red flips ✓ and ✓✓, and his yellow opponent flips ✓✓✓, should that be a win for yellow on the basis of ✓✓+✓< ✓✓✓?
|
|
|
Post by Dashing Inventor on Jul 16, 2014 1:50:12 GMT -8
Probably go off of the initiative numbers, using the top card in the case of a multiflip.
|
|
|
Post by Arcanet on Jul 16, 2014 2:42:12 GMT -8
Much easier and unambiguous, true.
I think handling all ties like that would be good, unless you happen to flip exactly the same card, in which case an extra tiebreaker is needed. I think that is already the case, though(?).
|
|
|
Post by Dashing Inventor on Jul 16, 2014 3:53:57 GMT -8
Yep.
|
|
Ziphion
Full Member
Resident Mathematician
Posts: 132
|
Post by Ziphion on Jul 16, 2014 6:24:10 GMT -8
So, about tiebreakers... My code works like this:
checkResult1 = 0 checkResult2 = 0 while checkResult1 = checkResult2 { checkResult1 = (function that draws cards and returns the check result for deck 1, then puts those cards in discard pile) checkResult2 = (same for deck 2) } if checkResult1 > checkResult2, deck 1 wins else deck 2 wins
So in the event of a tie, the program just discards all cards involved and tries again. That way ties are effectively removed from the equation. Am I supposed to just draw one card for tiebreakers involving multiple flips? If so, how exactly does that work?
If I go by initiative for tiebreakers instead, the chance of breaking a tie is 50/50 no matter what colors are involved. That will push all of the probability changes for deck rotations a bit closer to each other, which I think is undesirable.
P.S. Yay custom title!
|
|
|
Post by directedbyme on Jul 16, 2014 10:33:07 GMT -8
Thanks for all your mathematical magic, Ziphion!
|
|
Ziphion
Full Member
Resident Mathematician
Posts: 132
|
Post by Ziphion on Jul 16, 2014 16:15:25 GMT -8
DI, it seems like the alternative multiflip scheme you suggested exhibits some of the unfortunate traits of the original multiflip scheme. Namely, if you're facing against a triple red or triple blue, it really doesn't matter whether you're blue, green, yellow, or red, your chances are about the same (11% or 89% respectively, ±2%). Also, rotating from single color to single color gives you a 6.1% boost, but moving into double red or out of double blue gives you a whopping 17.1% boost, almost three times higher. Compare these two systems, and notice how much farther away from the "pack" the dotted lines are when counting all checks/strikes (this indicates big jumps in likelihood of success):
What about this: pick only the best check result on the card to determine success, but you can sum up all of the checks on all flipped cards for damage? (only for multiple reds, of course.) Would that be too complicated?
|
|
|
Post by Dashing Inventor on Jul 16, 2014 20:16:32 GMT -8
Yeah, that idea had some unexpected consequences I didn't consider at first. I thought about choosing best for your results and counting all checks as damage (as you mentioned above), which is slightly more complicated but probably the best compromise.
|
|
|
Post by Arcanet on Jul 17, 2014 2:37:15 GMT -8
I have a feeling it only seems complicated for us, as we're actively discussing probabilities and chances of success/failure. But once the rule text explaining the feature is formulated, the person reading the rules likely goes "3, 2 and 2 checks against his 2 check defense without armor, so thats 7 damage aww yes *Dance*".
|
|
Ziphion
Full Member
Resident Mathematician
Posts: 132
|
Post by Ziphion on Jul 17, 2014 18:08:33 GMT -8
Guys guys guys... I made a mistake. I actually didn't implement criticals correctly in my posts about OHCH. When I ran those tests, multiple blue could literally never get a critical success, and multiple red could never get a critical fail, since it wasn't overriding the check result with the critical when it was the last card played. Meaning, even if double blue flipped two critical successes, the result would be 1, because any crit except for the last one counts as 1, and 1 < critical. I fixed it so that when the last card is a critical, the result is guaranteed to be a critical, and reran the sim (with only 200,000 flips per entry because I was eager to see the results).
OHCH1 with criticals properly implemented, AltDeck2:
| 3 blue (-2) | 2 blue (-1) | blue (0) | green (1) | yellow (2) | red (3) | 2 red (4) | 3 red (5) | 3 blue (-2) | 50.2% | 41.8% | 31.1% | 25.7% | 20.9% | 16.6% | 15.0% | 14.7% | 2 blue (-1) | 58.1% | 50.1% | 37.9% | 30.5% | 23.8% | 18.0%
| 15.1%
| 14.9% | blue (0) | 68.8% | 62.1% | 50.0% | 40.3% | 31.6% | 23.2% | 18.0% | 16.6% | green (1) | 74.4% | 69.5% | 59.6% | 49.9% | 40.6% | 31.6% | 23.8% | 20.9% | yellow (2) | 79.1% | 76.1% | 68.5% | 59.4% | 50.0% | 40.5% | 30.5% | 25.7% | red (3) | 83.4% | 82.0% | 76.7% | 68.4% | 59.6% | 50.1% | 38.1% | 31.3% | 2 red (4) | 85.1% | 84.8% | 82.0% | 76.2% | 69.6% | 62.2% | 50.3% | 41.7% | 3 red (5)
| 85.3% | 85.1% | 83.4% | 79.0% | 74.4% | 68.7% | 58.3% | 49.9% |
When moving from [color1] to [color2], one's odds improve by [result (± uncertainty) %] against an average opponent:
3 blue to 2 blue, or 2 red to 3 red: 4.1 (± 3.4) % 2 blue to 1 blue, or 1 red to 2 red: 7.8 (± 4.1) % single-flip color to single-flip color: 7.5 (± 2.0) %
Look at this beautiful graph.
Now, I know what I said earlier about my distaste for asymptotes that aren't 0% and 100%. But guys. Look at how pretty that graph is. It's like it was meant to be. The changes between colors are super smooth, no sharp corners. And the asymptote is much better than before (14.7% is about 1 in 7), so it's not so bad; the least powerful character in the game might get in a hit or two against the most powerful character, but he still needs to get in several hits and avoid/sustain several hits to defeat him, which is much less likely. So I'm actually pretty okay with the limits being brought in a bit. I have a good feeling that this is the best we can do with a 36-card deck. But if you have suggestions for tweaks, please let me know and I'll test them.
So to recap: This data was taken using AltDeck2 (see the check distribution from my post on July 8 at 10:54am), with the rule that when one of your skills drops to -1 or -2, you flip two or three cards respectively, choosing the worst result; critical successes and failures override this if they are the last card drawn, otherwise they count as single successes and failures. Similarly, if your skill rises to 4 or 5, do the same, choosing the best result.
By the way, I realized I had made this mistake in my code while I was calculating average positive check values today for each color, which tells you your average attack damage, not including weapon bonuses etc. When you just use the number of checks you have on your successful attack card, not including criticals, and not summing up all the checks on all the cards for multiple red flips like we talked about, here are the average damages (calculated analytically, so no random error here, though human error is always a possibility):
3 blue: 1.058 2 blue: 1.185 blue: 1.667 green: 2.000 yellow: 2.111 red: 2.167 2 red: 2.388 3 red: 2.572
So you can see that damage follows a nice, somewhat linear increase from -2 up to +5, increasing by about .22 per level. This is because 1's are very common for successful attacks at -2, and 3's are very common at +5. What about if we have 2 red and 3 red sum up the checks on all flipped cards for damage?
2 red: 3.433
3 red: 4.850
Huge spike in damage. I don't think that should be implemented; I think it adds complication and hurts balance.
Questions? Comments? Suggestions?
|
|