PUGing through Incarnate Trials 04/05/11 to 10/20/11


Arilou

 

Posted

Quote:
Originally Posted by Arilou View Post
I do agree that the fact that we're assuming that drop rates are completely random (and not based on any factors within the trials, performance, AT etc.) is a major uncertainity factor.
Just to add in here, Second Measure did confirm that the later trials themselves are weighted to reward the higher rewards more frequently, so it does potentially cause dataset contamination wrt to Keyes' results.

Second Measure on Later Trial Rewards


Let's Dance!

 

Posted

Quote:
Originally Posted by Arilou View Post
But the chance of one sample in three being off by 2 sigma is much larger than 5%.
Quote:
I see. So what he actually has is:

04/05/11: 17 / 82 = 20.7% rare
04/26/11: 24 / 121 = 19.8% rare
06/28/11: 16 / 119 = 13.4% rare
09/13/11: 13 / 57 = 22.8% rare

What he is saying is sounds about right -- the third result is unusually low. By eye, it looks like a 2-sigma (i.e. 1-in-20; sigma is jargon for standard deviation) effect. However, keep in mind that these aren't actually that rare. If you take multiple sets of these samples, chances are you will run into them for a particular sample sooner or later. With 4 samples, the probability that 1 will be off by 2 sigma is about 17% or 1-in-6.
So yes, it's unusually low, however, it's a pretty large marigin of error (17%) Considering we don't actually know the numbers we're deviating from...
I didn't see the quoted post in the thread, but nonetheless, I will argue against this interpretation. To say that it falls within two sigma is to beg the question: it already assumes that the drops are all coming from the same distribution. (As an aside, if they indeed were all from the same Bernoulli distribution, the best parameter estimate would be p=53/297; the s.d. of the Binomial distribution with N=119 (the 20110628 sample) is 4.18, the mean is 21.2, and the observed value of 16 rare drops is certainly within two s.d. of the mean.)

Before applying the s.d. heuristic, we have to confirm that the hypothesis that all the rare drop rates are the same is a reasonable one. Suppose the rare drop rate for 20110628 is u, and that the rare drop rates for 20110913 and 20110426 are the same at v. Let hypothesis I be: u and v are equal(the drop rate did not actually dip). Let hypothesis II be: u is v-0.05 (the drop rate for rares was 5% lower in the 20110913 build.) We can apply Bayesian model selection (with u and v otherwise unconstrained) to see which is the better model.

Let n1=119 be the number of drops total in 20110628, r1=16 be the number of rares, and n2=178 the number of drops total in 20110913 and 20110426, and r2=37 the number of rares.
The marginal likelihood of I is:
(n1 choose r1)(n2 choose r2) Integral(u=0 to 1, u^r1.(1-u)^(n1-r1).v^r2.(1-v)^(n2-r2) du) (with v=u)
=0.000107.
The marginal likelihood of II is:
(n1 choose r1)(n2 choose r2) Integral(u=0 to 0.95, u^r1.(1-u)^(n1-r1).v^r2.(1-v)^(n2-r2) du) (with v=u+0.05)
=0.000393
The Bayes factor for II over I is the ratio of likelihoods, = 3.66. A ratio above 3 is 'Substantial' - I'm relying on Wikipedia here - evidence for the second hypothesis. We cannot safely assume that the drop rates are constant across the three releases, and there is substantial evidence that a 5% lower drop rate in the 20110628 release is a better explanation. (Actually any proposed difference between 4% and 10% gives a factor greater than 3.)

However, given
Quote:
Originally Posted by reiella
Just to add in here, Second Measure did confirm that the later trials themselves are weighted to reward the higher rewards more frequently, so it does potentially cause dataset contamination wrt to Keyes' results.
the fundamental assumptions behind the analysis have to be re-examined regardless!


 

Posted

Quote:
Originally Posted by GuyPerfect View Post
For those nay-sayers who insist all the components have a fixed drop rate, feel free to explain this abundance of "uncommon" being more common than "common"
Apparently, collecting badge objectives gives the entire league a better chance at higher tier rewards. Considering how easy it is to get 1/2 of the badges each BAF run, this makes Uncommons drop more often.


"I accidently killed Synapse, do we need to restart the mission?" - The Oldest One on Lord Recluses Strike Force

 

Posted

After all that analysis, subjectively the drop rates are sodding miserable. Just got my sixth common in a row, suffering though trials which have become tedious through months of grinding, suffering multiple defeats in a Lambda trial which makes me feel anything but super.

Limited grindy content, dodgy lore, random rewards, and illogically and brutally overpowered foes. So, so unimpressed with the end-game vision at this point in time.


 

Posted

Quote:
Originally Posted by reiella View Post
Just to add in here, Second Measure did confirm that the later trials themselves are weighted to reward the higher rewards more frequently, so it does potentially cause dataset contamination wrt to Keyes' results.

Second Measure on Later Trial Rewards
For some reason I don't believe him. Maybe it is because I have been tracking my progress on not only the overall trial drops, but have broken down the data per trial as well.

Quote:
Originally Posted by Snow Globe View Post
1) Tell that to the player, who in my estimation, was an active participant who got the thread table in the last successful Underground Trial I was in.

2) I think you need to read this post about my trial results:
[Edit: pointed to this thread] I did 2 each of Keyes, BAF, and Lambda this weekend, so the numbers below are slightly different from the graphs/counts from that post by the following amounts:

BAF: +1 Common, +1 Rare.
Lambda: +2 Uncommons.
Keyes: +2 Uncommons.

Given that the Underground has a 2:1 fail rate, I don't perceive that as having a chance at higher rewards.

In fact, out of my Underground Trials (15 trials):
10 Fails (66.67%)
0 Threads (0.00%)
1 Common (6.67%)
2 Uncommons (13.33%)
1 Rare (6.67%)
1 Very Rare. (6.67%)

Without the fails
1 Common (20%)
2 Uncommons (40%)
1 Rare (20%)
1 Very Rare. (20%)

I'd have a larger sample set except no one, including myself, wants to run them very frequently with people they don't know.

Keyes on live since Issue 20.5 (50 successful trials, 3 fails):
18 Common (36% of Keyes runs since June 28, 2011)
21 Uncommons (42% of Keyes runs since June 28, 2011)
8 Rare (16% of Keyes runs since June 28, 2011)
3 Very Rare (6% of Keyes runs since June 28, 2011)

Compare that with BAF since 20.5 (68 successful trials, 7 fails):
21 Common (32.35% of BAF runs since June 28, 2011)
31 Uncommons (45.59% of BAF runs since June 28, 2011)
10 Rare (14.71% of BAF runs since June 28, 2011)
5 Very Rare (7.35% of BAF runs since June 28, 2011)

Or Lambda since 20.5 (59 successful trials, 6 fails):
25 Common (42.37% of Lambda runs since June 28, 2011)
15 Uncommons (28.81% of Lambda runs since June 28, 2011)
11 Rare (18.64% of Lambda runs since June 28, 2011)
6 Very Rare (10.17% of Lambda runs since June 28, 2011)

See much difference in the trends between for the overall totals for these three graphs? I don't:





All three have an overall comparative curve to them.

Here are pie charts of the rarities for the three different trials (BAF, Lambda, Keyes) since June 28, 2011.


The end result is that while Second Measure (or any of the development team for that matter) say that X trials have higher chances at higher rewards, there is no difference when the pavement meets the road as far as a player can tell.
The distribution, contrary to what Second Measure said, seems to be fairly uniform among the trials. If Keyes was supposed to have a higher chance at Rares and Very Rares, that should be showing in the overall totals.




Triumph: White Succubus: 50 Ill/Emp/PF Snow Globe: 50 Ice/FF/Ice Strobe: 50 PB Shi Otomi: 50 Ninja/Ninjistu/GW Stalker My other characters

 

Posted

And then the odd evening where I go through two Lam, two Baf, and get a common and three VR's. (Plus an Uncommon random drop in one of the baf's)

Screwy random generators.


My memory's not as sharp as it used to be.
Also, my memory's not as sharp as it used to be.

"The tip of a shoelace is called an aglet, its true purpose is sinister." The Question

 

Posted

I should have started doing this when I first started running trials, because rewards for me have been pretty dismal. I had 3x50's when I first started running the trials - a Dark/Dark/Psi Defender, a Merc/Traps MM, and a Fortuna. Between all 3 of them, I roughly estimate between 100-150 trials (all BAF/λ) before I took a long break from them to play on my lowbies.

I can extrapolate some rough numbers when I get home later today, but what comes to mind thus far:

  • Defender: Has had the best luck for Rares, about 5-7 Rares, plenty of commons and uncommons.
  • Mastermind: Has had terrible luck for Rares - only gaining 2, some commons and tons of Uncommons (she had so many uncommons I built a T2 and T3 Alpha with mostly UC's). Ironically, she is the only character to have earned a VRare.
  • Fortuna: She's not run as many trials as my previous two characters, but so-far she's fallen in-between them as far as results go. More Rares than my MM by far, but still lagging behind my Defender by a notable margin.
  • New BS/Regen Scrapper: She's only run trials - her first one (A BAF) yielded a Rare, and the second one (a speed λ) yielded a common.

I've not been on many failed trials, so I don't count them. I've also never received the "Threads-Only) booby price. I also have yet to run any of the new trials (they were introduced when I took a month off from CoH, and another couple months playing lowbies). Now that I have a new L50, I'll start recording my data a more closely, as to give yet another list of data (the more data samples you have, the more accurate any conclusions are).