From The Fantasy Oracle 2016 MLB Draft Guide Is Here! 

Fantasy Baseball: Sample Size Questions

Let's talk about sample sizes and break down some common mistakes people make with the topic.

Slide 1 of 2 Fantasy Baseball: Sample Size Questions  | Slide - 1

When is enough, enough? Is three cars enough or do you need one for each day of the week (I'm looking at you hoops players)? Is that 5th Long Island enough to send you off to happy time or do you need to order one more on a Friday night? Is sleeping with enough people to fill up your hands and toes satisfactory, or do you need to shoot for triple-digit notches on your bed post? Obviously everyone reading this has had plenty of “cuddling” in their lifetime because what is more sexy than telling someone you're in nine fantasy baseball leagues? That alone makes up for not having a huge bank account. Those are real world issues dealing with when is enough, enough. How does this apply to the world of fantasy baseball? But the real center of this article, now that I've totally led you astray for an entire paragraph, is how to employ sample sizes to baseball players.

Sunday night on Fantasy Sports Tonight (my show on SiriusXM Fantasy Sports Radio, 7-10 PM EDT) I talked a lot about sample size issues. I've put those spoken words into written form. Here we go.

Some basics.


For batters I'd like to see 150 at-bats or roughly two months of work before I panic. In truth, this isn't nearly enough time to make any lasting decisions on players, but it's enough work that we can start to at least outline the picture we're attempting to construct. Of course, caveats.

* Players with an established track record, in most cases, get at least a half season of at-bats to prove themselves. Things tend to even out if the sample size is large enough.

* Rookies/youngsters get less rope. You stink for 150 at-bats and I'm looking to move on.

* Players that start off hot, more often than not, have strong seasons even if their final couple of months don't match the production they posted on a per at-bat basis earlier in the year.

* Players that start off cold, I'm talking really chilly, are unlikely to reach their full predicted value for the season. That doesn't mean they can't be picked up after the cold start and produce at their expected per game levels. It's just that their season long numbers won't reflect the in-season improvement.

* Be reasonable. No player is going to post a .425 BABIP for a season, and it's pretty impossible for an established hitter to go a whole year with a mere 13 percent line drive rate. Filter out the obviously unsustainable numbers an act accordingly.


I'd recommend you give pitchers about 10 starts or two months of starts before drawing any lasting conclusions.

* Same as with batters, if an established pitcher struggles out of the gate give him some leeway. A hurler who hasn't established himself, at least for me, carries a bit more risk as we just don't have a good feel as to how long it will take him to pull out of his slump or how much rope a club will give him.

* A study from BaseballHQ, from 2010-12, suggests that pitchers that start off hot have a good chance to end the year at better than expected levels. Over the three years of the study 88 percent of hurlers who had an April ERA that was two runs lower than their career mark ended that season with an ERA that was indeed under their career level. That same study suggested that 67 percent of pitchers who ended April with an ERA that was two runs higher than their career level finished that season with an ERA worse than their career mark.

* Avoid reading too much into the ratios of relievers especially early in the year. An outing or two with poor results can have a devastating effect on relievers early season numbers. In fact, one of those four runs while recording only one out type of outings can actually torpedo a relievers season long numbers a good deal.

* Be honest with the numbers. A top-25 arm isn't going to end the year with a BABIP of .352 is he? If the guy has a career walk rate of two per nine but is walking four per nine that isn't likely to continue (unless he is hurt).

Slide 2 of 2 Fantasy Baseball: Sample Size Questions  | Slide - 2

Introducing Fan vs Machine.  Instantly draft and play fantasy basketball against the computer in a one day long contest.  The best part of fantasy sports is drafting, why not do it anytime YOU want? SKYLLZONE is running an exclusive FREE contest this Wednesday ONLY for NBA games ONLY for Fantasy Alarm surfers with over $1000 in cash prizes.  It’s as easy as Draft, Score, Win CASH! Click HERE to Get Started.


It's easy to say 'Player X sucks in April, just look at his career numbers. Let someone else draft him and make a move to trade for him when the calendar hits May.' In most cases this is an erroneous way to think.

(1) What is the difference between April 30th and May 12th? The answer is nothing.

(2) We often use small samples to extrapolate to a larger contention that isn't supported by the data. Let's say Player X is “known” as a huge early season power hitter. Let's say Player X hit 10 homers in two of the last three April's. Let us also posit that Player X hit 10 homers total in the other two April's of his career. Player X would average 7.5 homers per April the last four seasons (10, 3, 10 and 7 per season). Notice that in two of the April's Player X failed to reach his 7.5 April average. Not just that, one year he hit only three homers. Is he really a big time April homer hitter or is this a situation in which we really don't know because we just don't have enough data? You know what I think (I have written an article that hit on second half efforts of 2013 just for the heck of it).

Another easy trap we fall into is 'Player X is a second half player.' Again, what's the difference between July 2nd and July 25th? Nothing. It appears logical to say a guy is a first/second half performer, a hot starter etc., but the truth is all of this is based around arbitrary end points. Not just that, we often don't have enough raw data to draw a completely unassailable position.


Here are three examples looking back to make my point.

Ubaldo Jimenez had a 3.30 ERA and 1.33 WHIP last season. Those are good numbers for a guy who wasn't even drafted in mixed leagues. However, what happens if we remove April? Ubaldo posted a 7.13 ERA in April with a 1.33 WHIP. Obviously removing those five starts doesn't do a think to change his WHIP, but his ERA over his final 27 starts was 2.72. Add in his May to October K/9 rate of 9.93 and you've got yourself a pretty damn good starting pitcher, even better than you likely thought (for more see his Player Profile).

Jake McGee had a 4.02 ERA and 1.18 WHIP last season for the Rays. The WHIP is fine for a middle reliever. The ERA obviously isn't. However, Jake;s season was ruined by two outings. On April 2nd and May 1st he allowed 10 runs while recording four outs. Horrible I know. Remove those two outings though and look what happens to McGee's numbers: 2.64 ERA, 1.03 WHIP. McGee goes from boring and who cares to a near must add in deep leagues, especially when you toss in his 10.77 K/9 mark. Remember, sample size issues can really distort the outlook of a reliever.

Domonic Brown was deemed the best prospect in the Phillies' organization each year from 2008-10. He failed, miserably, to produce when given a change in 2010-12. He entered 2013 with very low expectations. He had a strong season including 27 homers. People really like him for 2014. Should they? Brown hit 12 homers in May of 2013 with nine of them coming in a 10 game stretch in May. That means...

Remove those 10 May games and Brown hit 18 homers in 129 games.
Brown hit 15 homers in 111 non-May games.
Brown hit six homers in his last 60 games.
Brown failed to homer in his last 28 games.

Honestly, Brown has been a moderate homer producer, minus one month, for his four year career. Which sample size are you gonna trust heading into 2014 (for more on Brown see his Player Profile)?

Hopefully this little discussion about how sample sizes can influence your thoughts about players will cause you to pause the next time you buy into a hot start or sell a cold one.


  • 64x64

    Chad 25 Feb 17:39 / Reply

    Ray: In a dynasty league where you can keep up to 8 players any position and it's a 2 catcher league. I need 8 of the following and sorry it's a long list. Kenley Jansen, Matt Moore, Romo, Cliff Lee, Medlen, Cashner, Wacha, McCann, Carlos Santana, Goldschmidt, Kipnis, Chris Davis, Trout, Myers. Here is the lineups, 2C 1B, 2B, SS 3B, 2 1B/3B, 2 SS/2B 4OF DH 2 UTIL Thanks!

Leave a Comment