Sorry about the gap between postings everyone, especially since I left you hanging last time with the promise of more thoughts on park factors. The trade deadline hit me pretty hard and I've been focusing on catching up on making sure the DFS engine is completely up to date . Plus. don't tell anyone but I sneaked in some prep for fantasy football.

The lag also allowed me to really crystallize my thoughts on the park factor conundrum and instead of rambling without really making a cogent point, which is the direction I was headed, I now have a plan of attack to get my point across as well as tie it into both DFS and seasonal play.

For those that missed the piece referenced, you can catch up HERE or just keep reading as I'll remind everyone where I'm coming from. Long story short is while I know park factors aren't perfect, I've accepted the fact that using them is better than not using them. The formula, which I will discuss in a minute, is designed to flesh out all biases but it fails to do so. We've developed a means to help reduce that bias but even that is as much a mathematical treatment as it is a real solution.

There's so many things I can talk about with park factors but I decided to stay on point and primarily address the original question I had that instigated this whole thing: At what point of the season, if at all, can we confidently determine a park is playing differently than expected and thus use that new expectation in our analysis?

We'll first focus on park factors for runs, then we'll move on to homers. Starting with runs is better since you don't consider handedness with respect to runs, though I can see how doing so could help for pitchers and is something I'm going to look at in the off-season.

I use the method described in the Bill James Handbook, published by Baseball Info Solutions (BIS). Depending on the stat, the formula is a little different so when we discuss homers. I'll present that formula and discuss the differences. Here's the formula for a park runs index:

((Runs by Team at home + Opponents runs at home) / (2 x games played at home)) / ((Runs by Team away + Opponents runs away) / (2 x games played away))

The quality of team bias is supposed to be mitigated since both elements of the team - hitting and pitching - are compared home and away.

Here are the park factors for runs over the past six seasons. Included is the six-year average and standard deviation.

 20142013201220112010AVGSTDEV
ARI116971171151051108.7
ATL949610495101984.3
BAL93106117991111059.5
BOS107961211171081109.7
CHC931191029311710512.6
CIN96991111081011036.3
CLE9593909695942.4
COL15012715813513614112.5
CWS1051001279911310911.6
DET100114107106981056.3
HOU10110794110861009.8
KC101108103991011023.4
LAA9297818486886.4
LAD9187879494913.5
MIA101103100991041012.1
MIL1001111171041011077.2
MIN11210210494961027.1
NYM8587879189882.3
NYY94109991131181079.9
OAK10289919596955.0
PHI9311197100991006.7
PIT989176961039310.3
SD8383858288842.4
SEA82996985818310.7
SF9287747494849.7
STL11089989094968.5
TB10093878280888.2
TEX1059811814110911416.6
TOR1041121011151061085.8
WSH10710110296961004.6

The higher the standard deviation, the more inconsistency is seen from that park. A project for another day is to look at what parks have a low standard deviation and which are large to see if there's common characteristics we can identify. If there is a discernible pattern, perhaps we can identify some players as being more or less risky based on the stability of their park factor.

Because of the above variability, most use a three-year average when applying park factors to analysis. The three-year factors provided by BIS will be shared in a moment. It needs to be pointed out they aren't straight arithmetic averages. They appear to be a weighted average or some sort of least squares fit. I spent a decent amount of time since we last met in this space attempting to reverse engineer the calculation and got close but haven't cracked the code. I have an inquiry into Baseball Info Solutions to assist, we'll see if they respond. In addition, new parks have been built over the past six seasons along with renovations on other venues. Some of the three-year averages reflect that by leaving out some seasons in the calculation. For the purposes of this discussion, pinpointing which isn't necessary as doing so won't change what we're looking for or how we'll go about accounting for it.

With that as a backdrop, here are the three-year averages as provided by BIS:

 12-1411-1310-129-118-10AVGSTDEV
ARI1101101121131121111.3
ATL98971009599981.9
BAL1051071091041061061.9
BOS1081101151111081102.9
CHC1041091041081131083.8
CIN1021031061021021031.7
CLE9391939192921.0
COL1451361431321241368.5
CWS1101091131061111102.6
DET1071051041021031041.9
HOU10198969695972.4
KC1041041011041011031.6
LAA9090849197904.6
LAD8889919188891.5
MIA1021021001051031021.8
MIL10910310797941026.4
MIN10699989596994.3
NYM8687879192892.7
NYY1011071101081061063.4
OAK9493949695941.1
PHI100102991011021011.3
PIT88939110098944.9
SD8383858180821.9
SEA9199788790897.6
SF84868091101888.1
STL9892949293942.5
TB9388838792894.0
TEX1071151221191111156.0
TOR105105107104981043.4
WSH1039998981001002.1

The best way to analyze the data is to note the decrease in standard deviations, meaning there's far less fluctuation between the three-year average calculation we incorporate in our analysis from year to year. Obviously, the math looks better this way, but are the three year averages truly an accurate expectation for how the park will play the following season?

To answer this question, I looked at some correlation data, lining up the expectation with what happened over the past several seasons. The closer to 1.0 the correlation coefficient is, the better the three-year average did predicting how the parks would play in total. Here's how the three-year averages did predicting the run factors for the 30 MLB venues since 2011:

2011201220132014
0.660.820.650.71

There's definitely some correlation but it's not perfect. What does this mean in practical terms? For an entire season, we made assumptions based on how a park would play, We benched or started hitters and pitchers based on how we thought a park would play and it ended up playing differently.

ESPECIALLY in DFS, if we know WHEN to rejigger our evaluation to reflect how the park is playing at the present time, we can gain edges over those continuing to use the accepted, published park factor.

Let's take a look at last season and highlight some scenarios that we could have taken advantage of had we adjusted our park index.

 ExpectedActual
BAL10793
CHC10993
CIN10396
MIN99112
NYY10794
OAK93102
PHI10293
SEA9982
STL92110
TB88100
TEX115105
WSH99107

- Camden Yards, Wrigley Field, Yankee Stadium and Citizens Bank Park were all perceived to be hitter's parks so we not only expected good things from our hitters only to be disappointed, but we also reserved some pitchers when it wasn't actually necessary. In DFS, we missed on some chances to be safely contrarian by using a starting pitcher in what was perceived to be a hitter's park. Others would have faded the hurler, keeping his usage rate down which is important if you're trying to take down a GPP.

- Target Field, O.Co. Coliseum, Busch Stadium and Nationals Park all were either thought to be a pitcher's park or neutral and ended up being much more of a hitter's park. Here, we all used pitchers we thought were protected by the park but were in fact exposing them to a better hitting environment. Again this has seasonal applications when it comes to streaming and DFS when we should have faded a pitcher that wasn't in as good of a spot as we thought.

- Globe Life Park appeared to be an extreme hitter's park but it played a lot closer to neutral.

- Safeco Field was assumed to be neutral since there was only one season of data after some renovations. It turns out the park was still a pitcher's park despite moving the fences in.

So here's where I'm at. I still have a ton of issues with the park factor calculation but instead of using this space (and my time) to wax poetic on those feelings, I'm going to instead determine park factors over segmented portions of last season in an attempt to identify when, if ever, we could confidently alter our expectation and thus gain an edge over our competition. After that, based on the results, I'll highlight some parks that fit the bill THIS season to help us manage our seasonal teams and set our DFS lineups better down the stretch.