Formula for Batter Runs...with some basic rationale
Skinpinch
Posts: 1,531
in Sports Talk
CUTTING TO THE CHASE. Below are some excerpts of the data used, and how it was obtained. THe info is taken from the play by play data on RETROSHEET. The values of each event change based on run scoring ease of the era, so the same values can't just be used for all era's. There are no situational values for the pre war times, as individual men on hitting isn't available yet.
>>>>>EXCERPTS..
"Explaining Linear Weights
Linear Weights is a powerful tool that assigns a run value for every batting event. A single is given a run value of about 0.5 runs, while a HR is given a run value of 1.4 runs. How do we know those are true?
There are 24 base-out situations that a batter faces (8 different combinations of men on base, and the 3 outs). Each of those 24 situations has a particular Run Expectancy (RE). For example, at the start of each inning, the average team will score on average 0.56 runs. This is simple enough to figure. If the average team scores 5 runs per 9 inning game, then the average R/I is 5/9 or 0.56. To fill out the rest of the matrix, you need play-by-play data, or a simulator. For example, with the bases loaded and 0 outs, the average team will score 2.4 runs from that point on, to the end of the inning.
Every Plate Appearance (PA) has a start state and end state. That is, before the PA, the batter is facing one of the 24 base-out states, and after the PA is over, there is (possibly) a new base-out state. Since each base-out state has its own RE, the difference in RE (plus any run scored) is the run impact of that batting event that caused that change in state.
For example, with a man on 2B, and 0 outs, the RE for that situation (the start state) is 1.2 runs . If the batter hits a double, the RE for the end state is of course 1.2 runs. As well, a run scored. So, the run impact of this particular batting event is 1.0 runs. If you have a man on 1B with 1 out, the RE is 0.57 runs. A double-play brings us to the end of the inning, and an RE of 0 runs. The double-play in this case is worth -.57 runs.
Bringing Linear Weights together
You can create a run impact for each of the 24 states, for each of the batting events. To make matters more manageable I will consider only 8 states (the men on base states). Here is the Linear Weights value by Men On Base (mob-LWTS).
MOB..1b...2b..3b...hr.....bb....k.....out
--- 0.29 0.49 0.68 1.00 0.29 -0.20 -0.20
x-- 0.49 0.97 1.36 1.74 0.43 -0.32 -0.36
-x- 0.72 1.00 1.16 1.60 0.23 -0.39 -0.34
--x 0.72 0.86 1.00 1.51 0.21 -0.48 -0.29
xx- 0.93 1.54 1.94 2.38 0.56 -0.52 -0.48
x-x 0.88 0.93 1.77 2.22 0.38 -0.61 -0.46
-xx 1.17 1.46 1.62 2.07 0.23 -0.70 -0.56
xxx 1.38 2.00 2.40 2.86 1.00 -0.82 -0.68
ROB 0.73 1.14 1.49 1.92 0.42 -0.44 -0.42----This is the average of all base states per event. A quick and simple way.
avg 0.49 0.79 1.06 1.42 0.35 -0.31 -0.30--------Use these number if NO On baes info is avalable. It is the average of all states.
A little explanation is in order. The "avg" line matches the standard Linear Weights components that is in the mainstream. It is based on the weighted average of the 8 MOB states. Since the most "popular" state is the no-men-on-base, those numbers are weighted more. The "ROB" is the weighted average of the 7 states with a runner on (everything except the first line). This is useful since most of the time we'll only have player stats with "no one on" or "men on base".
The "out" column is all non-K outs. The "MOB" column is the base that is occupied. "xx-" means runners on 1b and 2b.
Interesting results of mob-LWTS
As you can guess, the more runners on base, the more the run impact of every batting event. There are of course some exceptions. Let's look at the strikeouts and the other outs. With no one on, both out values are worth the same (obviously). However, with a man on 3B, the K costs the offense .48 runs, while other outs only cost .29 runs. In fact the other out costs less with a runner on 3b than a runner on 2b. This makes sense since often an out can score the runner from 3b (and less than 2 outs).
If we compare the HR and Triples, we see that the difference with MOB is always around 0.4-something runs. This again makes sense since we expect about 60% of runners from 3B to score. So, the "value of 3b to home plate" is 0.40 runs.
You'll notice also that walks have the least value when no one is on 1B, as again, you would expect. (Note: you will notice that the value of the walk is .29 runs with no one on base, but .21 runs with a man on 3b. They should of course have the same run value, except that there are far more walks with 1 and 2 outs and men on 3b. This chart is weighted by outs as well. If I presented the 24-state chart, you would not see any difference.)
You'll notice that the double is worth far more than the single when we have a runner on 1B. Again, this makes sense, since the most valuable base is the one from 3B to HP, and a single will not be able to score a runner from 1B. The HR is least valuable when you have a runner on 3B because there's already a good chance that that runner will score.
Putting mob-LWTS into practice
So, now we know how much each hitting event is worth in each of the 8 men on base situation. The reason that this becomes important is that not every batter faces those situations as often as the league average. A leadoff hitter will have far more "no men on" situations than a #3 hitter. So, the value of each hitting event changes based on these 8 situations. (Again, 24 is the correct number, but I have not seen anyone publish splits by these 24 states, and so, I will restrict myself to the 8 supersets.)">>>>>END EXCERPT.
That is the shortest explanation. Call me crazy, but I think it is a little more accurate than some 'fan' just estimating what he 'thinks' the composite value of a BA, TB, SLG% etc... is. You think? When some schmo tells you that so and so is just KILLING his team because of striking out(as opposed to tapping out), just point him to the ACTUAL data and how much it REALLY matters, instead of just GUESSING.
You can stop here, or read further. Please note, that the chart above does not include the value for number of OUT situations, those are on another chart, but this is going to be long enough.
FOR MORE DETAIL READ BELOW......
>>>>"Runners on, and runners over
Runs are created by getting runners on base, and moving them over. Most people look at the OBA as the first part, and the SLG as the second part. And the combination of the two should generate runs scored. It's not that simple.
Using common sense on an uncommon example
Have you ever played in a softball league where the typical team scores upwards of 20 runs in a 7-inning game? Such a team will send say 51 batters to the plate, 21 of which will be out, and 20 of the other 30 batters will score. If a runner that gets on base will score two-thirds of the time, how much more valuable is the home run compared to a single? In softball, because the run environment is so high, it is important just to be able to get on base, because you know that you have a good chance at scoring.
Take an even more extreme example. Imagine playing in a run environment with 100+ runs scored in every game. 90% of the baserunners end up scoring. In this environment, there is very little difference between a home run and a single. Just by virtue of getting on base, you are bound to score. There is far more value in getting on base, than of moving runners over.
Take an extreme example the other way. Pedro Martinez provides his opposition with a very low run environment. Getting on base is not enough. Very few of those runners will end up scoring. However, if you can hit a home run, you will be adding alot of run potential to the runners on base. Not only that, but when you hit a home run, you are always guaranteed 1 run.
Don't like common sense? Let's try some math
Based on my analysis of the play-by-play data from 1974 to 1990 provided by Retrosheet and software provided by Ray Kerby, here is the likelihood of a runner scoring, based on which base he is on, and the number of outs
Chance of scoring, from each base/out state
........0 outs 1 out 2 outs
1B .38 .25 .12
2B .61 .41 .21
3B .86 .68 .29
This simply means that if you have a runner at 3B with 0 outs, then he has an 86% chance of scoring (that's the "getting on base" value). If someone can drive him in, that batter will add .14 runs (to the already established .86 for a total of 1.00; this is the "driving him in" value).
As you can see the higher the chances of scoring, the less value there is in driving in a runner. Let's look at the run-driving value of the walk.
Run Driving value of the walk, from each base/out state
.........0 outs 1 out 2 outs
1B (to 2B) +.23 +.16 +.09
2B (to 3B) +.25 +.27 +.08
3B (to home) +.14 +.32 +.71
Obviously, the most valuable walk in terms of moving runners over is the walk that scores the runner from 3B and 2 outs. This happens rarely, as the bases would be loaded. So, on top of this table, we need a "frequency" table that shows how often a walk occurs in each of the above scenarios.
Frequency of walk in moving runners over, from each base/out state
........0 outs 1 out 2 outs
1B (to 2B) 0.053 0.084 0.114
2B (to 3B) 0.012 0.027 0.042
3B (to home) 0.002 0.006 0.010
As you can see, walks are not given out in random fashion. A large portion of them occur with 2 outs, when they do the least damage. Multiplying these two tables will give you the "moving over" value of the walk. This works out to +.06 runs.
The "getting on" value of the walk can be determined using the "chance of scoring" table presented above, with the appropriate frequency at which walks occurs in those states.
Frequency of walk occuring, by outs
0 outs 1 out 2 outs
0.316 0.326 0.359
Doing a similar multiplication, and we see that the "getting on" value of the walk works out to +.24 runs. The run value of the walk is therefore equal to +.30 runs.
We could have performed this analysis in several other ways, each of which would yield the same result of +.30 runs. One is to look at the run expectancy (RE) before the walk, the RE after the walk, take the difference, add the number of runs that score, and you get the run value of the walk. Doing that and we get a run value of +.30 runs. Another way is to construct a simulator, insert a walk, and look at the difference. You will find that given the run environment of 1974-1990 you will get a run value of +.30 runs.
The important point to remember is that the run value of all the hitting events is dependent on the run environment. The walk is worth more today than in 1968. It is worth more in Coors than at the Astrodome.
Using the RE approach, here is the run values of all offensive events.
Run values, 1974-1990, using the RE approach
Single Double Triple HR Walk IBB HBP Reached Base On Error Interference OtherSafe
0.460 0.750 1.033 1.402 0.303 0.176 0.330 0.478 0.357 0.631
Sac Strikeout Out
(0.090) (0.269) (0.265)
SB CS Pickoff Pickoff Error Balk PB WP DefensiveIndiff OtherAdvance
0.193 (0.437) (0.228) (0.182) 0.250 0.276 0.278 0.132 (0.362)
Another Interlude
Remember what the run values represent. They represent the MARGINAL effect of the offensive events GIVEN a specific environment. Remember that. Repeat that.
If you get an out in a run environment that scores 3 runs per INNING, that is very costly to your team. It has a negative effect only because of the expectations of future runs. The out is not very costly when Bob Gibson's 1.12 ERA is on the mound, simply because the expectation is low that a run would be scored at all. -.27 runs doesn't mean that you will score negative runs, but rather that your team's run potential has been decreased by .27, GIVEN the environment in which the out was created.
I will talk more about how to understand the out in the frame of reference of Runs Created and Linear Weights in my next article. And I will apply David Smyth's BaseRuns, a constructor that models reality in almost all run environments.">>>>>>>> END EXCERPT
You see that chart directly above, that is an example of ALL OFFENSIVE EVENTS, that got some panties in a bunch. Though that isn't the situational chart. MORE ARTICLE BELOW.....
<<<<<Actual | "Theoretical breakup"
|
Event Total |Getting on Moving over Inning killer
--------------------------------------------------------------
single .46 | .25 .21 ---
double .75 | .41 .34 ---
triple 1.03 | .61 .42 ---
homerun 1.40 | 1.00 .40 ---
walk .30 | .24 .06 ---
steal .19 | --- .19 ---
CS -.44 | -.26 -.02 -.16
out -.27 | -.01 -.10 -.16
"The building blocks of run creation
Let's take these one at a time. We know that a random single will add an average of +.46 runs to a game. We also know that about 25% of the time, a player who reaches first on a single (or is replaced there by a pinch runner or a force play) will score. That leaves .21 of run potential that a single adds to the runners on base. (This can also be calculated exactly, as we have shown with the walk. It's a much more cumbersome calculation to be sure.) The double follows a similar pattern. The triple and HR are interesting in their differences. While we know for certain the getting on value of a HR (1 run), why would the "moving runners over" value of the HR and triple be different? Again, this relates to these values representing the weighted average of their moving over value. While they have the exact same values in each of the 24 base-out states, because the average HR occurs slightly more frequently with bases empty than the triple does, the triple occurs more often with men on base. The walk was discussed in the previous article.
The stolen base (and all of its brothers like balks, passed balls, etc) are interesting in that they have no "getting on" value. Their value lies entirely in moving runners over.
Outs and caught stealing
Here is where it gets interesting. The out. Let's study the caught stealing. Three very important things happen when a runner is caught stealing: (1) a runner is removed from the base path, (2) the run potential of all existing runners is reduced (i.e., a runner on 3B has a much greater chance of scoring with 1 out rather than 2 outs, which is what would happen if a runner is caught with 1 out), and (3) it gives the team 2 outs instead of 3, thereby reducing the potential number of future batters coming to the plate. Let's take these things one at a time.
A runner prior to attempting to steal, will have a 26% chance of scoring. If he is unsuccessful, his chances of scoring are reduced to zero. Therefore, he erases a runner from the bases (the "getting on" part) and that costs his team -.26 runs. Sometimes, a runner steals when there are other runners on base. A runner at 3B with 1 out has an excellent chance of scoring. With 2 outs, his chances of scoring are reduced greatly. While the impact of this is high, the frequency at which a CS occurs with runners on base is low, hence the small -.02 run value impact of "moving runners over".
The last part is the reduction in run potential of all future batters. Since the average game in 1974-1990 had 4.3 RPG, this means that each inning produced .48 runs per inning. Therefore, each out reduces the run potential of all future batters by .16 runs. That's the "inning killer" value. Add up all three values, and we get -.44 runs. What is very important to realize is that -.44 value can be calculated in 2 independent ways (through a simulator, or looking at the difference in run expectancy). We now have a third independent way (loosely related to the second way). We now have the building blocks of run creation.
A regular out will sometimes remove a runner from the bases (like a GIDP), and in hindsight, I should have split up the out event into "out 1", "out 2", "out 3" to denote the type of out. In any case, the "getting on" value is a -.01. For the "moving runners over", while the out will sometimes move a runner from 2b to 3b, most of the time, the out will reduce the chances of the existing runners from scoring. That effect, which is probably the most complex calculation we have, is -.10 runs. The "inning killer" value has been discussed with the CS, and it is -.16 runs. The total comes in at -.27 runs.
Important note to remember: all of these values apply ONLY to the 4.3 RPG environment of 1974-1990. Every run environment (whether at an era, league, team, or batting spot level) will have its own run values for the hitting events. "
>>>>>>>> END EXCERPT
IF you read this far, great. I wold guess that peope who refuse to believe in reality did not read this far, or they will just summarily dismiss it and stick to faulty truths so that they may continue to live in the "Matrix" of baseball reality.
>>>>>EXCERPTS..
"Explaining Linear Weights
Linear Weights is a powerful tool that assigns a run value for every batting event. A single is given a run value of about 0.5 runs, while a HR is given a run value of 1.4 runs. How do we know those are true?
There are 24 base-out situations that a batter faces (8 different combinations of men on base, and the 3 outs). Each of those 24 situations has a particular Run Expectancy (RE). For example, at the start of each inning, the average team will score on average 0.56 runs. This is simple enough to figure. If the average team scores 5 runs per 9 inning game, then the average R/I is 5/9 or 0.56. To fill out the rest of the matrix, you need play-by-play data, or a simulator. For example, with the bases loaded and 0 outs, the average team will score 2.4 runs from that point on, to the end of the inning.
Every Plate Appearance (PA) has a start state and end state. That is, before the PA, the batter is facing one of the 24 base-out states, and after the PA is over, there is (possibly) a new base-out state. Since each base-out state has its own RE, the difference in RE (plus any run scored) is the run impact of that batting event that caused that change in state.
For example, with a man on 2B, and 0 outs, the RE for that situation (the start state) is 1.2 runs . If the batter hits a double, the RE for the end state is of course 1.2 runs. As well, a run scored. So, the run impact of this particular batting event is 1.0 runs. If you have a man on 1B with 1 out, the RE is 0.57 runs. A double-play brings us to the end of the inning, and an RE of 0 runs. The double-play in this case is worth -.57 runs.
Bringing Linear Weights together
You can create a run impact for each of the 24 states, for each of the batting events. To make matters more manageable I will consider only 8 states (the men on base states). Here is the Linear Weights value by Men On Base (mob-LWTS).
MOB..1b...2b..3b...hr.....bb....k.....out
--- 0.29 0.49 0.68 1.00 0.29 -0.20 -0.20
x-- 0.49 0.97 1.36 1.74 0.43 -0.32 -0.36
-x- 0.72 1.00 1.16 1.60 0.23 -0.39 -0.34
--x 0.72 0.86 1.00 1.51 0.21 -0.48 -0.29
xx- 0.93 1.54 1.94 2.38 0.56 -0.52 -0.48
x-x 0.88 0.93 1.77 2.22 0.38 -0.61 -0.46
-xx 1.17 1.46 1.62 2.07 0.23 -0.70 -0.56
xxx 1.38 2.00 2.40 2.86 1.00 -0.82 -0.68
ROB 0.73 1.14 1.49 1.92 0.42 -0.44 -0.42----This is the average of all base states per event. A quick and simple way.
avg 0.49 0.79 1.06 1.42 0.35 -0.31 -0.30--------Use these number if NO On baes info is avalable. It is the average of all states.
A little explanation is in order. The "avg" line matches the standard Linear Weights components that is in the mainstream. It is based on the weighted average of the 8 MOB states. Since the most "popular" state is the no-men-on-base, those numbers are weighted more. The "ROB" is the weighted average of the 7 states with a runner on (everything except the first line). This is useful since most of the time we'll only have player stats with "no one on" or "men on base".
The "out" column is all non-K outs. The "MOB" column is the base that is occupied. "xx-" means runners on 1b and 2b.
Interesting results of mob-LWTS
As you can guess, the more runners on base, the more the run impact of every batting event. There are of course some exceptions. Let's look at the strikeouts and the other outs. With no one on, both out values are worth the same (obviously). However, with a man on 3B, the K costs the offense .48 runs, while other outs only cost .29 runs. In fact the other out costs less with a runner on 3b than a runner on 2b. This makes sense since often an out can score the runner from 3b (and less than 2 outs).
If we compare the HR and Triples, we see that the difference with MOB is always around 0.4-something runs. This again makes sense since we expect about 60% of runners from 3B to score. So, the "value of 3b to home plate" is 0.40 runs.
You'll notice also that walks have the least value when no one is on 1B, as again, you would expect. (Note: you will notice that the value of the walk is .29 runs with no one on base, but .21 runs with a man on 3b. They should of course have the same run value, except that there are far more walks with 1 and 2 outs and men on 3b. This chart is weighted by outs as well. If I presented the 24-state chart, you would not see any difference.)
You'll notice that the double is worth far more than the single when we have a runner on 1B. Again, this makes sense, since the most valuable base is the one from 3B to HP, and a single will not be able to score a runner from 1B. The HR is least valuable when you have a runner on 3B because there's already a good chance that that runner will score.
Putting mob-LWTS into practice
So, now we know how much each hitting event is worth in each of the 8 men on base situation. The reason that this becomes important is that not every batter faces those situations as often as the league average. A leadoff hitter will have far more "no men on" situations than a #3 hitter. So, the value of each hitting event changes based on these 8 situations. (Again, 24 is the correct number, but I have not seen anyone publish splits by these 24 states, and so, I will restrict myself to the 8 supersets.)">>>>>END EXCERPT.
That is the shortest explanation. Call me crazy, but I think it is a little more accurate than some 'fan' just estimating what he 'thinks' the composite value of a BA, TB, SLG% etc... is. You think? When some schmo tells you that so and so is just KILLING his team because of striking out(as opposed to tapping out), just point him to the ACTUAL data and how much it REALLY matters, instead of just GUESSING.
You can stop here, or read further. Please note, that the chart above does not include the value for number of OUT situations, those are on another chart, but this is going to be long enough.
FOR MORE DETAIL READ BELOW......
>>>>"Runners on, and runners over
Runs are created by getting runners on base, and moving them over. Most people look at the OBA as the first part, and the SLG as the second part. And the combination of the two should generate runs scored. It's not that simple.
Using common sense on an uncommon example
Have you ever played in a softball league where the typical team scores upwards of 20 runs in a 7-inning game? Such a team will send say 51 batters to the plate, 21 of which will be out, and 20 of the other 30 batters will score. If a runner that gets on base will score two-thirds of the time, how much more valuable is the home run compared to a single? In softball, because the run environment is so high, it is important just to be able to get on base, because you know that you have a good chance at scoring.
Take an even more extreme example. Imagine playing in a run environment with 100+ runs scored in every game. 90% of the baserunners end up scoring. In this environment, there is very little difference between a home run and a single. Just by virtue of getting on base, you are bound to score. There is far more value in getting on base, than of moving runners over.
Take an extreme example the other way. Pedro Martinez provides his opposition with a very low run environment. Getting on base is not enough. Very few of those runners will end up scoring. However, if you can hit a home run, you will be adding alot of run potential to the runners on base. Not only that, but when you hit a home run, you are always guaranteed 1 run.
Don't like common sense? Let's try some math
Based on my analysis of the play-by-play data from 1974 to 1990 provided by Retrosheet and software provided by Ray Kerby, here is the likelihood of a runner scoring, based on which base he is on, and the number of outs
Chance of scoring, from each base/out state
........0 outs 1 out 2 outs
1B .38 .25 .12
2B .61 .41 .21
3B .86 .68 .29
This simply means that if you have a runner at 3B with 0 outs, then he has an 86% chance of scoring (that's the "getting on base" value). If someone can drive him in, that batter will add .14 runs (to the already established .86 for a total of 1.00; this is the "driving him in" value).
As you can see the higher the chances of scoring, the less value there is in driving in a runner. Let's look at the run-driving value of the walk.
Run Driving value of the walk, from each base/out state
.........0 outs 1 out 2 outs
1B (to 2B) +.23 +.16 +.09
2B (to 3B) +.25 +.27 +.08
3B (to home) +.14 +.32 +.71
Obviously, the most valuable walk in terms of moving runners over is the walk that scores the runner from 3B and 2 outs. This happens rarely, as the bases would be loaded. So, on top of this table, we need a "frequency" table that shows how often a walk occurs in each of the above scenarios.
Frequency of walk in moving runners over, from each base/out state
........0 outs 1 out 2 outs
1B (to 2B) 0.053 0.084 0.114
2B (to 3B) 0.012 0.027 0.042
3B (to home) 0.002 0.006 0.010
As you can see, walks are not given out in random fashion. A large portion of them occur with 2 outs, when they do the least damage. Multiplying these two tables will give you the "moving over" value of the walk. This works out to +.06 runs.
The "getting on" value of the walk can be determined using the "chance of scoring" table presented above, with the appropriate frequency at which walks occurs in those states.
Frequency of walk occuring, by outs
0 outs 1 out 2 outs
0.316 0.326 0.359
Doing a similar multiplication, and we see that the "getting on" value of the walk works out to +.24 runs. The run value of the walk is therefore equal to +.30 runs.
We could have performed this analysis in several other ways, each of which would yield the same result of +.30 runs. One is to look at the run expectancy (RE) before the walk, the RE after the walk, take the difference, add the number of runs that score, and you get the run value of the walk. Doing that and we get a run value of +.30 runs. Another way is to construct a simulator, insert a walk, and look at the difference. You will find that given the run environment of 1974-1990 you will get a run value of +.30 runs.
The important point to remember is that the run value of all the hitting events is dependent on the run environment. The walk is worth more today than in 1968. It is worth more in Coors than at the Astrodome.
Using the RE approach, here is the run values of all offensive events.
Run values, 1974-1990, using the RE approach
Single Double Triple HR Walk IBB HBP Reached Base On Error Interference OtherSafe
0.460 0.750 1.033 1.402 0.303 0.176 0.330 0.478 0.357 0.631
Sac Strikeout Out
(0.090) (0.269) (0.265)
SB CS Pickoff Pickoff Error Balk PB WP DefensiveIndiff OtherAdvance
0.193 (0.437) (0.228) (0.182) 0.250 0.276 0.278 0.132 (0.362)
Another Interlude
Remember what the run values represent. They represent the MARGINAL effect of the offensive events GIVEN a specific environment. Remember that. Repeat that.
If you get an out in a run environment that scores 3 runs per INNING, that is very costly to your team. It has a negative effect only because of the expectations of future runs. The out is not very costly when Bob Gibson's 1.12 ERA is on the mound, simply because the expectation is low that a run would be scored at all. -.27 runs doesn't mean that you will score negative runs, but rather that your team's run potential has been decreased by .27, GIVEN the environment in which the out was created.
I will talk more about how to understand the out in the frame of reference of Runs Created and Linear Weights in my next article. And I will apply David Smyth's BaseRuns, a constructor that models reality in almost all run environments.">>>>>>>> END EXCERPT
You see that chart directly above, that is an example of ALL OFFENSIVE EVENTS, that got some panties in a bunch. Though that isn't the situational chart. MORE ARTICLE BELOW.....
<<<<<Actual | "Theoretical breakup"
|
Event Total |Getting on Moving over Inning killer
--------------------------------------------------------------
single .46 | .25 .21 ---
double .75 | .41 .34 ---
triple 1.03 | .61 .42 ---
homerun 1.40 | 1.00 .40 ---
walk .30 | .24 .06 ---
steal .19 | --- .19 ---
CS -.44 | -.26 -.02 -.16
out -.27 | -.01 -.10 -.16
"The building blocks of run creation
Let's take these one at a time. We know that a random single will add an average of +.46 runs to a game. We also know that about 25% of the time, a player who reaches first on a single (or is replaced there by a pinch runner or a force play) will score. That leaves .21 of run potential that a single adds to the runners on base. (This can also be calculated exactly, as we have shown with the walk. It's a much more cumbersome calculation to be sure.) The double follows a similar pattern. The triple and HR are interesting in their differences. While we know for certain the getting on value of a HR (1 run), why would the "moving runners over" value of the HR and triple be different? Again, this relates to these values representing the weighted average of their moving over value. While they have the exact same values in each of the 24 base-out states, because the average HR occurs slightly more frequently with bases empty than the triple does, the triple occurs more often with men on base. The walk was discussed in the previous article.
The stolen base (and all of its brothers like balks, passed balls, etc) are interesting in that they have no "getting on" value. Their value lies entirely in moving runners over.
Outs and caught stealing
Here is where it gets interesting. The out. Let's study the caught stealing. Three very important things happen when a runner is caught stealing: (1) a runner is removed from the base path, (2) the run potential of all existing runners is reduced (i.e., a runner on 3B has a much greater chance of scoring with 1 out rather than 2 outs, which is what would happen if a runner is caught with 1 out), and (3) it gives the team 2 outs instead of 3, thereby reducing the potential number of future batters coming to the plate. Let's take these things one at a time.
A runner prior to attempting to steal, will have a 26% chance of scoring. If he is unsuccessful, his chances of scoring are reduced to zero. Therefore, he erases a runner from the bases (the "getting on" part) and that costs his team -.26 runs. Sometimes, a runner steals when there are other runners on base. A runner at 3B with 1 out has an excellent chance of scoring. With 2 outs, his chances of scoring are reduced greatly. While the impact of this is high, the frequency at which a CS occurs with runners on base is low, hence the small -.02 run value impact of "moving runners over".
The last part is the reduction in run potential of all future batters. Since the average game in 1974-1990 had 4.3 RPG, this means that each inning produced .48 runs per inning. Therefore, each out reduces the run potential of all future batters by .16 runs. That's the "inning killer" value. Add up all three values, and we get -.44 runs. What is very important to realize is that -.44 value can be calculated in 2 independent ways (through a simulator, or looking at the difference in run expectancy). We now have a third independent way (loosely related to the second way). We now have the building blocks of run creation.
A regular out will sometimes remove a runner from the bases (like a GIDP), and in hindsight, I should have split up the out event into "out 1", "out 2", "out 3" to denote the type of out. In any case, the "getting on" value is a -.01. For the "moving runners over", while the out will sometimes move a runner from 2b to 3b, most of the time, the out will reduce the chances of the existing runners from scoring. That effect, which is probably the most complex calculation we have, is -.10 runs. The "inning killer" value has been discussed with the CS, and it is -.16 runs. The total comes in at -.27 runs.
Important note to remember: all of these values apply ONLY to the 4.3 RPG environment of 1974-1990. Every run environment (whether at an era, league, team, or batting spot level) will have its own run values for the hitting events. "
>>>>>>>> END EXCERPT
IF you read this far, great. I wold guess that peope who refuse to believe in reality did not read this far, or they will just summarily dismiss it and stick to faulty truths so that they may continue to live in the "Matrix" of baseball reality.
0
Comments
But I do have one question for you........Are you not the biggest baseball buff going?
Out 0 -- 1 -- 2
1B .38 .25 .12
2B .61 .41 .21
3B .86 .68 .29
Collecting 1970s Topps baseball wax, rack and cello packs, as well as PCGS graded Half Cents, Large Cents, Two Cent pieces and Three Cent Silver pieces.
First, I'm not even close to the biggest baseball statistical buff at all. The authors of the book "The Book...playing the percentages in baseball." have to be at the top right now. I'm simply a guy who loved to play baseball, who followed it more than most, happened to remember the players and results to a high degree, can apply a good degree of logic to findings, and then I simply poked and probed and debated to come to greater realizations.
Cliff Notes...
.........................1B.......2B......3B........HR.......BB......k........out
ROB............... 0.73.... 1.14... 1.49 ...1.92... 0.42.. -0.44.. -0.42
Nobody on.....0.29..... 0.49... 0.68... 1.00... 0.29.. -0.20.. -0.20
This is all you really need for a modern player's hitting. You can find his men on hitting on ESPN.com. All you do is simply multiply each event he did and then add them up.
For the ROB, multiply all singles x .73, all doubles x 1.14, 3B x 1.49, HR x 1.92, BB x .42, k x -.44, Other out x -.42...then add them together. Do the same thing for each event with the nobody on category, and add the two figures together.
The more detailed numbers that I posted(in all the players value) go into greater detail and examine each base situation and out situation. This ROB category simply averages each base situation for each event. It is more general, but usually end up extremely close together. Since it is very easy to find ROB(men on) totals for any hitter, as opposed to totals for each situation, it is simpler.
_______IMORTANT THIS IS HOW?_____________
THis is the way you get those figures. When a batter comes up with a man on second and one out, the average run expectancy is that you will score X amount of runs. If you make an out, you lower the chances of scoring a run. If you get a hit, you higher the chances. It is pretty logical and simple when you look at it like that. I think any fan would agree to that statment. The question is then a matter of HOW MUCH do you lower it, and HOW MUCH do you higher it? How much does a single advance the chances of scoring, as opposed to a double. Then how great of a chance does that single have of scoring himself, compared to the double of scoring himself. Part of that is in this chart.....
Out 0 -- 1 -- 2
1B .38 .25 .12
2B .61 .41 .21
3B .86 .68 .29
If you hit a single with nobody out, then you haeve a .38 percent chance of scoring, a doube .61, a triple .86. Notice with two outs that the chances of scoring are way down for the obvious reason. So it isn't a mystery when fans try and compare the value of a singles hitter compared to a HR hitter. It used to be that fans thought more hits, simply meant better, but as you can see in the singles chart, a LOT of singles don't lead to a hill of beans in terms of runs. It is almost a futile event with two outs.
This is done with every possible base/out situation and offensive event. Then the same thing is done with getting the runner in. A single has X amount of chance to drive the man from second base in. A double obviously has a greater chance to do that. Again, they are all figured.
Realize that this levels the playing field of the ability of players behind you. For instance, lets say Two players each get a single with nobody out, ONE GUY IS STRANDED BY HIS TEAMMATES...the other gets driven in. In a traditional measure of RUNS scored, only one of those guys is getting credited for doing something positive, while the other guy is getting penalized because his teammates are no good. IN BR, they get the same value attached, because they did the same thing, BECAUSE GIVEN EQUAL TEAMS AND EQUAL LINEUPS, they should have scored .38 percent of the time. If one guy goes all year at scoring at ten percent of the time, and another one by good fortune scores 50 percent of the time, AND THEY DID THE EXACT AMOUNT OF SINGLES, THEN THEY ARE OF EQUAL ABILITY! Of course, if you check their baserunning and notice that it is their speed that is getting extra bases, then they should get credit for that.
OUTS MADE. It amazes me that people have scoffed at DALLASACTUARY for bring up the point on how many outs a batter makes. Lets use Jim Rice as an example. Someone brought up the number of time he led in certain offensive events, but they neglected to mention that he ALSO LED THE LEAGUE IN OUTS MADE THREE TIMES TOO! Looking at the big chart in the first post notice the negative impact that an out made has in certain situations....
An out made with 1st and 2nd is a negative impact of -46 runs. Remember, this isnt a guess, it is a result of looking at every single 1st and 2nd situation from over 40 seasons! So every time he makes an out, he is costing his team a chance to score runs...the more outs made, the more you cost. That event basically wipes out any value of one of his fly ball doubles with nobody on. And remember, Rice was very good at making TWO outs with a single at bat
Now if you look at Eddie Murray vs. Jim Rice and you plug every one of their offensive events into the main chart, you will see how much they increased the team's chance of scoring, and how much they decreased it. REMEMBER, this isn't some fly by night guess work! What you then find are the situational batter runs that were posted in other posts. But you have to adjust for ballpark too!
KEY! Remember, the figures you get are a representation of runs over THE AVERAGE! A player who scores negative overall isn't changing the scoreboard backwards! He is preventing his team from scoring runs that an average player would be creating in his stead. IF Jim Rice is scoring NEGATIVE 10 runs in his last season, then that is simply ten runs below what a league average level hitter would be creating in his stead. If another player is +10 runs, then he is producing ten runs over what the league average would be in his stead, and 20 more than Rice.
Hitting wise, there really isn't any other factor to consider. Why would you still look at a simple batting average to make evaluations when you can see EXACTLY how a player is impacting the team. A good evaluator can look at the accumulation of all the traditional stats and make a pretty good estimate where he will fall among the Rolls Royce of measurements. But most fans and writers aren't as equipped to see this, so they make often gross misjudgements, based on old misconceptions and old measurements. These misconceptions are totaly blown out of the water now that detailed information has been made available.
A players true offensive value is VERY clear now. Defense? That is another story
Tough to respond while watching the Bears-Rams game, but nevertheless...
Thanks for trying to provide a formula, for a stat WHICH MEASURE EVERY SINGLE OFFENSIVE EVENT IMAGIBNABLE, this must be that particular one is it not ?
It might be nice if you could prove the validity of some of the given percentages of certain frequency of events. You do state "based on MY analysis of Retrosheet data .... "
A more quick to calculate, or step by step process might be a little easier to believe, however the more complex and lenghty a disertation is, the more one will be inclined to feel that so wordsome, statistical laden a piece is, it must have some merit. All one must really do is read your sig line, a self-proclaimed "expert fantasy player and analyst" such modesty is refreshing.
You state concepts like a homer with a man on 3rd is the least valuable, well in REALITY, not fantasy, the runs count equally, and runs are the currency of the game, and they do reflect in the final result in a major league baseball game with just as much value as if the runner was on 1st or 2nd, and they count or are worth the same value to the result of the game whether there are no outs two out, or one out.
You state a Homer with bases emtpy is more frequent than a triple with bases empty. Well really now, a homerun in any base runner situation is more frequent than a triple nowadays !
There is mention of top pitchers like Pedro M. and Bob Gibson, but where is the opposing pitcher factored into the formula ? I may have missed the calculation point.
Is there any measure of "clutch -hitting ? A guy who gets a hit or drives in a run in the late innings of a close game may be a more valuable hitter than one who excells in the ealry part of one-sided games. Where do game winning RBIs place in BRs ? Same factor as one in the fifth inning of a lop-sided game ?? Do they impact the game EXACTLY the same, as you do claim BR shows" EXACTLY how a player is impacting the team " ???
Although not posted in this particular thread , I must mention it to show why many can not quickly agree with some things you believe, you stated bating average is at the bottom of some mythical pyramid which consists of evey single imaginable offensive event, thus you feel time reached by error, singles, walks or any other imaginable offensive event is more accurate in evaluating a hitter than his batting average !! This is just cause for some doubt in your views.
No single stat is perfect or all inclusive, but to completely dimiss the value of some, is a little harsh. It does appear BRs may be a decent statistic for evaluating a hitter, as many others are as well. Your use of terms such as "faulty truths, gross misjudgements, misconceptions," and others, do convey an attitude of smugness, to be polite, and do little to enhnace acceptance of your discovery of old tradtional stats weighted and adjusted via probability percentages.
The very best htters do shine in traditional stats, they lead the league many times, they probably do well in BRs.
In 1942 Ted Williams won the triple crown of batting which uses the traditional stats of HRs, RBIs, and BA, however Joe Gordon won the MVP that year, perhaps he had a better BR ??
Again, the validity of hte data is available via retrosheet. If you want to dispute the percentages, you have access to every game log that others do. Go ahead and show that it is wrong. This is all published stuff and has been cross checked to death. It has been scrutinized to the highest degree already, and by some very bright people, not some guy who has no clue. Calling me smug? I'm not being smug. I can see why they say it is always tougher to debate an ignorant person rather than an informed one. Ignorance isn't an insult, just a level of knowledge regarding something.
Forget being pleasant, here is an example of just stupid stuff you are saying...
"In 1942 Ted Williams won the triple crown of batting which uses the traditional stats of HRs, RBIs, and BA, however Joe Gordon won the MVP that year, perhaps he had a better BR ??" -jaxxr
Do you have a freaking point that makes sense? What is your point with that statement? Is it that writers are stupid? Is it that writers discount the triple crown stats, thus not giving Ted Williams MVP? The point I see you are 'attempting' to make is that perhaps Joe Gordon won the MVP because he had better 'BR' than Ted WIlliams, thus the ignoring of Ted Williams Triple crown stats. Ted Williams was much higher than Gordon in more accurate measures. The fact that he didn't win the mvp has absolutely NOTHING to do with what we are talking about. It has much more to do with the writers voting techniques. This basically a dumb statement to support a flawed position you are tryiing to support.
What the heck does that even show? If ANYTHING IT SHOWS THE OPPOSITE OF WHAT YOU ARE TRYING TO PROVE!! The BR shows the better hitter was Ted Williams....AND THAT MORE INFORMED AND ACCURATE ANALYSIS WOULD HAVE GIVEN THE AWARD TO THE CORRECT PLAYER. AND THIS IS ONE OF THE REASONS WHY ONE SHOULD NOT USE VAGUE MEASUREMENTS WHEN PRECISE ONES ARE AVAILABLE!
MY Goodness.
Yeah, ok, I'm being 'smug' and high and mighty. I will call myself that for you. I deserve to be called that after this post, but I will gladly take that honor in light of the just plain stupid stuff I had to wade through the last few days...and the elementary logic that is seemingly lost. You are so darn hung up on "EVERY OFFENSIVE EVENT IMAGINABLE." For petes sake, do you want me to make a retraction and say, "No it doesn't cover every offensive event imagainbale, and instead say, "It covers every IMPORTANT offensive occurance...ones that are important enough to make more than a negligible difference in the evaluating of players. I will!! I am inaccurate in EVERY OFFENSIVE EVENT IMAGINABLE, but rather change that statement to what I just wrote. There, does that make you feel better? Good lord, argue points that actually mean something. Logic goes a long way.
P.S. Bags, I hope you don't find it offensive that I shortened your screen name in my reply.
Thanks for finally admitting that you were incorrect in stating there was one single stat which measure every single offensive event imaginable.
We all make some errors in our judgements and it is noble to say so.
There are several other items I found questionable, however if you can also own up to the incorrectness of just one more , I will feel you are open-minded and probably a reasonable fellow after all. You stated elsewhere that one's batting average is at the bottom of every single offensive event imaginable as a indication of a batter's worth. Can you admit that ones batting average, whether over a full career,
or just a full season , is a better stat than singles, times reaced by error, and several other offensive events ??
My comment about Ted Williams league leading BA, HRs, and RBIs in one season, does show how often even the most basic stats are sufficient to evaluate a hitter. In some cases, many many other stats are needed. It was mainly a joke, re sportswriters and baseball experts, they too make mistakes. Sorry you didn't get it.
batting average is at the bottom of the offensive pyramid that includes at the top, the precise measurements, followed by OPS, SLG, OB, and then batting average. Total bases is above it somewhere. Singles? Reached on Error? Boy, I don't ever recall saying or even thinking it is less useful than those? Those events are accounted for, but are NOT more telling than a batting average.
A career batting certainly tells a story. In normal hitting era's a guy who has a career .300 is almsot certainly not to be a below average hitter. Even the common man knows that a .300 hitter is doing something right. The whole point is that when you know measurements that are much more precise, you really don't need to use it, as the components that make it up are already being accounted for, and then some!
I'm not always reasonable, but I try and stay objective.
I have to correct you on one thing...since the very first time I posted the situational batter runs, I have always said that the biggest flaw is in giving credit/lack of credit to the battter based on the base advancing of the runners ahead of them. That simply is not fair...but the difference between hitters is going to be small enough where I wouldn't bother looking at it unless guys were neck and neck. The super minor things that aren't measured, really don't matter, as they may make the difference of 1/5 of a run for a season or something. So why bother.
Heck, based on the play by play data, there are guys that literally FIGHT tooth and nail to make sure the values are perfect. They will argue for days on whether the true value for an out is -.26 or -.27. They will do the same for the negative value of a caught stealing. That is how precise those events are.
I know there is a lot of info, but there is nothing better than knowing exactly how many times a guy will score after hitting a single. People used to guess, and now you don't have to. The same is done with ALL types of hits or ways of reaching base.
Hey, the info is there, you can choose to look at it, or you can choose to do the way you wish. What is the difference? One is going to give you a strong general outline, and the other will be extremely precise. It isn't wrong to look at it the way you do, but it doesn't make much sense to use the strong general outline to try and knock off the proven precise one...thats all I'm getting at.
I have to correct you on one thing....
you say and I quote "there is nothing better than knowing exactly how many times a guy will score after hitting a single " That my friend is pure fantasy. We may know exactly how one did perform in the past, but how many times any event will occur in the future, that's for crystal balls and palm-readers.
I am sure there are many variances in the past's figures as to the future scoring likelyhood as well, how many outs, who's next in the batting order, was a pinch-runner with more speed substituted, who's pitching, what's the score, what inning is it, and several others. One can NEVER know for sure what a hitter WILL do in any situation, only the potential likelyhood.
You state you did not say Batting Average was at the bottom of all things which measure evrey single offensive event imaginable, just of one particular level of said events. Perhaps I misunderstood, sorry. Some of your all-encompasing trems and phrases do make it difficult.
BR is a another measure of a hitter's worth, in most cases it will confim, expand, amplify, or enhance other stats or combination of stats, The one thing it can not do , nor any other stat can, is to measure every single offensive event imaginable. I believe no one has dismissed it, called it the Devil's work, or said it was totally incorrect.
May I suggest that anyone who might feel such elaborate detail, while useful, is merely not always needed, in many / most cases to evaluate a hitter,.... is most probably not a schlub, nor a moron, nor ignorant, nor misinformed.
Skin ??