Paved with Good Intentions

On Sunday, November 8, with 3:00 remaining in the fourth quarter, the Atlanta Falcons, trailing 17-13 with two remaining timeouts, faced fourth down and goal from the San Francisco 49ers’ two yard line. Rather than use their remaining down to attempt to score a touchdown and take the lead, the Falcons elected to take an essentially guaranteed three points by sending out Matt Bryant to kick a 19-yard field goal. From a variety of perspectives, ranging from intuitive (four is greater than three, after all) to quite rigorous, this decision was the wrong one, resulting in a significant reduction in the Falcons’ chance of winning the game. Here we attempt to address some questions:

  • To what extent was this decision counterproductive? More specifically, what are our best approximations for the Falcons’ two distinct win probabilities, respectively conditioned on their two available options? We utilize a quick, heuristic approach as well as an advanced model.
  • What factors contribute to this and similar errors? We focus on the persistence of traditionalist, “conservative” analytical fallacies.
  • As a fun thought experiment, can we approximate the maximum degree to which the aforementioned fallacies can negatively impact win probability, in any sport? In other words, what is the worst genuinely conceivable single analytical decision that a well-meaning party could make under the pretense of attempting to maximize win probability? In particular, does the example at hand approach this extreme?
  • In an era and climate where the competitive and financial stakes of high level sporting events are so high, and so much emphasis is placed on advanced analytics in other facets of operations, most notably personnel decisions, how and why do errors like this continue to be made with such frequency?

Disclaimer: This is not a hit piece. While this particular decision was presumably made by the Falcons’ head coach, with advice from coordinators and other trusted assistants, we choose not to specify names or assign blame, and will hereafter refer to the decision as made simply by the team as a whole. Even as we expand our discussion to other scenarios and other sports, it is not our goal to assert who is or is not good at their job, or who is or is not worthy of lucrative employment, a devolution that is rampant in such discussions. This analysis comes from a perspective not of frustration or criticism, but rather of genuine academic interest.

Crunching the Numbers: An Elementary Heuristic

Before we expand our gaze, let’s get down to business. How could one, with only an intermediate knowledge of football, estimate the two relevant win probabilities in real time? For this rough heuristic, we do not take into consideration the specific teams or personnel, so this is more like a discussion of what the correct decision is in this generic situation. We begin by defining the following parameters:

p = the probability of the Falcons scoring a touchdown on fourth and goal from the 2, should they attempt to do so

q = the probability of the 49ers regaining the lead and winning should the Falcons score a touchdown

r = the probability of the Falcons regaining possession and scoring a touchdown should they attempt and fail to do so initially

s = the probability of the Falcons regaining possession and scoring a field goal or touchdown should they choose to kick a field goal initially

We note that there is no defined parameter for the probability of the success of the field goal attempt, as that is an effective certainty. A field goal attempt snapped from the two yard line is equivalent to the traditional (pre-2015) extra point kick, and for some context, Matt Bryant was 222-222 on such kicks in his previous six seasons with the Falcons. A quick consideration reveals that the two relevant win probabilities are roughly P = p(1-q)+(1-p)r and s, as there would most likely be insufficient time for meaningful possessions beyond those indicated.

The most easily estimated value is p. In the NFL and NCAA, two point conversion tries are snapped from the two yard line, a spot determined to approximately balance the play’s expected value with the essentially guaranteed extra point kick. In practice, the conversion rate is slightly below 50%, but for our purposes that figure serves as a convenient and sufficient approximation. This yields the formula P = 0.5 + 0.5(r-q).

The individual approximations of r and q are highly nontrivial. However, given the limited time remaining, it is reasonable to assert that both probabilities are low, likely less than 0.25, and also that the probabilities are comparable to each other. To support the latter assertion, the scenario defining q features a team starting a drive with under three minutes to play, most likely from their own 20 yard line, whereas the scenario defining r  features a team on defense, but in immensely favorable field position. Without getting into too much detail, the competing factors of possession versus 80 yards of field position, counteract each other, though the limited remaining time adds additional value to possession. In summary we assume that r-q is a relatively small, though likely negative, quantity. In particular, we arrive at a well-motivated estimate of P ranging from 0.4 to 0.5.

An estimate for s is more challenging. We first assume that the 49ers ensuing possession begins at the 20 yard line, as most NFL kickers can record touchbacks essentially at will, particularly since the kickoff placement was moved to the 35 yard line, noting that this assumption results in, if anything, a slight overestimate for s. We note that given Atlanta’s retention of two timeouts, as well as the two minute warning, the 49ers must achieve two first downs to prevent Atlanta from ever regaining possession, and the achievement of one first down significantly limits Atlanta’s remaining time for an ensuing drive after a punt.

League wide, across all game situations, approximately 20% of all drives stall before achieving a first down. Assuming independence once a first down is achieved, this would indicate that the 49ers have approximately a 64% chance of winning the game without ever giving back the football, a 16% of punting with less than a minute remaining, and a 20% chance of punting with exactly two minutes remaining. However, built into these assumptions about the clock is that the 49ers will not throw any incomplete passes, which turn into de facto extra timeouts. Taking into account the 49ers likely conservative run-only play calling strategy, the argument could be made that the 20% figure should be significantly higher, let’s liberally assume 40%, resulting in a distribution of probabilities of 36% game over, 24% punt with under a minute, and 40% punt with two minutes.

In the event of a punt, the Falcons must then drive the ball into field goal range, likely a distance of at least 30 yards, and successfully kick a field goal (or score a long touchdown) in the remaining time. Considering the aforementioned data, and time constraints, it would be somewhat liberal to estimate that the probability of achieving this goal (again, with no consideration of specific teams or personnel) as 50% with two minutes remaining and 25% with less than a minute remaining. The resulting estimate on s is (0.4)(0.5)+(0.24)(0.25)=0.26. Given the rough nature of these probability assignments, we estimate that s lies between 0.2 and 0.3.

The interested reader may inquire: Couldn’t the consideration of factors specific to these teams, their quality, personnel, playing style, etc., lead to an advanced analysis that supports the field goal as the optimal strategy? For example, wasn’t Blaine Gabbert playing quarterback for the 49ers? Hadn’t they barely moved the football at all in the second half?

In theory, such reversals are certainly possible, but in this particular case the differential is rather cavernous. Additionally, these refined considerations are likely to effect each win probability in a similar way. For example, the consideration of San Francisco’s offensive ineffectiveness would certainly lead to an increased likelihood of Atlanta regaining possession, and hence an increase in the value of s. However, the same consideration would also lead to a corresponding decrease in the value or q and increase in the value of r, hence an increase in P. More thoroughly, since both offensive and defensive situations are in consideration for both teams, any asserted advantage for the Falcons would result in (likely comparable) increases in both P and s, while any asserted advantage for the 49ers would result in (likely comparable) decreases in both P and s.

Bringing out the Big Guns

While the preceding section was a pleasant exploration of how a casual observer could approximate these particular win probabilities without assistance, it is a fortunate reality of the current sports analytics climate that we don’t have to rely on such rough, off the cuff calculations. In particular, a statistical model that is ideally suited to the current discussion is the Pro Football Reference (PFR) Win Probability Calculator. The formula takes as input the time remaining in the game, the current point differential for the team in possession, down, distance, and field position, as well the game’s original point spread, and produces as output the win probability for the team in possession. A detailed explanation of the formula can be found here.

For generic teams, as assumed in our heuristic, we input 0 for the point spread, and the calculator produces the following probabilities:

If Atlanta goes for it: Atlanta wins 42.17% of the time.

If Atlanta kicks a FG (and kicks a touchback on the ensuing kickoff):  Atlanta wins 25.61% of the time.

Hey, we did pretty well!

Factoring in that the Falcons were actually 7.5 point favorites in the game, the results are below:

If Atlanta goes for it: Atlanta wins 52.55% of the time

If Atlanta kicks a FG (and kicks a touchback on the ensuing kickoff): Atlanta wins 33.33% of the time.

In both considerations, and for all point spreads in between, a team faced with Atlanta’s predicament is approximately 1.6 times as likely to win the game if they attempt to score a touchdown as opposed to kicking the short field goal. Phrased from the reciprocal perspective, the decision to kick left the Falcons about 40% less likely to win the game. (To clarify, we mean that the team sacrificed 40% of its winning outcomes, not that the difference in win probability was 40%. For example, if a decision dropped your win probability from 10% to 1%, we would say that while the difference in win probability is 9%, you are 90% less likely to win. As we continue our discussion, we will consider both ways of measuring the impact of a decision: difference in win probabilities and ratio of win probabilities.)

The Risk of the Sure Thing

If not informed by a straightforward consideration of conditional win probabilities, what could have led to such a counterproductive course of action for Atlanta?

Imagine a person is encountered with the following the game: he must either risk $2 to win $2 in a coin flip, or he must unconditionally give his opponent $1. Clearly, from a pure expected value perspective, the coin flip is the right choice, as in the long run his wins and losses should roughly balance out, and he would do much better than losing $1 on every round. However, if given the choice only once, he must take into account the volatility of the coin flip, and his personal utility in risking $2 for the sake of improved expected value versus the security of losing only $1. Maybe he really needs that second dollar. Long story short, even the shrewdest of statisticians could not declare his choice to surrender $1 as an objectively bad choice, as it may be the case that his personal utility function is highly intolerant of the increased variance of the coin flip. This is an example where the sacrifice of expected value for the sake of decreased variance is most certainly defensible.

However, imagine a second game, in which the player repeats the trial from the first game ten times, but instead of dollars, we just keep track of points. In each round, the player can either risk two points to win two points on a coin flip, or he can unconditionally surrender one point. After ten rounds, if the player has positive or zero points, he wins $100, but if he has a negative score, he gets nothing. It is very much still the case that for each round, the lower variance decision is to surrender a point. However, the game ultimately only has two outcomes, a win or a loss, and any “conservative” decisions made during the course of the game have no tangible benefit in defeat. Imagine a player who, in each round, resolutely declares, “I’m not much one for risk taking, I’ll just surrender a point.” After ten rounds and his inevitable defeat, he is no better off than he would have been in the remarkably unlikely event that he risked and lost every single coin flip. Sure, his score is -10 instead of a wildly unlucky -20, but that is not relevant to the conditions of the game. He is a loser, 100% of the time. He would be the worst player for this game imaginable.

For a less extreme example, suppose the player flips coins during his first seven rounds, winning four and losing three, leaving him with two points. He recognizes that 0 points is still a winning score, so he surrenders points in rounds 8 and 9, leaving the fate of his game to rest on a round 10 coin flip, a win probability of 50%. Had he instead just flipped all three coins in rounds 8-10, he would have only needed to win one of the three to ultimately win the game, a win probability of 87.5%. In other words, taking the “sure thing” for those two rounds is by far the RISKIER decision. The only conceivable benefit of surrendering points in rounds 8 and 9 was that he avoided his lowest possible score of -4 points that would result from three straight lost coin flips, but remember, avoiding a particularly bad score means absolutely nothing in the context of this game. There are only two true outcomes, and all that matters is optimizing win probability.

To clarify, surrendering the point isn’t always wrong. If the player flips coins in rounds 1-6 and wins four of them, then he stands with 4 points, and he can simply surrender points in rounds 7-10 and guarantee victory. At risk of broken record status, a low variance decision in a two outcome game is not inherently correct or incorrect. What matters, and ALL that matter, is how that decision impacts win probability.

The astute reader can likely predict where this discussion is headed. A game of football, or any other sport whose season is separated into discrete win/loss outcome events, is not like the first scenario. It would be closer to it if the standings at the end of the season were determined by total point differential, but they aren’t. A game of football is like the second scenario, ultimately a two outcome proposition. Those three points the Falcons scored with 3:00 to play in Week 9 are not deposited into some sort of account that could be in any way useful later. A great example of the opposite dynamic is professional golf, where tiered prize money is awarded to each player that makes the cut in each tournament, and it makes perfect sense for a player to make a variance-lowering decision based on personal preference at virtually any time. Football is not golf.

This flawed logic of “conservative” play rears its head most often, though not exclusively, when a team is trailing and considering a low variance decision that will leave them STILL trailing (for example: a field goal when trailing by more than three or a punt when trailing at all; we are not insinuating that these are never appropriate actions, only that these scenarios are disproportionately represented in the collection of bad decisions that teams make in reality). The explanation is simple: expected value being equal, if the variance of possible scores is lowered, then the lack of volatility makes it more likely for the lead to remain where it is. Put another way, if a team is trailing and is not in a situation where a low variance decision can gain them the lead (for example: down by 2, lining up a short field goal), then that team should actually take measures to RAISE variance. The “sure thing” is, in some sense, aptly named: the team is assuring themselves that they remain behind.

Aside from a misguided attraction to “traditional” or “old-school” strategy, a key psychological motivator for these crippling decisions appears to be a paralyzing aversion to a “fatal” outcome, an apparent preference to a slow, drawn out, more certain death compared to a potentially quick, yet less likely, instant execution. Even at the expense of a substantial portion of their win probability, teams often favor the route of “extending the game”. That phrase can be used to denote positive strategic techniques employed by a trailing team to maximize the utility of remaining time, by using timeouts, favoring passes to runs, getting out of bounds, etc., but that’s not how we mean it here. Rather, a trailing team faced with its mortality, perhaps in the form of a manageable fourth down situation, may be overwhelmed by the fear of the fact that if that single conversion is not made, then the game is effectively over, whereas a punt or a field goal is guaranteed to leave some conceivable path to victory, now matter how unlikely, still on the table.

To go back to a coin-flipping scenario. Imagine a player is faced with two options. He can choose to flip one coin, heads he wins, tails he loses, for a clear 50% win probability. Alternatively, he can choose to flip ten coins, and he wins if at least six of the flips are heads, a win probability of 37.7%. In a pure win/loss scenario, the second option is clearly inferior, but it avoids the player from staring down the barrel of an immediate determination of his fate. NFL teams take this second option, a lot.

 

Does it Get Any Worse?

Here is a rough transcript of a conversation between me and Daniel Garver, close friend, EPA scientist, and sports analytics enthusiast, shortly after the Falcons-49ers game in question.

Daniel: I’m trying the think of comparable examples of this kind of decision in other sports…

Alex: Here’s one for basketball – my team is down by 3 with 7 seconds left, and I miraculously steal an inbounds pass and get a breakaway opportunity. Instead of pulling up for a wide open three, I go in and dunk.

Daniel: Yeah, that’s a good one. I was going to say it’s like having the bases loaded, 1 out, Miguel Cabrera at the plate, and running a squeeze play.

Alex: Exactly, except you’re forgetting some key details: It’s the bottom of the ninth and YOU’RE DOWN BY 2!

Daniel: These definitely seem like scenarios that have a greater negative impact than the Falcons play today, but we’ve already probably crossed in to the realm of ‘things that would never actually happen’.

Daniel is likely correct about these examples, and some lengthy consideration should convince the reader that, at least among major team sports, the discrete nature of football and baseball leave them more susceptible to purely human decision (as opposed to execution)-based win probability swings. Among those two, the time constraint and fourth down decision components of football contribute to an increased prevalence of the aforementioned “fatality risk” scenarios that can psychologically warp the competitors into counterproductive decision making. Putting all of these factors together, it is reasonable to think that the most extreme, genuinely conceivable examples of decision-based reductions in win probability are likely to occur in football. So what can we come up with? How bad can one decision really be?

For the remainder of this section, we will not consider specific teams or personnel, and all appeals to the PFR calculator will include a 0 point spread.

An Extremized Falcons-49ers Scenario:  This entire post was inspired by the fact that the Falcons decision against the 49ers was in some sense perfectly ill-informed, making it inherently hard to top, even hypothetically. However, we can certainly take the core spirit of this scenario, and push some of the details to the extreme.

For example, would the Falcons’ decision-making process have been considerably different if they had been at the 1 yard line instead of the 2? Analytically speaking of course this is a big difference, but we have already established that the Falcons were clearly motivated by fallacious, “game-extending”, anti-analytic logic, and it is conceivable that this small perturbation would not swing them toward the light.

In a similar vein, what if there had been slightly less time? The Falcons likely took into account that they had the two minute warning ahead of them to supplement their two remaining timeouts. Is it possible that this perceived security blanket would have still felt sufficient if there had been, say, 2:10 remaining instead of 3:00? After all, that still leaves time for a field goal, a touchback, and a 49ers first down play before the two minute warning. Maybe the reduced time would change Atlanta’s mind, but it’s conceivable that it wouldn’t.

Here’s what the PFR calculator has to say:

Falcons down 4, 4th and goal from the SF 1, 2:10 remaining.

Go for it: win probability 50.19%

Kick a FG (and a touchback): win probability 13.53%

Win Probability Difference: 36.66%

Portion of Winning Outcomes Sacrificed: 73.04%

Note: The PFR calculator does not have a parameter for remaining timeouts, which is admittedly important for extremely late game scenarios, but not THAT important. In particular, it would still be the case that one first down for the 49ers ends the game.

Just Out of “Range”: Suppose Team A is down by two points, facing a fourth down and 1 on Team B’s 40 yard line. A potential go-ahead field goal would measure 57-58 yards, and a miss, or a failed fourth down conversion attempt, would result in a time, possession, and field position predicament that, even with a timeout or two and the two minute warning, could be perceived as nearly fatal. However, given those available time stoppages, pinning Team B deep into its own territory with a punt doesn’t seem quite as dire. The potential would remain to get a quick stop, regain possession with good field position, and set up an easier field goal try. This is a classic “extend the game” logical fallacy, and this is genuinely something that an NFL team might do.

Of course, whether the ideal course of action would be to kick a field goal or go for a first down, likely to set up a shorter field goal attempt upon success, is highly dependent on Team A’s kicker, but for the generic situation, the PFR calculator produces the following win probabilities for the various situations, depending on the success of Team A’s punt. What we find is that, in addition to a super deep punt being very difficult, it doesn’t make a huge impact in this scenario.

Team A: 4th and 1 on Team B 40, 2:10 remaining

Win Probability: 39.83%

Team A: Punts to the Team B 1

Win Probability: 12.41%

Team A: Punts to the Team B 10

Win Probability: 11.35%

Team A: Punts into the endzone for a touchback

Win Probability: 10.25%

Win Probability Difference: 27.42-29.58%

Portion of Winning Outcomes Sacrificed: 68.84-74.27%

Note: In both of the scenarios outlined in this section, the negative impact of the decision would be significantly greater if made with less remaining time, but in trying to stay in the realm of feasibility we use the two-minute warning as a convenient delineation, as its presence often serves as fallacious bait for a “game-extending” decision.

While the parameters of this discussion are fairly nebulous, and the discussion remains widely open, it may indeed be the case that the decision made by the Falcons was not only an analytical error, but in fact quite close, just some small detail perturbations away, from the WORST POSSIBLE purely human decision-based mistake that a team in any sport could conceivably make, outside of intentional sabotage.

To contrast with what we are going for here, let’s discuss a recent, much-derided NFL decision, namely the Seahawks’ goal line play call in the closing moments of Super Bowl XLIX. In particular, there was no “outer layer” decision to be made in that scenario, as it was only second down, the Seahwaks trailed by 6, and it was a foregone conclusion that the Seahawks would attempt to score a touchdown. As for the “inner layer” decision to call a pass play, most everyone from the viewers, the announcers, even the Seahawks players were in agreement that this was not optimal, particularly with two downs and a timeout remaining and an elite short yardage running back. However, the perception of this decision as one of the worst in NFL history is almost completely informed by the result of the play, an interception, which was by any objective measure unlikely. If one were to perform the (presumably difficult) analysis to determine the win probability for Seattle conditioned on calling a passing play on that second down, versus the win probability for Seattle conditioned on handing the ball to Marshawn Lynch on that second down, it is likely that both probabilities would have been quite high, well over 50%, and that their difference would be quite small. While the stage and the outcome magnified what was probably an error, it was not an analytical mistake to anywhere near the degree of the others outlined here.

How do these things still happen?

In case the point hasn’t been made clearly enough, this decision made by the Falcons is in no way isolated. NFL teams make these kind of mistakes A LOT, most notably in fourth down situations. A great source of this data is the New York Times 4th Down Bot, which breaks down all of the possessing teams options in every fourth down situation encountered, including discussions of probability of success for fourth down conversions and field goals, and win probabilities before and after each potential decision.

While we can continue to dissect the psychological and historical reasons for these errors, the question remains: How do these things still happen, with such frequency, in a post-analytics explosion 2015? Thousands of man hours, millions of dollars, and numerous graduate dissertations have been poured into the effort of perfectly evaluating personnel, projecting their performance and value, and building the ideal roster based on complicated financial constraints. If a major league baseball player is deemed to be worth one additional win over the course of 162 game season, his market value salary could see a sharp increase. However, it may well be the case that simply having a reasonably statistically inclined intern with an iPhone on an NFL sideline is worth upwards of 1 win per season, the equivalent of 10 wins in major league baseball. Moreover, if teams went the extra mile in employing full-time in-game analytics experts, perhaps with their own proprietary, further advanced and nuanced statistical models, the same way teams do for personnel and other operations, the impact could be equivalent to that of a pro bowl level player, assuming the coaching staff would consistently heed their advice.

Football Podcasts

In a three part series, Alessandro Allegranzi and I dissected the many issues, medical, moral, financial, and otherwise, plaguing professional and collegiate football.

Part 1 (Posted 8/9) and Part 2 (Posted 9/14) concern the NFL, and Part 3 (Posted 10/20) focuses on NCAA football.

Warning: Lots of bad language.

NBA 2014-15 Season Wrap/Playoff Preview Podcast

Alessandro Allegranzi and I talked about basketball. A lot.

In Part 1, we discussed the regular season, including awards picks and logical dismantling of the 76ers long-term tanking plan.

In Parts 2 and 3, we previewed the Eastern and Western Conference playoff pictures, respectively. Click here or search for “Impetuous Windmills” on iTunes.

 

“A man has exactly two children…”: A probabilistic attack on intuition and “relevance”

In each course I have taught with an enrollment under 75, I have conducted the first day of class in essentially the same way: a brief self-introduction, a discussion of course policies, and two closely related probability questions, designed as a proverbial “ice breaker” to facilitate rapport and stretch the students’ intuition. Typically, I exit the room for about ten minutes after writing the questions, which I originally heard from Spellman College Professor Colm Mulcahy in a talk honoring the late legendary puzzlemaster Martin Gardner, on the blackboard as follows:

Part 1: A man has exactly two children, at least one of which is a boy. What is the chance he has two boys?

 

Part 2: A man has exactly two children, at least one of which is a boy born on a Tuesday. What is the chance he has two boys?

For the purposes of these discussions, we make the (not completely accurate) assumptions that every child is born either a boy or a girl, with equal likelihood, and that a child is born on each day of the week with equal likelihood. Since some people have difficulty conceptualizing the abstract probability concept of a random variable in such a concrete context (“What do you mean? It’s one guy, he either has two boys or he doesn’t…”), it is often useful to rephrase these and future questions as “What portion of ALL families with exactly two children, at least one of which is a boy, would you expect to have two boys?,” et cetera.  From a pure mathematical perspective, the questions are nothing even remotely special, completely appropriate as a quick homework exercise in a discrete math or first probability course. What distinguishes them, rather, is their lack of compliance with potentially blinding intuition.

After the allotted time elapses, I am usually pleased to reenter a loud room with multiple enduring discussions, and I solicit volunteers to offer an answer to Part 1. Whether it is in this context, or if I simply ask the question in casual conversation (several of my friends are quite sick of it by now, I’m sure), the first guess is almost invariably 50%. The offered logic?

“He has two children, one is a boy, the other is either a boy or a girl, each occurring 50% of the time. Therefore, the chance that he has two boys is 50%.”

When presented this seemingly sound logic, I diplomatically respond that I would not consider this an incorrect answer as much as a correct answer to a different question or questions. Most notably, if the question had begun “A man has two children, the OLDER of which is a boy…” or “A man has two children, the YOUNGER of which is a boy…”, then 50% is correct. So how is the question as stated any different?

Solution to Part 1: When two children are born, there are exactly four possible outcomes with respect to the children’s gender in chronological order, each occurring with equal probability: Boy/Boy, Boy/Girl, Girl/Boy, and Girl/Girl. The additional provided information that at least one of the children is a boy eliminates the Girl/Girl outcome, leaving three equally likely outcomes, one of which, Boy/Boy, qualifies as a success. Therefore, the correct probability is 1/3=33.33…%.

For the correct analysis of Part 1 with visual aid, along with discussion of the common flaws in logic yielding the answer of 50%, check out this diagram: Part 1 Analysis

To field another common objection before moving on, it is indeed correct to consider Boy/Girl and Girl/Boy as distinct outcomes, as they each occur 25% of the time when considering two random births. If, for whatever reason, you feel the need to identify these two outcomes as being the same, that is perfectly fine, but you then lose the convenient property that each distinct outcome occurs with equal likelihood (i.e. for two random births you would have three outcomes: 25% two boys, 25% two girls, and 50% one of each).

Now that everyone is on their toes, we strike at the common intuition’s jugular vein with Part 2.

Solution to Part 2: Now considering gender and day of birth, there are a total of 14 possible outcomes in which the older child is a boy born on a Tuesday, 7 of which contain two boys. Similarly, there are 14 outcomes in which the younger child is a boy born on a Tuesday, 7 of which contain two boys. These two components of the sample space overlap at only one outcome, when both children are boys born on Tuesdays, which in particular contains two boys. Therefore, the total number of possible outcomes is 14+14-1=27, each occurring with equal likelihood, and 7+7-1=13 of which contain two boys. Therefore, the answer is 13/27≈48.15%.

 

Once again, visualized analysis and discussion is provided here: Part 2 Analysis

The biggest expressed hangup with this solution? “How could the day of the week possibly be relevant? The kid has to be born on SOME day of the week, why would it matter which?” The big issue with that last sentence is the phrase “THE kid”, and to try to wrap your mind around this issue more, check out the linked diagrams. In particular, we could consider seven different versions of the Part 2 question by running through each day of the week,  thus dividing the full sample space from Part 1 into seven pieces, and restricted to each piece the success rate of having two boys is 13/27, but that doesn’t mean the total success rate is 13/27 because those seven pieces OVERLAP WITH EACH OTHER. To use some mathematical terminology, there is a big difference between a decomposition of a set (simply expressing the set as a union of subsets), and a partition of a set (expressing the set as a union of DISJOINT subsets).

For another example of this phenomenon, let’s turn to the world of sports: What if I told you that during the 2014 season, a AAA pitcher named John Smith earned a decision in every start, won 60% of his starts in games that began before 5pm, and won 60% of his starts that began after 2pm. Was Smith necessarily a winning pitcher? Nope, he could easily have gone 0-4 at 1pm, 6-0 at 4pm, and 0-4 at 7pm. All the given conditions are satisfied, yet he winds up a disappointing 6-8 on the year. OVERLAP IS IMPORTANT!

To return to the more practical rephrasing of the two original questions, we have effectively changed the sample space from “ALL families with exactly two children, at least one of which is a boy” to “ALL families with exactly two children, at least one of which is a boy born on a Tuesday”, and whether your gut tells you that this change should affect what portion of these families have two boys, it is undeniable that at face value we are considering a different collection of families. Perhaps the theme of this piece is that when conducting any sort of analysis, regardless of intuition, any new information is potentially relevant.

An observation worth noting: the answer to Part 2 is much closer to the common gut reaction of 50% than the answer to Part 1 is. Is this just an odd occurrence or indicative of a more general phenomenon?

Definitely the latter. As discussed in the linked diagrams, the only thing keeping the answer from being 50% is the overlap between the cases when the older child satisfies the given condition and the cases when the younger child satisfies the given condition. The more unlikely the given condition is, the smaller that overlap will be compared to the entirety of the sample space, and hence the closer the answer will be to 50%. For example, suppose we posed the question “A man has two boys, at least one of which is boy that was born on Christmas?” As long as we agree that the probability of being born on Christmas is 1/365 (probably not actually true), then we can find the answer. Visualized analysis and discussion of a more generalized version are provided here: Generalized Analysis 1

While discussing these questions with my good friend and EPA scientist Daniel Garver, we had the following exchange:

Daniel: Ok, so what if you tell me that the man has exactly two children at least of which is a boy and ask me the chance he has two boys, and I respond by asking you the day of the week the boy was born on, does the probability change just by you answering my question?

 

Alex: Well, you can’t say ‘the day of the week THE BOY was born on’, because that makes it seem like one particular child has been specified…

 

Daniel: Ok, ok, fine, I ask you to provide me with the day of the week on which  A MALE CHILD belonging to this man was born, and you answer me. Does the probability change?

 

Alex: As strange as it sounds, yes.

 

Daniel: …Ok, but what if you ask me the original question, and I respond by requesting that you provide me with a PICTURE of A MALE CHILD belonging to this man. There’s only one of that exact kid, so does that make the answer exactly 50%?

 

Alex: Yes, because then it is totally sound logic to say that this man has two kids, one of which is THAT KID, and the other of which is either a boy or a girl with equal probability. They can’t both be THAT KID, so there is no overlap issue.

 

(slightly unsure pause)

 

You know, it might seem like nothing can change just by showing you a picture, but if you take the practical as opposed to the abstract approach, the picture really changes the situation as much as it could possibly be changed, because the sample space went from all families with exactly two children and at least one boy to JUST ONE FAMILY.

 

Daniel: Yeah, that’s right… weird.

 

It is weird, but it is right. Daniel’s question got me thinking a little more, and I realized that the question could be generalized beyond just conditions that occur in each child independently to include conditions which occur non-independently. After all, it’s ALL ABOUT THE OVERLAP, so as long as we know the conditional probability that BOTH the children are boys satisfying the given condition given that at least one is, we can find the answer. In particular, Daniel’s inquiry can be greatly simplified, because in order to make the answer dead on 50%, we don’t need a condition that only one child in the world satisfies, we just need a condition that two brothers CAN’T SATISFY SIMULTANEOUSLY. For example, if we rule out crazy people like George Foreman, we could just specify a particular name, like “A man has exactly two children, at least one of which is a boy named Jack. What is the chance he has two boys?” He presumably can’t have two children named Jack, so the answer is exactly 50%, and the initial gut reaction is finally right. Visualized analysis and discussion of this further generalized version relying only on overlap probability are provided here: Generalized Analysis 2

Once you make your peace with these questions and their solutions, try them on your friends, family, teachers, students, coworkers, et cetera. Again, they’re nothing Earth-shattering, but with the right crowd they tend to lead to engaging and stimulating discussions. I’ve gotten a lot of good mileage out of them over the years, and I figured it was time to pass them on. Enjoy!

Loser Talk: The denial, delusion, and desperation of being a sports fan

Preface

I very recently began maintaining the twitter account @Loser_Talk with the goal of highlighting some comical and extreme examples of certain universal phenomena in the reactions of sports fans to negative stimuli. As a companion to this twitter feed, I felt it would be informative to discuss the origin of the terminology, as well as a brief enumeration of genres and examples. Put more succinctly, I would like to answer the questions “What is ‘loser talk’, and where did the idea come from?”

I would of course be remiss if I claimed that the terminology was truly unique or original; it is after all a pairing of two exceedingly common English words that combine completely naturally in the context of sports. It would be like a football team trying to lay legal claim to “twelfth man” or something (oh wait, that happened…), and furthermore there is even an urban dictionary entry dedicated to the phrase (unbeknownst to me prior to composing this post). However, the first provided definition, essentially amounting to pessimism and lack of confidence, runs somewhat contrary to what I’m going for here, and the latter two, while in the right spirit, are either too specific, underdeveloped, or both.

Before delving in to the details, I would like to include something of a disclaimer:

The @Loser_Talk twitter feed, and any accompanying discussions of loser talk here or elsewhere, are not meant to be genuinely critical or even remotely mean-spirited, quite the opposite actually. As I hope the reader will recognize, what makes loser talk so elemental is that, as sports fans, we are ALL guilty of it with astonishing frequency, and by delighting in the funniest and most extreme examples, we are not targeting specific offending parties and rejoicing in their idiocy and misfortune, but we are rather rejoicing in the utter absurdity and triviality of sports fandom in general and the neuroses and psychoses we share as a collective. Put briefly, we’re all crazy, and when we laugh at each other, we laugh at ourselves.

All that said, here’s how it started…

Origin Story

It was the evening of February 23, 2013, and the defending NBA champion Miami Heat, on their way to a repeat title, invaded the Wells Fargo Center in Philadelphia with an Eastern Conference-leading 38-14 record. They rode a dominant nine-game winning streak that would eventually balloon to 27 games, the second longest in league history.

In stark, depressing contrast, the hometown 76ers sat at 22-30, in the midst of an eventual seven-game losing streak that would cement them in second quartile NBA purgatory. After lofty hopes following the previous year’s playoff appearance, they would eventually finish in ninth place in the Eastern Conference, perhaps the least desirable outcome for a rebuilding franchise, the bitter taste of which sparked a multi-year tanking project beyond proportions ever previously attempted.

I was in my first of two years as a visiting professor in the mathematics department at Bucknell University, located in the tiny town of Lewisburg in central Pennsylvania, a hair under three hours northwest of Philly. Joseph Thaddeus Moss III, a close friend of over a decade, an acquaintance of nearly two, and a fellow sufferer of a debilitating NBA obsession, had circled this particular date on the calendar as the perfect bait for his first visit to my new quaint northeastern accommodations. I made the trek down the banks of the Susquehanna River, picked Joe up at the airport, and after a perfunctory driving tour of the city we made our way to the Central Casting image of the en vogue “sports complex”, with the Wells Fargo Center, Lincoln Financial Field, and Citizens Bank Park all towering within a choked up seven iron of each other. We headed in to the arena over an hour early, the building already abuzz as this was a rare sellout and the nightcap of an NHL/NBA doubleheader, and we took our seats in the upper level around the free throw line.

The Sixers competed admirably on their home floor in the first quarter, thanks in large part to twelve points in the first seven minutes from Nick Young, and the scoreboard flashed even at 24 with 26 seconds remaining in the period. The champs, however, succeeded in extracting the oxygen from the building before the first buzzer sounded, as LeBron James made the front end of a pair of free throws, and Shane Battier rebounded his miss on the back end. After resetting the offense, James dribbled out the remaining clock and barreled to the rim for two as time effectively expired. Miami’s lead was a slim 27-24 after one, but James and company had set a tone in the closing seconds that would reverberate.

Despite appearing in full control of the action throughout the second quarter, the Heat continued to fall short of blowing the game open. With Miami nursing a 53-47 lead with less than a minute remaining in the half, Jrue Holiday rebounded a Ray Allen miss, brought the ball down the floor to set up the Sixers offense, attempted a dribble drive to his right, and drew a quick whistle. The offender was Dwyane Wade, guilty of a hand check, finishing off an exceptionally efficient half on his way to a blistering final stat line. By intermission, Wade had 17 points on just one misfire from the field, and he would finish with 33 points on 14 of 18 (Side note: Joe vehemently predicted based on pregame warmups that Wade would go for 30+, despite having only done so five times in the first 52 games). He methodically dissected and consumed the young home team with a barrage of jumpers, floaters, and attacks of the basket, until their bones had been picked as clean as rotisserie chicken carcases tossed aside and dismantled by an eager house cat (that’s not weird, everyone does that… right?). He was aggressive on offense and controlled on defense, with his only prior infraction a charge drawn by Jeremy Pargo early in the quarter (“probably while dunking on him,” as later noted by Daniel Garver). The call at hand was just his second personal, which at the game’s midpoint is far from even the most liberal definition of foul trouble. It was at this moment that our muse announced himself, in the form of a hopeful local fan trying to rouse his cohorts just an arm’s length away from us in the next row down. He offered the following verbal emission, which reeked so thoroughly of denial, delusion, and desperation as to form an almost blissfully sweet tortured sports fan perfume:

“Yeah, that’s right, get Wade into cheap foul trouble! That could really come into play in the second half!”

Our initial reaction was complete paralysis, followed by a slow, synchronized turn toward each other with matching wide-eyed, mouth agape expressions. After a solid five Mississippi, I broke the silence with one deliberately enunciated exclamation:

“Whoa.”

After the wave of incredulity washed over us, we collected ourselves and elaborated. “That is pure desperation,” I commented, “there has been a LOT of losing in this town since Iverson left.” Then, thanks to Joe’s channeling of the great Steven Brody Stevens (in case he hasn’t told you personally, he was in Hangover, Hangover 2, Due Date, cut out of Funny People, YES! YOOOU GOT IT!…), we got to the heart of the matter:

Joe: This entire building is LOSERVILLE.

Alex: And THAT is LOSER TALK.

It wasn’t just what he said, it was the way he said it. He really believed it, figuratively separating his shoulder as he reached so disturbingly far and clung so depressingly tight to the tiniest Fox News-spun hope in the face of inevitable short-term and long-term failure.

Ultimately, the Heat pulled away to win 114-90 behind Wade’s aforementioned brilliance and a thoroughly action-dictating triple-double from James, but more to the point, a concept was born.

What is Loser Talk?

As a mathematician, I place great value in well-formed questions and rigorous definitions, and while the concept of loser talk is relatively dynamic and nebulous compared to anything in the ZFC framework (I’m actually a bit of a non-believer in the axiom of choice myself), I feel compelled, despite temptation, to avoid resorting to the immortal philosophy coined by Supreme Court Justice Potter Stewart in reference to hardcore pornography.

Broadly speaking, loser talk is any response to a lack of success, on a short or long-term scale, of one’s preferred team or athlete (including, when applicable, oneself or own team) that runs either contrary or unrelated to the acceptance and digestion of this shortcoming and the team or athlete’s responsibility for it. These counterproductive or irrelevant efforts include, but are not limited to:

  • excuses or rationalizations for failure
  • unreasonable reaching for or outright fabrication of “silver linings”, often including speculation that current failure could serve future benefit
  • The unreasonable clinging to or outright fabrication of “slivers of hope” in the face of what would be deemed inevitable failure by an objective observer
  • Disproportionately fervent response to a small degree of success due to its contrast with preceding failure
  • retroactive invention of alternative goals as a means of consolation for failing to meet previously defined goals
  • schadenfreude, more specifically the taking of pleasure in or desire for past or future misfortune of other teams or athletes, rivals or otherwise

By twisting and dilating this vague, meandering attempt at a definition, one could fit almost any remotely subjective sports-related thought under the loser talk banner, outside of resigned self-deprecation or the celebration of an undeniably massive achievement. This lack of boundary may appear to constitute a fatal shortcoming of the whole endeavor. However, even as someone whose primary creative outlet requires airtight definition, it doesn’t especially bother me.

For one, the thesis here is NOT that loser talk should be eliminated, or even avoided, so knowing exactly what it is and what it isn’t becomes a lot less vital. To elaborate on the previous paragraph, it appears the only way to be a sports fan and completely avoid the Keon Clark-esque reach of loser talk is to quietly, internally hope for your team’s success, only release humble, gracious exultation if they achieve their ultimate goal, or bypass the stages of grief to objective, well-reasoned acceptance if they do not. Quite frankly, that sounds excruciating.

Furthermore, the goal is to highlight the CENTER of the loser talk umbrella, not the outskirts; if you wanted to study the deepest parts of the ocean floor, you wouldn’t really care where the beaches were, or even how much of the globe was covered by water. And above all, the goal is to be funny, though not deliberately in anything I compose myself, as I am rather lacking in that regard, but fortunately there is often nothing funnier than the absurdity of things said in earnest.

Examples and Analysis

Let’s just dive right in with a few favorites. Since explanation is the enemy of comedy, I will do my best (retroactive parenthetical: not that well) to resist my instinct to overanalyze, leave the reader to unearth the multiple layers of absurdity and contradiction, and let the content speak for itself.

The Mecca of tanking

On January 6, the New York Knicks had an NBA-worst record 5-32 and were on a 12-game losing streak (they’ve added two more losses at the time of writing). They are somehow even worse than the 76ers, a team genuinely comprised of non-NBA players. The futility sparked this tweet and blog post from team reporter Adam Zagoria, the main idea of which can be paraphrased as:

The last time the Knicks lost this many games in a row, they eventually got the no. 1 draft pick and selected Patrick Ewing. This year, the consensus top pick is Duke’s Jahlil Okafor, who is ALSO A CENTER! THIS IS EXCITING!

At its core, this is nothing new. Taking solace in the fact that your team’s poor performance will lead to a high draft selection, or even that your team is doing so purposefully, is as old as the draft concept itself, particularly in the NBA and NFL, and the phenomenon is perched firmly on the loser talk Mount Rushmore. However, without spilling a few barrels of digital ink analyzing the numerous miniscule probabilities, irrelevancies, and gaping logical holes, I hope you’ll agree that this particular example has a little extra zest and raises the proverbial bar. I know it’s only January, but this tweet is the early leader in the clubhouse for 2015 Loser Talk of the Year, and Zagoria for Loser Talker of the Year (his twitter feed is sprinkled with similar gems). Although, as noted by Alessandro Allegranzi, targeting reporters or social media liaisons associated with a team in any official capacity has a bit of a “fish in a barrel” note to it, as spinning any and all team events toward the positive is literally part of their job.

Loser talk: the Braves warm blanket for late October chills

On October 17, 2014, weeks removed from the Atlanta Braves regular season elimination from playoff contention, MLB.com reporter Mike Bowman published this article, with the headline:

Braves boast connections to both World Series teams: Beloved Hudson pitching for SF; KC GM Moore came from Atlanta’s organization

How far we have fallen, Braves fans. Perhaps I wouldn’t have reacted so viscerally to this if the word “have” was used in place of “boast”. BOAST?! What are you talking about?!?! First of all, in the age of free agency and constant front office and managerial turnover, couldn’t you probably find a nontrivial connection between ANY two teams in major league baseball? Further, the article’s apparent suggestion that the supposed greatness of the Braves organization, who have not smelled the World Series in over 15 years, is somehow tangentially responsible for two teams having actual, tangible, current success, and that this should somehow be a source of PRIDE…

Ok, this is the kind of overanalysis I was talking about. Perhaps I’m too close to this one…

Portrait of a rivalry: loser talk from all directions

Nothing brings out the loser talk like a good rivalry, so for exposition purposes, let’s focus on one: Georgia-Georgia Tech football, also known as “Clean, Old-fashioned Hate”. If the last example was too personal, then maybe this one is an unwise choice, but ideally my intimate quarter-century first-hand knowledge will aid in the examination of the many shapes, sizes, flavors, and textures of loser talk that a single rivalry can produce. To that end, let’s start with a contribution from yours truly:

“If Georgia Tech played a football game against the Al-Qaeda All-Stars, I would root with all my heart for the Al-Qaeda All-Stars. I would root for Osama Bin Laden to throw six touchdown passes.”

I said this a number of times, probably between ages 16 and 21, mostly for comedic or shock value, but the core sentiment was genuine. Starting around age 4, I HATED Tech, Florida, and Tennessee (probably in that order), and I wished any and all misfortune upon them, including injuries, regardless of UGA’s involvement. This all somehow seemed reasonable as a child idolizing invulnerable, superheroic “adults”, but now, as a grown person observing players that I recognize as essentially children, anything resembling vitriol or ill will just seems heartless and perverse.

Of course, the other side is frequently guilty as well. Georgia Tech’s fight song contains the line “To Hell with Georgia”, echoed stadium-wide regardless of opponent, and as the annual matchup approaches, one encounters the rallying cry “To Hell with Georgia” or its abbreviation “THWG” somewhere between three and thirty-seven times as often as anything resembling “Go Jackets!”

Allow me to recount one particular related story from 2007. The Bulldogs defeated the Yellow Jackets 31-17 at Bobby Dodd stadium in Atlanta, and in the closing minutes of the game, Georgia fans were scoreboard watching; if Kentucky defeated Tennessee that day, then Georgia would play LSU for the SEC championship. Kentucky fell just short, losing 52-50 in a wild quadruple overtime classic, but what we Georgia fans hadn’t even considered was that the Tech fans were scoreboard watching as well, hyperaware of their rival’s window for success. Upon the completion of Tennessee’s victory, the Tech PA announcer informed the crowd of the final score, and the stadium erupted. I have seen Tech beat Georgia once in Atlanta, in 2000, and this Jackets fans’ reaction to a game played 380 miles away in a conference not their own was at least twice as loud. It sounded like they had just won a national championship. The biggest game of your season is happening right in front of you! AND YOU’RE LOSING! The sheer volume that echoed through that cool Atlanta night is forever etched in loser talk lore.

Of course, these examples are tame and lighthearted compared to the full destructive power of the hatred inspired by sports rivalries, including horrific violence (see section 8) and bizarre, hurtful vandalism. I have repeatedly indicated that my goal is not to criticize or attempt to discourage the various genres of loser talk, but the schadenfreude component, particularly in its extremest manifestations, may qualify as an exception. In nearly every other aspect of society, we recognize that our penchant for extracting warmth and joy from the suffering of others is amongst our most loathsome of human impulses, one that we should suppress whenever possible, yet in the context of sports rivalries it remains socially acceptable and even encouraged. It would be really nice if we could just act like human beings and avoid using sports as a thinly-veiled outlet for all of our basal, reptilian brain desires to be absolutely despicable to one another.

Not to be too self-aggrandizing or judgmental, but after years of effort and evolution in this regard, I can say with a measure of confidence and pride that I have virtually eliminated schadenfreude from my sports fan repertoire (If you find this sentence ironic because of the nature of this entire endeavor, then you’ve badly missed the point, so badly that you probably don’t actually know what “ironic” means, rendering this entire parenthetical useless).

I rooted for Georgia Tech in this year’s Orange Bowl, and NOT because their win would “make Georgia’s loss look better” (hall of fame loser talk), but because I like the way they play football (I tend to lean old-school), I have many friends and family members who are Georgia Tech fans and alumni, and I otherwise had no skin in the game.

But enough preaching and self-congratulation, I promised an array of loser talk examples, and one rich source is the rivalry’s most recent incarnation, a 30-24 Yellow Jacket overtime victory in which both teams suffered costly turnovers and mental errors. Here is what a loser talk-free (and hence boring) Georgia fan reaction to the game would sound like:

We had a lot of success moving the ball early, and it looked like we had an opportunity to build a big lead, but we turned the ball over in big spots, Tech shut down our running game over the last 2.5 quarters, we made some bad decisions late in the game, coach and player alike, and most notably, we had no answer for Tech’s super-effective, well-executed triple option. Ultimately, we didn’t do enough to win, and they did. We lost a game to a better team.

It sounds like a bad post-game press conference, I know, but it’s all true, and it’s free of the early stages of grief and our powerful, child-like reaction instincts. Fortunately for our collective entertainment, actual Georgia fan reactions sounded more like this:

Tech shouldn’t have even been IN that game! We GAVE it to them with TWO fumbles on the one!!

 

Well, hey, I guess we can throw them a bone once a decade or so. So you’ve won two out of the last fourteen, nice job…

 

SQUIB KICK?!?!? Are you kidding me?!?!? That was the ONLY WAY they had a chance to tie the game!!

That last one may or may not be an essentially direct quote from me in the bleachers at the end of regulation. While the first and third reaction focus on the specifics of this particular contest (and each conveniently ignore that Georgia only had an opportunity to take the lead late due to an inexplicable fourth quarter fumble by Tech quarterback Justin Thomas, but I digress…), the second speaks to a more general and widespread rivalry phenomenon, nestled in the “moving the flagstick” category of loser talk characterized by retroactive goal amendment.

In particular, this approach seems to indicate that in a rivalry, a fan of the team that has had more historical success, or even just more recent success, can, when beneficial, reposition the focus from the competition at hand to some kind of nebulous, “big picture” competition. We’ve won twelve of fourteen? That’s interesting, because I thought we were taking about a football game between two football teams, not a best-of-23 series over multiple decades in a sport where players stay on the team for at most four years. If, as the more recently successful team, you can make this shift at will, then you essentially render yourself invulnerable in the rivalry, and that’s bullshit. Vulnerability is what sports and rivalries are about.

Another version of this goal-shifting consolation phenomenon is a frustrating mentality shared by many Tech fans, typified by a genuinely hurtful chant that I heard just about every two years following Bulldog victories at Bobby Dodd:

That’s alright, that’s ok, you will work for us some day.

As a lifelong Georgia fan, and eventual Bachelors and Ph.D. recipient, I always had a bit of a sore spot for the perceived academic and intellectual superiority expressed to me by Tech supporters. Today, as an adult who has spent a great deal of time at each school, I can comfortably and objectively analyze that while each school has its strengths and weaknesses, Tech is certainly more prestigious and highly-ranked based on traditional metrics and publications. However, both schools are elite public universities, and the “nerd versus dumb redneck” narrative is horribly outdated on both sides. Not only that, but the schools’ respective specialties are largely disjoint, and definitely would not result in especially frequent instances of UGA grads working directly subordinate to Tech grads. Even if this WERE the case, however, we all know that the NCAA student-athlete model is a farce, and that the academic reputations of the universities have little to no bearing on those of the football players, so the sentiment is pretty thoroughly ridiculous.

Ok, that got a little a personal, perhaps, and it’s important to note that even if the chant weren’t a case of fatally flawed logic, it would still constitute a massive moving of the flagstick, repositioning the goal in question from a single football game to some measure of larger life success for the purpose of consolation after a loss, making it textbook loser talk.

Here’s one more thing:

The last three meetings of the 1990s each featured controversial officiating decisions that the losing teams use for consolation and argument to this day: a 1997 pass interference penalty against Tech that nullified an effectively game-ending interception and was followed by UGA’s go-ahead touchdown, a 1998 two-point conversion rushing attempt by UGA quarterback Quincy Carter that was called no good, and a 1999 goal-line play in which UGA running back Jasper Sanks was ruled to have fumbled.

I included this to highlight an as yet undiscussed topic, namely that any blaming of officiating for losses, regardless of egregiousness (the officials who called the Sanks fumble were suspended), is amongst the oldest and purest forms of loser talk.

I’ve said way too much, as is often my way, but I hope you get the idea. Follow @Loser_Talk on twitter for daily instances of loser talk from the sports world and occasionally beyond, and please tweet at me any good examples you read or overhear. Enjoy!

 

Why the Best Picture of 2014 doesn’t stand a chance.

As award season in the film industry barrels down upon us once again, as do the requisite predictions and, inevitably, the soberingly inconsequential yet still remarkably bitter declarations of travesty and injustice after the fact. While most born on this side of World War II recognize the Hollywood Foreign Press and the Golden Globes as little more than a punchline, the Academy Awards still maintain a semblance of gravity through tradition, like triple crown statistics in a post-sabermetric world of content consumption and analysis. While my thinly-veiled irreverence should indicate that I am not easily moved by an arbitrary distribution of miniature golden RoboCops, I find myself this year anticipating an intensely frustrating slight weeks ahead of time, namely with regard to the Best Picture category.

Since up to ten films can be nominated, and most predictive lists provide some spillover, I have compiled for the sake of discussion the complete collection of movies that appear anywhere in the top fifteen Best Picture candidates of any publication with a relatively high search engine presence, loosely ordered based on a non-scientific composite of provided rankings. Accompanying each title are four pieces of data recorded as of December 24 from the two most popular film criticism aggregators, rottentomatoes.com (denoted here by RT) and metacritic.com, presented as follows:

[RT perecentage (number of reviews on RT), RT average rating, Metacritic score]

 A dictionary for the data is provided below, followed by the list itself:

RT percentage: This is the percentage of all the reviews found and vetted by rottentomatoes.com that are perceived as positive reviews. In the event that the reviewer provides a quantitative rating for the film (e.g. 3/5 stars), this “thumbs up/thumbs down” determination is made based on whether that rating normalizes to at least 6/10.

RT average rating: This is the average of all quantitative ratings provided by critics found and vetted by rottentomatoes.com, with all scores normalized out of 10 (e.g. 3/4 stars is normalized to 7.5/10).

Metacritic score: This is the average of all quantitative ratings provided by critics found and vetted by metacritic.com, with all scores normalized out of 100. Note that the Metacritic score and ten times the RT average rating are, in principle, completely equivalent metrics, but metacritic.com pulls from a narrower collection of film critics.

Potential Best Picture Nominees

  1. Boyhood [99% (213), 9.3, 100]
  2. Birdman [93% (174), 8.5, 89]
  3. Selma [100% (54), 9, 100]
  4. The Imitation Game [89% (158), 7.9, 72]
  5. The Theory of Everything [81% (154), 7.4, 72]
  6. The Grand Budapest Hotel [92% (228), 8.4, 88]
  7. Unbroken [49% (105), 6.1, 60]
  8. Foxcatcher [86% (158), 7.8, 83]
  9. Whiplash [96% (183), 8.6, 87]
  10. A Most Violent Year [95% (21), 8.3, 86]
  11. Into the Woods [72% (96), 6.7, 70]
  12. Gone Girl [88% (249), 8, 79]
  13. Nightcrawler [95% (192), 8.2 , 76]
  14. Interstellar [73% (258), 7.1, 74]
  15. Wild [93% (150), 7.6, 75]
  16. American Sniper [69% (52), 6.7, 71]
  17. Mr. Turner [97% (105), 8.5, 93]

Upon even a brief skimming of the numbers, a few things jump out. For example, based purely on critical reaction, Unbroken appears to have no place on this list. What’s the deal with that, is Angelina Jolie sitting on DNA evidence that Cheryl Boone Isaacs killed Hae Min Lee in Baltimore in 1999? (No, I did not know the name of the president of the Academy; I looked it up, because I was committed to the bit.)

For a more positive outlier, Boyhood is, by any number of reasonable metrics, the best-reviewed film of the last 70 years, and one of the best of all time (on RT, only the Wizard of Oz and Citizen Kane have the narrowest of edges in average rating, but with a much smaller sample size). I saw it, rather fittingly with my father, and it was undeniably tremendous. I was floored by the sheer thoroughness and enormity of the project, the realism with which the writing captured familial relationships, and the subtlety and poignancy of the acting performances. It was about three hours long, and it felt even longer, which would typically serve as an indictment but in this case felt quite appropriate. As great as Boyhood was, I thought Birdman, Whiplash, Grand Budapest, and Nightcrawler were as good if not better. Gone Girl and Interstellar were really good too, and while I did not find them to be on the same tier as the aforementioned group, there were a few unlisted movies this year (to be discussed later) that I did, and I look forward to seeing nearly all of the other films on this list. It was a great year for movies, which is good for me, because I go to the movies a lot, occasionally without much discrimination regarding the quality of the offering.

However, despite all of the historically significant artistic merit splattered all over the list above, my brain is refusing to allow me to digest this consensus in tacit agreement. I scroll up and down this powerhouse roster of cinematic brilliance, but can only hear an echo of Cate Blanchett’s dulcet tone as Lady Galadriel reminding us of the ring of power:

There was another…

Indeed, there was, a sparkling oasis way back in February, smack in the harsh, arid center of the traditionally barren wasteland of first quarter theatrical releases. This film captures the viewer with jaw-droppingly stunning and immersive visual effects, as well as intense and compelling plot and character development. Its most distinguishing characteristic, however, is transcendently brilliant, well-executed comedy. The movie is a 100 minute non-stop barrage of uproarious haymakers, with a jokes-per-minute rate for the ages and an almost unfathomable comedic batting average. In fact, to extend the time-honored baseball analogy, the comedic production level of this film would put Barry Bonds’ four year chemically-enhanced peak, revered by stat nerds as an unattainable standard, to shame. In addition, this tour de force is accomplished without resorting to any perceptible level of inappropriateness for any age group, and the film straddles the line between complete universal appeal and razor sharp, ultra-high IQ screenwriting as well as any movie I have ever seen.

My initial reaction to the film was passionate and unequivocal, but perhaps there was an element of relativity at work; it was February after all, there was likely a month-long buffer on either side without a noteworthy release. Over the next ten months, however, the idea did not fade. It grew, as if it had been implanted in a dream within a dream within a dream by Leo DiCaprio and company (friends know I am more of a Tom Hardy man, myself). The other previously discussed masterpieces came and went, and that potentially subversive thought, initially dismissed as temporary, was nourished by time, discourse, and repeated viewings, and it evolved into a Lovecraftian beast of an opinion that I now perceive as bordering on objective fact:

The LEGO Movie is the best movie of the year.

In my recent conversational experience, reactions to this opinion have been mixed. Those who haven’t seen the movie usually think I’m joking, and those that have seen it typically echo my adoration, but fall just short of accepting my assertion entirely. I have already outlined why I like the movie so much, but before entering into any further qualitative discussion, let’s begin with a pure face-value analysis of the aggregate criticism data. If it were to be appended to the above list as the eighteenth film (full disclosure: that is exactly how it was ranked by awardscircuit.com, so it has not been ignored entirely), The LEGO Movie [96% (201),  8.1, 82] would rank third in RT percentage and ninth in both RT average rating and Metacritic score. This discrepancy in ranking based on choice of metric may be a bit jarring at first glance, but it is actually quite easily anticipated for reasons to be discussed later. It suffices to say for now that The LEGO Movie ranks either near the middle or the top of the pack when compared quantitatively to the other potential nominees, and it is  intensely and almost universally beloved by those lucky enough to have seen it.

Of course, despite my fervor, I maintain realistic expectations. I fully recognize and respect that no matter how strongly I state the case, especially given the quality of the competition, most will ultimately disagree with the full strength of my thesis here, that The LEGO Movie should WIN Best Picture. After all, splitting hairs between genuinely terrific and impactful works of art will always boil down to some degree of taste, and the choice is certainly nontraditional.  But why shouldn’t it at least get nominated? Better yet, why, if the prognosticators are to believed, does it not even have a genuine chance? There is a short answer to the latter question, the one I recently provided my friend’s son when presenting him with The LEGO Movie on blu-ray for his ninth birthday:

The people that vote for the Oscars are old and lame.

I could probably stop there and go back to watching all five Christmas day NBA games, but while we’re here, let’s dig a little deeper and investigate a pair of institutional biases working against The LEGO Movie, plaguing the Best Picture category and beyond.

The first is rather simple, and that is the bias against animated films. In the previous 86 Academy Awards, only three animated films have been nominated for Best Picture: Beauty and the Beast (1991) [93% (103), 8.4, N/A], Up (2009) [98% (280), 8.7, 88], and Toy Story 3 (2010) [99% (279), 8.9, 92]. Since 2001, there is a readily available and exceptionally lazy explanation for this disparity: Best Animated Feature is its own category. This is essentially identical to the sentiment that pitchers should not receive the Most Valuable Player award in baseball because of the existence of the Cy Young award for each league’s top pitcher (only Justin Verlander and Clayton Kershaw have broken that barrier since 1986). This could potentially make some sense, except for the blindingly obvious issue that in each case, one of the awards maintains the title and prestige of being a completely open category (i.e. they didn’t change Best Picture to Best Live Action Picture), so the de facto exclusions remain completely absurd. I submit that an objective, unclouded analysis finds that animated films account for much of the best story writing, character development, and visual effects of the two decades since Pixar began producing feature-length films, and as a result it is rather short work to come up with recent animated films that probably deserved a Best Picture nomination. Here are a few:

  • Toy Story (1995) [100% (78), 9, 92]
  • Toy Story 2 (1999) [100% (163), 8.6, 88] (the most RT reviews of any film with a 100% RT percentage)
  • Finding Nemo (2003) [99% (238), 8.6, 90]
  • Ratatouille (2007) [96% (231), 8.4, 96]
  • How to Train Your Dragon (2010) [98% (200), 7.9, 74]

Now seems like a good time to revisit the previously observed phenomenon exhibited by The LEGO Movie, in which the film ranked significantly higher with respect to RT percentage  as opposed to the other metrics, as How to Train Your Dragon is another prime example of what one could call “depolarization”. For the purpose of exposition, let’s compare How to Train Your Dragon with The Dark Knight (2008) [94% (316), 8.6, 82]. The gap in RT percentages may seem slim, but consider that the number of negative reviews for The Dark Knight outnumber those for How to Train Your Dragon twenty to four. If you genuinely dislike The Dark Knight, you could be perceived as a snob or a contrarion, but if you genuinely dislike How to Train Your Dragon, you are perceived as a monster. Since they often are geared toward children and feature cute, lovable characters, critics seem unwilling to be especially harsh on animated films, even the bad ones.

On the flip side, the gaps between the films in the other two metrics, considering these are both beloved films with large critic sample sizes, are  relatively cavernous, and they are going in the opposite direction as the first gap. This could partially be explained by an unwillingness of some critics to entertain the possibility that an animated film could possess the same level of artistic merit as a live action classic, making them willing to rate the film highly, but not TOO highly, like a gymnastics routine with a less than maximum start value. The combined effect of these two forces is a massive accumulation of scores in the third quartile, above the middle but away from the top, leading to ultra-high RT percentages but relatively tame RT average ratings and Metacritic scores. One could interpret this phenomenon as a patronization of the animated films, patting them on the back while simultaneously denying them access to rarefied artistic air, and one can only assume that the phenomenon extends to award voters. A lifting of depolarization through total objectivity would undoubtedly drop the RT percentage of the more mediocre animated films, but would vault the elite into their rightful place amongst the masterworks.

Intuition might suggest that much of the outlined bias against animated films could also apply to live action films perceived as geared primarily toward children, and on a long term scale this appears to largely check out. A perusing of Best Picture nominees from the last few decades finds only two counterexamples: Hugo (2011) [94% [203], 8.3, 83] and Babe (1995) [97% (68), 8.3, 83]. These well-deserving, critical darling exceptions make one potential oncoming anomaly that much more perplexing: Why in the world is Into the Woods on the proverbial bubble of these projection rankings despite a decidedly lukewarm critical response? Any number of explanations are possible, and they segue nicely into the second bias to be explored. Into the Woods biggest draws are a legendary actress, an award-winning director, and beloved source material, all of which have huge pull with older voters, whereas The LEGO Movie’s biggest draws are its visual effects and, more to the point, that it is utterly freaking hilarious, and the latter just doesn’t seem to pull its weight in the awards world.

To that end, let me pose a question: In the last half century (and probably further, I stop recognizing the titles at a certain point), how many Best Picture winners, if you were forced to assign them to one film genre, would be classified as comedies?

Of course, film classification is more akin to a continuous, high-dimensional Euclidean space than it is a small, discrete collection of pigeonholes, so the answer is up to a certain degree of interpretation, but I submit that the most likely arrived at answer is ONE: Annie Hall (1977) [98% (60), 8.9, 82], with reasonable pivots in American Beauty (1999) [88% (168), 8.1, 86] and The Artist (2011) [98% (229), 8.8, 89], and super long stretches to Terms of Endearment (1983) [88% (41), 7.6, 79], Driving Miss Daisy (1989) [81% (52), 7.1, 81], Forrest Gump (1994) [71% (80), 7.1, 82], and Chicago (2002) [87% (226), 7.9, 82].

Using a very liberal interpretation of this same standard, I could reluctantly talk myself into identifying about twenty more “comedies” that were nominated for Best Picture since Annie Hall took home the Oscar. Amongst this collection, films like Little Miss Sunshine (2006) [91\% (208), 7.1, 80] and Juno (2007) [94\% (205), 8.1, 81], nestled relatively close to the positive comedy coordinate axis, are by far the exception rather than the rule. Almost all, in fact would be more readily classified as a “comedic drama”, or a “black comedy thriller”, or a “science fiction romantic comedy-drama” (that last one is straight off of the Wikipedia page for Her (2013) [94\% (229), 8.5, 90]). To summarize, in very few cases was a film nominated for Best Picture when its comedic value was its primary offering. One might observe that, according to the Academy, a film must possess some form of “substance” to serve as a nutritious protein to feature opposite the comedic dessert in order to be Oscar-worthy. The implicit message there is, of course, the root of the problem: despite its rich history, its powerful emotional impact, and its unequaled degree of difficulty, comedy, to many critics and voters, is not substance.

Just for fun, here is a cross-section of eight somewhat “purer” comedies spanning a few decades and comedy styles which have achieved some degree of legendary status, rank amongst my personal all-time favorites, and were deprived of a deserved nomination for Best Picture. There was effectively no upper bound as to how long I could have made this list – I kept digging and finding more – but I erred on the side of brevity:

  • Harold and Maude (1971) [86% (42), 7.6, N/A]
  • This is Spinal Tap (1984) [95% (61), 8.6, 85]
  • The Princess Bride (1987) [97% (63), 8.3, 77]
  • Defending Your Life (1990) [96% (28), 7.5, N/A]
  • The Big Lebowski (1998) [80% (86), 7.2, 69]
  • Office Space (1999) [79% (95), 6.8, 68]
  • Best in Show (2000) [95% (110), 7.5, 78]
  • The Royal Tenenbaums (2001) [80% (201), 7.5, 75]

Granted, some of these films took years to permeate the zeitgeist and establish the almost religious degree of reverence and popularity that they now enjoy, but for many other brilliant comedies this was not the case.

For example, the fact that the Grand Budapest Hotel appears primed for a nomination may appear to be a strike against my stipulated conclusions, but I would counter that it serves as an exception that proves the rule. More specifically, how is it that Wes Anderson has never had a film nominated for Best Picture before? While the real-time critical reception of The Royal Tenenbaums (which WAS nominated for Best Original Screenplay) was not that of typical Oscar bait, the same cannot be said for Rushmore (1999) [89% (102), 8.1, 86], Moonrise Kingdom (2012) [94% (224), 8.2, 84] (also nominated for Best Original Screenplay), and the uniquely pleasure-inducing Fantastic Mr. Fox (2009) [92% (225), 7.9, 83], which was nominated for Best Animated Feature, was definitely worthy of a place in the previous segment on animated films, and is perhaps the best approximation and precedent we have for The LEGO Movie. In recent years, Anderson has been the target of some backlash, with frequent contention that his films recycle a somewhat uniform cadence and aesthetic, and a smug, overt intellectualism. Each of these observations has a degree of validity, but it is a point of contention as to whether they should serve as criticisms or appreciated common threads. While the argument can  be made that Anderson has been overly deified by a certain segment of hipster culture, he is undoubtedly amongst the most gifted and influential filmmakers to ever live, and the fact that he has yet to have a film nominated for Best Picture feels a bit criminal.

But I digress, and it is worth noting that The LEGO Movie is not the only casualty of these and similar biases this award season (remember I promised more top-tier unlisted movies earlier). Specifically,  Guardians of the Galaxy [90% (251), 7.7, 76], Frank [93% (128), 7.5, 75], and The Skeleton Twins [87% (141), 7, 74] all belong on the short list of potential nominees. While each film has substantial drama and/or fantasy elements, they are each distinguished from their respective peer groups and springboarded into the year’s elite by how thoroughly and astonishingly funny they are. To distill the message down, when voters and critics literally or figuratively “score” these films in their minds, they are apparently not adding nearly enough points for the quantity and quality of Ph. D. level comedic elements, because if they were, there is simply no way these would all be miles clear of the Best Picture radar screen.

So what are some potential solutions here? For one, while I previously lamented the relegation of animated films to their own minor league category, I suppose the addition of a Best Comedy category to the Academy Awards would be better than the status quo. Better still  would be the division of the Best Picture category into Best Comedy and Best Drama, not dissimilar to the Golden Globes, but perhaps the ideal scenario would be genre specific categories in addition to culminating the event with a completely open, objective, unbiased Best Picture category. Anything would be better than nothing, though, because in the current state of affairs, with regard to mainstream recognition for cinematic achievement, comedies are getting royally screwed.

But maybe it’s too much to ask for the Academy Awards to change. They are bound by tradition and set in their ways (you know, the things you say to dismiss all the bizarre, racist stuff your great-great aunt Ruth says at Thanksgiving). Maybe it’s better to recognize that the determination of “the best movie of the year” and the determination of “Best Picture at the Academy Awards” are distinct, unrelated endeavors, each with their own rules and criteria: “That’s a great movie, but it’s just not an Oscar movie.” Similar sentiments have been echoed for years about MVP awards and most notably the Heisman trophy: “Oh, c’mon, the Heisman isn’t the best player in college football, it’s the best upperclassman quarterback or running back on a national championship contender, everyone knows that.” Except, not everyone does know that. That’s not what the Downtown Athletic Club says it is, and that’s not what an overwhelming supermajority of Americans think it is. Somehow, despite all the peculiar trends, the Heisman carries nearly as much weight in the popular culture at large as it always has, and the same is true for Best Picture. When someone wants to browse through the history of cinema, or get a snapshot of what the film landscape looked like in a particular year, they are probably going to start by looking at the Best Picture nominees, so it would be nice if the movement to ignore them could gain some traction, or better yet, if the Academy could try a little harder to get them right.

Ok, time to get off my high horse, lighten up, and remember what this was all supposed to be about: The LEGO Movie! I would be remiss if I didn’t give the credit where it’s due, in case anyone doesn’t know. The LEGO Movie was written and directed by superduo Phil Lord and Christopher Miller. Their first movie, and first huge success, was Cloudy with a Chance of Meatballs (2009) [87% (139), 7.3, 66], which they followed up by defying widespread skepticism and knocking  21 Jump Street (2012) [85% (209), 7.2, 69] out of the stadium. And if making an animated comedy about plastic toys that may very well be better than Richard Linklater’s twelve-year-in-the-making life’s work isn’t impressive enough, they recently pulled off something potentially rarer: a genuinely original and thoroughly successful comedy sequel, 22 Jump Steet (2014) [84% (197), 7, 71]. We’ll see if they can pull it off again; they are writing The LEGO Movie 2 to be released in 2018. Everything they’ve touched so far has turned to solid gold, so moving forward I’ll assume whatever they do is spectacular, mandatory viewing until proven otherwise.

If you haven’t seen The LEGO Movie, see it! If you’ve seen it, see it again! You might not agree with me that it’s the best movie of 2014, but if you agree with my core message, then spread the word. Tell people how funny it is, how beautiful it is, how great the characters and the story are. Shout from the rooftops that The LEGO Movie is a great movie – not a great kids movie, not a great animated movie, not a great comedy – a great movie. Better yet, tell them the whole truth about the film in three words:

EVERYTHING IS AWESOME!!!

(I’m sorry, I had to.)