Data Mining

Data Mining

1 Early Data Mining
I loved watching and playing sports as a boy, and I began playing Farm League at age 8 and then Little League from age 10 to 12. I loved pitching because of the cognitive battle: nothing was more fun than figuring out a way to make the hitter look foolish by throwing a pitch that was unexpected. Over time, I began to notice patterns in individual hitters’ approaches that could be exploited. My father and I would spend hours discussing hitters and how to get them out, including which pitch sequences would work best for each hitter. I did not think about it at that time, but in retrospect, this was data collection and predictive modeling in a rudimentary (but exciting!) form. The only thing I was missing was storing the transactional data in an Oracle database. For me, an aspect of playing baseball that was almost as fun as pitching was
computing the statistics after the game. I still have my game-by-game box scores complete with cumulative batting average and earned run average (ERA). I thank my father for teaching me how to calculate these statistics. ERA is not an easy thing for a 9 year old to compute, though it is a very nice statistic because it normalizes the runs allowed over a typical game. Tracking cumulative statistics helped give me a sense of how important the denominator is in computing statistics: the longer the time period for calculating rates, the more difficult it is to change their value. This led to another childhood pastime that involved summary statistics: Strat-
O-matic Baseball. For those unfamiliar with this game, their web site describes the game like this:
Since

Similar Essays