|
|
- {
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "** NAVIGATION **\n",
- "\n",
- "**Got Pandas? _Practical Data Wrangling with Pandas_**\n",
- "\n",
- "* [Introduction](./0_introduction.ipynb)\n",
- "1. [Data Structures](./1_data_structures.ipynb)\n",
- "2. [Importing Data](./2_importing_data.ipynb)\n",
- "3. **Manipulating DataFrames**\n",
- "4. [Wrap Up](./4_wrapping_up.ipynb)\n",
- "---"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "toc": "true"
- },
- "source": [
- "# Table of Contents\n",
- " <p><div class=\"lev1 toc-item\"><a href=\"#Manipulating-DataFrames\" data-toc-modified-id=\"Manipulating-DataFrames-1\"><span class=\"toc-item-num\">1 </span>Manipulating DataFrames</a></div><div class=\"lev2 toc-item\"><a href=\"#More-Selecting\" data-toc-modified-id=\"More-Selecting-11\"><span class=\"toc-item-num\">1.1 </span>More Selecting</a></div><div class=\"lev3 toc-item\"><a href=\"#The-convenient--[]-operator-(again)\" data-toc-modified-id=\"The-convenient--[]-operator-(again)-111\"><span class=\"toc-item-num\">1.1.1 </span>The convenient <code>[]</code> operator (<em>again</em>)</a></div><div class=\"lev3 toc-item\"><a href=\"#Selecting-data-by--.-selector-on-column-and-index-name\" data-toc-modified-id=\"Selecting-data-by--.-selector-on-column-and-index-name-112\"><span class=\"toc-item-num\">1.1.2 </span>Selecting data by <code>.</code> selector on column and index name</a></div><div class=\"lev3 toc-item\"><a href=\"#Boolean-selecting\" data-toc-modified-id=\"Boolean-selecting-113\"><span class=\"toc-item-num\">1.1.3 </span>Boolean selecting</a></div><div class=\"lev2 toc-item\"><a href=\"#Sorting\" data-toc-modified-id=\"Sorting-12\"><span class=\"toc-item-num\">1.2 </span>Sorting</a></div><div class=\"lev2 toc-item\"><a href=\"#DataFrame-manipulation\" data-toc-modified-id=\"DataFrame-manipulation-13\"><span class=\"toc-item-num\">1.3 </span>DataFrame manipulation</a></div><div class=\"lev3 toc-item\"><a href=\"#Adding-and-dropping-columns\" data-toc-modified-id=\"Adding-and-dropping-columns-131\"><span class=\"toc-item-num\">1.3.1 </span>Adding and dropping columns</a></div><div class=\"lev3 toc-item\"><a href=\"#Adding-and-dropping-rows\" data-toc-modified-id=\"Adding-and-dropping-rows-132\"><span class=\"toc-item-num\">1.3.2 </span>Adding and dropping rows</a></div><div class=\"lev2 toc-item\"><a href=\"#Advanced-indexing\" data-toc-modified-id=\"Advanced-indexing-14\"><span class=\"toc-item-num\">1.4 </span>Advanced indexing</a></div>"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "---"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "# Manipulating DataFrames"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "We will review our terminology for a quick moment:\n",
- "* **index** : the column and row indices of your Series or DataFrame, the index for each of these may be hiearchical\n",
- " * row index : the index along the horizontal dimension, and typically used as the primary index\n",
- " * column index : the index along the vertical dimension\n",
- " \n",
- " \n",
- "* **axis** : the numeric designation for the _column_ or _row_ indices; typically `0` is the _column-axis_ and `1` is the _row-axis_. When dealing with multi-indices, the hierarchy within the axis are referred to as _levels_ and accessed similarly \n",
- " \n",
- " \n",
- "**NOTEBOOK OBJECTIVES**\n",
- "\n",
- "In this notebook we'll:\n",
- "\n",
- "* explore more complex slicing and selecting, \n",
- "* look at DataFrame concatenation and appending,\n",
- "* explore Multi-Indices / hierarchical indexing in Pandas."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## More Selecting\n",
- "In the example for this section, we're going to go back to our Baseball data set and load the batting statistics into a DataFrame."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 1,
- "metadata": {
- "collapsed": true
- },
- "outputs": [],
- "source": [
- "import pandas as pd\n",
- "\n",
- "# get the data for players in 2015-16 who played in 100 or more games\n",
- "df = pd.read_csv(\"./datasets/Batting.csv\")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### The convenient `[]` operator (_again_)\n",
- "\n",
- "As before basic slice selections can be made with the syntax similar to that found in lists using the convenience of the `[]` operator. For example, obtaining the first 5 rows of our data, or the last 15."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 2,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>yearID</th>\n",
- " <th>stint</th>\n",
- " <th>teamID</th>\n",
- " <th>lgID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>R</th>\n",
- " <th>H</th>\n",
- " <th>2B</th>\n",
- " <th>...</th>\n",
- " <th>RBI</th>\n",
- " <th>SB</th>\n",
- " <th>CS</th>\n",
- " <th>BB</th>\n",
- " <th>SO</th>\n",
- " <th>IBB</th>\n",
- " <th>HBP</th>\n",
- " <th>SH</th>\n",
- " <th>SF</th>\n",
- " <th>GIDP</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>0</th>\n",
- " <td>abercda01</td>\n",
- " <td>1871</td>\n",
- " <td>1</td>\n",
- " <td>TRO</td>\n",
- " <td>NaN</td>\n",
- " <td>1</td>\n",
- " <td>4</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>1</th>\n",
- " <td>addybo01</td>\n",
- " <td>1871</td>\n",
- " <td>1</td>\n",
- " <td>RC1</td>\n",
- " <td>NaN</td>\n",
- " <td>25</td>\n",
- " <td>118</td>\n",
- " <td>30</td>\n",
- " <td>32</td>\n",
- " <td>6</td>\n",
- " <td>...</td>\n",
- " <td>13.0</td>\n",
- " <td>8.0</td>\n",
- " <td>1.0</td>\n",
- " <td>4</td>\n",
- " <td>0.0</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>2</th>\n",
- " <td>allisar01</td>\n",
- " <td>1871</td>\n",
- " <td>1</td>\n",
- " <td>CL1</td>\n",
- " <td>NaN</td>\n",
- " <td>29</td>\n",
- " <td>137</td>\n",
- " <td>28</td>\n",
- " <td>40</td>\n",
- " <td>4</td>\n",
- " <td>...</td>\n",
- " <td>19.0</td>\n",
- " <td>3.0</td>\n",
- " <td>1.0</td>\n",
- " <td>2</td>\n",
- " <td>5.0</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>3</th>\n",
- " <td>allisdo01</td>\n",
- " <td>1871</td>\n",
- " <td>1</td>\n",
- " <td>WS3</td>\n",
- " <td>NaN</td>\n",
- " <td>27</td>\n",
- " <td>133</td>\n",
- " <td>28</td>\n",
- " <td>44</td>\n",
- " <td>10</td>\n",
- " <td>...</td>\n",
- " <td>27.0</td>\n",
- " <td>1.0</td>\n",
- " <td>1.0</td>\n",
- " <td>0</td>\n",
- " <td>2.0</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>4</th>\n",
- " <td>ansonca01</td>\n",
- " <td>1871</td>\n",
- " <td>1</td>\n",
- " <td>RC1</td>\n",
- " <td>NaN</td>\n",
- " <td>25</td>\n",
- " <td>120</td>\n",
- " <td>29</td>\n",
- " <td>39</td>\n",
- " <td>11</td>\n",
- " <td>...</td>\n",
- " <td>16.0</td>\n",
- " <td>6.0</td>\n",
- " <td>2.0</td>\n",
- " <td>2</td>\n",
- " <td>1.0</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "<p>5 rows × 22 columns</p>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID yearID stint teamID lgID G AB R H 2B ... RBI SB \\\n",
- "0 abercda01 1871 1 TRO NaN 1 4 0 0 0 ... 0.0 0.0 \n",
- "1 addybo01 1871 1 RC1 NaN 25 118 30 32 6 ... 13.0 8.0 \n",
- "2 allisar01 1871 1 CL1 NaN 29 137 28 40 4 ... 19.0 3.0 \n",
- "3 allisdo01 1871 1 WS3 NaN 27 133 28 44 10 ... 27.0 1.0 \n",
- "4 ansonca01 1871 1 RC1 NaN 25 120 29 39 11 ... 16.0 6.0 \n",
- "\n",
- " CS BB SO IBB HBP SH SF GIDP \n",
- "0 0.0 0 0.0 NaN NaN NaN NaN NaN \n",
- "1 1.0 4 0.0 NaN NaN NaN NaN NaN \n",
- "2 1.0 2 5.0 NaN NaN NaN NaN NaN \n",
- "3 1.0 0 2.0 NaN NaN NaN NaN NaN \n",
- "4 2.0 2 1.0 NaN NaN NaN NaN NaN \n",
- "\n",
- "[5 rows x 22 columns]"
- ]
- },
- "execution_count": 2,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df[:5]"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 3,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>yearID</th>\n",
- " <th>stint</th>\n",
- " <th>teamID</th>\n",
- " <th>lgID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>R</th>\n",
- " <th>H</th>\n",
- " <th>2B</th>\n",
- " <th>...</th>\n",
- " <th>RBI</th>\n",
- " <th>SB</th>\n",
- " <th>CS</th>\n",
- " <th>BB</th>\n",
- " <th>SO</th>\n",
- " <th>IBB</th>\n",
- " <th>HBP</th>\n",
- " <th>SH</th>\n",
- " <th>SF</th>\n",
- " <th>GIDP</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>102801</th>\n",
- " <td>ynoaga01</td>\n",
- " <td>2016</td>\n",
- " <td>1</td>\n",
- " <td>NYN</td>\n",
- " <td>NL</td>\n",
- " <td>10</td>\n",
- " <td>3</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>102802</th>\n",
- " <td>ynoami01</td>\n",
- " <td>2016</td>\n",
- " <td>1</td>\n",
- " <td>CHA</td>\n",
- " <td>AL</td>\n",
- " <td>23</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>102803</th>\n",
- " <td>ynoara01</td>\n",
- " <td>2016</td>\n",
- " <td>1</td>\n",
- " <td>COL</td>\n",
- " <td>NL</td>\n",
- " <td>3</td>\n",
- " <td>5</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>2.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>102804</th>\n",
- " <td>youngch03</td>\n",
- " <td>2016</td>\n",
- " <td>1</td>\n",
- " <td>KCA</td>\n",
- " <td>AL</td>\n",
- " <td>34</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>102805</th>\n",
- " <td>youngch04</td>\n",
- " <td>2016</td>\n",
- " <td>1</td>\n",
- " <td>BOS</td>\n",
- " <td>AL</td>\n",
- " <td>76</td>\n",
- " <td>203</td>\n",
- " <td>29</td>\n",
- " <td>56</td>\n",
- " <td>18</td>\n",
- " <td>...</td>\n",
- " <td>24.0</td>\n",
- " <td>4.0</td>\n",
- " <td>2.0</td>\n",
- " <td>21</td>\n",
- " <td>50.0</td>\n",
- " <td>0.0</td>\n",
- " <td>3.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>4.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>102806</th>\n",
- " <td>younger03</td>\n",
- " <td>2016</td>\n",
- " <td>1</td>\n",
- " <td>NYA</td>\n",
- " <td>AL</td>\n",
- " <td>6</td>\n",
- " <td>1</td>\n",
- " <td>2</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>1.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>102807</th>\n",
- " <td>youngma03</td>\n",
- " <td>2016</td>\n",
- " <td>1</td>\n",
- " <td>ATL</td>\n",
- " <td>NL</td>\n",
- " <td>8</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>102808</th>\n",
- " <td>zastrro01</td>\n",
- " <td>2016</td>\n",
- " <td>1</td>\n",
- " <td>CHN</td>\n",
- " <td>NL</td>\n",
- " <td>8</td>\n",
- " <td>3</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>2.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>102809</th>\n",
- " <td>zieglbr01</td>\n",
- " <td>2016</td>\n",
- " <td>1</td>\n",
- " <td>ARI</td>\n",
- " <td>NL</td>\n",
- " <td>36</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>102810</th>\n",
- " <td>zieglbr01</td>\n",
- " <td>2016</td>\n",
- " <td>2</td>\n",
- " <td>BOS</td>\n",
- " <td>AL</td>\n",
- " <td>33</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>102811</th>\n",
- " <td>zimmejo02</td>\n",
- " <td>2016</td>\n",
- " <td>1</td>\n",
- " <td>DET</td>\n",
- " <td>AL</td>\n",
- " <td>19</td>\n",
- " <td>4</td>\n",
- " <td>0</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>2.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>1.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>102812</th>\n",
- " <td>zimmery01</td>\n",
- " <td>2016</td>\n",
- " <td>1</td>\n",
- " <td>WAS</td>\n",
- " <td>NL</td>\n",
- " <td>115</td>\n",
- " <td>427</td>\n",
- " <td>60</td>\n",
- " <td>93</td>\n",
- " <td>18</td>\n",
- " <td>...</td>\n",
- " <td>46.0</td>\n",
- " <td>4.0</td>\n",
- " <td>1.0</td>\n",
- " <td>29</td>\n",
- " <td>104.0</td>\n",
- " <td>1.0</td>\n",
- " <td>5.0</td>\n",
- " <td>0.0</td>\n",
- " <td>6.0</td>\n",
- " <td>12.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>102813</th>\n",
- " <td>zobribe01</td>\n",
- " <td>2016</td>\n",
- " <td>1</td>\n",
- " <td>CHN</td>\n",
- " <td>NL</td>\n",
- " <td>147</td>\n",
- " <td>523</td>\n",
- " <td>94</td>\n",
- " <td>142</td>\n",
- " <td>31</td>\n",
- " <td>...</td>\n",
- " <td>76.0</td>\n",
- " <td>6.0</td>\n",
- " <td>4.0</td>\n",
- " <td>96</td>\n",
- " <td>82.0</td>\n",
- " <td>6.0</td>\n",
- " <td>4.0</td>\n",
- " <td>4.0</td>\n",
- " <td>4.0</td>\n",
- " <td>17.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>102814</th>\n",
- " <td>zuninmi01</td>\n",
- " <td>2016</td>\n",
- " <td>1</td>\n",
- " <td>SEA</td>\n",
- " <td>AL</td>\n",
- " <td>55</td>\n",
- " <td>164</td>\n",
- " <td>16</td>\n",
- " <td>34</td>\n",
- " <td>7</td>\n",
- " <td>...</td>\n",
- " <td>31.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>21</td>\n",
- " <td>65.0</td>\n",
- " <td>0.0</td>\n",
- " <td>6.0</td>\n",
- " <td>0.0</td>\n",
- " <td>1.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>102815</th>\n",
- " <td>zychto01</td>\n",
- " <td>2016</td>\n",
- " <td>1</td>\n",
- " <td>SEA</td>\n",
- " <td>AL</td>\n",
- " <td>12</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "<p>15 rows × 22 columns</p>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID yearID stint teamID lgID G AB R H 2B ... \\\n",
- "102801 ynoaga01 2016 1 NYN NL 10 3 0 0 0 ... \n",
- "102802 ynoami01 2016 1 CHA AL 23 0 0 0 0 ... \n",
- "102803 ynoara01 2016 1 COL NL 3 5 0 0 0 ... \n",
- "102804 youngch03 2016 1 KCA AL 34 1 0 0 0 ... \n",
- "102805 youngch04 2016 1 BOS AL 76 203 29 56 18 ... \n",
- "102806 younger03 2016 1 NYA AL 6 1 2 0 0 ... \n",
- "102807 youngma03 2016 1 ATL NL 8 0 0 0 0 ... \n",
- "102808 zastrro01 2016 1 CHN NL 8 3 0 0 0 ... \n",
- "102809 zieglbr01 2016 1 ARI NL 36 0 0 0 0 ... \n",
- "102810 zieglbr01 2016 2 BOS AL 33 0 0 0 0 ... \n",
- "102811 zimmejo02 2016 1 DET AL 19 4 0 1 0 ... \n",
- "102812 zimmery01 2016 1 WAS NL 115 427 60 93 18 ... \n",
- "102813 zobribe01 2016 1 CHN NL 147 523 94 142 31 ... \n",
- "102814 zuninmi01 2016 1 SEA AL 55 164 16 34 7 ... \n",
- "102815 zychto01 2016 1 SEA AL 12 0 0 0 0 ... \n",
- "\n",
- " RBI SB CS BB SO IBB HBP SH SF GIDP \n",
- "102801 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 \n",
- "102802 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 \n",
- "102803 0.0 0.0 0.0 0 2.0 0.0 0.0 0.0 0.0 0.0 \n",
- "102804 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 \n",
- "102805 24.0 4.0 2.0 21 50.0 0.0 3.0 0.0 0.0 4.0 \n",
- "102806 0.0 1.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 \n",
- "102807 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 \n",
- "102808 0.0 0.0 0.0 0 2.0 0.0 0.0 0.0 0.0 0.0 \n",
- "102809 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 \n",
- "102810 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 \n",
- "102811 0.0 0.0 0.0 0 2.0 0.0 0.0 1.0 0.0 0.0 \n",
- "102812 46.0 4.0 1.0 29 104.0 1.0 5.0 0.0 6.0 12.0 \n",
- "102813 76.0 6.0 4.0 96 82.0 6.0 4.0 4.0 4.0 17.0 \n",
- "102814 31.0 0.0 0.0 21 65.0 0.0 6.0 0.0 1.0 0.0 \n",
- "102815 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 \n",
- "\n",
- "[15 rows x 22 columns]"
- ]
- },
- "execution_count": 3,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df[-15:]"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "We mostly worked on _row slicing_ with the `[]` selector, but if we pass a _column label_ or **list** of the columns we'd like, say the `RBI` and `G` (games played) data, we get mostly what we'd expect:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "collapsed": true
- },
- "outputs": [],
- "source": [
- "df[\"RBI\"][:5]"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "collapsed": true
- },
- "outputs": [],
- "source": [
- "df[[\"RBI\", \"G\"]][:10]"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Selecting data by `.` selector on column and index name\n",
- "\n",
- "We can obtain _column_ data by column labels (note that the column index was loaded for us when we read the file into the DataFrame). For example to get all the `RBI` data:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 4,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "0 0.0\n",
- "1 13.0\n",
- "2 19.0\n",
- "3 27.0\n",
- "4 16.0\n",
- "5 5.0\n",
- "6 2.0\n",
- "7 34.0\n",
- "8 1.0\n",
- "9 11.0\n",
- "Name: RBI, dtype: float64"
- ]
- },
- "execution_count": 4,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df.RBI[:10]"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Similarly, we can pass a **list** of the columns we'd like, so let's get the `RBI` and `G` (games played) data:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 5,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>RBI</th>\n",
- " <th>G</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>0</th>\n",
- " <td>0.0</td>\n",
- " <td>1</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>1</th>\n",
- " <td>13.0</td>\n",
- " <td>25</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>2</th>\n",
- " <td>19.0</td>\n",
- " <td>29</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>3</th>\n",
- " <td>27.0</td>\n",
- " <td>27</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>4</th>\n",
- " <td>16.0</td>\n",
- " <td>25</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>5</th>\n",
- " <td>5.0</td>\n",
- " <td>12</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>6</th>\n",
- " <td>2.0</td>\n",
- " <td>1</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>7</th>\n",
- " <td>34.0</td>\n",
- " <td>31</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>8</th>\n",
- " <td>1.0</td>\n",
- " <td>1</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>9</th>\n",
- " <td>11.0</td>\n",
- " <td>18</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " RBI G\n",
- "0 0.0 1\n",
- "1 13.0 25\n",
- "2 19.0 29\n",
- "3 27.0 27\n",
- "4 16.0 25\n",
- "5 5.0 12\n",
- "6 2.0 1\n",
- "7 34.0 31\n",
- "8 1.0 1\n",
- "9 11.0 18"
- ]
- },
- "execution_count": 5,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df[[\"RBI\", \"G\"]][:10]"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Boolean selecting\n",
- "We have yet to make more complex selections beyond index values. Now we're ready to introduce selecting by boolean value. With this kinds of selection, we're going to as Pandas to give us the Series or DataFrame that represents the _boolean_ values of what we want, then we will allow `iloc` to reduce the resulting Series or DataFrame to what we're looking for. Let's see this in action.\n",
- "\n",
- "Say we want to find all items in our DataFrame where `yearID` is `2015` or\n",
- "\n",
- "```\n",
- "df.yearID == 2015\n",
- "```\n",
- "\n",
- "Let's first see what this does."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 6,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "0 False\n",
- "1 False\n",
- "2 False\n",
- "3 False\n",
- "4 False\n",
- "5 False\n",
- "6 False\n",
- "7 False\n",
- "8 False\n",
- "9 False\n",
- "10 False\n",
- "11 False\n",
- "12 False\n",
- "13 False\n",
- "14 False\n",
- "15 False\n",
- "16 False\n",
- "17 False\n",
- "18 False\n",
- "19 False\n",
- "20 False\n",
- "21 False\n",
- "22 False\n",
- "23 False\n",
- "24 False\n",
- "25 False\n",
- "26 False\n",
- "27 False\n",
- "28 False\n",
- "29 False\n",
- " ... \n",
- "102786 False\n",
- "102787 False\n",
- "102788 False\n",
- "102789 False\n",
- "102790 False\n",
- "102791 False\n",
- "102792 False\n",
- "102793 False\n",
- "102794 False\n",
- "102795 False\n",
- "102796 False\n",
- "102797 False\n",
- "102798 False\n",
- "102799 False\n",
- "102800 False\n",
- "102801 False\n",
- "102802 False\n",
- "102803 False\n",
- "102804 False\n",
- "102805 False\n",
- "102806 False\n",
- "102807 False\n",
- "102808 False\n",
- "102809 False\n",
- "102810 False\n",
- "102811 False\n",
- "102812 False\n",
- "102813 False\n",
- "102814 False\n",
- "102815 False\n",
- "Name: yearID, Length: 102816, dtype: bool"
- ]
- },
- "execution_count": 6,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df.yearID == 2015"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "We're returned the Series that contains a `True` or `False` given our _boolean_ query. We need now pass this _boolean_ Series into `loc` and we will see the outcome."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 7,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>yearID</th>\n",
- " <th>stint</th>\n",
- " <th>teamID</th>\n",
- " <th>lgID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>R</th>\n",
- " <th>H</th>\n",
- " <th>2B</th>\n",
- " <th>...</th>\n",
- " <th>RBI</th>\n",
- " <th>SB</th>\n",
- " <th>CS</th>\n",
- " <th>BB</th>\n",
- " <th>SO</th>\n",
- " <th>IBB</th>\n",
- " <th>HBP</th>\n",
- " <th>SH</th>\n",
- " <th>SF</th>\n",
- " <th>GIDP</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>99847</th>\n",
- " <td>aardsda01</td>\n",
- " <td>2015</td>\n",
- " <td>1</td>\n",
- " <td>ATL</td>\n",
- " <td>NL</td>\n",
- " <td>33</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>1.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>99848</th>\n",
- " <td>abadfe01</td>\n",
- " <td>2015</td>\n",
- " <td>1</td>\n",
- " <td>OAK</td>\n",
- " <td>AL</td>\n",
- " <td>62</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>99849</th>\n",
- " <td>abreujo02</td>\n",
- " <td>2015</td>\n",
- " <td>1</td>\n",
- " <td>CHA</td>\n",
- " <td>AL</td>\n",
- " <td>154</td>\n",
- " <td>613</td>\n",
- " <td>88</td>\n",
- " <td>178</td>\n",
- " <td>34</td>\n",
- " <td>...</td>\n",
- " <td>101.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>39</td>\n",
- " <td>140.0</td>\n",
- " <td>11.0</td>\n",
- " <td>15.0</td>\n",
- " <td>0.0</td>\n",
- " <td>1.0</td>\n",
- " <td>16.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>99850</th>\n",
- " <td>achteaj01</td>\n",
- " <td>2015</td>\n",
- " <td>1</td>\n",
- " <td>MIN</td>\n",
- " <td>AL</td>\n",
- " <td>11</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>99851</th>\n",
- " <td>ackledu01</td>\n",
- " <td>2015</td>\n",
- " <td>1</td>\n",
- " <td>SEA</td>\n",
- " <td>AL</td>\n",
- " <td>85</td>\n",
- " <td>186</td>\n",
- " <td>22</td>\n",
- " <td>40</td>\n",
- " <td>8</td>\n",
- " <td>...</td>\n",
- " <td>19.0</td>\n",
- " <td>2.0</td>\n",
- " <td>2.0</td>\n",
- " <td>14</td>\n",
- " <td>38.0</td>\n",
- " <td>0.0</td>\n",
- " <td>1.0</td>\n",
- " <td>3.0</td>\n",
- " <td>3.0</td>\n",
- " <td>3.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>99852</th>\n",
- " <td>ackledu01</td>\n",
- " <td>2015</td>\n",
- " <td>2</td>\n",
- " <td>NYA</td>\n",
- " <td>AL</td>\n",
- " <td>23</td>\n",
- " <td>52</td>\n",
- " <td>6</td>\n",
- " <td>15</td>\n",
- " <td>3</td>\n",
- " <td>...</td>\n",
- " <td>11.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>4</td>\n",
- " <td>7.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>1.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>99853</th>\n",
- " <td>adamecr01</td>\n",
- " <td>2015</td>\n",
- " <td>1</td>\n",
- " <td>COL</td>\n",
- " <td>NL</td>\n",
- " <td>26</td>\n",
- " <td>53</td>\n",
- " <td>4</td>\n",
- " <td>13</td>\n",
- " <td>1</td>\n",
- " <td>...</td>\n",
- " <td>3.0</td>\n",
- " <td>0.0</td>\n",
- " <td>1.0</td>\n",
- " <td>3</td>\n",
- " <td>11.0</td>\n",
- " <td>1.0</td>\n",
- " <td>1.0</td>\n",
- " <td>1.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>99854</th>\n",
- " <td>adamsau01</td>\n",
- " <td>2015</td>\n",
- " <td>1</td>\n",
- " <td>CLE</td>\n",
- " <td>AL</td>\n",
- " <td>28</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>1.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>99855</th>\n",
- " <td>adamsma01</td>\n",
- " <td>2015</td>\n",
- " <td>1</td>\n",
- " <td>SLN</td>\n",
- " <td>NL</td>\n",
- " <td>60</td>\n",
- " <td>175</td>\n",
- " <td>14</td>\n",
- " <td>42</td>\n",
- " <td>9</td>\n",
- " <td>...</td>\n",
- " <td>24.0</td>\n",
- " <td>1.0</td>\n",
- " <td>0.0</td>\n",
- " <td>10</td>\n",
- " <td>41.0</td>\n",
- " <td>1.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>1.0</td>\n",
- " <td>1.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>99856</th>\n",
- " <td>adcocna01</td>\n",
- " <td>2015</td>\n",
- " <td>1</td>\n",
- " <td>CIN</td>\n",
- " <td>NL</td>\n",
- " <td>13</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "<p>10 rows × 22 columns</p>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID yearID stint teamID lgID G AB R H 2B ... \\\n",
- "99847 aardsda01 2015 1 ATL NL 33 1 0 0 0 ... \n",
- "99848 abadfe01 2015 1 OAK AL 62 0 0 0 0 ... \n",
- "99849 abreujo02 2015 1 CHA AL 154 613 88 178 34 ... \n",
- "99850 achteaj01 2015 1 MIN AL 11 0 0 0 0 ... \n",
- "99851 ackledu01 2015 1 SEA AL 85 186 22 40 8 ... \n",
- "99852 ackledu01 2015 2 NYA AL 23 52 6 15 3 ... \n",
- "99853 adamecr01 2015 1 COL NL 26 53 4 13 1 ... \n",
- "99854 adamsau01 2015 1 CLE AL 28 1 0 0 0 ... \n",
- "99855 adamsma01 2015 1 SLN NL 60 175 14 42 9 ... \n",
- "99856 adcocna01 2015 1 CIN NL 13 0 0 0 0 ... \n",
- "\n",
- " RBI SB CS BB SO IBB HBP SH SF GIDP \n",
- "99847 0.0 0.0 0.0 0 1.0 0.0 0.0 0.0 0.0 0.0 \n",
- "99848 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 \n",
- "99849 101.0 0.0 0.0 39 140.0 11.0 15.0 0.0 1.0 16.0 \n",
- "99850 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 \n",
- "99851 19.0 2.0 2.0 14 38.0 0.0 1.0 3.0 3.0 3.0 \n",
- "99852 11.0 0.0 0.0 4 7.0 0.0 0.0 0.0 1.0 0.0 \n",
- "99853 3.0 0.0 1.0 3 11.0 1.0 1.0 1.0 0.0 0.0 \n",
- "99854 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
- "99855 24.0 1.0 0.0 10 41.0 1.0 0.0 0.0 1.0 1.0 \n",
- "99856 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 \n",
- "\n",
- "[10 rows x 22 columns]"
- ]
- },
- "execution_count": 7,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df.loc[df.yearID == 2015][:10] # note we're restricting the return to just the first 10 values"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Now what if we wanted the restrict this further by team. Say we wanted to see only the [Minesota Twins](https://www.mlb.com/twins) player data for 2015. That is\n",
- "\n",
- "```\n",
- "df.yearID == 2015\n",
- "AND\n",
- "df.teamID == \"MIN\"\n",
- "```\n",
- "\n",
- "We simply put these in parethesis and use the `&` operator."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 8,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>yearID</th>\n",
- " <th>stint</th>\n",
- " <th>teamID</th>\n",
- " <th>lgID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>R</th>\n",
- " <th>H</th>\n",
- " <th>2B</th>\n",
- " <th>...</th>\n",
- " <th>RBI</th>\n",
- " <th>SB</th>\n",
- " <th>CS</th>\n",
- " <th>BB</th>\n",
- " <th>SO</th>\n",
- " <th>IBB</th>\n",
- " <th>HBP</th>\n",
- " <th>SH</th>\n",
- " <th>SF</th>\n",
- " <th>GIDP</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>99850</th>\n",
- " <td>achteaj01</td>\n",
- " <td>2015</td>\n",
- " <td>1</td>\n",
- " <td>MIN</td>\n",
- " <td>AL</td>\n",
- " <td>11</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>99891</th>\n",
- " <td>arciaos01</td>\n",
- " <td>2015</td>\n",
- " <td>1</td>\n",
- " <td>MIN</td>\n",
- " <td>AL</td>\n",
- " <td>19</td>\n",
- " <td>58</td>\n",
- " <td>6</td>\n",
- " <td>16</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>8.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>4</td>\n",
- " <td>15.0</td>\n",
- " <td>4.0</td>\n",
- " <td>2.0</td>\n",
- " <td>0.0</td>\n",
- " <td>1.0</td>\n",
- " <td>2.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>99954</th>\n",
- " <td>bernido01</td>\n",
- " <td>2015</td>\n",
- " <td>1</td>\n",
- " <td>MIN</td>\n",
- " <td>AL</td>\n",
- " <td>4</td>\n",
- " <td>5</td>\n",
- " <td>1</td>\n",
- " <td>1</td>\n",
- " <td>1</td>\n",
- " <td>...</td>\n",
- " <td>2.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>1</td>\n",
- " <td>3.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>99988</th>\n",
- " <td>boyerbl01</td>\n",
- " <td>2015</td>\n",
- " <td>1</td>\n",
- " <td>MIN</td>\n",
- " <td>AL</td>\n",
- " <td>68</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100030</th>\n",
- " <td>buxtoby01</td>\n",
- " <td>2015</td>\n",
- " <td>1</td>\n",
- " <td>MIN</td>\n",
- " <td>AL</td>\n",
- " <td>46</td>\n",
- " <td>129</td>\n",
- " <td>16</td>\n",
- " <td>27</td>\n",
- " <td>7</td>\n",
- " <td>...</td>\n",
- " <td>6.0</td>\n",
- " <td>2.0</td>\n",
- " <td>2.0</td>\n",
- " <td>6</td>\n",
- " <td>44.0</td>\n",
- " <td>0.0</td>\n",
- " <td>1.0</td>\n",
- " <td>2.0</td>\n",
- " <td>0.0</td>\n",
- " <td>1.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100139</th>\n",
- " <td>cottsne01</td>\n",
- " <td>2015</td>\n",
- " <td>2</td>\n",
- " <td>MIN</td>\n",
- " <td>AL</td>\n",
- " <td>17</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100215</th>\n",
- " <td>doziebr01</td>\n",
- " <td>2015</td>\n",
- " <td>1</td>\n",
- " <td>MIN</td>\n",
- " <td>AL</td>\n",
- " <td>157</td>\n",
- " <td>628</td>\n",
- " <td>101</td>\n",
- " <td>148</td>\n",
- " <td>39</td>\n",
- " <td>...</td>\n",
- " <td>77.0</td>\n",
- " <td>12.0</td>\n",
- " <td>4.0</td>\n",
- " <td>61</td>\n",
- " <td>148.0</td>\n",
- " <td>2.0</td>\n",
- " <td>7.0</td>\n",
- " <td>0.0</td>\n",
- " <td>8.0</td>\n",
- " <td>10.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100221</th>\n",
- " <td>duensbr01</td>\n",
- " <td>2015</td>\n",
- " <td>1</td>\n",
- " <td>MIN</td>\n",
- " <td>AL</td>\n",
- " <td>55</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100222</th>\n",
- " <td>duffety01</td>\n",
- " <td>2015</td>\n",
- " <td>1</td>\n",
- " <td>MIN</td>\n",
- " <td>AL</td>\n",
- " <td>10</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100249</th>\n",
- " <td>escobed01</td>\n",
- " <td>2015</td>\n",
- " <td>1</td>\n",
- " <td>MIN</td>\n",
- " <td>AL</td>\n",
- " <td>127</td>\n",
- " <td>409</td>\n",
- " <td>48</td>\n",
- " <td>107</td>\n",
- " <td>31</td>\n",
- " <td>...</td>\n",
- " <td>58.0</td>\n",
- " <td>2.0</td>\n",
- " <td>3.0</td>\n",
- " <td>28</td>\n",
- " <td>86.0</td>\n",
- " <td>1.0</td>\n",
- " <td>2.0</td>\n",
- " <td>2.0</td>\n",
- " <td>5.0</td>\n",
- " <td>7.0</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "<p>10 rows × 22 columns</p>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID yearID stint teamID lgID G AB R H 2B ... \\\n",
- "99850 achteaj01 2015 1 MIN AL 11 0 0 0 0 ... \n",
- "99891 arciaos01 2015 1 MIN AL 19 58 6 16 0 ... \n",
- "99954 bernido01 2015 1 MIN AL 4 5 1 1 1 ... \n",
- "99988 boyerbl01 2015 1 MIN AL 68 0 0 0 0 ... \n",
- "100030 buxtoby01 2015 1 MIN AL 46 129 16 27 7 ... \n",
- "100139 cottsne01 2015 2 MIN AL 17 0 0 0 0 ... \n",
- "100215 doziebr01 2015 1 MIN AL 157 628 101 148 39 ... \n",
- "100221 duensbr01 2015 1 MIN AL 55 1 0 0 0 ... \n",
- "100222 duffety01 2015 1 MIN AL 10 0 0 0 0 ... \n",
- "100249 escobed01 2015 1 MIN AL 127 409 48 107 31 ... \n",
- "\n",
- " RBI SB CS BB SO IBB HBP SH SF GIDP \n",
- "99850 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 \n",
- "99891 8.0 0.0 0.0 4 15.0 4.0 2.0 0.0 1.0 2.0 \n",
- "99954 2.0 0.0 0.0 1 3.0 0.0 0.0 0.0 0.0 0.0 \n",
- "99988 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 \n",
- "100030 6.0 2.0 2.0 6 44.0 0.0 1.0 2.0 0.0 1.0 \n",
- "100139 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 \n",
- "100215 77.0 12.0 4.0 61 148.0 2.0 7.0 0.0 8.0 10.0 \n",
- "100221 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 \n",
- "100222 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 \n",
- "100249 58.0 2.0 3.0 28 86.0 1.0 2.0 2.0 5.0 7.0 \n",
- "\n",
- "[10 rows x 22 columns]"
- ]
- },
- "execution_count": 8,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df.loc[(df.yearID == 2015) & (df.teamID == \"MIN\")].head(10)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Now what if we wanted to restrict a subset of columns. This is easy with `iloc[]` ... we will just use our boolean expression as above for the _row selection_ and then the list of columns for our _column selection_ (in this case a much smaller subset of data)."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 9,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>H</th>\n",
- " <th>HR</th>\n",
- " <th>RBI</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>99850</th>\n",
- " <td>achteaj01</td>\n",
- " <td>11</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>99891</th>\n",
- " <td>arciaos01</td>\n",
- " <td>19</td>\n",
- " <td>58</td>\n",
- " <td>16</td>\n",
- " <td>2</td>\n",
- " <td>8.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>99954</th>\n",
- " <td>bernido01</td>\n",
- " <td>4</td>\n",
- " <td>5</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>2.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>99988</th>\n",
- " <td>boyerbl01</td>\n",
- " <td>68</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100030</th>\n",
- " <td>buxtoby01</td>\n",
- " <td>46</td>\n",
- " <td>129</td>\n",
- " <td>27</td>\n",
- " <td>2</td>\n",
- " <td>6.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100139</th>\n",
- " <td>cottsne01</td>\n",
- " <td>17</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100215</th>\n",
- " <td>doziebr01</td>\n",
- " <td>157</td>\n",
- " <td>628</td>\n",
- " <td>148</td>\n",
- " <td>28</td>\n",
- " <td>77.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100221</th>\n",
- " <td>duensbr01</td>\n",
- " <td>55</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100222</th>\n",
- " <td>duffety01</td>\n",
- " <td>10</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100249</th>\n",
- " <td>escobed01</td>\n",
- " <td>127</td>\n",
- " <td>409</td>\n",
- " <td>107</td>\n",
- " <td>12</td>\n",
- " <td>58.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100270</th>\n",
- " <td>fienca01</td>\n",
- " <td>62</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100302</th>\n",
- " <td>fryerer01</td>\n",
- " <td>15</td>\n",
- " <td>22</td>\n",
- " <td>5</td>\n",
- " <td>0</td>\n",
- " <td>2.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100333</th>\n",
- " <td>gibsoky01</td>\n",
- " <td>32</td>\n",
- " <td>5</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100373</th>\n",
- " <td>grahajr01</td>\n",
- " <td>39</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100455</th>\n",
- " <td>herrmch01</td>\n",
- " <td>45</td>\n",
- " <td>103</td>\n",
- " <td>15</td>\n",
- " <td>2</td>\n",
- " <td>10.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100459</th>\n",
- " <td>hicksaa01</td>\n",
- " <td>97</td>\n",
- " <td>352</td>\n",
- " <td>90</td>\n",
- " <td>11</td>\n",
- " <td>33.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100486</th>\n",
- " <td>hugheph01</td>\n",
- " <td>27</td>\n",
- " <td>3</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100488</th>\n",
- " <td>hunteto01</td>\n",
- " <td>139</td>\n",
- " <td>521</td>\n",
- " <td>125</td>\n",
- " <td>22</td>\n",
- " <td>81.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100521</th>\n",
- " <td>jepseke01</td>\n",
- " <td>29</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100564</th>\n",
- " <td>keplema01</td>\n",
- " <td>3</td>\n",
- " <td>7</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100696</th>\n",
- " <td>mauerjo01</td>\n",
- " <td>158</td>\n",
- " <td>592</td>\n",
- " <td>157</td>\n",
- " <td>10</td>\n",
- " <td>66.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100701</th>\n",
- " <td>maytr01</td>\n",
- " <td>48</td>\n",
- " <td>3</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100729</th>\n",
- " <td>meyeral01</td>\n",
- " <td>2</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100737</th>\n",
- " <td>milonto01</td>\n",
- " <td>24</td>\n",
- " <td>2</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100807</th>\n",
- " <td>nolasri01</td>\n",
- " <td>9</td>\n",
- " <td>3</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100816</th>\n",
- " <td>nunezed02</td>\n",
- " <td>72</td>\n",
- " <td>188</td>\n",
- " <td>53</td>\n",
- " <td>4</td>\n",
- " <td>20.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100837</th>\n",
- " <td>orourry01</td>\n",
- " <td>28</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100872</th>\n",
- " <td>pelfrmi01</td>\n",
- " <td>30</td>\n",
- " <td>3</td>\n",
- " <td>2</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100895</th>\n",
- " <td>perkigl01</td>\n",
- " <td>60</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100915</th>\n",
- " <td>plouftr01</td>\n",
- " <td>152</td>\n",
- " <td>573</td>\n",
- " <td>140</td>\n",
- " <td>22</td>\n",
- " <td>86.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100917</th>\n",
- " <td>polanjo01</td>\n",
- " <td>4</td>\n",
- " <td>10</td>\n",
- " <td>3</td>\n",
- " <td>0</td>\n",
- " <td>1.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100925</th>\n",
- " <td>pressry01</td>\n",
- " <td>27</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100994</th>\n",
- " <td>robinsh01</td>\n",
- " <td>83</td>\n",
- " <td>180</td>\n",
- " <td>45</td>\n",
- " <td>0</td>\n",
- " <td>16.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101023</th>\n",
- " <td>rosared01</td>\n",
- " <td>122</td>\n",
- " <td>453</td>\n",
- " <td>121</td>\n",
- " <td>13</td>\n",
- " <td>50.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101067</th>\n",
- " <td>sanomi01</td>\n",
- " <td>80</td>\n",
- " <td>279</td>\n",
- " <td>75</td>\n",
- " <td>18</td>\n",
- " <td>52.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101069</th>\n",
- " <td>santada01</td>\n",
- " <td>91</td>\n",
- " <td>261</td>\n",
- " <td>56</td>\n",
- " <td>0</td>\n",
- " <td>21.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101072</th>\n",
- " <td>santaer01</td>\n",
- " <td>17</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101079</th>\n",
- " <td>schafjo02</td>\n",
- " <td>27</td>\n",
- " <td>69</td>\n",
- " <td>15</td>\n",
- " <td>0</td>\n",
- " <td>5.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101144</th>\n",
- " <td>staufti01</td>\n",
- " <td>13</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101164</th>\n",
- " <td>suzukku01</td>\n",
- " <td>131</td>\n",
- " <td>433</td>\n",
- " <td>104</td>\n",
- " <td>5</td>\n",
- " <td>50.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101189</th>\n",
- " <td>thielca01</td>\n",
- " <td>6</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101193</th>\n",
- " <td>thompaa01</td>\n",
- " <td>41</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101203</th>\n",
- " <td>tonkimi01</td>\n",
- " <td>26</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101240</th>\n",
- " <td>vargake01</td>\n",
- " <td>58</td>\n",
- " <td>175</td>\n",
- " <td>42</td>\n",
- " <td>5</td>\n",
- " <td>17.0</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID G AB H HR RBI\n",
- "99850 achteaj01 11 0 0 0 0.0\n",
- "99891 arciaos01 19 58 16 2 8.0\n",
- "99954 bernido01 4 5 1 0 2.0\n",
- "99988 boyerbl01 68 0 0 0 0.0\n",
- "100030 buxtoby01 46 129 27 2 6.0\n",
- "100139 cottsne01 17 0 0 0 0.0\n",
- "100215 doziebr01 157 628 148 28 77.0\n",
- "100221 duensbr01 55 1 0 0 0.0\n",
- "100222 duffety01 10 0 0 0 0.0\n",
- "100249 escobed01 127 409 107 12 58.0\n",
- "100270 fienca01 62 0 0 0 0.0\n",
- "100302 fryerer01 15 22 5 0 2.0\n",
- "100333 gibsoky01 32 5 1 0 0.0\n",
- "100373 grahajr01 39 0 0 0 0.0\n",
- "100455 herrmch01 45 103 15 2 10.0\n",
- "100459 hicksaa01 97 352 90 11 33.0\n",
- "100486 hugheph01 27 3 0 0 0.0\n",
- "100488 hunteto01 139 521 125 22 81.0\n",
- "100521 jepseke01 29 0 0 0 0.0\n",
- "100564 keplema01 3 7 1 0 0.0\n",
- "100696 mauerjo01 158 592 157 10 66.0\n",
- "100701 maytr01 48 3 0 0 0.0\n",
- "100729 meyeral01 2 0 0 0 0.0\n",
- "100737 milonto01 24 2 0 0 0.0\n",
- "100807 nolasri01 9 3 0 0 0.0\n",
- "100816 nunezed02 72 188 53 4 20.0\n",
- "100837 orourry01 28 0 0 0 0.0\n",
- "100872 pelfrmi01 30 3 2 0 0.0\n",
- "100895 perkigl01 60 0 0 0 0.0\n",
- "100915 plouftr01 152 573 140 22 86.0\n",
- "100917 polanjo01 4 10 3 0 1.0\n",
- "100925 pressry01 27 0 0 0 0.0\n",
- "100994 robinsh01 83 180 45 0 16.0\n",
- "101023 rosared01 122 453 121 13 50.0\n",
- "101067 sanomi01 80 279 75 18 52.0\n",
- "101069 santada01 91 261 56 0 21.0\n",
- "101072 santaer01 17 0 0 0 0.0\n",
- "101079 schafjo02 27 69 15 0 5.0\n",
- "101144 staufti01 13 0 0 0 0.0\n",
- "101164 suzukku01 131 433 104 5 50.0\n",
- "101189 thielca01 6 0 0 0 0.0\n",
- "101193 thompaa01 41 0 0 0 0.0\n",
- "101203 tonkimi01 26 0 0 0 0.0\n",
- "101240 vargake01 58 175 42 5 17.0"
- ]
- },
- "execution_count": 9,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df.loc[(df.yearID == 2015) & (df.teamID == \"MIN\"),\\\n",
- " ['playerID', 'G', 'AB', 'H', 'HR', 'RBI']]"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Sorting"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Sorting is facilitated by the [`sort_values()` method](). By default, sorting is done in _ascending order_, specify the parameter `ascending=False` to get descending order."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 10,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>H</th>\n",
- " <th>HR</th>\n",
- " <th>RBI</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>100696</th>\n",
- " <td>mauerjo01</td>\n",
- " <td>158</td>\n",
- " <td>592</td>\n",
- " <td>157</td>\n",
- " <td>10</td>\n",
- " <td>66.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100215</th>\n",
- " <td>doziebr01</td>\n",
- " <td>157</td>\n",
- " <td>628</td>\n",
- " <td>148</td>\n",
- " <td>28</td>\n",
- " <td>77.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100915</th>\n",
- " <td>plouftr01</td>\n",
- " <td>152</td>\n",
- " <td>573</td>\n",
- " <td>140</td>\n",
- " <td>22</td>\n",
- " <td>86.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100488</th>\n",
- " <td>hunteto01</td>\n",
- " <td>139</td>\n",
- " <td>521</td>\n",
- " <td>125</td>\n",
- " <td>22</td>\n",
- " <td>81.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101164</th>\n",
- " <td>suzukku01</td>\n",
- " <td>131</td>\n",
- " <td>433</td>\n",
- " <td>104</td>\n",
- " <td>5</td>\n",
- " <td>50.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100249</th>\n",
- " <td>escobed01</td>\n",
- " <td>127</td>\n",
- " <td>409</td>\n",
- " <td>107</td>\n",
- " <td>12</td>\n",
- " <td>58.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101023</th>\n",
- " <td>rosared01</td>\n",
- " <td>122</td>\n",
- " <td>453</td>\n",
- " <td>121</td>\n",
- " <td>13</td>\n",
- " <td>50.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100459</th>\n",
- " <td>hicksaa01</td>\n",
- " <td>97</td>\n",
- " <td>352</td>\n",
- " <td>90</td>\n",
- " <td>11</td>\n",
- " <td>33.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101069</th>\n",
- " <td>santada01</td>\n",
- " <td>91</td>\n",
- " <td>261</td>\n",
- " <td>56</td>\n",
- " <td>0</td>\n",
- " <td>21.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100994</th>\n",
- " <td>robinsh01</td>\n",
- " <td>83</td>\n",
- " <td>180</td>\n",
- " <td>45</td>\n",
- " <td>0</td>\n",
- " <td>16.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101067</th>\n",
- " <td>sanomi01</td>\n",
- " <td>80</td>\n",
- " <td>279</td>\n",
- " <td>75</td>\n",
- " <td>18</td>\n",
- " <td>52.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100816</th>\n",
- " <td>nunezed02</td>\n",
- " <td>72</td>\n",
- " <td>188</td>\n",
- " <td>53</td>\n",
- " <td>4</td>\n",
- " <td>20.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>99988</th>\n",
- " <td>boyerbl01</td>\n",
- " <td>68</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100270</th>\n",
- " <td>fienca01</td>\n",
- " <td>62</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100895</th>\n",
- " <td>perkigl01</td>\n",
- " <td>60</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101240</th>\n",
- " <td>vargake01</td>\n",
- " <td>58</td>\n",
- " <td>175</td>\n",
- " <td>42</td>\n",
- " <td>5</td>\n",
- " <td>17.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100221</th>\n",
- " <td>duensbr01</td>\n",
- " <td>55</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100701</th>\n",
- " <td>maytr01</td>\n",
- " <td>48</td>\n",
- " <td>3</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100030</th>\n",
- " <td>buxtoby01</td>\n",
- " <td>46</td>\n",
- " <td>129</td>\n",
- " <td>27</td>\n",
- " <td>2</td>\n",
- " <td>6.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100455</th>\n",
- " <td>herrmch01</td>\n",
- " <td>45</td>\n",
- " <td>103</td>\n",
- " <td>15</td>\n",
- " <td>2</td>\n",
- " <td>10.0</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID G AB H HR RBI\n",
- "100696 mauerjo01 158 592 157 10 66.0\n",
- "100215 doziebr01 157 628 148 28 77.0\n",
- "100915 plouftr01 152 573 140 22 86.0\n",
- "100488 hunteto01 139 521 125 22 81.0\n",
- "101164 suzukku01 131 433 104 5 50.0\n",
- "100249 escobed01 127 409 107 12 58.0\n",
- "101023 rosared01 122 453 121 13 50.0\n",
- "100459 hicksaa01 97 352 90 11 33.0\n",
- "101069 santada01 91 261 56 0 21.0\n",
- "100994 robinsh01 83 180 45 0 16.0\n",
- "101067 sanomi01 80 279 75 18 52.0\n",
- "100816 nunezed02 72 188 53 4 20.0\n",
- "99988 boyerbl01 68 0 0 0 0.0\n",
- "100270 fienca01 62 0 0 0 0.0\n",
- "100895 perkigl01 60 0 0 0 0.0\n",
- "101240 vargake01 58 175 42 5 17.0\n",
- "100221 duensbr01 55 1 0 0 0.0\n",
- "100701 maytr01 48 3 0 0 0.0\n",
- "100030 buxtoby01 46 129 27 2 6.0\n",
- "100455 herrmch01 45 103 15 2 10.0"
- ]
- },
- "execution_count": 10,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df_min_2015 = df.loc[(df.yearID == 2015) & (df.teamID == \"MIN\"),\\\n",
- " ['playerID', 'G', 'AB', 'H', 'HR', 'RBI']]\\\n",
- " .sort_values('G', ascending=False)\n",
- "df_min_2015.head(20)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "We may also do a _multi-sort_ by passing in the list of _columns_ we want sorted. This will sort in the order of the columns provided. For example,"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 11,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>H</th>\n",
- " <th>HR</th>\n",
- " <th>RBI</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>101189</th>\n",
- " <td>thielca01</td>\n",
- " <td>6</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>99954</th>\n",
- " <td>bernido01</td>\n",
- " <td>4</td>\n",
- " <td>5</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>2.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100917</th>\n",
- " <td>polanjo01</td>\n",
- " <td>4</td>\n",
- " <td>10</td>\n",
- " <td>3</td>\n",
- " <td>0</td>\n",
- " <td>1.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100564</th>\n",
- " <td>keplema01</td>\n",
- " <td>3</td>\n",
- " <td>7</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100729</th>\n",
- " <td>meyeral01</td>\n",
- " <td>2</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID G AB H HR RBI\n",
- "101189 thielca01 6 0 0 0 0.0\n",
- "99954 bernido01 4 5 1 0 2.0\n",
- "100917 polanjo01 4 10 3 0 1.0\n",
- "100564 keplema01 3 7 1 0 0.0\n",
- "100729 meyeral01 2 0 0 0 0.0"
- ]
- },
- "execution_count": 11,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df.loc[(df.yearID == 2015) & (df.teamID == \"MIN\"),\\\n",
- " ['playerID', 'G', 'AB', 'H', 'HR', 'RBI']]\\\n",
- " .sort_values(['G', 'HR'], ascending=False).tail()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## DataFrame manipulation"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Adding and dropping columns"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 12,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>H</th>\n",
- " <th>HR</th>\n",
- " <th>RBI</th>\n",
- " <th>HtoAB</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>100696</th>\n",
- " <td>mauerjo01</td>\n",
- " <td>158</td>\n",
- " <td>592</td>\n",
- " <td>157</td>\n",
- " <td>10</td>\n",
- " <td>66.0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100215</th>\n",
- " <td>doziebr01</td>\n",
- " <td>157</td>\n",
- " <td>628</td>\n",
- " <td>148</td>\n",
- " <td>28</td>\n",
- " <td>77.0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100915</th>\n",
- " <td>plouftr01</td>\n",
- " <td>152</td>\n",
- " <td>573</td>\n",
- " <td>140</td>\n",
- " <td>22</td>\n",
- " <td>86.0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100488</th>\n",
- " <td>hunteto01</td>\n",
- " <td>139</td>\n",
- " <td>521</td>\n",
- " <td>125</td>\n",
- " <td>22</td>\n",
- " <td>81.0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101164</th>\n",
- " <td>suzukku01</td>\n",
- " <td>131</td>\n",
- " <td>433</td>\n",
- " <td>104</td>\n",
- " <td>5</td>\n",
- " <td>50.0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID G AB H HR RBI HtoAB\n",
- "100696 mauerjo01 158 592 157 10 66.0 0\n",
- "100215 doziebr01 157 628 148 28 77.0 0\n",
- "100915 plouftr01 152 573 140 22 86.0 0\n",
- "100488 hunteto01 139 521 125 22 81.0 0\n",
- "101164 suzukku01 131 433 104 5 50.0 0"
- ]
- },
- "execution_count": 12,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df_min_2015.loc[:,'HtoAB'] = 0\n",
- "df_min_2015.head()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 13,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>H</th>\n",
- " <th>HR</th>\n",
- " <th>RBI</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>100696</th>\n",
- " <td>mauerjo01</td>\n",
- " <td>158</td>\n",
- " <td>592</td>\n",
- " <td>157</td>\n",
- " <td>10</td>\n",
- " <td>66.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100215</th>\n",
- " <td>doziebr01</td>\n",
- " <td>157</td>\n",
- " <td>628</td>\n",
- " <td>148</td>\n",
- " <td>28</td>\n",
- " <td>77.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100915</th>\n",
- " <td>plouftr01</td>\n",
- " <td>152</td>\n",
- " <td>573</td>\n",
- " <td>140</td>\n",
- " <td>22</td>\n",
- " <td>86.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100488</th>\n",
- " <td>hunteto01</td>\n",
- " <td>139</td>\n",
- " <td>521</td>\n",
- " <td>125</td>\n",
- " <td>22</td>\n",
- " <td>81.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101164</th>\n",
- " <td>suzukku01</td>\n",
- " <td>131</td>\n",
- " <td>433</td>\n",
- " <td>104</td>\n",
- " <td>5</td>\n",
- " <td>50.0</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID G AB H HR RBI\n",
- "100696 mauerjo01 158 592 157 10 66.0\n",
- "100215 doziebr01 157 628 148 28 77.0\n",
- "100915 plouftr01 152 573 140 22 86.0\n",
- "100488 hunteto01 139 521 125 22 81.0\n",
- "101164 suzukku01 131 433 104 5 50.0"
- ]
- },
- "execution_count": 13,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df_min_2015 = df_min_2015.drop('HtoAB', axis=1)\n",
- "df_min_2015.head()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 14,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "100696 157\n",
- "100215 148\n",
- "100915 140\n",
- "100488 125\n",
- "101164 104\n",
- "100249 107\n",
- "101023 121\n",
- "100459 90\n",
- "101069 56\n",
- "100994 45\n",
- "Name: H, dtype: int64"
- ]
- },
- "execution_count": 14,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df_min_2015.H.head(10)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 15,
- "metadata": {
- "collapsed": true
- },
- "outputs": [],
- "source": [
- "df_min_2015.loc[:,'HtoAB'] = 0\n",
- "df_min_2015.loc[:,'HtoAB'] = [v.H/v.AB \n",
- " if v.AB > 0 else 0 \n",
- " for r, v in df_min_2015.iterrows()]"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 16,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>H</th>\n",
- " <th>HR</th>\n",
- " <th>RBI</th>\n",
- " <th>HtoAB</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>100696</th>\n",
- " <td>mauerjo01</td>\n",
- " <td>158</td>\n",
- " <td>592</td>\n",
- " <td>157</td>\n",
- " <td>10</td>\n",
- " <td>66.0</td>\n",
- " <td>0.265203</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100215</th>\n",
- " <td>doziebr01</td>\n",
- " <td>157</td>\n",
- " <td>628</td>\n",
- " <td>148</td>\n",
- " <td>28</td>\n",
- " <td>77.0</td>\n",
- " <td>0.235669</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100915</th>\n",
- " <td>plouftr01</td>\n",
- " <td>152</td>\n",
- " <td>573</td>\n",
- " <td>140</td>\n",
- " <td>22</td>\n",
- " <td>86.0</td>\n",
- " <td>0.244328</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100488</th>\n",
- " <td>hunteto01</td>\n",
- " <td>139</td>\n",
- " <td>521</td>\n",
- " <td>125</td>\n",
- " <td>22</td>\n",
- " <td>81.0</td>\n",
- " <td>0.239923</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101164</th>\n",
- " <td>suzukku01</td>\n",
- " <td>131</td>\n",
- " <td>433</td>\n",
- " <td>104</td>\n",
- " <td>5</td>\n",
- " <td>50.0</td>\n",
- " <td>0.240185</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100249</th>\n",
- " <td>escobed01</td>\n",
- " <td>127</td>\n",
- " <td>409</td>\n",
- " <td>107</td>\n",
- " <td>12</td>\n",
- " <td>58.0</td>\n",
- " <td>0.261614</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101023</th>\n",
- " <td>rosared01</td>\n",
- " <td>122</td>\n",
- " <td>453</td>\n",
- " <td>121</td>\n",
- " <td>13</td>\n",
- " <td>50.0</td>\n",
- " <td>0.267108</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100459</th>\n",
- " <td>hicksaa01</td>\n",
- " <td>97</td>\n",
- " <td>352</td>\n",
- " <td>90</td>\n",
- " <td>11</td>\n",
- " <td>33.0</td>\n",
- " <td>0.255682</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101069</th>\n",
- " <td>santada01</td>\n",
- " <td>91</td>\n",
- " <td>261</td>\n",
- " <td>56</td>\n",
- " <td>0</td>\n",
- " <td>21.0</td>\n",
- " <td>0.214559</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100994</th>\n",
- " <td>robinsh01</td>\n",
- " <td>83</td>\n",
- " <td>180</td>\n",
- " <td>45</td>\n",
- " <td>0</td>\n",
- " <td>16.0</td>\n",
- " <td>0.250000</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID G AB H HR RBI HtoAB\n",
- "100696 mauerjo01 158 592 157 10 66.0 0.265203\n",
- "100215 doziebr01 157 628 148 28 77.0 0.235669\n",
- "100915 plouftr01 152 573 140 22 86.0 0.244328\n",
- "100488 hunteto01 139 521 125 22 81.0 0.239923\n",
- "101164 suzukku01 131 433 104 5 50.0 0.240185\n",
- "100249 escobed01 127 409 107 12 58.0 0.261614\n",
- "101023 rosared01 122 453 121 13 50.0 0.267108\n",
- "100459 hicksaa01 97 352 90 11 33.0 0.255682\n",
- "101069 santada01 91 261 56 0 21.0 0.214559\n",
- "100994 robinsh01 83 180 45 0 16.0 0.250000"
- ]
- },
- "execution_count": 16,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df_min_2015.head(10)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 17,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>H</th>\n",
- " <th>HR</th>\n",
- " <th>RBI</th>\n",
- " <th>HtoAB</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>101023</th>\n",
- " <td>rosared01</td>\n",
- " <td>122</td>\n",
- " <td>453</td>\n",
- " <td>121</td>\n",
- " <td>13</td>\n",
- " <td>50.0</td>\n",
- " <td>0.267108</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100696</th>\n",
- " <td>mauerjo01</td>\n",
- " <td>158</td>\n",
- " <td>592</td>\n",
- " <td>157</td>\n",
- " <td>10</td>\n",
- " <td>66.0</td>\n",
- " <td>0.265203</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100249</th>\n",
- " <td>escobed01</td>\n",
- " <td>127</td>\n",
- " <td>409</td>\n",
- " <td>107</td>\n",
- " <td>12</td>\n",
- " <td>58.0</td>\n",
- " <td>0.261614</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100459</th>\n",
- " <td>hicksaa01</td>\n",
- " <td>97</td>\n",
- " <td>352</td>\n",
- " <td>90</td>\n",
- " <td>11</td>\n",
- " <td>33.0</td>\n",
- " <td>0.255682</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100994</th>\n",
- " <td>robinsh01</td>\n",
- " <td>83</td>\n",
- " <td>180</td>\n",
- " <td>45</td>\n",
- " <td>0</td>\n",
- " <td>16.0</td>\n",
- " <td>0.250000</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100915</th>\n",
- " <td>plouftr01</td>\n",
- " <td>152</td>\n",
- " <td>573</td>\n",
- " <td>140</td>\n",
- " <td>22</td>\n",
- " <td>86.0</td>\n",
- " <td>0.244328</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101164</th>\n",
- " <td>suzukku01</td>\n",
- " <td>131</td>\n",
- " <td>433</td>\n",
- " <td>104</td>\n",
- " <td>5</td>\n",
- " <td>50.0</td>\n",
- " <td>0.240185</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100488</th>\n",
- " <td>hunteto01</td>\n",
- " <td>139</td>\n",
- " <td>521</td>\n",
- " <td>125</td>\n",
- " <td>22</td>\n",
- " <td>81.0</td>\n",
- " <td>0.239923</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100215</th>\n",
- " <td>doziebr01</td>\n",
- " <td>157</td>\n",
- " <td>628</td>\n",
- " <td>148</td>\n",
- " <td>28</td>\n",
- " <td>77.0</td>\n",
- " <td>0.235669</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101069</th>\n",
- " <td>santada01</td>\n",
- " <td>91</td>\n",
- " <td>261</td>\n",
- " <td>56</td>\n",
- " <td>0</td>\n",
- " <td>21.0</td>\n",
- " <td>0.214559</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID G AB H HR RBI HtoAB\n",
- "101023 rosared01 122 453 121 13 50.0 0.267108\n",
- "100696 mauerjo01 158 592 157 10 66.0 0.265203\n",
- "100249 escobed01 127 409 107 12 58.0 0.261614\n",
- "100459 hicksaa01 97 352 90 11 33.0 0.255682\n",
- "100994 robinsh01 83 180 45 0 16.0 0.250000\n",
- "100915 plouftr01 152 573 140 22 86.0 0.244328\n",
- "101164 suzukku01 131 433 104 5 50.0 0.240185\n",
- "100488 hunteto01 139 521 125 22 81.0 0.239923\n",
- "100215 doziebr01 157 628 148 28 77.0 0.235669\n",
- "101069 santada01 91 261 56 0 21.0 0.214559"
- ]
- },
- "execution_count": 17,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df_min_2015[df_min_2015.G>80].sort_values('HtoAB', ascending=False)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 18,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>HtoAB</th>\n",
- " <th>AB</th>\n",
- " <th>H</th>\n",
- " <th>HR</th>\n",
- " <th>RBI</th>\n",
- " <th>G</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>100696</th>\n",
- " <td>mauerjo01</td>\n",
- " <td>0.265203</td>\n",
- " <td>592</td>\n",
- " <td>157</td>\n",
- " <td>10</td>\n",
- " <td>66.0</td>\n",
- " <td>158</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100215</th>\n",
- " <td>doziebr01</td>\n",
- " <td>0.235669</td>\n",
- " <td>628</td>\n",
- " <td>148</td>\n",
- " <td>28</td>\n",
- " <td>77.0</td>\n",
- " <td>157</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100915</th>\n",
- " <td>plouftr01</td>\n",
- " <td>0.244328</td>\n",
- " <td>573</td>\n",
- " <td>140</td>\n",
- " <td>22</td>\n",
- " <td>86.0</td>\n",
- " <td>152</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100488</th>\n",
- " <td>hunteto01</td>\n",
- " <td>0.239923</td>\n",
- " <td>521</td>\n",
- " <td>125</td>\n",
- " <td>22</td>\n",
- " <td>81.0</td>\n",
- " <td>139</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101164</th>\n",
- " <td>suzukku01</td>\n",
- " <td>0.240185</td>\n",
- " <td>433</td>\n",
- " <td>104</td>\n",
- " <td>5</td>\n",
- " <td>50.0</td>\n",
- " <td>131</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID HtoAB AB H HR RBI G\n",
- "100696 mauerjo01 0.265203 592 157 10 66.0 158\n",
- "100215 doziebr01 0.235669 628 148 28 77.0 157\n",
- "100915 plouftr01 0.244328 573 140 22 86.0 152\n",
- "100488 hunteto01 0.239923 521 125 22 81.0 139\n",
- "101164 suzukku01 0.240185 433 104 5 50.0 131"
- ]
- },
- "execution_count": 18,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df_min_2015 = df_min_2015.reindex(columns=['playerID', 'HtoAB', 'AB', 'H', 'HR', 'RBI', 'G'])\n",
- "df_min_2015.head()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Finally, we can return our DataFrame back to its original columns (and order) by reindexing again. Notice, also that we can effectively perform a `drop()` by doing this, though the syntax with `reindex()` is more verbose."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 19,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>H</th>\n",
- " <th>HR</th>\n",
- " <th>RBI</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>100696</th>\n",
- " <td>mauerjo01</td>\n",
- " <td>158</td>\n",
- " <td>592</td>\n",
- " <td>157</td>\n",
- " <td>10</td>\n",
- " <td>66.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100215</th>\n",
- " <td>doziebr01</td>\n",
- " <td>157</td>\n",
- " <td>628</td>\n",
- " <td>148</td>\n",
- " <td>28</td>\n",
- " <td>77.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100915</th>\n",
- " <td>plouftr01</td>\n",
- " <td>152</td>\n",
- " <td>573</td>\n",
- " <td>140</td>\n",
- " <td>22</td>\n",
- " <td>86.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100488</th>\n",
- " <td>hunteto01</td>\n",
- " <td>139</td>\n",
- " <td>521</td>\n",
- " <td>125</td>\n",
- " <td>22</td>\n",
- " <td>81.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101164</th>\n",
- " <td>suzukku01</td>\n",
- " <td>131</td>\n",
- " <td>433</td>\n",
- " <td>104</td>\n",
- " <td>5</td>\n",
- " <td>50.0</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID G AB H HR RBI\n",
- "100696 mauerjo01 158 592 157 10 66.0\n",
- "100215 doziebr01 157 628 148 28 77.0\n",
- "100915 plouftr01 152 573 140 22 86.0\n",
- "100488 hunteto01 139 521 125 22 81.0\n",
- "101164 suzukku01 131 433 104 5 50.0"
- ]
- },
- "execution_count": 19,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df_min_2015 = df_min_2015.reindex(columns=['playerID', 'G', 'AB', 'H', 'HR', 'RBI'])\n",
- "df_min_2015.head()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Adding and dropping rows"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Adding rows can be achieved using `loc[]` and setting the new index to a dictionary of values using the column labels as keys."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 20,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>H</th>\n",
- " <th>HR</th>\n",
- " <th>RBI</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>100917</th>\n",
- " <td>polanjo01</td>\n",
- " <td>4</td>\n",
- " <td>10</td>\n",
- " <td>3</td>\n",
- " <td>0</td>\n",
- " <td>1</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>99954</th>\n",
- " <td>bernido01</td>\n",
- " <td>4</td>\n",
- " <td>5</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>2</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100564</th>\n",
- " <td>keplema01</td>\n",
- " <td>3</td>\n",
- " <td>7</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100729</th>\n",
- " <td>meyeral01</td>\n",
- " <td>2</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>200000</th>\n",
- " <td>keith01</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID G AB H HR RBI\n",
- "100917 polanjo01 4 10 3 0 1\n",
- "99954 bernido01 4 5 1 0 2\n",
- "100564 keplema01 3 7 1 0 0\n",
- "100729 meyeral01 2 0 0 0 0\n",
- "200000 keith01 0 0 0 0 0"
- ]
- },
- "execution_count": 20,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df_min_2015.loc[200000] = \\\n",
- " { 'playerID': 'keith01',\n",
- " 'RBI': '0',\n",
- " 'G': '0',\n",
- " 'H': '0',\n",
- " 'HR': '0',\n",
- " 'AB': '0' }\n",
- " \n",
- "df_min_2015.tail()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "It is also the same with lists and tuples."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 21,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>H</th>\n",
- " <th>HR</th>\n",
- " <th>RBI</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>99954</th>\n",
- " <td>bernido01</td>\n",
- " <td>4</td>\n",
- " <td>5</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>2</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100564</th>\n",
- " <td>keplema01</td>\n",
- " <td>3</td>\n",
- " <td>7</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100729</th>\n",
- " <td>meyeral01</td>\n",
- " <td>2</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>200000</th>\n",
- " <td>keith01</td>\n",
- " <td>1</td>\n",
- " <td>1</td>\n",
- " <td>1</td>\n",
- " <td>1</td>\n",
- " <td>1</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>200001</th>\n",
- " <td>keith02</td>\n",
- " <td>1</td>\n",
- " <td>1</td>\n",
- " <td>1</td>\n",
- " <td>1</td>\n",
- " <td>1</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID G AB H HR RBI\n",
- "99954 bernido01 4 5 1 0 2\n",
- "100564 keplema01 3 7 1 0 0\n",
- "100729 meyeral01 2 0 0 0 0\n",
- "200000 keith01 1 1 1 1 1\n",
- "200001 keith02 1 1 1 1 1"
- ]
- },
- "execution_count": 21,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df_min_2015.loc[200000] = ('keith01', 1, 1, 1, 1, 1)\n",
- "df_min_2015.loc[200001] = ['keith02', 1, 1, 1, 1, 1]\n",
- "\n",
- "df_min_2015.tail()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Note that we can drop a number of rows at a time by passing a list of the indices we'd like dropped."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 22,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>H</th>\n",
- " <th>HR</th>\n",
- " <th>RBI</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>101189</th>\n",
- " <td>thielca01</td>\n",
- " <td>6</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100917</th>\n",
- " <td>polanjo01</td>\n",
- " <td>4</td>\n",
- " <td>10</td>\n",
- " <td>3</td>\n",
- " <td>0</td>\n",
- " <td>1</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>99954</th>\n",
- " <td>bernido01</td>\n",
- " <td>4</td>\n",
- " <td>5</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>2</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100564</th>\n",
- " <td>keplema01</td>\n",
- " <td>3</td>\n",
- " <td>7</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100729</th>\n",
- " <td>meyeral01</td>\n",
- " <td>2</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID G AB H HR RBI\n",
- "101189 thielca01 6 0 0 0 0\n",
- "100917 polanjo01 4 10 3 0 1\n",
- "99954 bernido01 4 5 1 0 2\n",
- "100564 keplema01 3 7 1 0 0\n",
- "100729 meyeral01 2 0 0 0 0"
- ]
- },
- "execution_count": 22,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df_min_2015 = df_min_2015.drop([200000, 200001], axis=0)\n",
- "df_min_2015.tail()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Similar results can be achieved using [`append()`](http://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.DataFrame.append.html#pandas.DataFrame.append). With append, you can append, Series, DataFrames and/or a list of these."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 23,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>H</th>\n",
- " <th>HR</th>\n",
- " <th>RBI</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>100917</th>\n",
- " <td>polanjo01</td>\n",
- " <td>4</td>\n",
- " <td>10</td>\n",
- " <td>3</td>\n",
- " <td>0</td>\n",
- " <td>1</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>99954</th>\n",
- " <td>bernido01</td>\n",
- " <td>4</td>\n",
- " <td>5</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>2</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100564</th>\n",
- " <td>keplema01</td>\n",
- " <td>3</td>\n",
- " <td>7</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100729</th>\n",
- " <td>meyeral01</td>\n",
- " <td>2</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>200000</th>\n",
- " <td>keith01</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID G AB H HR RBI\n",
- "100917 polanjo01 4 10 3 0 1\n",
- "99954 bernido01 4 5 1 0 2\n",
- "100564 keplema01 3 7 1 0 0\n",
- "100729 meyeral01 2 0 0 0 0\n",
- "200000 keith01 0 0 0 0 0"
- ]
- },
- "execution_count": 23,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df_min_2015.append(\n",
- " pd.Series( \n",
- " {'playerID': 'keith01', \n",
- " 'G': 0, \n",
- " 'AB': 0, \n",
- " 'H':0, \n",
- " 'HR': 0, \n",
- " 'RBI': 0}, name='200000')).tail()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 24,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>H</th>\n",
- " <th>HR</th>\n",
- " <th>RBI</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>100696</th>\n",
- " <td>mauerjo01</td>\n",
- " <td>158</td>\n",
- " <td>592</td>\n",
- " <td>157</td>\n",
- " <td>10</td>\n",
- " <td>66</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100215</th>\n",
- " <td>doziebr01</td>\n",
- " <td>157</td>\n",
- " <td>628</td>\n",
- " <td>148</td>\n",
- " <td>28</td>\n",
- " <td>77</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100915</th>\n",
- " <td>plouftr01</td>\n",
- " <td>152</td>\n",
- " <td>573</td>\n",
- " <td>140</td>\n",
- " <td>22</td>\n",
- " <td>86</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100488</th>\n",
- " <td>hunteto01</td>\n",
- " <td>139</td>\n",
- " <td>521</td>\n",
- " <td>125</td>\n",
- " <td>22</td>\n",
- " <td>81</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101164</th>\n",
- " <td>suzukku01</td>\n",
- " <td>131</td>\n",
- " <td>433</td>\n",
- " <td>104</td>\n",
- " <td>5</td>\n",
- " <td>50</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101189</th>\n",
- " <td>thielca01</td>\n",
- " <td>6</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100917</th>\n",
- " <td>polanjo01</td>\n",
- " <td>4</td>\n",
- " <td>10</td>\n",
- " <td>3</td>\n",
- " <td>0</td>\n",
- " <td>1</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>99954</th>\n",
- " <td>bernido01</td>\n",
- " <td>4</td>\n",
- " <td>5</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>2</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100564</th>\n",
- " <td>keplema01</td>\n",
- " <td>3</td>\n",
- " <td>7</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100729</th>\n",
- " <td>meyeral01</td>\n",
- " <td>2</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID G AB H HR RBI\n",
- "100696 mauerjo01 158 592 157 10 66\n",
- "100215 doziebr01 157 628 148 28 77\n",
- "100915 plouftr01 152 573 140 22 86\n",
- "100488 hunteto01 139 521 125 22 81\n",
- "101164 suzukku01 131 433 104 5 50\n",
- "101189 thielca01 6 0 0 0 0\n",
- "100917 polanjo01 4 10 3 0 1\n",
- "99954 bernido01 4 5 1 0 2\n",
- "100564 keplema01 3 7 1 0 0\n",
- "100729 meyeral01 2 0 0 0 0"
- ]
- },
- "execution_count": 24,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df_min_2015[:5].append(df_min_2015[-5:])"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 25,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>H</th>\n",
- " <th>HR</th>\n",
- " <th>RBI</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>100696</th>\n",
- " <td>mauerjo01</td>\n",
- " <td>158</td>\n",
- " <td>592</td>\n",
- " <td>157</td>\n",
- " <td>10</td>\n",
- " <td>66</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100215</th>\n",
- " <td>doziebr01</td>\n",
- " <td>157</td>\n",
- " <td>628</td>\n",
- " <td>148</td>\n",
- " <td>28</td>\n",
- " <td>77</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100915</th>\n",
- " <td>plouftr01</td>\n",
- " <td>152</td>\n",
- " <td>573</td>\n",
- " <td>140</td>\n",
- " <td>22</td>\n",
- " <td>86</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100488</th>\n",
- " <td>hunteto01</td>\n",
- " <td>139</td>\n",
- " <td>521</td>\n",
- " <td>125</td>\n",
- " <td>22</td>\n",
- " <td>81</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101164</th>\n",
- " <td>suzukku01</td>\n",
- " <td>131</td>\n",
- " <td>433</td>\n",
- " <td>104</td>\n",
- " <td>5</td>\n",
- " <td>50</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101067</th>\n",
- " <td>sanomi01</td>\n",
- " <td>80</td>\n",
- " <td>279</td>\n",
- " <td>75</td>\n",
- " <td>18</td>\n",
- " <td>52</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100816</th>\n",
- " <td>nunezed02</td>\n",
- " <td>72</td>\n",
- " <td>188</td>\n",
- " <td>53</td>\n",
- " <td>4</td>\n",
- " <td>20</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101189</th>\n",
- " <td>thielca01</td>\n",
- " <td>6</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100917</th>\n",
- " <td>polanjo01</td>\n",
- " <td>4</td>\n",
- " <td>10</td>\n",
- " <td>3</td>\n",
- " <td>0</td>\n",
- " <td>1</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>99954</th>\n",
- " <td>bernido01</td>\n",
- " <td>4</td>\n",
- " <td>5</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>2</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100564</th>\n",
- " <td>keplema01</td>\n",
- " <td>3</td>\n",
- " <td>7</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100729</th>\n",
- " <td>meyeral01</td>\n",
- " <td>2</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID G AB H HR RBI\n",
- "100696 mauerjo01 158 592 157 10 66\n",
- "100215 doziebr01 157 628 148 28 77\n",
- "100915 plouftr01 152 573 140 22 86\n",
- "100488 hunteto01 139 521 125 22 81\n",
- "101164 suzukku01 131 433 104 5 50\n",
- "101067 sanomi01 80 279 75 18 52\n",
- "100816 nunezed02 72 188 53 4 20\n",
- "101189 thielca01 6 0 0 0 0\n",
- "100917 polanjo01 4 10 3 0 1\n",
- "99954 bernido01 4 5 1 0 2\n",
- "100564 keplema01 3 7 1 0 0\n",
- "100729 meyeral01 2 0 0 0 0"
- ]
- },
- "execution_count": 25,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df_min_2015[:5].append([df_min_2015[10:12], df_min_2015[-5:]])"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "The same result can be achieved with [`pd.concat()`](http://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.concat.html#pandas.concat), where the defaut `axis` is `0`."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 26,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>H</th>\n",
- " <th>HR</th>\n",
- " <th>RBI</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>100696</th>\n",
- " <td>mauerjo01</td>\n",
- " <td>158</td>\n",
- " <td>592</td>\n",
- " <td>157</td>\n",
- " <td>10</td>\n",
- " <td>66</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100215</th>\n",
- " <td>doziebr01</td>\n",
- " <td>157</td>\n",
- " <td>628</td>\n",
- " <td>148</td>\n",
- " <td>28</td>\n",
- " <td>77</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100915</th>\n",
- " <td>plouftr01</td>\n",
- " <td>152</td>\n",
- " <td>573</td>\n",
- " <td>140</td>\n",
- " <td>22</td>\n",
- " <td>86</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100488</th>\n",
- " <td>hunteto01</td>\n",
- " <td>139</td>\n",
- " <td>521</td>\n",
- " <td>125</td>\n",
- " <td>22</td>\n",
- " <td>81</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101164</th>\n",
- " <td>suzukku01</td>\n",
- " <td>131</td>\n",
- " <td>433</td>\n",
- " <td>104</td>\n",
- " <td>5</td>\n",
- " <td>50</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101189</th>\n",
- " <td>thielca01</td>\n",
- " <td>6</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100917</th>\n",
- " <td>polanjo01</td>\n",
- " <td>4</td>\n",
- " <td>10</td>\n",
- " <td>3</td>\n",
- " <td>0</td>\n",
- " <td>1</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>99954</th>\n",
- " <td>bernido01</td>\n",
- " <td>4</td>\n",
- " <td>5</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>2</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100564</th>\n",
- " <td>keplema01</td>\n",
- " <td>3</td>\n",
- " <td>7</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100729</th>\n",
- " <td>meyeral01</td>\n",
- " <td>2</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID G AB H HR RBI\n",
- "100696 mauerjo01 158 592 157 10 66\n",
- "100215 doziebr01 157 628 148 28 77\n",
- "100915 plouftr01 152 573 140 22 86\n",
- "100488 hunteto01 139 521 125 22 81\n",
- "101164 suzukku01 131 433 104 5 50\n",
- "101189 thielca01 6 0 0 0 0\n",
- "100917 polanjo01 4 10 3 0 1\n",
- "99954 bernido01 4 5 1 0 2\n",
- "100564 keplema01 3 7 1 0 0\n",
- "100729 meyeral01 2 0 0 0 0"
- ]
- },
- "execution_count": 26,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "pd.concat([df_min_2015[:5], \n",
- " df_min_2015[-5:]], axis=0)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "But we can use `concat()` to make a _column-wise_ concatenation using `axis=1` (columns). "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 27,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>H</th>\n",
- " <th>HR</th>\n",
- " <th>RBI</th>\n",
- " <th>playerID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>H</th>\n",
- " <th>HR</th>\n",
- " <th>RBI</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>99954</th>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>bernido01</td>\n",
- " <td>4</td>\n",
- " <td>5</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>2</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100215</th>\n",
- " <td>doziebr01</td>\n",
- " <td>157</td>\n",
- " <td>628</td>\n",
- " <td>148</td>\n",
- " <td>28</td>\n",
- " <td>77</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100488</th>\n",
- " <td>hunteto01</td>\n",
- " <td>139</td>\n",
- " <td>521</td>\n",
- " <td>125</td>\n",
- " <td>22</td>\n",
- " <td>81</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100564</th>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>keplema01</td>\n",
- " <td>3</td>\n",
- " <td>7</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100696</th>\n",
- " <td>mauerjo01</td>\n",
- " <td>158</td>\n",
- " <td>592</td>\n",
- " <td>157</td>\n",
- " <td>10</td>\n",
- " <td>66</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100729</th>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>meyeral01</td>\n",
- " <td>2</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100915</th>\n",
- " <td>plouftr01</td>\n",
- " <td>152</td>\n",
- " <td>573</td>\n",
- " <td>140</td>\n",
- " <td>22</td>\n",
- " <td>86</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100917</th>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>polanjo01</td>\n",
- " <td>4</td>\n",
- " <td>10</td>\n",
- " <td>3</td>\n",
- " <td>0</td>\n",
- " <td>1</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101164</th>\n",
- " <td>suzukku01</td>\n",
- " <td>131</td>\n",
- " <td>433</td>\n",
- " <td>104</td>\n",
- " <td>5</td>\n",
- " <td>50</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101189</th>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>thielca01</td>\n",
- " <td>6</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID G AB H HR RBI playerID G AB H HR RBI\n",
- "99954 NaN NaN NaN NaN NaN NaN bernido01 4 5 1 0 2\n",
- "100215 doziebr01 157 628 148 28 77 NaN NaN NaN NaN NaN NaN\n",
- "100488 hunteto01 139 521 125 22 81 NaN NaN NaN NaN NaN NaN\n",
- "100564 NaN NaN NaN NaN NaN NaN keplema01 3 7 1 0 0\n",
- "100696 mauerjo01 158 592 157 10 66 NaN NaN NaN NaN NaN NaN\n",
- "100729 NaN NaN NaN NaN NaN NaN meyeral01 2 0 0 0 0\n",
- "100915 plouftr01 152 573 140 22 86 NaN NaN NaN NaN NaN NaN\n",
- "100917 NaN NaN NaN NaN NaN NaN polanjo01 4 10 3 0 1\n",
- "101164 suzukku01 131 433 104 5 50 NaN NaN NaN NaN NaN NaN\n",
- "101189 NaN NaN NaN NaN NaN NaN thielca01 6 0 0 0 0"
- ]
- },
- "execution_count": 27,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "pd.concat([df_min_2015[:5], \n",
- " df_min_2015[-5:]], axis=1)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "We can see that the indices are being considered in the concatenation and row indices are being joined. This behavior can be controlled via the `join` parameter, which we'll leave [for the reader to explore](http://pandas.pydata.org/pandas-docs/version/0.17.0/merging.html#concatenating-objects).\n",
- "\n",
- "One last thing we might want to do in an operation like this is to reset the index. To do so, we might start with ignoring the column index using the `ignore_index=True` so we can set it later to something more appropriate after the concatenation."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 28,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>0</th>\n",
- " <th>1</th>\n",
- " <th>2</th>\n",
- " <th>3</th>\n",
- " <th>4</th>\n",
- " <th>5</th>\n",
- " <th>6</th>\n",
- " <th>7</th>\n",
- " <th>8</th>\n",
- " <th>9</th>\n",
- " <th>10</th>\n",
- " <th>11</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>99954</th>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>bernido01</td>\n",
- " <td>4</td>\n",
- " <td>5</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>2</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100215</th>\n",
- " <td>doziebr01</td>\n",
- " <td>157</td>\n",
- " <td>628</td>\n",
- " <td>148</td>\n",
- " <td>28</td>\n",
- " <td>77</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100488</th>\n",
- " <td>hunteto01</td>\n",
- " <td>139</td>\n",
- " <td>521</td>\n",
- " <td>125</td>\n",
- " <td>22</td>\n",
- " <td>81</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100564</th>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>keplema01</td>\n",
- " <td>3</td>\n",
- " <td>7</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100696</th>\n",
- " <td>mauerjo01</td>\n",
- " <td>158</td>\n",
- " <td>592</td>\n",
- " <td>157</td>\n",
- " <td>10</td>\n",
- " <td>66</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100729</th>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>meyeral01</td>\n",
- " <td>2</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100915</th>\n",
- " <td>plouftr01</td>\n",
- " <td>152</td>\n",
- " <td>573</td>\n",
- " <td>140</td>\n",
- " <td>22</td>\n",
- " <td>86</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100917</th>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>polanjo01</td>\n",
- " <td>4</td>\n",
- " <td>10</td>\n",
- " <td>3</td>\n",
- " <td>0</td>\n",
- " <td>1</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101164</th>\n",
- " <td>suzukku01</td>\n",
- " <td>131</td>\n",
- " <td>433</td>\n",
- " <td>104</td>\n",
- " <td>5</td>\n",
- " <td>50</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>101189</th>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>thielca01</td>\n",
- " <td>6</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " 0 1 2 3 4 5 6 7 8 9 10 11\n",
- "99954 NaN NaN NaN NaN NaN NaN bernido01 4 5 1 0 2\n",
- "100215 doziebr01 157 628 148 28 77 NaN NaN NaN NaN NaN NaN\n",
- "100488 hunteto01 139 521 125 22 81 NaN NaN NaN NaN NaN NaN\n",
- "100564 NaN NaN NaN NaN NaN NaN keplema01 3 7 1 0 0\n",
- "100696 mauerjo01 158 592 157 10 66 NaN NaN NaN NaN NaN NaN\n",
- "100729 NaN NaN NaN NaN NaN NaN meyeral01 2 0 0 0 0\n",
- "100915 plouftr01 152 573 140 22 86 NaN NaN NaN NaN NaN NaN\n",
- "100917 NaN NaN NaN NaN NaN NaN polanjo01 4 10 3 0 1\n",
- "101164 suzukku01 131 433 104 5 50 NaN NaN NaN NaN NaN NaN\n",
- "101189 NaN NaN NaN NaN NaN NaN thielca01 6 0 0 0 0"
- ]
- },
- "execution_count": 28,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "pd.concat([df_min_2015[:5], \n",
- " df_min_2015[-5:]], axis=1, ignore_index=True)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Advanced indexing\n",
- "Pandas provides the ability to build more complex indices allowing for highly flexible and natural data access.\n",
- "\n",
- "We will cover the basics of through the [`MultiIndex`](http://pandas.pydata.org/pandas-docs/version/0.17.0/advanced.html#hierarchical-indexing-multiindex) object and will the the remaining exploration to the reader."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Let's get the players on the Washington Nationals who played 100 or more games in 2015 and 2016."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 29,
- "metadata": {
- "collapsed": true
- },
- "outputs": [],
- "source": [
- "df_was = df[(df.yearID > 2014) & (df.teamID=='WAS') & (df.G > 99)]"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 30,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>yearID</th>\n",
- " <th>stint</th>\n",
- " <th>teamID</th>\n",
- " <th>lgID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>R</th>\n",
- " <th>H</th>\n",
- " <th>2B</th>\n",
- " <th>...</th>\n",
- " <th>RBI</th>\n",
- " <th>SB</th>\n",
- " <th>CS</th>\n",
- " <th>BB</th>\n",
- " <th>SO</th>\n",
- " <th>IBB</th>\n",
- " <th>HBP</th>\n",
- " <th>SH</th>\n",
- " <th>SF</th>\n",
- " <th>GIDP</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>100193</th>\n",
- " <td>desmoia01</td>\n",
- " <td>2015</td>\n",
- " <td>1</td>\n",
- " <td>WAS</td>\n",
- " <td>NL</td>\n",
- " <td>156</td>\n",
- " <td>583</td>\n",
- " <td>69</td>\n",
- " <td>136</td>\n",
- " <td>27</td>\n",
- " <td>...</td>\n",
- " <td>62.0</td>\n",
- " <td>13.0</td>\n",
- " <td>5.0</td>\n",
- " <td>45</td>\n",
- " <td>187.0</td>\n",
- " <td>0.0</td>\n",
- " <td>3.0</td>\n",
- " <td>6.0</td>\n",
- " <td>4.0</td>\n",
- " <td>9.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100250</th>\n",
- " <td>escobyu01</td>\n",
- " <td>2015</td>\n",
- " <td>1</td>\n",
- " <td>WAS</td>\n",
- " <td>NL</td>\n",
- " <td>139</td>\n",
- " <td>535</td>\n",
- " <td>75</td>\n",
- " <td>168</td>\n",
- " <td>25</td>\n",
- " <td>...</td>\n",
- " <td>56.0</td>\n",
- " <td>2.0</td>\n",
- " <td>2.0</td>\n",
- " <td>45</td>\n",
- " <td>70.0</td>\n",
- " <td>0.0</td>\n",
- " <td>8.0</td>\n",
- " <td>1.0</td>\n",
- " <td>2.0</td>\n",
- " <td>24.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100251</th>\n",
- " <td>espinda01</td>\n",
- " <td>2015</td>\n",
- " <td>1</td>\n",
- " <td>WAS</td>\n",
- " <td>NL</td>\n",
- " <td>118</td>\n",
- " <td>367</td>\n",
- " <td>59</td>\n",
- " <td>88</td>\n",
- " <td>21</td>\n",
- " <td>...</td>\n",
- " <td>37.0</td>\n",
- " <td>5.0</td>\n",
- " <td>2.0</td>\n",
- " <td>33</td>\n",
- " <td>106.0</td>\n",
- " <td>5.0</td>\n",
- " <td>6.0</td>\n",
- " <td>3.0</td>\n",
- " <td>3.0</td>\n",
- " <td>6.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100422</th>\n",
- " <td>harpebr03</td>\n",
- " <td>2015</td>\n",
- " <td>1</td>\n",
- " <td>WAS</td>\n",
- " <td>NL</td>\n",
- " <td>153</td>\n",
- " <td>521</td>\n",
- " <td>118</td>\n",
- " <td>172</td>\n",
- " <td>38</td>\n",
- " <td>...</td>\n",
- " <td>99.0</td>\n",
- " <td>6.0</td>\n",
- " <td>4.0</td>\n",
- " <td>124</td>\n",
- " <td>131.0</td>\n",
- " <td>15.0</td>\n",
- " <td>5.0</td>\n",
- " <td>0.0</td>\n",
- " <td>4.0</td>\n",
- " <td>15.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>100950</th>\n",
- " <td>ramoswi01</td>\n",
- " <td>2015</td>\n",
- " <td>1</td>\n",
- " <td>WAS</td>\n",
- " <td>NL</td>\n",
- " <td>128</td>\n",
- " <td>475</td>\n",
- " <td>41</td>\n",
- " <td>109</td>\n",
- " <td>16</td>\n",
- " <td>...</td>\n",
- " <td>68.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>21</td>\n",
- " <td>101.0</td>\n",
- " <td>2.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>8.0</td>\n",
- " <td>16.0</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "<p>5 rows × 22 columns</p>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID yearID stint teamID lgID G AB R H 2B ... \\\n",
- "100193 desmoia01 2015 1 WAS NL 156 583 69 136 27 ... \n",
- "100250 escobyu01 2015 1 WAS NL 139 535 75 168 25 ... \n",
- "100251 espinda01 2015 1 WAS NL 118 367 59 88 21 ... \n",
- "100422 harpebr03 2015 1 WAS NL 153 521 118 172 38 ... \n",
- "100950 ramoswi01 2015 1 WAS NL 128 475 41 109 16 ... \n",
- "\n",
- " RBI SB CS BB SO IBB HBP SH SF GIDP \n",
- "100193 62.0 13.0 5.0 45 187.0 0.0 3.0 6.0 4.0 9.0 \n",
- "100250 56.0 2.0 2.0 45 70.0 0.0 8.0 1.0 2.0 24.0 \n",
- "100251 37.0 5.0 2.0 33 106.0 5.0 6.0 3.0 3.0 6.0 \n",
- "100422 99.0 6.0 4.0 124 131.0 15.0 5.0 0.0 4.0 15.0 \n",
- "100950 68.0 0.0 0.0 21 101.0 2.0 0.0 0.0 8.0 16.0 \n",
- "\n",
- "[5 rows x 22 columns]"
- ]
- },
- "execution_count": 30,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df_was.head()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "One obvious problem if we were to access the data here by player and year, we have to build a much more involved query and even more so if we needed to ignore data.\n",
- "\n",
- "We are going to create a _hierarchical index_ or _MultiIndex_ to solve this problem. We'll take take liberty to drop columns we don't need (`teamID`, `ldID`, `stint`) and reorganize the index hierarchically.\n",
- "\n",
- "We will use `MultiIndex` using a _tuple_ of the data we need and provide the index first by _player_, then by _year_. To do this we'll just grab all the player IDs and `zip` them with the year. This will look something like this:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 31,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "(('desmoia01', 2015),\n",
- " ('escobyu01', 2015),\n",
- " ('espinda01', 2015),\n",
- " ('espinda01', 2016),\n",
- " ('harpebr03', 2015),\n",
- " ('harpebr03', 2016),\n",
- " ('murphda08', 2016),\n",
- " ('ramoswi01', 2015),\n",
- " ('ramoswi01', 2016),\n",
- " ('rendoan01', 2016),\n",
- " ('reverbe01', 2016),\n",
- " ('robincl01', 2015),\n",
- " ('robincl01', 2016),\n",
- " ('taylomi02', 2015),\n",
- " ('werthja01', 2016),\n",
- " ('zimmery01', 2016))"
- ]
- },
- "execution_count": 31,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "tuple(\n",
- "zip(\n",
- " df_was[['playerID','yearID']].sort_values(by='playerID')['playerID'],\n",
- " df_was[['playerID','yearID']].sort_values(by='playerID')['yearID']\n",
- ")\n",
- ")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 32,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "MultiIndex(levels=[['desmoia01', 'escobyu01', 'espinda01', 'harpebr03', 'murphda08', 'ramoswi01', 'rendoan01', 'reverbe01', 'robincl01', 'taylomi02', 'werthja01', 'zimmery01'], [2015, 2016]],\n",
- " labels=[[0, 1, 2, 2, 3, 3, 4, 5, 5, 6, 7, 8, 8, 9, 10, 11], [0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1]])"
- ]
- },
- "execution_count": 32,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# create an index to be used over the data we're interested in\n",
- "idx = \\\n",
- " pd.MultiIndex.from_tuples(\n",
- " tuple(\n",
- " zip(\n",
- " df_was[['playerID','yearID']].sort_values(by='playerID')['playerID'],\n",
- " df_was[['playerID','yearID']].sort_values(by='playerID')['yearID']))\n",
- " )\n",
- "idx"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Notice now that we have two _levels_ in our _row axis_ (axis 0) and we will now use that index to build the hierachically indexed DataFrame."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 33,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th></th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>R</th>\n",
- " <th>H</th>\n",
- " <th>2B</th>\n",
- " <th>3B</th>\n",
- " <th>HR</th>\n",
- " <th>RBI</th>\n",
- " <th>SB</th>\n",
- " <th>CS</th>\n",
- " <th>BB</th>\n",
- " <th>SO</th>\n",
- " <th>IBB</th>\n",
- " <th>HBP</th>\n",
- " <th>SH</th>\n",
- " <th>SF</th>\n",
- " <th>GIDP</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>desmoia01</th>\n",
- " <th>2015</th>\n",
- " <td>156</td>\n",
- " <td>583</td>\n",
- " <td>69</td>\n",
- " <td>136</td>\n",
- " <td>27</td>\n",
- " <td>2</td>\n",
- " <td>19</td>\n",
- " <td>62.0</td>\n",
- " <td>13.0</td>\n",
- " <td>5.0</td>\n",
- " <td>45</td>\n",
- " <td>187.0</td>\n",
- " <td>0.0</td>\n",
- " <td>3.0</td>\n",
- " <td>6.0</td>\n",
- " <td>4.0</td>\n",
- " <td>9.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>escobyu01</th>\n",
- " <th>2015</th>\n",
- " <td>139</td>\n",
- " <td>535</td>\n",
- " <td>75</td>\n",
- " <td>168</td>\n",
- " <td>25</td>\n",
- " <td>1</td>\n",
- " <td>9</td>\n",
- " <td>56.0</td>\n",
- " <td>2.0</td>\n",
- " <td>2.0</td>\n",
- " <td>45</td>\n",
- " <td>70.0</td>\n",
- " <td>0.0</td>\n",
- " <td>8.0</td>\n",
- " <td>1.0</td>\n",
- " <td>2.0</td>\n",
- " <td>24.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th rowspan=\"2\" valign=\"top\">espinda01</th>\n",
- " <th>2015</th>\n",
- " <td>118</td>\n",
- " <td>367</td>\n",
- " <td>59</td>\n",
- " <td>88</td>\n",
- " <td>21</td>\n",
- " <td>1</td>\n",
- " <td>13</td>\n",
- " <td>37.0</td>\n",
- " <td>5.0</td>\n",
- " <td>2.0</td>\n",
- " <td>33</td>\n",
- " <td>106.0</td>\n",
- " <td>5.0</td>\n",
- " <td>6.0</td>\n",
- " <td>3.0</td>\n",
- " <td>3.0</td>\n",
- " <td>6.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>2016</th>\n",
- " <td>157</td>\n",
- " <td>516</td>\n",
- " <td>66</td>\n",
- " <td>108</td>\n",
- " <td>15</td>\n",
- " <td>0</td>\n",
- " <td>24</td>\n",
- " <td>72.0</td>\n",
- " <td>9.0</td>\n",
- " <td>2.0</td>\n",
- " <td>54</td>\n",
- " <td>174.0</td>\n",
- " <td>12.0</td>\n",
- " <td>20.0</td>\n",
- " <td>7.0</td>\n",
- " <td>4.0</td>\n",
- " <td>4.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th rowspan=\"2\" valign=\"top\">harpebr03</th>\n",
- " <th>2015</th>\n",
- " <td>153</td>\n",
- " <td>521</td>\n",
- " <td>118</td>\n",
- " <td>172</td>\n",
- " <td>38</td>\n",
- " <td>1</td>\n",
- " <td>42</td>\n",
- " <td>99.0</td>\n",
- " <td>6.0</td>\n",
- " <td>4.0</td>\n",
- " <td>124</td>\n",
- " <td>131.0</td>\n",
- " <td>15.0</td>\n",
- " <td>5.0</td>\n",
- " <td>0.0</td>\n",
- " <td>4.0</td>\n",
- " <td>15.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>2016</th>\n",
- " <td>147</td>\n",
- " <td>506</td>\n",
- " <td>84</td>\n",
- " <td>123</td>\n",
- " <td>24</td>\n",
- " <td>2</td>\n",
- " <td>24</td>\n",
- " <td>86.0</td>\n",
- " <td>21.0</td>\n",
- " <td>10.0</td>\n",
- " <td>108</td>\n",
- " <td>117.0</td>\n",
- " <td>20.0</td>\n",
- " <td>3.0</td>\n",
- " <td>0.0</td>\n",
- " <td>10.0</td>\n",
- " <td>11.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>murphda08</th>\n",
- " <th>2016</th>\n",
- " <td>142</td>\n",
- " <td>531</td>\n",
- " <td>88</td>\n",
- " <td>184</td>\n",
- " <td>47</td>\n",
- " <td>5</td>\n",
- " <td>25</td>\n",
- " <td>104.0</td>\n",
- " <td>5.0</td>\n",
- " <td>3.0</td>\n",
- " <td>35</td>\n",
- " <td>57.0</td>\n",
- " <td>10.0</td>\n",
- " <td>8.0</td>\n",
- " <td>0.0</td>\n",
- " <td>8.0</td>\n",
- " <td>4.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th rowspan=\"2\" valign=\"top\">ramoswi01</th>\n",
- " <th>2015</th>\n",
- " <td>128</td>\n",
- " <td>475</td>\n",
- " <td>41</td>\n",
- " <td>109</td>\n",
- " <td>16</td>\n",
- " <td>0</td>\n",
- " <td>15</td>\n",
- " <td>68.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>21</td>\n",
- " <td>101.0</td>\n",
- " <td>2.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>8.0</td>\n",
- " <td>16.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>2016</th>\n",
- " <td>131</td>\n",
- " <td>482</td>\n",
- " <td>58</td>\n",
- " <td>148</td>\n",
- " <td>25</td>\n",
- " <td>0</td>\n",
- " <td>22</td>\n",
- " <td>80.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>35</td>\n",
- " <td>79.0</td>\n",
- " <td>2.0</td>\n",
- " <td>2.0</td>\n",
- " <td>0.0</td>\n",
- " <td>4.0</td>\n",
- " <td>17.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>rendoan01</th>\n",
- " <th>2016</th>\n",
- " <td>156</td>\n",
- " <td>567</td>\n",
- " <td>91</td>\n",
- " <td>153</td>\n",
- " <td>38</td>\n",
- " <td>2</td>\n",
- " <td>20</td>\n",
- " <td>85.0</td>\n",
- " <td>12.0</td>\n",
- " <td>6.0</td>\n",
- " <td>65</td>\n",
- " <td>117.0</td>\n",
- " <td>2.0</td>\n",
- " <td>7.0</td>\n",
- " <td>0.0</td>\n",
- " <td>8.0</td>\n",
- " <td>5.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>reverbe01</th>\n",
- " <th>2016</th>\n",
- " <td>103</td>\n",
- " <td>350</td>\n",
- " <td>44</td>\n",
- " <td>76</td>\n",
- " <td>9</td>\n",
- " <td>7</td>\n",
- " <td>2</td>\n",
- " <td>24.0</td>\n",
- " <td>14.0</td>\n",
- " <td>5.0</td>\n",
- " <td>18</td>\n",
- " <td>34.0</td>\n",
- " <td>0.0</td>\n",
- " <td>3.0</td>\n",
- " <td>2.0</td>\n",
- " <td>2.0</td>\n",
- " <td>12.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th rowspan=\"2\" valign=\"top\">robincl01</th>\n",
- " <th>2015</th>\n",
- " <td>126</td>\n",
- " <td>309</td>\n",
- " <td>44</td>\n",
- " <td>84</td>\n",
- " <td>15</td>\n",
- " <td>1</td>\n",
- " <td>10</td>\n",
- " <td>34.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>37</td>\n",
- " <td>52.0</td>\n",
- " <td>4.0</td>\n",
- " <td>5.0</td>\n",
- " <td>0.0</td>\n",
- " <td>1.0</td>\n",
- " <td>6.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>2016</th>\n",
- " <td>104</td>\n",
- " <td>196</td>\n",
- " <td>16</td>\n",
- " <td>46</td>\n",
- " <td>4</td>\n",
- " <td>0</td>\n",
- " <td>5</td>\n",
- " <td>26.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>20</td>\n",
- " <td>38.0</td>\n",
- " <td>0.0</td>\n",
- " <td>2.0</td>\n",
- " <td>1.0</td>\n",
- " <td>5.0</td>\n",
- " <td>4.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>taylomi02</th>\n",
- " <th>2015</th>\n",
- " <td>138</td>\n",
- " <td>472</td>\n",
- " <td>49</td>\n",
- " <td>108</td>\n",
- " <td>15</td>\n",
- " <td>2</td>\n",
- " <td>14</td>\n",
- " <td>63.0</td>\n",
- " <td>16.0</td>\n",
- " <td>3.0</td>\n",
- " <td>35</td>\n",
- " <td>158.0</td>\n",
- " <td>9.0</td>\n",
- " <td>1.0</td>\n",
- " <td>1.0</td>\n",
- " <td>2.0</td>\n",
- " <td>5.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>werthja01</th>\n",
- " <th>2016</th>\n",
- " <td>143</td>\n",
- " <td>525</td>\n",
- " <td>84</td>\n",
- " <td>128</td>\n",
- " <td>28</td>\n",
- " <td>0</td>\n",
- " <td>21</td>\n",
- " <td>69.0</td>\n",
- " <td>5.0</td>\n",
- " <td>1.0</td>\n",
- " <td>71</td>\n",
- " <td>139.0</td>\n",
- " <td>0.0</td>\n",
- " <td>4.0</td>\n",
- " <td>0.0</td>\n",
- " <td>6.0</td>\n",
- " <td>17.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>zimmery01</th>\n",
- " <th>2016</th>\n",
- " <td>115</td>\n",
- " <td>427</td>\n",
- " <td>60</td>\n",
- " <td>93</td>\n",
- " <td>18</td>\n",
- " <td>1</td>\n",
- " <td>15</td>\n",
- " <td>46.0</td>\n",
- " <td>4.0</td>\n",
- " <td>1.0</td>\n",
- " <td>29</td>\n",
- " <td>104.0</td>\n",
- " <td>1.0</td>\n",
- " <td>5.0</td>\n",
- " <td>0.0</td>\n",
- " <td>6.0</td>\n",
- " <td>12.0</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " G AB R H 2B 3B HR RBI SB CS BB SO \\\n",
- "desmoia01 2015 156 583 69 136 27 2 19 62.0 13.0 5.0 45 187.0 \n",
- "escobyu01 2015 139 535 75 168 25 1 9 56.0 2.0 2.0 45 70.0 \n",
- "espinda01 2015 118 367 59 88 21 1 13 37.0 5.0 2.0 33 106.0 \n",
- " 2016 157 516 66 108 15 0 24 72.0 9.0 2.0 54 174.0 \n",
- "harpebr03 2015 153 521 118 172 38 1 42 99.0 6.0 4.0 124 131.0 \n",
- " 2016 147 506 84 123 24 2 24 86.0 21.0 10.0 108 117.0 \n",
- "murphda08 2016 142 531 88 184 47 5 25 104.0 5.0 3.0 35 57.0 \n",
- "ramoswi01 2015 128 475 41 109 16 0 15 68.0 0.0 0.0 21 101.0 \n",
- " 2016 131 482 58 148 25 0 22 80.0 0.0 0.0 35 79.0 \n",
- "rendoan01 2016 156 567 91 153 38 2 20 85.0 12.0 6.0 65 117.0 \n",
- "reverbe01 2016 103 350 44 76 9 7 2 24.0 14.0 5.0 18 34.0 \n",
- "robincl01 2015 126 309 44 84 15 1 10 34.0 0.0 0.0 37 52.0 \n",
- " 2016 104 196 16 46 4 0 5 26.0 0.0 0.0 20 38.0 \n",
- "taylomi02 2015 138 472 49 108 15 2 14 63.0 16.0 3.0 35 158.0 \n",
- "werthja01 2016 143 525 84 128 28 0 21 69.0 5.0 1.0 71 139.0 \n",
- "zimmery01 2016 115 427 60 93 18 1 15 46.0 4.0 1.0 29 104.0 \n",
- "\n",
- " IBB HBP SH SF GIDP \n",
- "desmoia01 2015 0.0 3.0 6.0 4.0 9.0 \n",
- "escobyu01 2015 0.0 8.0 1.0 2.0 24.0 \n",
- "espinda01 2015 5.0 6.0 3.0 3.0 6.0 \n",
- " 2016 12.0 20.0 7.0 4.0 4.0 \n",
- "harpebr03 2015 15.0 5.0 0.0 4.0 15.0 \n",
- " 2016 20.0 3.0 0.0 10.0 11.0 \n",
- "murphda08 2016 10.0 8.0 0.0 8.0 4.0 \n",
- "ramoswi01 2015 2.0 0.0 0.0 8.0 16.0 \n",
- " 2016 2.0 2.0 0.0 4.0 17.0 \n",
- "rendoan01 2016 2.0 7.0 0.0 8.0 5.0 \n",
- "reverbe01 2016 0.0 3.0 2.0 2.0 12.0 \n",
- "robincl01 2015 4.0 5.0 0.0 1.0 6.0 \n",
- " 2016 0.0 2.0 1.0 5.0 4.0 \n",
- "taylomi02 2015 9.0 1.0 1.0 2.0 5.0 \n",
- "werthja01 2016 0.0 4.0 0.0 6.0 17.0 \n",
- "zimmery01 2016 1.0 5.0 0.0 6.0 12.0 "
- ]
- },
- "execution_count": 33,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# sorting the indices is critical for lining up the data in the tuples\n",
- "df_was = df_was.sort_values(by=['playerID']).\\\n",
- " set_index(idx).\\\n",
- " drop(['playerID', 'yearID', 'teamID', 'lgID', 'stint'], axis=1)\n",
- "df_was"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 34,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>H</th>\n",
- " <th>SO</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>2015</th>\n",
- " <td>126</td>\n",
- " <td>309</td>\n",
- " <td>84</td>\n",
- " <td>52.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>2016</th>\n",
- " <td>104</td>\n",
- " <td>196</td>\n",
- " <td>46</td>\n",
- " <td>38.0</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " G AB H SO\n",
- "2015 126 309 84 52.0\n",
- "2016 104 196 46 38.0"
- ]
- },
- "execution_count": 34,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df_was.loc[('robincl01', ),['G', 'AB', 'H', 'SO']]"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 35,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "G 104.0\n",
- "AB 196.0\n",
- "H 46.0\n",
- "SO 38.0\n",
- "Name: (robincl01, 2016), dtype: float64"
- ]
- },
- "execution_count": 35,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df_was.loc[('robincl01', 2016),['G', 'AB', 'H', 'SO']]"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "For the sake of the example, let's take the DataFrame for all rows of data past 2016 and create a multi-index using year, league, team and player as the groupings of the index."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 36,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>yearID</th>\n",
- " <th>stint</th>\n",
- " <th>teamID</th>\n",
- " <th>lgID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>R</th>\n",
- " <th>H</th>\n",
- " <th>2B</th>\n",
- " <th>...</th>\n",
- " <th>RBI</th>\n",
- " <th>SB</th>\n",
- " <th>CS</th>\n",
- " <th>BB</th>\n",
- " <th>SO</th>\n",
- " <th>IBB</th>\n",
- " <th>HBP</th>\n",
- " <th>SH</th>\n",
- " <th>SF</th>\n",
- " <th>GIDP</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>0</th>\n",
- " <td>abercda01</td>\n",
- " <td>1871</td>\n",
- " <td>1</td>\n",
- " <td>TRO</td>\n",
- " <td>NaN</td>\n",
- " <td>1</td>\n",
- " <td>4</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>1</th>\n",
- " <td>addybo01</td>\n",
- " <td>1871</td>\n",
- " <td>1</td>\n",
- " <td>RC1</td>\n",
- " <td>NaN</td>\n",
- " <td>25</td>\n",
- " <td>118</td>\n",
- " <td>30</td>\n",
- " <td>32</td>\n",
- " <td>6</td>\n",
- " <td>...</td>\n",
- " <td>13.0</td>\n",
- " <td>8.0</td>\n",
- " <td>1.0</td>\n",
- " <td>4</td>\n",
- " <td>0.0</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>2</th>\n",
- " <td>allisar01</td>\n",
- " <td>1871</td>\n",
- " <td>1</td>\n",
- " <td>CL1</td>\n",
- " <td>NaN</td>\n",
- " <td>29</td>\n",
- " <td>137</td>\n",
- " <td>28</td>\n",
- " <td>40</td>\n",
- " <td>4</td>\n",
- " <td>...</td>\n",
- " <td>19.0</td>\n",
- " <td>3.0</td>\n",
- " <td>1.0</td>\n",
- " <td>2</td>\n",
- " <td>5.0</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>3</th>\n",
- " <td>allisdo01</td>\n",
- " <td>1871</td>\n",
- " <td>1</td>\n",
- " <td>WS3</td>\n",
- " <td>NaN</td>\n",
- " <td>27</td>\n",
- " <td>133</td>\n",
- " <td>28</td>\n",
- " <td>44</td>\n",
- " <td>10</td>\n",
- " <td>...</td>\n",
- " <td>27.0</td>\n",
- " <td>1.0</td>\n",
- " <td>1.0</td>\n",
- " <td>0</td>\n",
- " <td>2.0</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>4</th>\n",
- " <td>ansonca01</td>\n",
- " <td>1871</td>\n",
- " <td>1</td>\n",
- " <td>RC1</td>\n",
- " <td>NaN</td>\n",
- " <td>25</td>\n",
- " <td>120</td>\n",
- " <td>29</td>\n",
- " <td>39</td>\n",
- " <td>11</td>\n",
- " <td>...</td>\n",
- " <td>16.0</td>\n",
- " <td>6.0</td>\n",
- " <td>2.0</td>\n",
- " <td>2</td>\n",
- " <td>1.0</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " <td>NaN</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "<p>5 rows × 22 columns</p>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID yearID stint teamID lgID G AB R H 2B ... RBI SB \\\n",
- "0 abercda01 1871 1 TRO NaN 1 4 0 0 0 ... 0.0 0.0 \n",
- "1 addybo01 1871 1 RC1 NaN 25 118 30 32 6 ... 13.0 8.0 \n",
- "2 allisar01 1871 1 CL1 NaN 29 137 28 40 4 ... 19.0 3.0 \n",
- "3 allisdo01 1871 1 WS3 NaN 27 133 28 44 10 ... 27.0 1.0 \n",
- "4 ansonca01 1871 1 RC1 NaN 25 120 29 39 11 ... 16.0 6.0 \n",
- "\n",
- " CS BB SO IBB HBP SH SF GIDP \n",
- "0 0.0 0 0.0 NaN NaN NaN NaN NaN \n",
- "1 1.0 4 0.0 NaN NaN NaN NaN NaN \n",
- "2 1.0 2 5.0 NaN NaN NaN NaN NaN \n",
- "3 1.0 0 2.0 NaN NaN NaN NaN NaN \n",
- "4 2.0 2 1.0 NaN NaN NaN NaN NaN \n",
- "\n",
- "[5 rows x 22 columns]"
- ]
- },
- "execution_count": 36,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df.head()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 37,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "((2016, 'NL', 'WAS', 'rzepcma01'),\n",
- " (2016, 'NL', 'WAS', 'scherma01'),\n",
- " (2016, 'NL', 'WAS', 'severpe01'),\n",
- " (2016, 'NL', 'WAS', 'solissa01'),\n",
- " (2016, 'NL', 'WAS', 'strasst01'),\n",
- " (2016, 'NL', 'WAS', 'taylomi02'),\n",
- " (2016, 'NL', 'WAS', 'treinbl01'),\n",
- " (2016, 'NL', 'WAS', 'turnetr01'),\n",
- " (2016, 'NL', 'WAS', 'werthja01'),\n",
- " (2016, 'NL', 'WAS', 'zimmery01'))"
- ]
- },
- "execution_count": 37,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df_mi = df[df.yearID>2006].copy()\n",
- "idx_labels = ['yearID', 'lgID', 'teamID', 'playerID']\n",
- "\n",
- "tuple(\n",
- " zip(\n",
- " df_mi[idx_labels]\\\n",
- " .sort_values(idx_labels)['yearID'],\n",
- "\n",
- " df_mi[idx_labels]\\\n",
- " .sort_values(idx_labels)['lgID'],\n",
- "\n",
- " df_mi[idx_labels]\\\n",
- " .sort_values(idx_labels)['teamID'],\n",
- "\n",
- " df_mi[idx_labels]\\\n",
- " .sort_values(idx_labels)['playerID']))[-10:]"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 38,
- "metadata": {
- "collapsed": true
- },
- "outputs": [],
- "source": [
- "idx = \\\n",
- " pd.MultiIndex.from_tuples(\n",
- " tuple(\n",
- " zip(\n",
- " df_mi[idx_labels]\\\n",
- " .sort_values(idx_labels)['yearID'],\n",
- " \n",
- " df_mi[idx_labels]\\\n",
- " .sort_values(idx_labels)['lgID'],\n",
- " \n",
- " df_mi[idx_labels]\\\n",
- " .sort_values(idx_labels)['teamID'],\n",
- " \n",
- " df_mi[idx_labels]\\\n",
- " .sort_values(idx_labels)['playerID']))\n",
- " )"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 39,
- "metadata": {
- "collapsed": true
- },
- "outputs": [],
- "source": [
- "df_mi = df_mi.sort_values(['yearID', 'teamID']).set_index(idx)#.drop(['playerID', 'yearID', 'teamID', 'stint'], axis=1)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 40,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th></th>\n",
- " <th></th>\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>yearID</th>\n",
- " <th>stint</th>\n",
- " <th>teamID</th>\n",
- " <th>lgID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>R</th>\n",
- " <th>H</th>\n",
- " <th>2B</th>\n",
- " <th>...</th>\n",
- " <th>RBI</th>\n",
- " <th>SB</th>\n",
- " <th>CS</th>\n",
- " <th>BB</th>\n",
- " <th>SO</th>\n",
- " <th>IBB</th>\n",
- " <th>HBP</th>\n",
- " <th>SH</th>\n",
- " <th>SF</th>\n",
- " <th>GIDP</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th rowspan=\"5\" valign=\"top\">2007</th>\n",
- " <th rowspan=\"5\" valign=\"top\">AL</th>\n",
- " <th rowspan=\"5\" valign=\"top\">BAL</th>\n",
- " <th>baezda01</th>\n",
- " <td>bardebr01</td>\n",
- " <td>2007</td>\n",
- " <td>1</td>\n",
- " <td>ARI</td>\n",
- " <td>NL</td>\n",
- " <td>8</td>\n",
- " <td>12</td>\n",
- " <td>0</td>\n",
- " <td>1</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>3.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>bakopa01</th>\n",
- " <td>bonifem01</td>\n",
- " <td>2007</td>\n",
- " <td>1</td>\n",
- " <td>ARI</td>\n",
- " <td>NL</td>\n",
- " <td>11</td>\n",
- " <td>23</td>\n",
- " <td>2</td>\n",
- " <td>5</td>\n",
- " <td>1</td>\n",
- " <td>...</td>\n",
- " <td>2.0</td>\n",
- " <td>0.0</td>\n",
- " <td>1.0</td>\n",
- " <td>4</td>\n",
- " <td>3.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>bedarer01</th>\n",
- " <td>byrneer01</td>\n",
- " <td>2007</td>\n",
- " <td>1</td>\n",
- " <td>ARI</td>\n",
- " <td>NL</td>\n",
- " <td>160</td>\n",
- " <td>626</td>\n",
- " <td>103</td>\n",
- " <td>179</td>\n",
- " <td>30</td>\n",
- " <td>...</td>\n",
- " <td>83.0</td>\n",
- " <td>50.0</td>\n",
- " <td>7.0</td>\n",
- " <td>57</td>\n",
- " <td>98.0</td>\n",
- " <td>5.0</td>\n",
- " <td>10.0</td>\n",
- " <td>1.0</td>\n",
- " <td>4.0</td>\n",
- " <td>12.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>bellro01</th>\n",
- " <td>callaal01</td>\n",
- " <td>2007</td>\n",
- " <td>1</td>\n",
- " <td>ARI</td>\n",
- " <td>NL</td>\n",
- " <td>56</td>\n",
- " <td>144</td>\n",
- " <td>10</td>\n",
- " <td>31</td>\n",
- " <td>8</td>\n",
- " <td>...</td>\n",
- " <td>7.0</td>\n",
- " <td>1.0</td>\n",
- " <td>1.0</td>\n",
- " <td>9</td>\n",
- " <td>14.0</td>\n",
- " <td>0.0</td>\n",
- " <td>1.0</td>\n",
- " <td>1.0</td>\n",
- " <td>1.0</td>\n",
- " <td>8.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>birkiku01</th>\n",
- " <td>choatra01</td>\n",
- " <td>2007</td>\n",
- " <td>1</td>\n",
- " <td>ARI</td>\n",
- " <td>NL</td>\n",
- " <td>2</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "<p>5 rows × 22 columns</p>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID yearID stint teamID lgID G AB R \\\n",
- "2007 AL BAL baezda01 bardebr01 2007 1 ARI NL 8 12 0 \n",
- " bakopa01 bonifem01 2007 1 ARI NL 11 23 2 \n",
- " bedarer01 byrneer01 2007 1 ARI NL 160 626 103 \n",
- " bellro01 callaal01 2007 1 ARI NL 56 144 10 \n",
- " birkiku01 choatra01 2007 1 ARI NL 2 0 0 \n",
- "\n",
- " H 2B ... RBI SB CS BB SO IBB HBP \\\n",
- "2007 AL BAL baezda01 1 0 ... 0.0 0.0 0.0 0 3.0 0.0 0.0 \n",
- " bakopa01 5 1 ... 2.0 0.0 1.0 4 3.0 0.0 0.0 \n",
- " bedarer01 179 30 ... 83.0 50.0 7.0 57 98.0 5.0 10.0 \n",
- " bellro01 31 8 ... 7.0 1.0 1.0 9 14.0 0.0 1.0 \n",
- " birkiku01 0 0 ... 0.0 0.0 0.0 0 0.0 0.0 0.0 \n",
- "\n",
- " SH SF GIDP \n",
- "2007 AL BAL baezda01 0.0 0.0 0.0 \n",
- " bakopa01 0.0 0.0 0.0 \n",
- " bedarer01 1.0 4.0 12.0 \n",
- " bellro01 1.0 1.0 8.0 \n",
- " birkiku01 0.0 0.0 0.0 \n",
- "\n",
- "[5 rows x 22 columns]"
- ]
- },
- "execution_count": 40,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df_mi.head()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 41,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th></th>\n",
- " <th></th>\n",
- " <th></th>\n",
- " <th>playerID</th>\n",
- " <th>yearID</th>\n",
- " <th>stint</th>\n",
- " <th>teamID</th>\n",
- " <th>lgID</th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " <th>R</th>\n",
- " <th>H</th>\n",
- " <th>2B</th>\n",
- " <th>...</th>\n",
- " <th>RBI</th>\n",
- " <th>SB</th>\n",
- " <th>CS</th>\n",
- " <th>BB</th>\n",
- " <th>SO</th>\n",
- " <th>IBB</th>\n",
- " <th>HBP</th>\n",
- " <th>SH</th>\n",
- " <th>SF</th>\n",
- " <th>GIDP</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th rowspan=\"5\" valign=\"top\">2016</th>\n",
- " <th rowspan=\"5\" valign=\"top\">NL</th>\n",
- " <th rowspan=\"5\" valign=\"top\">WAS</th>\n",
- " <th>taylomi02</th>\n",
- " <td>taylomi02</td>\n",
- " <td>2016</td>\n",
- " <td>1</td>\n",
- " <td>WAS</td>\n",
- " <td>NL</td>\n",
- " <td>76</td>\n",
- " <td>221</td>\n",
- " <td>28</td>\n",
- " <td>51</td>\n",
- " <td>11</td>\n",
- " <td>...</td>\n",
- " <td>16.0</td>\n",
- " <td>14.0</td>\n",
- " <td>3.0</td>\n",
- " <td>14</td>\n",
- " <td>77.0</td>\n",
- " <td>0.0</td>\n",
- " <td>1.0</td>\n",
- " <td>0.0</td>\n",
- " <td>1.0</td>\n",
- " <td>2.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>treinbl01</th>\n",
- " <td>treinbl01</td>\n",
- " <td>2016</td>\n",
- " <td>1</td>\n",
- " <td>WAS</td>\n",
- " <td>NL</td>\n",
- " <td>73</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>0</td>\n",
- " <td>...</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " <td>0.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>turnetr01</th>\n",
- " <td>turnetr01</td>\n",
- " <td>2016</td>\n",
- " <td>1</td>\n",
- " <td>WAS</td>\n",
- " <td>NL</td>\n",
- " <td>73</td>\n",
- " <td>307</td>\n",
- " <td>53</td>\n",
- " <td>105</td>\n",
- " <td>14</td>\n",
- " <td>...</td>\n",
- " <td>40.0</td>\n",
- " <td>33.0</td>\n",
- " <td>6.0</td>\n",
- " <td>14</td>\n",
- " <td>59.0</td>\n",
- " <td>0.0</td>\n",
- " <td>1.0</td>\n",
- " <td>0.0</td>\n",
- " <td>2.0</td>\n",
- " <td>1.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>werthja01</th>\n",
- " <td>werthja01</td>\n",
- " <td>2016</td>\n",
- " <td>1</td>\n",
- " <td>WAS</td>\n",
- " <td>NL</td>\n",
- " <td>143</td>\n",
- " <td>525</td>\n",
- " <td>84</td>\n",
- " <td>128</td>\n",
- " <td>28</td>\n",
- " <td>...</td>\n",
- " <td>69.0</td>\n",
- " <td>5.0</td>\n",
- " <td>1.0</td>\n",
- " <td>71</td>\n",
- " <td>139.0</td>\n",
- " <td>0.0</td>\n",
- " <td>4.0</td>\n",
- " <td>0.0</td>\n",
- " <td>6.0</td>\n",
- " <td>17.0</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>zimmery01</th>\n",
- " <td>zimmery01</td>\n",
- " <td>2016</td>\n",
- " <td>1</td>\n",
- " <td>WAS</td>\n",
- " <td>NL</td>\n",
- " <td>115</td>\n",
- " <td>427</td>\n",
- " <td>60</td>\n",
- " <td>93</td>\n",
- " <td>18</td>\n",
- " <td>...</td>\n",
- " <td>46.0</td>\n",
- " <td>4.0</td>\n",
- " <td>1.0</td>\n",
- " <td>29</td>\n",
- " <td>104.0</td>\n",
- " <td>1.0</td>\n",
- " <td>5.0</td>\n",
- " <td>0.0</td>\n",
- " <td>6.0</td>\n",
- " <td>12.0</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "<p>5 rows × 22 columns</p>\n",
- "</div>"
- ],
- "text/plain": [
- " playerID yearID stint teamID lgID G AB R \\\n",
- "2016 NL WAS taylomi02 taylomi02 2016 1 WAS NL 76 221 28 \n",
- " treinbl01 treinbl01 2016 1 WAS NL 73 0 0 \n",
- " turnetr01 turnetr01 2016 1 WAS NL 73 307 53 \n",
- " werthja01 werthja01 2016 1 WAS NL 143 525 84 \n",
- " zimmery01 zimmery01 2016 1 WAS NL 115 427 60 \n",
- "\n",
- " H 2B ... RBI SB CS BB SO IBB HBP \\\n",
- "2016 NL WAS taylomi02 51 11 ... 16.0 14.0 3.0 14 77.0 0.0 1.0 \n",
- " treinbl01 0 0 ... 0.0 0.0 0.0 0 0.0 0.0 0.0 \n",
- " turnetr01 105 14 ... 40.0 33.0 6.0 14 59.0 0.0 1.0 \n",
- " werthja01 128 28 ... 69.0 5.0 1.0 71 139.0 0.0 4.0 \n",
- " zimmery01 93 18 ... 46.0 4.0 1.0 29 104.0 1.0 5.0 \n",
- "\n",
- " SH SF GIDP \n",
- "2016 NL WAS taylomi02 0.0 1.0 2.0 \n",
- " treinbl01 0.0 0.0 0.0 \n",
- " turnetr01 0.0 2.0 1.0 \n",
- " werthja01 0.0 6.0 17.0 \n",
- " zimmery01 0.0 6.0 12.0 \n",
- "\n",
- "[5 rows x 22 columns]"
- ]
- },
- "execution_count": 41,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df_mi.tail()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Now we can use this multi-index to out advantage, using the tuple of the index values we want and restricting the columns to just the data of interest."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 42,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "<div>\n",
- "<style>\n",
- " .dataframe thead tr:only-child th {\n",
- " text-align: right;\n",
- " }\n",
- "\n",
- " .dataframe thead th {\n",
- " text-align: left;\n",
- " }\n",
- "\n",
- " .dataframe tbody tr th {\n",
- " vertical-align: top;\n",
- " }\n",
- "</style>\n",
- "<table border=\"1\" class=\"dataframe\">\n",
- " <thead>\n",
- " <tr style=\"text-align: right;\">\n",
- " <th></th>\n",
- " <th>G</th>\n",
- " <th>AB</th>\n",
- " </tr>\n",
- " </thead>\n",
- " <tbody>\n",
- " <tr>\n",
- " <th>accarje01</th>\n",
- " <td>152</td>\n",
- " <td>509</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>adamsru01</th>\n",
- " <td>62</td>\n",
- " <td>1</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>banksjo01</th>\n",
- " <td>26</td>\n",
- " <td>5</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>burneaj01</th>\n",
- " <td>8</td>\n",
- " <td>14</td>\n",
- " </tr>\n",
- " <tr>\n",
- " <th>chacigu01</th>\n",
- " <td>65</td>\n",
- " <td>0</td>\n",
- " </tr>\n",
- " </tbody>\n",
- "</table>\n",
- "</div>"
- ],
- "text/plain": [
- " G AB\n",
- "accarje01 152 509\n",
- "adamsru01 62 1\n",
- "banksjo01 26 5\n",
- "burneaj01 8 14\n",
- "chacigu01 65 0"
- ]
- },
- "execution_count": 42,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df_mi.loc[(2007, 'AL', 'TOR'), ['G', 'AB']].head()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Ξ"
- ]
- }
- ],
- "metadata": {
- "anaconda-cloud": {},
- "gist": {
- "data": {
- "description": "nb/2_dataframe_operations.ipynb",
- "public": false
- },
- "id": ""
- },
- "kernelspec": {
- "display_name": "Python [default]",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.6.1"
- },
- "toc": {
- "colors": {
- "hover_highlight": "#DAA520",
- "navigate_num": "#000000",
- "navigate_text": "#333333",
- "running_highlight": "#FF0000",
- "selected_highlight": "#FFD700",
- "sidebar_border": "#EEEEEE",
- "wrapper_background": "#FFFFFF"
- },
- "moveMenuLeft": true,
- "nav_menu": {
- "height": "211px",
- "width": "252px"
- },
- "navigate_menu": true,
- "number_sections": false,
- "sideBar": true,
- "threshold": 4,
- "toc_cell": true,
- "toc_section_display": "block",
- "toc_window_display": true,
- "widenNotebook": false
- }
- },
- "nbformat": 4,
- "nbformat_minor": 2
- }
|