pandas: paquete para análisis y manipulación de datos

Descripción general

pandas es una biblioteca de Python para análisis y manipulación de datos. Proporciona estructuras de datos y operaciones para manejar tablas numéricas y series temporales. Fue creada por Wes McKinney in 2008. El nombre “pandas” hace referencia tanto a “Panel Data” como a “Python Data Analysis”.

Como su estructura principal, pandas implementa el DataFrame, el cual es un arreglo rectangular de datos, organizado en filas y columnas.

Carga

# Se acostumbra cargar pandas con el alias pd

import pandas as pd

Estructuras de datos

Las dos principales estructuras de datos de pandas son Series y DataFrames.

Series

Las Series son arreglos unidimensionales que contienen datos de cualquier tipo. Se asemejan a una columna de una tabla.

# Definición de una serie

primos = [2, 3, 5, 7, 11]
serie_primos = pd.Series(primos)

serie_primos
0     2
1     3
2     5
3     7
4    11
dtype: int64

Cada elemento de una serie tiene un índice (i.e. posición), comenzando con 0.

# Primer elemento
print("Primer elemento:", serie_primos[0])

# Segundo elemento
print("Segundo elemento:", serie_primos[1])
Primer elemento: 2
Segundo elemento: 3

Los índices también pueden tener etiquetas personalizadas.

# Índice de una serie con etiquetas personalizadas

serie_primos = pd.Series(primos, index = ["A", "B", "C", "D", "E"])

serie_primos
A     2
B     3
C     5
D     7
E    11
dtype: int64
# Elemento en el índice "D"
print(serie_primos["D"])
7

DataFrames

Los DataFrames son estructuras multidimensionales. Una serie puede verse como una columna de una tabla y un dataframe como una tabla completa. Un dataframe puede construirse a partir de varias series.

# Dataframe construído a partir de dos series

datos = {
  "pais": ["PA", "CR", "NI"],
  "poblacion": [4.1, 5.0, 6.6]
}

paises = pd.DataFrame(datos)

paises
pais poblacion
0 PA 4.1
1 CR 5.0
2 NI 6.6

El operador loc permite retornar una o más filas de un dataframe, de acuerdo con un índice o con un vector de índices.

# Segundo elemento
paises.loc[1]
pais          CR
poblacion    5.0
Name: 1, dtype: object
# Segundo y tercer elemento
paises.loc[[1, 2]]
pais poblacion
1 CR 5.0
2 NI 6.6

Los índices de los dataframes también pueden etiquetarse:

paises = pd.DataFrame(datos, index=["pais0", "pais1", "pais2"])
paises
pais poblacion
pais0 PA 4.1
pais1 CR 5.0
pais2 NI 6.6
# Elemento en "pais0"
paises.loc["pais0"]
pais          PA
poblacion    4.1
Name: pais0, dtype: object

Operaciones básicas

Seguidamente, se describen y ejemplifican algunas de las funciones básicas de pandas.

En los siguientes ejemplos, se utilizará un conjunto de registros de presencia de felinos (familia Felidae) de Costa Rica, obtenido a través de una consulta al portal de GBIF.

read_csv() - carga de datos

felinos = pd.read_csv("https://raw.githubusercontent.com/pf3311-cienciadatosgeoespaciales/2021-iii/main/contenido/b/datos/felinos.csv", sep="\t")

info() - información general sobre un conjunto de datos

felinos.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 50 columns):
 #   Column                            Non-Null Count  Dtype  
---  ------                            --------------  -----  
 0   gbifID                            150 non-null    int64  
 1   datasetKey                        150 non-null    object 
 2   occurrenceID                      147 non-null    object 
 3   kingdom                           150 non-null    object 
 4   phylum                            150 non-null    object 
 5   class                             150 non-null    object 
 6   order                             150 non-null    object 
 7   family                            150 non-null    object 
 8   genus                             150 non-null    object 
 9   species                           150 non-null    object 
 10  infraspecificEpithet              18 non-null     object 
 11  taxonRank                         150 non-null    object 
 12  scientificName                    150 non-null    object 
 13  verbatimScientificName            150 non-null    object 
 14  verbatimScientificNameAuthorship  16 non-null     object 
 15  countryCode                       150 non-null    object 
 16  locality                          41 non-null     object 
 17  stateProvince                     132 non-null    object 
 18  occurrenceStatus                  150 non-null    object 
 19  individualCount                   27 non-null     float64
 20  publishingOrgKey                  150 non-null    object 
 21  decimalLatitude                   150 non-null    float64
 22  decimalLongitude                  150 non-null    float64
 23  coordinateUncertaintyInMeters     129 non-null    float64
 24  coordinatePrecision               0 non-null      float64
 25  elevation                         3 non-null      float64
 26  elevationAccuracy                 3 non-null      float64
 27  depth                             3 non-null      float64
 28  depthAccuracy                     3 non-null      float64
 29  eventDate                         148 non-null    object 
 30  day                               147 non-null    float64
 31  month                             148 non-null    float64
 32  year                              148 non-null    float64
 33  taxonKey                          150 non-null    int64  
 34  speciesKey                        150 non-null    int64  
 35  basisOfRecord                     150 non-null    object 
 36  institutionCode                   137 non-null    object 
 37  collectionCode                    150 non-null    object 
 38  catalogNumber                     150 non-null    object 
 39  recordNumber                      4 non-null      object 
 40  identifiedBy                      121 non-null    object 
 41  dateIdentified                    109 non-null    object 
 42  license                           150 non-null    object 
 43  rightsHolder                      132 non-null    object 
 44  recordedBy                        135 non-null    object 
 45  typeStatus                        0 non-null      float64
 46  establishmentMeans                10 non-null     object 
 47  lastInterpreted                   150 non-null    object 
 48  mediaType                         102 non-null    object 
 49  issue                             137 non-null    object 
dtypes: float64(13), int64(3), object(34)
memory usage: 58.7+ KB

head(), tail(), sample() - despliegue de filas de un conjunto de datos

# Primeros 10 registros
felinos.head()
gbifID datasetKey occurrenceID kingdom phylum class order family genus species ... identifiedBy dateIdentified license rightsHolder recordedBy typeStatus establishmentMeans lastInterpreted mediaType issue
0 3337559907 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/90794984 Animalia Chordata Mammalia Carnivora Felidae Puma Puma concolor ... Marvin López M. 2021-08-11T17:02:57 CC_BY_NC_4_0 Marvin López M. Marvin López M. NaN NaN 2021-09-23T21:26:16.096Z StillImage COORDINATE_ROUNDED
1 3333401669 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/88270427 Animalia Chordata Mammalia Carnivora Felidae Puma Puma concolor ... Tiziano Luka Pesci Rubilar 2021-07-23T17:52:04 CC_BY_NC_4_0 Rebeca Quirós Rebeca Quirós NaN NaN 2021-09-23T21:15:51.507Z NaN COORDINATE_ROUNDED
2 3325502794 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/85490861 Animalia Chordata Mammalia Carnivora Felidae Leopardus Leopardus pardalis ... Sofía Pastor Parajeles 2021-07-03T16:59:46 CC_BY_NC_4_0 Sofía Pastor Parajeles Sofía Pastor Parajeles NaN NaN 2021-09-23T20:59:21.345Z StillImage COORDINATE_ROUNDED
3 3314547422 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/84224884 Animalia Chordata Mammalia Carnivora Felidae Leopardus Leopardus pardalis ... Sofía Pastor Parajeles 2021-06-23T20:39:47 CC_BY_NC_4_0 Sofía Pastor Parajeles Sofía Pastor Parajeles NaN NaN 2021-09-23T21:25:26.648Z StillImage NaN
4 3307298689 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/82810053 Animalia Chordata Mammalia Carnivora Felidae Leopardus Leopardus wiedii ... David 2021-06-13T09:10:50 CC_BY_NC_4_0 David David NaN NaN 2021-09-23T21:25:57.478Z StillImage COORDINATE_ROUNDED

5 rows × 50 columns

# Últimos 15 registros
felinos.tail()
gbifID datasetKey occurrenceID kingdom phylum class order family genus species ... identifiedBy dateIdentified license rightsHolder recordedBy typeStatus establishmentMeans lastInterpreted mediaType issue
145 439779436 7e2989f0-f762-11e1-a439-00145eb45e9a NaN Animalia Chordata Mammalia Carnivora Felidae Puma Puma concolor ... NaN NaN CC_BY_4_0 NaN NaN NaN NaN 2021-09-24T03:32:37.674Z StillImage GEODETIC_DATUM_ASSUMED_WGS84;RECORDED_DATE_INV...
146 45869665 847e2306-f762-11e1-a439-00145eb45e9a urn:catalog:LSUMZ:Mammals:13378 Animalia Chordata Mammalia Carnivora Felidae Leopardus Leopardus tigrinus ... NaN NaN CC0_1_0 NaN Gardner, Alfred L. NaN NATIVE 2021-09-24T06:03:59.587Z NaN INSTITUTION_COLLECTION_MISMATCH
147 45869664 847e2306-f762-11e1-a439-00145eb45e9a urn:catalog:LSUMZ:Mammals:13377 Animalia Chordata Mammalia Carnivora Felidae Leopardus Leopardus tigrinus ... NaN NaN CC0_1_0 NaN Gardner, Alfred L. NaN NATIVE 2021-09-24T06:03:59.587Z NaN INSTITUTION_COLLECTION_MISMATCH
148 45869301 847e2306-f762-11e1-a439-00145eb45e9a urn:catalog:LSUMZ:Mammals:10219 Animalia Chordata Mammalia Carnivora Felidae Leopardus Leopardus pardalis ... NaN NaN CC0_1_0 NaN Arnold, Keith A. NaN NATIVE 2021-09-24T06:03:59.952Z NaN INSTITUTION_COLLECTION_MISMATCH
149 45869265 847e2306-f762-11e1-a439-00145eb45e9a urn:catalog:LSUMZ:Mammals:9289 Animalia Chordata Mammalia Carnivora Felidae Leopardus Leopardus pardalis ... NaN NaN CC0_1_0 NaN NaN NaN NATIVE 2021-09-24T06:03:59.933Z NaN INSTITUTION_COLLECTION_MISMATCH

5 rows × 50 columns

# 5 registros seleccionados aleatoriamente
felinos.sample(5)
gbifID datasetKey occurrenceID kingdom phylum class order family genus species ... identifiedBy dateIdentified license rightsHolder recordedBy typeStatus establishmentMeans lastInterpreted mediaType issue
126 1145891880 0daed095-478a-4af6-abf5-18acb790fbb2 http://arctos.database.museum/guid/MVZ:Mamm:95... Animalia Chordata Mammalia Carnivora Felidae Leopardus Leopardus pardalis ... Museum of Vertebrate Zoology, University of Ca... 1999-01-27T00:00:00 CC0_1_0 NaN Collector(s): Harvey E. Stork NaN NaN 2021-09-25T18:06:59.268Z NaN OCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COU...
29 2864895066 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/59866711 Animalia Chordata Mammalia Carnivora Felidae Puma Puma concolor ... Michele Chiacchio 2020-09-17T12:34:30 CC_BY_NC_4_0 dhfischer dhfischer NaN NaN 2021-09-23T21:14:11.862Z StillImage COORDINATE_ROUNDED
143 476843829 4bfac3ea-8763-4f4b-a71a-76a6f5f243d3 MCZ:Mamm:BOM-5358 Animalia Chordata Mammalia Carnivora Felidae Leopardus Leopardus wiedii ... [no agent data] NaN CC_BY_NC_4_0 President and Fellows of Harvard College William More Gabb NaN NaN 2021-10-01T01:54:06.578Z NaN OCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COU...
74 2574073231 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/31417751 Animalia Chordata Mammalia Carnivora Felidae Leopardus Leopardus pardalis ... Michelle Monge-Velazquez 2019-08-24T18:15:37 CC_BY_NC_4_0 Michelle Monge-Velazquez Michelle Monge-Velazquez NaN NaN 2021-09-23T21:11:18.353Z StillImage COORDINATE_ROUNDED
97 2236857888 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/22419536 Animalia Chordata Mammalia Carnivora Felidae Panthera Panthera onca ... Lena Struwe 2019-04-13T17:26:49 CC_BY_NC_4_0 Lena Struwe Lena Struwe NaN NaN 2021-09-23T21:19:43.762Z StillImage COORDINATE_ROUNDED

5 rows × 50 columns

Los contenidos de un data frame también pueden desplegarse al escribir su nombre en la consola de Python.

felinos
gbifID datasetKey occurrenceID kingdom phylum class order family genus species ... identifiedBy dateIdentified license rightsHolder recordedBy typeStatus establishmentMeans lastInterpreted mediaType issue
0 3337559907 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/90794984 Animalia Chordata Mammalia Carnivora Felidae Puma Puma concolor ... Marvin López M. 2021-08-11T17:02:57 CC_BY_NC_4_0 Marvin López M. Marvin López M. NaN NaN 2021-09-23T21:26:16.096Z StillImage COORDINATE_ROUNDED
1 3333401669 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/88270427 Animalia Chordata Mammalia Carnivora Felidae Puma Puma concolor ... Tiziano Luka Pesci Rubilar 2021-07-23T17:52:04 CC_BY_NC_4_0 Rebeca Quirós Rebeca Quirós NaN NaN 2021-09-23T21:15:51.507Z NaN COORDINATE_ROUNDED
2 3325502794 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/85490861 Animalia Chordata Mammalia Carnivora Felidae Leopardus Leopardus pardalis ... Sofía Pastor Parajeles 2021-07-03T16:59:46 CC_BY_NC_4_0 Sofía Pastor Parajeles Sofía Pastor Parajeles NaN NaN 2021-09-23T20:59:21.345Z StillImage COORDINATE_ROUNDED
3 3314547422 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/84224884 Animalia Chordata Mammalia Carnivora Felidae Leopardus Leopardus pardalis ... Sofía Pastor Parajeles 2021-06-23T20:39:47 CC_BY_NC_4_0 Sofía Pastor Parajeles Sofía Pastor Parajeles NaN NaN 2021-09-23T21:25:26.648Z StillImage NaN
4 3307298689 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/82810053 Animalia Chordata Mammalia Carnivora Felidae Leopardus Leopardus wiedii ... David 2021-06-13T09:10:50 CC_BY_NC_4_0 David David NaN NaN 2021-09-23T21:25:57.478Z StillImage COORDINATE_ROUNDED
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
145 439779436 7e2989f0-f762-11e1-a439-00145eb45e9a NaN Animalia Chordata Mammalia Carnivora Felidae Puma Puma concolor ... NaN NaN CC_BY_4_0 NaN NaN NaN NaN 2021-09-24T03:32:37.674Z StillImage GEODETIC_DATUM_ASSUMED_WGS84;RECORDED_DATE_INV...
146 45869665 847e2306-f762-11e1-a439-00145eb45e9a urn:catalog:LSUMZ:Mammals:13378 Animalia Chordata Mammalia Carnivora Felidae Leopardus Leopardus tigrinus ... NaN NaN CC0_1_0 NaN Gardner, Alfred L. NaN NATIVE 2021-09-24T06:03:59.587Z NaN INSTITUTION_COLLECTION_MISMATCH
147 45869664 847e2306-f762-11e1-a439-00145eb45e9a urn:catalog:LSUMZ:Mammals:13377 Animalia Chordata Mammalia Carnivora Felidae Leopardus Leopardus tigrinus ... NaN NaN CC0_1_0 NaN Gardner, Alfred L. NaN NATIVE 2021-09-24T06:03:59.587Z NaN INSTITUTION_COLLECTION_MISMATCH
148 45869301 847e2306-f762-11e1-a439-00145eb45e9a urn:catalog:LSUMZ:Mammals:10219 Animalia Chordata Mammalia Carnivora Felidae Leopardus Leopardus pardalis ... NaN NaN CC0_1_0 NaN Arnold, Keith A. NaN NATIVE 2021-09-24T06:03:59.952Z NaN INSTITUTION_COLLECTION_MISMATCH
149 45869265 847e2306-f762-11e1-a439-00145eb45e9a urn:catalog:LSUMZ:Mammals:9289 Animalia Chordata Mammalia Carnivora Felidae Leopardus Leopardus pardalis ... NaN NaN CC0_1_0 NaN NaN NaN NATIVE 2021-09-24T06:03:59.933Z NaN INSTITUTION_COLLECTION_MISMATCH

150 rows × 50 columns

Selección de columnas

Las columnas que se despliegan en un data frame pueden especificarse mediante una lista.

# Despliegue de las columnas con el nombre científico, la especie, la fecha, el año, el mes y el día

felinos[["scientificName", "species", "eventDate", "year", "month", "day"]]
scientificName species eventDate year month day
0 Puma concolor (Linnaeus, 1771) Puma concolor 2021-08-11T10:22:36 2021.0 8.0 11.0
1 Puma concolor (Linnaeus, 1771) Puma concolor 2021-07-15T16:22:29 2021.0 7.0 15.0
2 Leopardus pardalis (Linnaeus, 1758) Leopardus pardalis 2021-07-01T19:28:25 2021.0 7.0 1.0
3 Leopardus pardalis (Linnaeus, 1758) Leopardus pardalis 2021-06-23T10:55:00 2021.0 6.0 23.0
4 Leopardus wiedii (Schinz, 1821) Leopardus wiedii 2015-12-05T14:41:42 2015.0 12.0 5.0
... ... ... ... ... ... ...
145 Puma concolor (Linnaeus, 1771) Puma concolor NaN NaN NaN NaN
146 Leopardus tigrinus (Schreber, 1775) Leopardus tigrinus 1967-05-15T00:00:00 1967.0 5.0 15.0
147 Leopardus tigrinus (Schreber, 1775) Leopardus tigrinus 1967-02-01T00:00:00 1967.0 2.0 1.0
148 Leopardus pardalis (Linnaeus, 1758) Leopardus pardalis 1965-06-28T00:00:00 1965.0 6.0 28.0
149 Leopardus pardalis (Linnaeus, 1758) Leopardus pardalis 1963-02-03T00:00:00 1963.0 2.0 3.0

150 rows × 6 columns

Selección de filas

# Selección de filas correspondientes a jaguares (*Panthera onca*)
jaguares = felinos[felinos["species"] == "Panthera onca"]

# Despliegue de los primeros registros
jaguares.head()
gbifID datasetKey occurrenceID kingdom phylum class order family genus species ... identifiedBy dateIdentified license rightsHolder recordedBy typeStatus establishmentMeans lastInterpreted mediaType issue
21 3008449314 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/15270189 Animalia Chordata Mammalia Carnivora Felidae Panthera Panthera onca ... mike_cove 2018-08-09T20:10:35 CC0_1_0 mike_cove mike_cove NaN NaN 2021-09-23T20:57:45.632Z NaN COORDINATE_ROUNDED
32 2860189171 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/58257138 Animalia Chordata Mammalia Carnivora Felidae Panthera Panthera onca ... Kate Rothra Fleming 2020-09-01T16:56:35 CC_BY_NC_4_0 Kate Rothra Fleming Kate Rothra Fleming NaN NaN 2021-09-23T21:14:12.219Z StillImage COORDINATE_ROUNDED
38 2850700339 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/55493722 Animalia Chordata Mammalia Carnivora Felidae Panthera Panthera onca ... Osa Conservation 2020-08-05T14:47:40 CC_BY_NC_4_0 Osa Conservation Osa Conservation NaN NaN 2021-09-23T21:08:25.102Z StillImage COORDINATE_ROUNDED
57 2802770349 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/15255648 Animalia Chordata Mammalia Carnivora Felidae Panthera Panthera onca ... James Telford 2018-08-09T07:22:08 CC_BY_4_0 James Telford James Telford NaN NaN 2021-09-23T20:58:14.603Z NaN COORDINATE_ROUNDED
61 2629043484 09d2da7b-4699-4e45-b0da-73c982660c98 urn:catalog:KU:KUM:145971:08353198cc55c65471bf... Animalia Chordata Mammalia Carnivora Felidae Panthera Panthera onca ... Consuelo Lorenzo & Jorge Bolaños NaN CC_BY_4_0 Comisión Nacional para el Conocimiento y Uso d... NO DISPONIBLE NaN NaN 2021-09-23T18:42:40.301Z NaN TYPE_STATUS_INVALID;OCCURRENCE_STATUS_INFERRED...

5 rows × 50 columns

# Selección de filas correspondientes a jaguares (*Panthera onca*) o pumas (*Puma concolor*)
jaguares_pumas = felinos[(felinos["species"] == "Panthera onca") | (felinos["species"] == "Puma concolor")]

# Despliegue de los primeros registros
jaguares_pumas.head(10)
gbifID datasetKey occurrenceID kingdom phylum class order family genus species ... identifiedBy dateIdentified license rightsHolder recordedBy typeStatus establishmentMeans lastInterpreted mediaType issue
0 3337559907 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/90794984 Animalia Chordata Mammalia Carnivora Felidae Puma Puma concolor ... Marvin López M. 2021-08-11T17:02:57 CC_BY_NC_4_0 Marvin López M. Marvin López M. NaN NaN 2021-09-23T21:26:16.096Z StillImage COORDINATE_ROUNDED
1 3333401669 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/88270427 Animalia Chordata Mammalia Carnivora Felidae Puma Puma concolor ... Tiziano Luka Pesci Rubilar 2021-07-23T17:52:04 CC_BY_NC_4_0 Rebeca Quirós Rebeca Quirós NaN NaN 2021-09-23T21:15:51.507Z NaN COORDINATE_ROUNDED
6 3302057398 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/81502744 Animalia Chordata Mammalia Carnivora Felidae Puma Puma concolor ... Michelle Monge-Velazquez 2021-06-04T00:10:47 CC_BY_NC_4_0 Michelle Monge-Velazquez Michelle Monge-Velazquez NaN NaN 2021-09-23T21:15:55.933Z StillImage COORDINATE_ROUNDED
10 3097275563 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/73407648 Animalia Chordata Mammalia Carnivora Felidae Puma Puma concolor ... Jaime Marcelo Aranda Sánchez 2021-04-09T22:42:23 CC_BY_NC_4_0 nubegris nubegris NaN NaN 2021-09-23T21:15:03.814Z StillImage COORDINATE_ROUNDED
11 3079910798 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/73053113 Animalia Chordata Mammalia Carnivora Felidae Puma Puma concolor ... gernotkunz 2021-04-05T21:48:18 CC_BY_NC_4_0 gernotkunz gernotkunz NaN NaN 2021-09-23T21:24:37.211Z NaN COORDINATE_ROUNDED
12 3079872785 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/73053107 Animalia Chordata Mammalia Carnivora Felidae Puma Puma concolor ... gernotkunz 2021-04-05T21:48:17 CC_BY_NC_4_0 gernotkunz gernotkunz NaN NaN 2021-09-23T21:23:35.478Z NaN COORDINATE_ROUNDED
13 3067612232 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/66718421 Animalia Chordata Mammalia Carnivora Felidae Puma Puma concolor ... Jeff Mollenhauer 2020-12-18T01:09:50 CC_BY_NC_4_0 Jeff Mollenhauer Jeff Mollenhauer NaN NaN 2021-09-23T21:09:44.468Z StillImage COORDINATE_ROUNDED
17 3031700803 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/68067200 Animalia Chordata Mammalia Carnivora Felidae Puma Puma concolor ... Marian Paniagua 2021-01-14T20:04:58 CC_BY_NC_4_0 Marian Paniagua Marian Paniagua NaN NaN 2021-09-23T21:14:47.597Z StillImage COORDINATE_ROUNDED
20 3008566753 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/66811638 Animalia Chordata Mammalia Carnivora Felidae Puma Puma concolor ... jayras 2020-12-20T05:44:17 CC_BY_NC_4_0 Pacho Gutierrez Pacho Gutierrez NaN NaN 2021-09-23T21:14:42.430Z StillImage COORDINATE_ROUNDED
21 3008449314 50c9509d-22c7-4a22-a47d-8c48425ef4a7 https://www.inaturalist.org/observations/15270189 Animalia Chordata Mammalia Carnivora Felidae Panthera Panthera onca ... mike_cove 2018-08-09T20:10:35 CC0_1_0 mike_cove mike_cove NaN NaN 2021-09-23T20:57:45.632Z NaN COORDINATE_ROUNDED

10 rows × 50 columns

Operaciones de análisis

Graficación

Carga de bibliotecas

import matplotlib.pyplot as plt # biblioteca de graficación
%matplotlib inline

import calendar # biblioteca para manejo de fechas

Estilo de los gráficos

# Estilo de los gráficos
plt.style.use('ggplot')

Ejemplos de gráficos

Distribución de registros de presencia por año
# Cambio del tipo de datos del campo de fecha
felinos["eventDate"] = pd.to_datetime(felinos["eventDate"])

# Agrupación de los registros por año
felinos_registros_x_anio = felinos.groupby(felinos['eventDate'].dt.year).count().eventDate

felinos_registros_x_anio
eventDate
1839.0     6
1928.0     2
1931.0     1
1932.0     2
1933.0     1
1939.0     1
1954.0     2
1958.0     1
1963.0     2
1964.0     1
1965.0     2
1967.0     2
1970.0     1
1993.0     2
2002.0     1
2005.0     1
2007.0     1
2008.0     1
2009.0     6
2010.0     3
2011.0     2
2012.0     6
2013.0     6
2014.0     3
2015.0    10
2016.0     8
2017.0    15
2018.0     4
2019.0    20
2020.0    23
2021.0    12
Name: eventDate, dtype: int64
# Tipo de datos retornado
type(felinos_registros_x_anio)
pandas.core.series.Series
# Conversión de series a dataframe
felinos_registros_x_anio_df = pd.DataFrame({'anio':felinos_registros_x_anio.index, 'registros':felinos_registros_x_anio.values}) 

# Conversión del tipo de la columna de año
felinos_registros_x_anio_df["anio"] = pd.to_numeric(felinos_registros_x_anio_df["anio"], downcast='integer')
felinos_registros_x_anio_df.style.set_precision(2)

felinos_registros_x_anio_df
/tmp/ipykernel_281410/3786505339.py:6: FutureWarning: this method is deprecated in favour of `Styler.format(precision=..)`
  felinos_registros_x_anio_df.style.set_precision(2)
anio registros
0 1839 6
1 1928 2
2 1931 1
3 1932 2
4 1933 1
5 1939 1
6 1954 2
7 1958 1
8 1963 2
9 1964 1
10 1965 2
11 1967 2
12 1970 1
13 1993 2
14 2002 1
15 2005 1
16 2007 1
17 2008 1
18 2009 6
19 2010 3
20 2011 2
21 2012 6
22 2013 6
23 2014 3
24 2015 10
25 2016 8
26 2017 15
27 2018 4
28 2019 20
29 2020 23
30 2021 12
# Graficación
felinos_registros_x_anio_df.plot(x='anio', y='registros', kind='bar', figsize=(12,7), color='red')

# Título y leyendas en los ejes
plt.title('Registros de presencia de Felidae (felinos) en Costa Rica por año', fontsize=20)
plt.xlabel('Año', fontsize=16)
plt.ylabel('Cantidad de registros', fontsize=16)
Text(0, 0.5, 'Cantidad de registros')
../../_images/pandas_50_1.png
Distribución de registros de presencia por mes
# Agrupación de los registros por mes
felinos_registros_x_mes = felinos.groupby(felinos['eventDate'].dt.month).count().eventDate

felinos_registros_x_mes
eventDate
1.0     28
2.0     15
3.0     21
4.0      8
5.0      9
6.0     15
7.0     13
8.0      9
9.0      5
10.0     8
11.0     2
12.0    15
Name: eventDate, dtype: int64
# Reemplazo del número del mes por el nombre del mes
felinos_registros_x_mes.index=[calendar.month_name[x] for x in range(1,13)]

felinos_registros_x_mes
January      28
February     15
March        21
April         8
May           9
June         15
July         13
August        9
September     5
October       8
November      2
December     15
Name: eventDate, dtype: int64
# Gráfico de barras
felinos_registros_x_mes.plot(kind='bar',figsize=(12,7), color='blue', alpha=0.5)

# Título y leyendas en los ejes
plt.title('Registros de presencia de Felidae (felinos) en Costa Rica por mes', fontsize=20)
plt.xlabel('Mes', fontsize=16)
plt.ylabel('Cantidad de registros', fontsize=16);
../../_images/pandas_54_0.png
Graficación en una línea de tiempo
# Agrupación de los registros por fecha
registros_x_fecha = felinos.groupby(felinos['eventDate'].dt.date).count().eventDate

registros_x_fecha
eventDate
1839-01-01    6
1928-01-01    2
1931-05-29    1
1932-01-01    1
1932-06-01    1
             ..
2021-06-03    1
2021-06-23    1
2021-07-01    1
2021-07-15    1
2021-08-11    1
Name: eventDate, Length: 135, dtype: int64
# Gráfico de líneas
registros_x_fecha.plot(figsize=(20,8), color='blue')

# Título y leyendas en los ejes
plt.title('Registros de presencia de Felidae (felinos) en Costa Rica por fecha', fontsize=20)
plt.xlabel('Fecha',fontsize=16)
plt.ylabel('Cantidad de registros',fontsize=16);
plt.legend()
<matplotlib.legend.Legend at 0x7f58d5d8fc70>
../../_images/pandas_57_1.png