pandas: paquete para análisis y manipulación de datos¶
Descripción general¶
pandas es una biblioteca de Python para análisis y manipulación de datos. Proporciona estructuras de datos y operaciones para manejar tablas numéricas y series temporales. Fue creada por Wes McKinney in 2008. El nombre “pandas” hace referencia tanto a “Panel Data” como a “Python Data Analysis”.
Como su estructura principal, pandas implementa el DataFrame
, el cual es un arreglo rectangular de datos, organizado en filas y columnas.
Carga¶
# Se acostumbra cargar pandas con el alias pd
import pandas as pd
Estructuras de datos¶
Las dos principales estructuras de datos de pandas son Series
y DataFrames
.
Series¶
Las Series son arreglos unidimensionales que contienen datos de cualquier tipo. Se asemejan a una columna de una tabla.
# Definición de una serie
primos = [2, 3, 5, 7, 11]
serie_primos = pd.Series(primos)
serie_primos
0 2
1 3
2 5
3 7
4 11
dtype: int64
Cada elemento de una serie tiene un índice (i.e. posición), comenzando con 0.
# Primer elemento
print("Primer elemento:", serie_primos[0])
# Segundo elemento
print("Segundo elemento:", serie_primos[1])
Primer elemento: 2
Segundo elemento: 3
Los índices también pueden tener etiquetas personalizadas.
# Índice de una serie con etiquetas personalizadas
serie_primos = pd.Series(primos, index = ["A", "B", "C", "D", "E"])
serie_primos
A 2
B 3
C 5
D 7
E 11
dtype: int64
# Elemento en el índice "D"
print(serie_primos["D"])
7
DataFrames¶
Los DataFrames son estructuras multidimensionales. Una serie puede verse como una columna de una tabla y un dataframe como una tabla completa. Un dataframe puede construirse a partir de varias series.
# Dataframe construído a partir de dos series
datos = {
"pais": ["PA", "CR", "NI"],
"poblacion": [4.1, 5.0, 6.6]
}
paises = pd.DataFrame(datos)
paises
pais | poblacion | |
---|---|---|
0 | PA | 4.1 |
1 | CR | 5.0 |
2 | NI | 6.6 |
El operador loc permite retornar una o más filas de un dataframe, de acuerdo con un índice o con un vector de índices.
# Segundo elemento
paises.loc[1]
pais CR
poblacion 5.0
Name: 1, dtype: object
# Segundo y tercer elemento
paises.loc[[1, 2]]
pais | poblacion | |
---|---|---|
1 | CR | 5.0 |
2 | NI | 6.6 |
Los índices de los dataframes también pueden etiquetarse:
paises = pd.DataFrame(datos, index=["pais0", "pais1", "pais2"])
paises
pais | poblacion | |
---|---|---|
pais0 | PA | 4.1 |
pais1 | CR | 5.0 |
pais2 | NI | 6.6 |
# Elemento en "pais0"
paises.loc["pais0"]
pais PA
poblacion 4.1
Name: pais0, dtype: object
Operaciones básicas¶
Seguidamente, se describen y ejemplifican algunas de las funciones básicas de pandas.
En los siguientes ejemplos, se utilizará un conjunto de registros de presencia de felinos (familia Felidae) de Costa Rica, obtenido a través de una consulta al portal de GBIF.
read_csv() - carga de datos¶
felinos = pd.read_csv("https://raw.githubusercontent.com/pf3311-cienciadatosgeoespaciales/2021-iii/main/contenido/b/datos/felinos.csv", sep="\t")
info() - información general sobre un conjunto de datos¶
felinos.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 50 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 gbifID 150 non-null int64
1 datasetKey 150 non-null object
2 occurrenceID 147 non-null object
3 kingdom 150 non-null object
4 phylum 150 non-null object
5 class 150 non-null object
6 order 150 non-null object
7 family 150 non-null object
8 genus 150 non-null object
9 species 150 non-null object
10 infraspecificEpithet 18 non-null object
11 taxonRank 150 non-null object
12 scientificName 150 non-null object
13 verbatimScientificName 150 non-null object
14 verbatimScientificNameAuthorship 16 non-null object
15 countryCode 150 non-null object
16 locality 41 non-null object
17 stateProvince 132 non-null object
18 occurrenceStatus 150 non-null object
19 individualCount 27 non-null float64
20 publishingOrgKey 150 non-null object
21 decimalLatitude 150 non-null float64
22 decimalLongitude 150 non-null float64
23 coordinateUncertaintyInMeters 129 non-null float64
24 coordinatePrecision 0 non-null float64
25 elevation 3 non-null float64
26 elevationAccuracy 3 non-null float64
27 depth 3 non-null float64
28 depthAccuracy 3 non-null float64
29 eventDate 148 non-null object
30 day 147 non-null float64
31 month 148 non-null float64
32 year 148 non-null float64
33 taxonKey 150 non-null int64
34 speciesKey 150 non-null int64
35 basisOfRecord 150 non-null object
36 institutionCode 137 non-null object
37 collectionCode 150 non-null object
38 catalogNumber 150 non-null object
39 recordNumber 4 non-null object
40 identifiedBy 121 non-null object
41 dateIdentified 109 non-null object
42 license 150 non-null object
43 rightsHolder 132 non-null object
44 recordedBy 135 non-null object
45 typeStatus 0 non-null float64
46 establishmentMeans 10 non-null object
47 lastInterpreted 150 non-null object
48 mediaType 102 non-null object
49 issue 137 non-null object
dtypes: float64(13), int64(3), object(34)
memory usage: 58.7+ KB
head(), tail(), sample() - despliegue de filas de un conjunto de datos¶
# Primeros 10 registros
felinos.head()
gbifID | datasetKey | occurrenceID | kingdom | phylum | class | order | family | genus | species | ... | identifiedBy | dateIdentified | license | rightsHolder | recordedBy | typeStatus | establishmentMeans | lastInterpreted | mediaType | issue | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 3337559907 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/90794984 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | Marvin López M. | 2021-08-11T17:02:57 | CC_BY_NC_4_0 | Marvin López M. | Marvin López M. | NaN | NaN | 2021-09-23T21:26:16.096Z | StillImage | COORDINATE_ROUNDED |
1 | 3333401669 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/88270427 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | Tiziano Luka Pesci Rubilar | 2021-07-23T17:52:04 | CC_BY_NC_4_0 | Rebeca Quirós | Rebeca Quirós | NaN | NaN | 2021-09-23T21:15:51.507Z | NaN | COORDINATE_ROUNDED |
2 | 3325502794 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/85490861 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus pardalis | ... | Sofía Pastor Parajeles | 2021-07-03T16:59:46 | CC_BY_NC_4_0 | Sofía Pastor Parajeles | Sofía Pastor Parajeles | NaN | NaN | 2021-09-23T20:59:21.345Z | StillImage | COORDINATE_ROUNDED |
3 | 3314547422 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/84224884 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus pardalis | ... | Sofía Pastor Parajeles | 2021-06-23T20:39:47 | CC_BY_NC_4_0 | Sofía Pastor Parajeles | Sofía Pastor Parajeles | NaN | NaN | 2021-09-23T21:25:26.648Z | StillImage | NaN |
4 | 3307298689 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/82810053 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus wiedii | ... | David | 2021-06-13T09:10:50 | CC_BY_NC_4_0 | David | David | NaN | NaN | 2021-09-23T21:25:57.478Z | StillImage | COORDINATE_ROUNDED |
5 rows × 50 columns
# Últimos 15 registros
felinos.tail()
gbifID | datasetKey | occurrenceID | kingdom | phylum | class | order | family | genus | species | ... | identifiedBy | dateIdentified | license | rightsHolder | recordedBy | typeStatus | establishmentMeans | lastInterpreted | mediaType | issue | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
145 | 439779436 | 7e2989f0-f762-11e1-a439-00145eb45e9a | NaN | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | NaN | NaN | CC_BY_4_0 | NaN | NaN | NaN | NaN | 2021-09-24T03:32:37.674Z | StillImage | GEODETIC_DATUM_ASSUMED_WGS84;RECORDED_DATE_INV... |
146 | 45869665 | 847e2306-f762-11e1-a439-00145eb45e9a | urn:catalog:LSUMZ:Mammals:13378 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus tigrinus | ... | NaN | NaN | CC0_1_0 | NaN | Gardner, Alfred L. | NaN | NATIVE | 2021-09-24T06:03:59.587Z | NaN | INSTITUTION_COLLECTION_MISMATCH |
147 | 45869664 | 847e2306-f762-11e1-a439-00145eb45e9a | urn:catalog:LSUMZ:Mammals:13377 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus tigrinus | ... | NaN | NaN | CC0_1_0 | NaN | Gardner, Alfred L. | NaN | NATIVE | 2021-09-24T06:03:59.587Z | NaN | INSTITUTION_COLLECTION_MISMATCH |
148 | 45869301 | 847e2306-f762-11e1-a439-00145eb45e9a | urn:catalog:LSUMZ:Mammals:10219 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus pardalis | ... | NaN | NaN | CC0_1_0 | NaN | Arnold, Keith A. | NaN | NATIVE | 2021-09-24T06:03:59.952Z | NaN | INSTITUTION_COLLECTION_MISMATCH |
149 | 45869265 | 847e2306-f762-11e1-a439-00145eb45e9a | urn:catalog:LSUMZ:Mammals:9289 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus pardalis | ... | NaN | NaN | CC0_1_0 | NaN | NaN | NaN | NATIVE | 2021-09-24T06:03:59.933Z | NaN | INSTITUTION_COLLECTION_MISMATCH |
5 rows × 50 columns
# 5 registros seleccionados aleatoriamente
felinos.sample(5)
gbifID | datasetKey | occurrenceID | kingdom | phylum | class | order | family | genus | species | ... | identifiedBy | dateIdentified | license | rightsHolder | recordedBy | typeStatus | establishmentMeans | lastInterpreted | mediaType | issue | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
126 | 1145891880 | 0daed095-478a-4af6-abf5-18acb790fbb2 | http://arctos.database.museum/guid/MVZ:Mamm:95... | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus pardalis | ... | Museum of Vertebrate Zoology, University of Ca... | 1999-01-27T00:00:00 | CC0_1_0 | NaN | Collector(s): Harvey E. Stork | NaN | NaN | 2021-09-25T18:06:59.268Z | NaN | OCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COU... |
29 | 2864895066 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/59866711 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | Michele Chiacchio | 2020-09-17T12:34:30 | CC_BY_NC_4_0 | dhfischer | dhfischer | NaN | NaN | 2021-09-23T21:14:11.862Z | StillImage | COORDINATE_ROUNDED |
143 | 476843829 | 4bfac3ea-8763-4f4b-a71a-76a6f5f243d3 | MCZ:Mamm:BOM-5358 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus wiedii | ... | [no agent data] | NaN | CC_BY_NC_4_0 | President and Fellows of Harvard College | William More Gabb | NaN | NaN | 2021-10-01T01:54:06.578Z | NaN | OCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COU... |
74 | 2574073231 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/31417751 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus pardalis | ... | Michelle Monge-Velazquez | 2019-08-24T18:15:37 | CC_BY_NC_4_0 | Michelle Monge-Velazquez | Michelle Monge-Velazquez | NaN | NaN | 2021-09-23T21:11:18.353Z | StillImage | COORDINATE_ROUNDED |
97 | 2236857888 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/22419536 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Panthera | Panthera onca | ... | Lena Struwe | 2019-04-13T17:26:49 | CC_BY_NC_4_0 | Lena Struwe | Lena Struwe | NaN | NaN | 2021-09-23T21:19:43.762Z | StillImage | COORDINATE_ROUNDED |
5 rows × 50 columns
Los contenidos de un data frame también pueden desplegarse al escribir su nombre en la consola de Python.
felinos
gbifID | datasetKey | occurrenceID | kingdom | phylum | class | order | family | genus | species | ... | identifiedBy | dateIdentified | license | rightsHolder | recordedBy | typeStatus | establishmentMeans | lastInterpreted | mediaType | issue | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 3337559907 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/90794984 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | Marvin López M. | 2021-08-11T17:02:57 | CC_BY_NC_4_0 | Marvin López M. | Marvin López M. | NaN | NaN | 2021-09-23T21:26:16.096Z | StillImage | COORDINATE_ROUNDED |
1 | 3333401669 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/88270427 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | Tiziano Luka Pesci Rubilar | 2021-07-23T17:52:04 | CC_BY_NC_4_0 | Rebeca Quirós | Rebeca Quirós | NaN | NaN | 2021-09-23T21:15:51.507Z | NaN | COORDINATE_ROUNDED |
2 | 3325502794 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/85490861 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus pardalis | ... | Sofía Pastor Parajeles | 2021-07-03T16:59:46 | CC_BY_NC_4_0 | Sofía Pastor Parajeles | Sofía Pastor Parajeles | NaN | NaN | 2021-09-23T20:59:21.345Z | StillImage | COORDINATE_ROUNDED |
3 | 3314547422 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/84224884 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus pardalis | ... | Sofía Pastor Parajeles | 2021-06-23T20:39:47 | CC_BY_NC_4_0 | Sofía Pastor Parajeles | Sofía Pastor Parajeles | NaN | NaN | 2021-09-23T21:25:26.648Z | StillImage | NaN |
4 | 3307298689 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/82810053 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus wiedii | ... | David | 2021-06-13T09:10:50 | CC_BY_NC_4_0 | David | David | NaN | NaN | 2021-09-23T21:25:57.478Z | StillImage | COORDINATE_ROUNDED |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
145 | 439779436 | 7e2989f0-f762-11e1-a439-00145eb45e9a | NaN | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | NaN | NaN | CC_BY_4_0 | NaN | NaN | NaN | NaN | 2021-09-24T03:32:37.674Z | StillImage | GEODETIC_DATUM_ASSUMED_WGS84;RECORDED_DATE_INV... |
146 | 45869665 | 847e2306-f762-11e1-a439-00145eb45e9a | urn:catalog:LSUMZ:Mammals:13378 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus tigrinus | ... | NaN | NaN | CC0_1_0 | NaN | Gardner, Alfred L. | NaN | NATIVE | 2021-09-24T06:03:59.587Z | NaN | INSTITUTION_COLLECTION_MISMATCH |
147 | 45869664 | 847e2306-f762-11e1-a439-00145eb45e9a | urn:catalog:LSUMZ:Mammals:13377 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus tigrinus | ... | NaN | NaN | CC0_1_0 | NaN | Gardner, Alfred L. | NaN | NATIVE | 2021-09-24T06:03:59.587Z | NaN | INSTITUTION_COLLECTION_MISMATCH |
148 | 45869301 | 847e2306-f762-11e1-a439-00145eb45e9a | urn:catalog:LSUMZ:Mammals:10219 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus pardalis | ... | NaN | NaN | CC0_1_0 | NaN | Arnold, Keith A. | NaN | NATIVE | 2021-09-24T06:03:59.952Z | NaN | INSTITUTION_COLLECTION_MISMATCH |
149 | 45869265 | 847e2306-f762-11e1-a439-00145eb45e9a | urn:catalog:LSUMZ:Mammals:9289 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus pardalis | ... | NaN | NaN | CC0_1_0 | NaN | NaN | NaN | NATIVE | 2021-09-24T06:03:59.933Z | NaN | INSTITUTION_COLLECTION_MISMATCH |
150 rows × 50 columns
Selección de columnas¶
Las columnas que se despliegan en un data frame pueden especificarse mediante una lista.
# Despliegue de las columnas con el nombre científico, la especie, la fecha, el año, el mes y el día
felinos[["scientificName", "species", "eventDate", "year", "month", "day"]]
scientificName | species | eventDate | year | month | day | |
---|---|---|---|---|---|---|
0 | Puma concolor (Linnaeus, 1771) | Puma concolor | 2021-08-11T10:22:36 | 2021.0 | 8.0 | 11.0 |
1 | Puma concolor (Linnaeus, 1771) | Puma concolor | 2021-07-15T16:22:29 | 2021.0 | 7.0 | 15.0 |
2 | Leopardus pardalis (Linnaeus, 1758) | Leopardus pardalis | 2021-07-01T19:28:25 | 2021.0 | 7.0 | 1.0 |
3 | Leopardus pardalis (Linnaeus, 1758) | Leopardus pardalis | 2021-06-23T10:55:00 | 2021.0 | 6.0 | 23.0 |
4 | Leopardus wiedii (Schinz, 1821) | Leopardus wiedii | 2015-12-05T14:41:42 | 2015.0 | 12.0 | 5.0 |
... | ... | ... | ... | ... | ... | ... |
145 | Puma concolor (Linnaeus, 1771) | Puma concolor | NaN | NaN | NaN | NaN |
146 | Leopardus tigrinus (Schreber, 1775) | Leopardus tigrinus | 1967-05-15T00:00:00 | 1967.0 | 5.0 | 15.0 |
147 | Leopardus tigrinus (Schreber, 1775) | Leopardus tigrinus | 1967-02-01T00:00:00 | 1967.0 | 2.0 | 1.0 |
148 | Leopardus pardalis (Linnaeus, 1758) | Leopardus pardalis | 1965-06-28T00:00:00 | 1965.0 | 6.0 | 28.0 |
149 | Leopardus pardalis (Linnaeus, 1758) | Leopardus pardalis | 1963-02-03T00:00:00 | 1963.0 | 2.0 | 3.0 |
150 rows × 6 columns
Selección de filas¶
# Selección de filas correspondientes a jaguares (*Panthera onca*)
jaguares = felinos[felinos["species"] == "Panthera onca"]
# Despliegue de los primeros registros
jaguares.head()
gbifID | datasetKey | occurrenceID | kingdom | phylum | class | order | family | genus | species | ... | identifiedBy | dateIdentified | license | rightsHolder | recordedBy | typeStatus | establishmentMeans | lastInterpreted | mediaType | issue | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
21 | 3008449314 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/15270189 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Panthera | Panthera onca | ... | mike_cove | 2018-08-09T20:10:35 | CC0_1_0 | mike_cove | mike_cove | NaN | NaN | 2021-09-23T20:57:45.632Z | NaN | COORDINATE_ROUNDED |
32 | 2860189171 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/58257138 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Panthera | Panthera onca | ... | Kate Rothra Fleming | 2020-09-01T16:56:35 | CC_BY_NC_4_0 | Kate Rothra Fleming | Kate Rothra Fleming | NaN | NaN | 2021-09-23T21:14:12.219Z | StillImage | COORDINATE_ROUNDED |
38 | 2850700339 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/55493722 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Panthera | Panthera onca | ... | Osa Conservation | 2020-08-05T14:47:40 | CC_BY_NC_4_0 | Osa Conservation | Osa Conservation | NaN | NaN | 2021-09-23T21:08:25.102Z | StillImage | COORDINATE_ROUNDED |
57 | 2802770349 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/15255648 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Panthera | Panthera onca | ... | James Telford | 2018-08-09T07:22:08 | CC_BY_4_0 | James Telford | James Telford | NaN | NaN | 2021-09-23T20:58:14.603Z | NaN | COORDINATE_ROUNDED |
61 | 2629043484 | 09d2da7b-4699-4e45-b0da-73c982660c98 | urn:catalog:KU:KUM:145971:08353198cc55c65471bf... | Animalia | Chordata | Mammalia | Carnivora | Felidae | Panthera | Panthera onca | ... | Consuelo Lorenzo & Jorge Bolaños | NaN | CC_BY_4_0 | Comisión Nacional para el Conocimiento y Uso d... | NO DISPONIBLE | NaN | NaN | 2021-09-23T18:42:40.301Z | NaN | TYPE_STATUS_INVALID;OCCURRENCE_STATUS_INFERRED... |
5 rows × 50 columns
# Selección de filas correspondientes a jaguares (*Panthera onca*) o pumas (*Puma concolor*)
jaguares_pumas = felinos[(felinos["species"] == "Panthera onca") | (felinos["species"] == "Puma concolor")]
# Despliegue de los primeros registros
jaguares_pumas.head(10)
gbifID | datasetKey | occurrenceID | kingdom | phylum | class | order | family | genus | species | ... | identifiedBy | dateIdentified | license | rightsHolder | recordedBy | typeStatus | establishmentMeans | lastInterpreted | mediaType | issue | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 3337559907 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/90794984 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | Marvin López M. | 2021-08-11T17:02:57 | CC_BY_NC_4_0 | Marvin López M. | Marvin López M. | NaN | NaN | 2021-09-23T21:26:16.096Z | StillImage | COORDINATE_ROUNDED |
1 | 3333401669 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/88270427 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | Tiziano Luka Pesci Rubilar | 2021-07-23T17:52:04 | CC_BY_NC_4_0 | Rebeca Quirós | Rebeca Quirós | NaN | NaN | 2021-09-23T21:15:51.507Z | NaN | COORDINATE_ROUNDED |
6 | 3302057398 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/81502744 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | Michelle Monge-Velazquez | 2021-06-04T00:10:47 | CC_BY_NC_4_0 | Michelle Monge-Velazquez | Michelle Monge-Velazquez | NaN | NaN | 2021-09-23T21:15:55.933Z | StillImage | COORDINATE_ROUNDED |
10 | 3097275563 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/73407648 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | Jaime Marcelo Aranda Sánchez | 2021-04-09T22:42:23 | CC_BY_NC_4_0 | nubegris | nubegris | NaN | NaN | 2021-09-23T21:15:03.814Z | StillImage | COORDINATE_ROUNDED |
11 | 3079910798 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/73053113 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | gernotkunz | 2021-04-05T21:48:18 | CC_BY_NC_4_0 | gernotkunz | gernotkunz | NaN | NaN | 2021-09-23T21:24:37.211Z | NaN | COORDINATE_ROUNDED |
12 | 3079872785 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/73053107 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | gernotkunz | 2021-04-05T21:48:17 | CC_BY_NC_4_0 | gernotkunz | gernotkunz | NaN | NaN | 2021-09-23T21:23:35.478Z | NaN | COORDINATE_ROUNDED |
13 | 3067612232 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/66718421 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | Jeff Mollenhauer | 2020-12-18T01:09:50 | CC_BY_NC_4_0 | Jeff Mollenhauer | Jeff Mollenhauer | NaN | NaN | 2021-09-23T21:09:44.468Z | StillImage | COORDINATE_ROUNDED |
17 | 3031700803 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/68067200 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | Marian Paniagua | 2021-01-14T20:04:58 | CC_BY_NC_4_0 | Marian Paniagua | Marian Paniagua | NaN | NaN | 2021-09-23T21:14:47.597Z | StillImage | COORDINATE_ROUNDED |
20 | 3008566753 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/66811638 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | jayras | 2020-12-20T05:44:17 | CC_BY_NC_4_0 | Pacho Gutierrez | Pacho Gutierrez | NaN | NaN | 2021-09-23T21:14:42.430Z | StillImage | COORDINATE_ROUNDED |
21 | 3008449314 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/15270189 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Panthera | Panthera onca | ... | mike_cove | 2018-08-09T20:10:35 | CC0_1_0 | mike_cove | mike_cove | NaN | NaN | 2021-09-23T20:57:45.632Z | NaN | COORDINATE_ROUNDED |
10 rows × 50 columns
Operaciones de análisis¶
Graficación¶
Carga de bibliotecas¶
import matplotlib.pyplot as plt # biblioteca de graficación
%matplotlib inline
import calendar # biblioteca para manejo de fechas
Estilo de los gráficos¶
# Estilo de los gráficos
plt.style.use('ggplot')
Ejemplos de gráficos¶
Distribución de registros de presencia por año¶
# Cambio del tipo de datos del campo de fecha
felinos["eventDate"] = pd.to_datetime(felinos["eventDate"])
# Agrupación de los registros por año
felinos_registros_x_anio = felinos.groupby(felinos['eventDate'].dt.year).count().eventDate
felinos_registros_x_anio
eventDate
1839.0 6
1928.0 2
1931.0 1
1932.0 2
1933.0 1
1939.0 1
1954.0 2
1958.0 1
1963.0 2
1964.0 1
1965.0 2
1967.0 2
1970.0 1
1993.0 2
2002.0 1
2005.0 1
2007.0 1
2008.0 1
2009.0 6
2010.0 3
2011.0 2
2012.0 6
2013.0 6
2014.0 3
2015.0 10
2016.0 8
2017.0 15
2018.0 4
2019.0 20
2020.0 23
2021.0 12
Name: eventDate, dtype: int64
# Tipo de datos retornado
type(felinos_registros_x_anio)
pandas.core.series.Series
# Conversión de series a dataframe
felinos_registros_x_anio_df = pd.DataFrame({'anio':felinos_registros_x_anio.index, 'registros':felinos_registros_x_anio.values})
# Conversión del tipo de la columna de año
felinos_registros_x_anio_df["anio"] = pd.to_numeric(felinos_registros_x_anio_df["anio"], downcast='integer')
felinos_registros_x_anio_df.style.set_precision(2)
felinos_registros_x_anio_df
/tmp/ipykernel_281410/3786505339.py:6: FutureWarning: this method is deprecated in favour of `Styler.format(precision=..)`
felinos_registros_x_anio_df.style.set_precision(2)
anio | registros | |
---|---|---|
0 | 1839 | 6 |
1 | 1928 | 2 |
2 | 1931 | 1 |
3 | 1932 | 2 |
4 | 1933 | 1 |
5 | 1939 | 1 |
6 | 1954 | 2 |
7 | 1958 | 1 |
8 | 1963 | 2 |
9 | 1964 | 1 |
10 | 1965 | 2 |
11 | 1967 | 2 |
12 | 1970 | 1 |
13 | 1993 | 2 |
14 | 2002 | 1 |
15 | 2005 | 1 |
16 | 2007 | 1 |
17 | 2008 | 1 |
18 | 2009 | 6 |
19 | 2010 | 3 |
20 | 2011 | 2 |
21 | 2012 | 6 |
22 | 2013 | 6 |
23 | 2014 | 3 |
24 | 2015 | 10 |
25 | 2016 | 8 |
26 | 2017 | 15 |
27 | 2018 | 4 |
28 | 2019 | 20 |
29 | 2020 | 23 |
30 | 2021 | 12 |
# Graficación
felinos_registros_x_anio_df.plot(x='anio', y='registros', kind='bar', figsize=(12,7), color='red')
# Título y leyendas en los ejes
plt.title('Registros de presencia de Felidae (felinos) en Costa Rica por año', fontsize=20)
plt.xlabel('Año', fontsize=16)
plt.ylabel('Cantidad de registros', fontsize=16)
Text(0, 0.5, 'Cantidad de registros')
Distribución de registros de presencia por mes¶
# Agrupación de los registros por mes
felinos_registros_x_mes = felinos.groupby(felinos['eventDate'].dt.month).count().eventDate
felinos_registros_x_mes
eventDate
1.0 28
2.0 15
3.0 21
4.0 8
5.0 9
6.0 15
7.0 13
8.0 9
9.0 5
10.0 8
11.0 2
12.0 15
Name: eventDate, dtype: int64
# Reemplazo del número del mes por el nombre del mes
felinos_registros_x_mes.index=[calendar.month_name[x] for x in range(1,13)]
felinos_registros_x_mes
January 28
February 15
March 21
April 8
May 9
June 15
July 13
August 9
September 5
October 8
November 2
December 15
Name: eventDate, dtype: int64
# Gráfico de barras
felinos_registros_x_mes.plot(kind='bar',figsize=(12,7), color='blue', alpha=0.5)
# Título y leyendas en los ejes
plt.title('Registros de presencia de Felidae (felinos) en Costa Rica por mes', fontsize=20)
plt.xlabel('Mes', fontsize=16)
plt.ylabel('Cantidad de registros', fontsize=16);
Graficación en una línea de tiempo¶
# Agrupación de los registros por fecha
registros_x_fecha = felinos.groupby(felinos['eventDate'].dt.date).count().eventDate
registros_x_fecha
eventDate
1839-01-01 6
1928-01-01 2
1931-05-29 1
1932-01-01 1
1932-06-01 1
..
2021-06-03 1
2021-06-23 1
2021-07-01 1
2021-07-15 1
2021-08-11 1
Name: eventDate, Length: 135, dtype: int64
# Gráfico de líneas
registros_x_fecha.plot(figsize=(20,8), color='blue')
# Título y leyendas en los ejes
plt.title('Registros de presencia de Felidae (felinos) en Costa Rica por fecha', fontsize=20)
plt.xlabel('Fecha',fontsize=16)
plt.ylabel('Cantidad de registros',fontsize=16);
plt.legend()
<matplotlib.legend.Legend at 0x7f58d5d8fc70>