Python Pandas Series

A Series is an ordered collection of somewhat-uniform data that can be indexed.

The type is fully specified as pandas.core.series.Series.


Example

import pandas as pd

pd.Series(["foo", "bar", "baz"])  # 0   foo
                                  # 1   bar
                                  # 2   baz
                                  # dtype: object


Data Model

A Series can be instantiated with any iterable.

Index

By default, a series is indexed by a sequential integer (beginning at 0).

Certain iterables are interpreted as pairs of indices and values.

d = {"First": "foo", "Second": "bar", "Third": "baz"}
s = pd.Series(d)  # First    foo
                  # Second   bar
                  # Third    baz
                  # dtype: object

A second iterable can be specified as explicit indices.

d = ["foo", "bar", "baz"]
i = ["First", "Second", "Third"]
s = pd.Series(d, i)
s = pd.Series(d, index=i)
s = pd.Series(data=d, index=i)

DType

A series without significant consistency of data types will initialize with a dtype of object. Alternatives include:

Dunder Methods

Series objects support all of the dunder methods implied by a sequence, e.g. len() and sorted().

They also support mathematical operators as member-wise operations.

Note that these operations return a new Series, rather than mutating the data in-place.


Attributes

Method

Meaning

axes

list containing the index attribute's value

iloc

indexable accessor of member values

index

RangeIndex containing the member indices

is_unique

bool representing if all member values are unique

hasnans

bool representing if any member values are NaN

loc

indexable accessor of member values

shape

tuple of 1 int representing number of member values

size

int count of member values

values

numpy.ndarray containing the member values


Methods

These methods return numpy.float64 values unless otherwise specified.

Method

Meaning

Example

add

return a new Series with N added to each member value

s.add(10)

apply

return a new Series with f applied to each member value

s.apply(len)

copy

return a copy of the Series

count

return a count of member non-missing values

describe

return a new Series containing descriptive statistics

div

return a new Series with each member value divided by N

s.div(10)

get

return the value from an index or a default

s.get("foo", default=None)

head

return a Series view of the first N member values

s.head(5)

info

print information including types and null values

map

return a new Series with mapped values or NaN

s.map({True: 1})

max

return greatest value

mean

return mean value

median

return median value

min

return least value

mode

return modal value

mul

return a new Series with each member value multiplied by N

s.mul(10)

product

return product from multiplying all member values

sort_index

return a Series view sorted by indices

s.sort_index(ascending=True)

sort_values

return a Series view sorted by values

s.sort_values(ascending=True)

std

return standard deviation of values

sub

return a new Series with N subtracted from each member value

s.sub(10)

sum

return sum from adding all member values

tail

return a Series of the last N member values

s.tail(5)

value_counts

return a new Series containing counts of unique values

Note that the get method can take a list of indices to look up. A new Series will be returned if all indices exist, and the singleton default will be returned otherwise.

The describe method returns a specifically-formatted Series which can be used.

s = pd.Series([1,2,3,4,5,6,7,8,9,10])
s.describe()  # count    10.00000
              # mean      5.50000
              # std       3.02765
              # min       1.00000
              # 25%       3.25000
              # 50%       5.50000
              # 75%       7.75000
              # max      10.00000
              # dtype: float64

s.describe().loc["75%"]  # 7.75000

The Series created by the value_counts method is particularly useful.

s = pd.Series("Call me Ishmael. Some years ago—never mind how long precisely—having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world.".lower().replace('.', '').replace(',', '').replace('—', ' ').split())

s.value_counts(ascending=True).head()  # call          1
                                       # nothing       1
                                       # particular    1
                                       # to            1
                                       # interest      1
                                       # Name: count, dtype: int64

s.value_counts(normalize=True).head()  # and       0.046512
                                       # the       0.046512
                                       # i         0.046512
                                       # little    0.046512
                                       # me        0.046512
                                       # Name: proportion, dtype: float64


CategoryRicottone

Python/Pandas/Series (last edited 2024-01-16 03:42:53 by DominicRicottone)