Diff for "Python/Pandas/Series"

Immutable Page
Comments
Info
Attachments
More Actions:

Differences between revisions 15 and 17 (spanning 2 versions)

Immutable Page
Comments
Info
Attachments
More Actions:

-  ⇤ ← Revision 15 as of 2024-01-16 03:13:10 → 
  Size: 8722
  Editor: DominicRicottone
  Comment: Added method
+   ← Revision 17 as of 2025-12-23 05:06:57 → ⇥
  Size: 0
  Editor: DominicRicottone
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 1:
-= Python Pandas Series =

A '''`Series`''' is an ordered collection of somewhat-uniform data that can be indexed.

The [[Python/Pandas/Types|type]] is fully specified as `pandas.core.series.Series`.

<<TableOfContents>>

----



== Example ==

{{{
import pandas as pd

pd.Series(["foo", "bar", "baz"])  # 0   foo
                                  # 1   bar
                                  # 2   baz
                                  # dtype: object
}}}

----



== Data Model ==

A `Series` can be instantiated with any [[Python/Collections/Abc#Iterable|iterable]].



=== Index ===

By default, a series is indexed by a sequential integer (beginning at 0).

Certain iterables are interpreted as pairs of indices and values.

{{{
d = {"First": "foo", "Second": "bar", "Third": "baz"}
s = pd.Series(d)  # First    foo
                  # Second   bar
                  # Third    baz
                  # dtype: object
}}}

A second iterable can be specified as explicit indices.

{{{
d = ["foo", "bar", "baz"]
i = ["First", "Second", "Third"]
s = pd.Series(d, i)
s = pd.Series(d, index=i)
s = pd.Series(data=d, index=i)
}}}



=== DType ===

A series without significant consistency of data types will initialize with a [[Python/NumPy/Types#ObjectDType|dtype]] of `object`. Alternatives include:

 * `int64`
 * `float64`
 * `datetime64`
 * `bool`
 * `category`



=== Dunder Methods ===

`Series` objects support all of the [[Python/DunderMethod|dunder methods]] implied by a [[Python/Collections/Abc#Sequence|sequence]], e.g. `len()` and `sorted()`.

They also support mathematical [[Python/Builtins/Operators|operators]] as member-wise operations.

 * `s + 10` adds 10 to each member value of `s`
 * `s - 10` subtracts 10
 * `s * 10` multiples by 10
 * `s / 10` divides by 10
 * `s // 10` performs integer division by 10

Note that these operations return a new `Series`, rather than mutating the data in-place.

----



== Attributes ==

||'''Method'''||'''Meaning'''                                                                                                    ||
||`axes`      ||[[Python/Builtins/Types#List|list]] containing the `index` attribute's value                                     ||
||`iloc`      ||[[Python/Pandas/Types#A_ILocIndexer|indexable accessor of member values]]                                        ||
||`index`     ||[[Python/Pandas/Types|RangeIndex]] containing the member indices                                                 ||
||`is_unique` ||[[Python/Builtins/Types#Bool|bool]] representing if all member values are unique                                 ||
||`hasnans`   ||`bool` representing if any member values are [[Python/Builtins/Types#Float|NaN]]                                 ||
||`loc`       ||[[Python/Pandas/Types#A_LocIndexer|indexable accessor of member values]]                                         ||
||`shape`     ||[[Python/Builtins/Types#Tuple|tuple]] of 1 [[Python/Builtins/Types#Int|int]] representing number of member values||
||`size`      ||`int` count of member values                                                                                     ||
||`values`    ||[[Python/NumPy/Types#NDArray|numpy.ndarray]] containing the member values                                        ||

----



== Methods ==

These methods return [[Python/NumPy/Types|numpy.float64]] values unless otherwise specified.

||'''Method'''  ||'''Meaning'''                                                                  ||'''Example'''                   ||
||`add`         ||return a new `Series` with `N` added to each member value                      ||`s.add(10)`                     ||
||`apply`       ||return a new `Series` with `f` applied to each member value                    ||`s.apply(len)`                  ||
||`copy`        ||return a copy of the `Series`                                                  ||                                ||
||`count`       ||return a count of member non-missing values                                    ||                                ||
||`describe`    ||return a new `Series` containing descriptive statistics                        ||                                ||
||`div`         ||return a new `Series` with each member value divided by `N`                    ||`s.div(10)`                     ||
||`get`         ||return the value from an index or a default                                    ||`s.get("foo", default=None)`    ||
||`head`        ||return a `Series` view of the first N member values                            ||`s.head(5)`                     ||
||`info`        ||print information including types and null values                              ||                                ||
||`map`         ||return a new `Series` with mapped values or [[Python/Builtins/Types#Float|NaN]]||`s.map({True: 1})`              ||
||`max`         ||return greatest value                                                          ||                                ||
||`mean`        ||return mean value                                                              ||                                ||
||`median`      ||return median value                                                            ||                                ||
||`min`         ||return least value                                                             ||                                ||
||`mode`        ||return modal value                                                             ||                                ||
||`mul`         ||return a new `Series` with each member value multiplied by `N`                 ||`s.mul(10)`                     ||
||`product`     ||return product from multiplying all member values                              ||                                ||
||`sort_index`  ||return a `Series` view sorted by indices                                       ||`s.sort_index(ascending=True)`  ||
||`sort_values` ||return a `Series` view sorted by values                                        ||`s.sort_values(ascending=True)` ||
||`std`         ||return standard deviation of values                                            ||                                ||
||`sub`         ||return a new `Series` with `N` subtracted from each member value               ||`s.sub(10)`                     ||
||`sum`         ||return sum from adding all member values                                       ||                                ||
||`tail`        ||return a `Series` of the last `N` member values                                ||`s.tail(5)`                     ||
||`value_counts`||return a new `Series` containing counts of unique values                       ||                                ||

Note that the `get` method can take a list of indices to look up. A new `Series` will be returned if all indices exist, and the singleton default will be returned otherwise.

The '''`describe`''' method returns a specifically-formatted `Series` which ''can'' be used.

{{{
s = pd.Series([1,2,3,4,5,6,7,8,9,10])
s.describe()  # count    10.00000
              # mean      5.50000
              # std       3.02765
              # min       1.00000
              # 25%       3.25000
              # 50%       5.50000
              # 75%       7.75000
              # max      10.00000
              # dtype: float64

s.describe().loc["75%"]  # 7.75000
}}}

The `Series` created by the `value_counts` method ''is'' particularly useful.

{{{
s = pd.Series("Call me Ishmael. Some years ago—never mind how long precisely—having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world.".lower().replace('.', '').replace(',', '').replace('—', ' ').split())

s.value_counts(ascending=True).head()  # call          1
                                       # nothing       1
                                       # particular    1
                                       # to            1
                                       # interest      1
                                       # Name: count, dtype: int64

s.value_counts(normalize=True).head()  # and       0.046512
                                       # the       0.046512
                                       # i         0.046512
                                       # little    0.046512
                                       # me        0.046512
                                       # Name: proportion, dtype: float64
}}}



----
CategoryRicottone