= Python Pandas Series = A '''`Series`''' is an ordered collection of somewhat-uniform data that can be indexed. The [[Python/Pandas/Types|type]] is fully specified as `pandas.core.series.Series`. <> ---- == Example == {{{ import pandas as pd pd.Series(["foo", "bar", "baz"]) # 0 foo # 1 bar # 2 baz # dtype: object }}} ---- == Data Model == A `Series` can be instantiated with any [[Python/Collections/Abc#Iterable|iterable]]. === Index === By default, a series is indexed by a sequential integer (beginning at 0). Certain iterables are interpreted as pairs of indices and values. {{{ d = {"First": "foo", "Second": "bar", "Third": "baz"} s = pd.Series(d) # First foo # Second bar # Third baz # dtype: object }}} A second iterable can be specified as explicit indices. {{{ d = ["foo", "bar", "baz"] i = ["First", "Second", "Third"] s = pd.Series(d, i) s = pd.Series(d, index=i) s = pd.Series(data=d, index=i) }}} === DType === A series without significant consistency of data types will initialize with a [[Python/NumPy/Types#ObjectDType|dtype]] of `object`. Alternatives include: * `int64` * `float64` * `datetime64` * `bool` * `category` === Dunder Methods === `Series` objects support all of the [[Python/DunderMethod|dunder methods]] implied by a [[Python/Collections/Abc#Sequence|sequence]], e.g. `len()` and `sorted()`. They also support mathematical [[Python/Builtins/Operators|operators]] as member-wise operations. * `s + 10` adds 10 to each member value of `s` * `s - 10` subtracts 10 * `s * 10` multiples by 10 * `s / 10` divides by 10 * `s // 10` performs integer division by 10 Note that these operations return a new `Series`, rather than mutating the data in-place. ---- == Attributes == ||'''Method'''||'''Meaning''' || ||`axes` ||[[Python/Builtins/Types#List|list]] containing the `index` attribute's value || ||`iloc` ||[[Python/Pandas/Types#A_ILocIndexer|indexable accessor of member values]] || ||`index` ||[[Python/Pandas/Types|RangeIndex]] containing the member indices || ||`is_unique` ||[[Python/Builtins/Types#Bool|bool]] representing if all member values are unique || ||`hasnans` ||`bool` representing if any member values are [[Python/Builtins/Types#Float|NaN]] || ||`loc` ||[[Python/Pandas/Types#A_LocIndexer|indexable accessor of member values]] || ||`shape` ||[[Python/Builtins/Types#Tuple|tuple]] of 1 [[Python/Builtins/Types#Int|int]] representing number of member values|| ||`size` ||`int` count of member values || ||`values` ||[[Python/NumPy/Types#NDArray|numpy.ndarray]] containing the member values || ---- == Methods == These methods return [[Python/NumPy/Types#Float64|numpy.float64]] values unless otherwise specified. ||'''Method''' ||'''Meaning''' ||'''Example''' || ||`add` ||return a new `Series` with `N` added to each member value ||`s.add(10)` || ||`apply` ||return a new `Series` with `f` applied to each member value ||`s.apply(len)` || ||`copy` ||return a copy of the `Series` || || ||`count` ||return a count of member non-missing values || || ||`describe` ||return a new `Series` containing descriptive statistics || || ||`div` ||return a new `Series` with each member value divided by `N` ||`s.div(10)` || ||`get` ||return the value from an index or a default ||`s.get("foo", default=None)` || ||`head` ||return a `Series` view of the first N member values ||`s.head(5)` || ||`info` ||print information including types and null values || || ||`map` ||return a new `Series` with mapped values or [[Python/Builtins/Types#Float|NaN]]||`s.map({True: 1})` || ||`max` ||return greatest value || || ||`mean` ||return mean value || || ||`median` ||return median value || || ||`min` ||return least value || || ||`mode` ||return modal value || || ||`mul` ||return a new `Series` with each member value multiplied by `N` ||`s.mul(10)` || ||`product` ||return product from multiplying all member values || || ||`sort_index` ||return a `Series` view sorted by indices ||`s.sort_index(ascending=True)` || ||`sort_values` ||return a `Series` view sorted by values ||`s.sort_values(ascending=True)` || ||`std` ||return standard deviation of values || || ||`sub` ||return a new `Series` with `N` subtracted from each member value ||`s.sub(10)` || ||`sum` ||return sum from adding all member values || || ||`tail` ||return a `Series` of the last `N` member values ||`s.tail(5)` || ||`value_counts`||return a new `Series` containing counts of unique values || || Note that the `get` method can take a list of indices to look up. A new `Series` will be returned if all indices exist, and the singleton default will be returned otherwise. The '''`describe`''' method returns a specifically-formatted `Series` which ''can'' be used. {{{ s = pd.Series([1,2,3,4,5,6,7,8,9,10]) s.describe() # count 10.00000 # mean 5.50000 # std 3.02765 # min 1.00000 # 25% 3.25000 # 50% 5.50000 # 75% 7.75000 # max 10.00000 # dtype: float64 s.describe().loc["75%"] # 7.75000 }}} The `Series` created by the `value_counts` method ''is'' particularly useful. {{{ s = pd.Series("Call me Ishmael. Some years ago—never mind how long precisely—having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world.".lower().replace('.', '').replace(',', '').replace('—', ' ').split()) s.value_counts(ascending=True).head() # call 1 # nothing 1 # particular 1 # to 1 # interest 1 # Name: count, dtype: int64 s.value_counts(normalize=True).head() # and 0.046512 # the 0.046512 # i 0.046512 # little 0.046512 # me 0.046512 # Name: proportion, dtype: float64 }}} ---- CategoryRicottone