Four views of multisets
This post will define multisets and basic operations on multisets.
We'll view union, intersection, inclusion and sum each from four perspectives:
- Examples with words
- Example with prime factorization
- Using Python's multiset module
- Multisets as functions
A multiset is like a set, except each element may appear more than once. We say each element has a multiplicity.
Example with wordsFor example, the set of letters in the word mississippi is {i, m, p, s} but the multiset of letters would be
{i, i, i, i, m, p, p, s, s, s, s}
As with sets, order doesn't matter. We could write the multiset of letters in mississipi more compactly as
{i4, m1, p2, s4}
This looks a lot like a prime factorization, and indeed prime factorization will be our running example in this post.
Example with prime factorsThe factors of 2022 form an ordinary set because
2022 = 2 * 3 * 337
but the factors of 20221026 form a multiset because
20221026 = 2 * 3 * 7^2 * 109 * 631,
i.e. the factor 7 has multiplicity 2 because 20221026 is divisible by 49.
Python exampleThe factorint function in sympy returns a multiset, represented as a dictionary. The keys are the prime factors and the values are the multiplicities.
>>> from sympy import factorint >>> factorint(2022) {2: 1, 3: 1, 337: 1} >>> factorint(20221026) {2: 1, 3: 1, 7: 2, 109: 1, 631: 1}
Suppose the members of our multisets come from a universal set A. A multiset can be represented as a function f from A to the set of non-negative integers, mapping each element to its multiplicit. In the case of prime factorizations, A is the set of prime numbers and f(p) is the exponent of p in the prime factorization of some number.
We will use the Python library multiset to represent multisets. We could initialize a Multiset object with the return value of a call to factorint.
>>> from multiset import Multiset >>> M = Multiset(factorint(20221026)) >>> M Multiset({2: 1, 3: 1, 7: 2, 109: 1, 631: 1})Union
The elements in the union of two multisets are the union of the elements of in each of the multisets, with multiplicity equal to the maximum from each multiset. An example will make this clearer.
Example with wordsThe union of
{m, i, s, s, i, s, s, i, p, p, i}
and
{m, a, s, s, a, c, h, u, s, e, t, t, s}
would be
{a, a, c, e, h, i, i, i, i, m, p, p, s, s, s, s, t, t, u}
Multiset unions don't have to be ordered, but ordering them makes it easier to compute the union.
Example with PythonHere's Python code to repeat the example above.
>>> MS = Multiset("mississippi") >>> MA = Multiset("massachusetts") >>> MS.union(MA) Multiset({'m': 1, 'i': 4, 's': 4, 'p': 2, 'a': 2, 'c': 1, 'h': 1, 'u': 1, 'e': 1, 't': 2})Prime factor example
Let a and b be two non-negative integers. The union of as prime factorization with bs prime factorization is the prime factorization of the least common multiple of a and b.
For example, let a = 12 and b = 10.
lcm(2^2 * 3, 2 * 5) = 2^2 * 3 * 5.
Function perspectiveIf the function f represents the multiset A and g represents the multiset B then the function max(f, g) represents the union of A and B.
IntersectionTo form the intersection of two multisets, intersect the underlying sets, and set the multiplicities to the minimum of the multiplicities in each of the original multisets.
Example with wordsThe intersection of
{m, i, s, s, i, s, s, i, p, p, i}
and
{m, a, s, s, a, c, h, u, s, e, t, t, s}
would be
{m, s, s, s, s}
Example with PythonReusing MA and MS above, we have
>>> MS.intersection(MA) Multiset({'m': 1, 's': 4})Prime factorization
The intersection of the prime factorization of a and the prime factorization of b is the prime factorization of the greatest common factor of a and b.
For example,
gcd(2^3 * 3 * 5^3, 2 * 5^2 * 17) = 2 * 5^2
Function perspectiveIf the function f represents the multiset A and g represents the multiset B then the function min(f, g) represents the intersection of A and B.
InclusionA multiset A is included in a multiset B if every element of A is an element of B with the same or greater multiplicity.
Example with wordsThe multiset of letters in the word kale is included in the multiset of letters in kalman filter.
The multiset of letters in apple is not included in the multiset of letters in pale rider. The set
{a, p, l, e}
is contained in the set
{p, a, l, e, r, i, d}
but the multiset
{a, p, p, l, e}
is not contained in the multiset
{p, a, l, e, e, r, r, i, d}
because p' has multiplicity 2 in the former but multiplicity 1 in the latter.
Python exampleThe multiset module overloads <= for inclusion.
>>> Multiset("kale") <= Multiset("kalman filter") True >>> Multiset("apple") <= Multiset("pale rider") FalsePrime factorization
The multiset of prime factors of a is included in the multiset of prime factors of b if and only if a divides b.
Function perspectiveIf the function f represents the multiset A and g represents the multiset B then A is included in B if and only if f g.
SumThe union of two multisets may not be exactly what you expect. For example, maybe you expected the union of
{a, b, b}
with
{b, c}
to be
{a, b, b, b, c}
rather than
{a, b, b, c}.
There is an operation on multisets like this, but it's called sum rather than union. In the sum of two multisets, multiplicities add.
Example with wordsThe sum of the multiset of letters in harry and potter is the multiset of letters in harrypotter.
Example with Python>>> Multiset("harry") + Multiset("potter") Multiset({'h': 1, 'a': 1, 'r': 3, 'y': 1, 'p': 1, 'o': 1, 't': 2, 'e': 1})
Note that r' has multiplicity 3 in the sum because it has multiplicity 2 in the first multiset and 1 in the second.
Prime factorizationThe sum of the multiset of prime factors of a and the multiset of prime factors of b is the multiset of prime factors of ab.
Function perspectiveIf the function f represents the multiset A and g represents the multiset B then f + g represents the sum of A and B.
The post Four views of multisets first appeared on John D. Cook.