灯泡机和激光投影区别 Indicator function

Mathematical function characterizing set membership This article is about the 0–1 indicator function. For the 0–infinity indicator function, see characteristic function (convex analysis).

This article includes a list of general references, but it lacks sufficient corresponding inline citations. Please help to improve this article by introducing more precise citations. (December 2009) (Learn how and when to remove this message)

A three-dimensional plot of an indicator function, shown over a square two-dimensional domain (set X): the "raised" portion overlays those two-dimensional points which are members of the "indicated" subset (A).

In mathematics, an indicator function or a characteristic function of a subset of a set is a function that maps elements of the subset to one, and all other elements to zero. That is, if A is a subset of some set X, then the indicator function of A is the function 1 A {\displaystyle \mathbf {1} _{A}} defined by 1 A ( x ) = 1 {\displaystyle \mathbf {1} _{A}\!(x)=1} if x ∈ A , {\displaystyle x\in A,} and 1 A ( x ) = 0 {\displaystyle \mathbf {1} _{A}\!(x)=0} otherwise. Other common notations are 𝟙A and χ A . {\displaystyle \chi _{A}.} [a]

The indicator function of A is the Iverson bracket of the property of belonging to A; that is,

1 A ( x ) = [ x ∈ A ] . {\displaystyle \mathbf {1} _{A}(x)=\left[\ x\in A\ \right].}

For example, the Dirichlet function is the indicator function of the rational numbers as a subset of the real numbers.

Definition[edit]

Given an arbitrary set X, the indicator function of a subset A of X is the function 1 A : X → { 0 , 1 } {\displaystyle \mathbf {1} _{A}\colon X\rightarrow \{0,1\}} defined by 1 A ( x ) = { 1 if x ∈ A 0 if x ∉ A . {\displaystyle \operatorname {\mathbf {1} } _{A}\!(x)={\begin{cases}1&{\text{if }}x\in A\\0&{\text{if }}x\notin A\,.\end{cases}}}

The Iverson bracket provides the equivalent notation [ x ∈ A ] {\displaystyle \left[\ x\in A\ \right]} or ⟦ x ∈ A ⟧, that can be used instead of 1 A ( x ) . {\displaystyle \mathbf {1} _{A}\!(x).}

The function 1 A {\displaystyle \mathbf {1} _{A}} is sometimes denoted 𝟙A, IA, χA[a] or even just A.[b]

Notation and terminology[edit]

The notation χ A {\displaystyle \chi _{A}} is also used to denote the characteristic function in convex analysis, which is defined as if using the reciprocal of the standard definition of the indicator function.

A related concept in statistics is that of a dummy variable. (This must not be confused with "dummy variables" as that term is usually used in mathematics, also called a bound variable.)

The term "characteristic function" has an unrelated meaning in classic probability theory. For this reason, traditional probabilists use the term indicator function for the function defined here almost exclusively, while mathematicians in other fields are more likely to use the term characteristic function to describe the function that indicates membership in a set.

In fuzzy logic and modern many-valued logic, predicates are the characteristic functions of a probability distribution. That is, the strict true/false valuation of the predicate is replaced by a quantity interpreted as the degree of truth.

Basic properties[edit]

The indicator or characteristic function of a subset A of some set X maps elements of X to the codomain { 0 , 1 } . {\displaystyle \{0,\,1\}.}

This mapping is surjective only when A is a non-empty proper subset of X. If A = X , {\displaystyle A=X,} then 1 A ≡ 1. {\displaystyle \mathbf {1} _{A}\equiv 1.} By a similar argument, if A = ∅ {\displaystyle A=\emptyset } then 1 A ≡ 0. {\displaystyle \mathbf {1} _{A}\equiv 0.}

If A {\displaystyle A} and B {\displaystyle B} are two subsets of X , {\displaystyle X,} then 1 A ∩ B ( x ) = min { 1 A ( x ) , 1 B ( x ) } = 1 A ( x ) ⋅ 1 B ( x ) , 1 A ∪ B ( x ) = max { 1 A ( x ) , 1 B ( x ) } = 1 A ( x ) + 1 B ( x ) − 1 A ( x ) ⋅ 1 B ( x ) , {\displaystyle {\begin{aligned}\mathbf {1} _{A\cap B}(x)~&=~\min {\bigl \{}\mathbf {1} _{A}(x),\ \mathbf {1} _{B}(x){\bigr \}}~~=~\mathbf {1} _{A}(x)\cdot \mathbf {1} _{B}(x),\\\mathbf {1} _{A\cup B}(x)~&=~\max {\bigl \{}\mathbf {1} _{A}(x),\ \mathbf {1} _{B}(x){\bigr \}}~=~\mathbf {1} _{A}(x)+\mathbf {1} _{B}(x)-\mathbf {1} _{A}(x)\cdot \mathbf {1} _{B}(x)\,,\end{aligned}}}

and the indicator function of the complement of A {\displaystyle A} i.e. A ∁ {\displaystyle A^{\complement }} is: 1 A ∁ = 1 − 1 A . {\displaystyle \mathbf {1} _{A^{\complement }}=1-\mathbf {1} _{A}.}

More generally, suppose A 1 , … , A n {\displaystyle A_{1},\dotsc ,A_{n}} is a collection of subsets of X. For any x ∈ X : {\displaystyle x\in X:}

∏ k ∈ I ( 1 − 1 A k ( x ) ) {\displaystyle \prod _{k\in I}\left(\ 1-\mathbf {1} _{A_{k}}\!\left(x\right)\ \right)}

is a product of 0s and 1s. This product has the value 1 at precisely those x ∈ X {\displaystyle x\in X} that belong to none of the sets A k {\displaystyle A_{k}} and is 0 otherwise. That is

∏ k ∈ I ( 1 − 1 A k ) = 1 X − ⋃ k A k = 1 − 1 ⋃ k A k . {\displaystyle \prod _{k\in I}(1-\mathbf {1} _{A_{k}})=\mathbf {1} _{X-\bigcup _{k}A_{k}}=1-\mathbf {1} _{\bigcup _{k}A_{k}}.}

Expanding the product on the left hand side,

1 ⋃ k A k = 1 − ∑ F ⊆ { 1 , 2 , … , n } ( − 1 ) | F | 1 ⋂ F A k = ∑ ∅ ≠ F ⊆ { 1 , 2 , … , n } ( − 1 ) | F | + 1 1 ⋂ F A k {\displaystyle \mathbf {1} _{\bigcup _{k}A_{k}}=1-\sum _{F\subseteq \{1,2,\dotsc ,n\}}(-1)^{|F|}\mathbf {1} _{\bigcap _{F}A_{k}}=\sum _{\emptyset \neq F\subseteq \{1,2,\dotsc ,n\}}(-1)^{|F|+1}\mathbf {1} _{\bigcap _{F}A_{k}}}

where | F | {\displaystyle |F|} is the cardinality of F. This is one form of the principle of inclusion-exclusion.

As suggested by the previous example, the indicator function is a useful notational device in combinatorics. The notation is used in other places as well, for instance in probability theory: if X is a probability space with probability measure P {\displaystyle \mathbb {P} } and A is a measurable set, then 1 A {\displaystyle \mathbf {1} _{A}} becomes a random variable whose expected value is equal to the probability of A:

E X ⁡ { 1 A ( x ) } = ∫ X 1 A ( x ) d P ⁡ ( x ) = ∫ A d P ⁡ ( x ) = P ⁡ ( A ) . {\displaystyle \operatorname {\mathbb {E} } _{X}\left\{\ \mathbf {1} _{A}(x)\ \right\}\ =\ \int _{X}\mathbf {1} _{A}(x)\ \operatorname {d\ \mathbb {P} } (x)=\int _{A}\operatorname {d\ \mathbb {P} } (x)=\operatorname {\mathbb {P} } (A).}

This identity is used in a simple proof of Markov's inequality.

In many cases, such as order theory, the inverse of the indicator function may be defined. This is commonly called the generalized Möbius function, as a generalization of the inverse of the indicator function in elementary number theory, the Möbius function. (See paragraph below about the use of the inverse in classical recursion theory.)

Mean, variance and covariance[edit]

Given a probability space ( Ω , F , P ) {\displaystyle \textstyle (\Omega ,{\mathcal {F}},\operatorname {P} )} with A ∈ F , {\displaystyle A\in {\mathcal {F}},} the indicator random variable 1 A : Ω → R {\displaystyle \mathbf {1} _{A}\colon \Omega \rightarrow \mathbb {R} } is defined by 1 A ( ω ) = 1 {\displaystyle \mathbf {1} _{A}(\omega )=1} if ω ∈ A , {\displaystyle \omega \in A,} otherwise 1 A ( ω ) = 0. {\displaystyle \mathbf {1} _{A}(\omega )=0.}

Mean E ⁡ ( 1 A ( ω ) ) = P ⁡ ( A ) {\displaystyle \ \operatorname {\mathbb {E} } (\mathbf {1} _{A}(\omega ))=\operatorname {\mathbb {P} } (A)\ } (also called "Fundamental Bridge"). Variance Var ⁡ ( 1 A ( ω ) ) = P ⁡ ( A ) ( 1 − P ⁡ ( A ) ) . {\displaystyle \ \operatorname {Var} (\mathbf {1} _{A}(\omega ))=\operatorname {\mathbb {P} } (A)(1-\operatorname {\mathbb {P} } (A)).} Covariance Cov ⁡ ( 1 A ( ω ) , 1 B ( ω ) ) = P ⁡ ( A ∩ B ) − P ⁡ ( A ) P ⁡ ( B ) . {\displaystyle \ \operatorname {Cov} (\mathbf {1} _{A}(\omega ),\mathbf {1} _{B}(\omega ))=\operatorname {\mathbb {P} } (A\cap B)-\operatorname {\mathbb {P} } (A)\operatorname {\mathbb {P} } (B).} Characteristic function in recursion theory, Gödel's and Kleene's representing function[edit]

Kurt Gödel described the representing function in his 1934 paper "On undecidable propositions of formal mathematical systems" (the symbol "¬" indicates logical inversion, i.e. "NOT"):[1]: 42

There shall correspond to each class or relation R a representing function ϕ ( x 1 , … x n ) = 0 {\displaystyle \phi (x_{1},\ldots x_{n})=0} if R ( x 1 , … x n ) {\displaystyle R(x_{1},\ldots x_{n})} and ϕ ( x 1 , … x n ) = 1 {\displaystyle \phi (x_{1},\ldots x_{n})=1} if ¬ R ( x 1 , … x n ) . {\displaystyle \neg R(x_{1},\ldots x_{n}).}

Kleene offers up the same definition in the context of the primitive recursive functions as a function φ of a predicate P takes on values 0 if the predicate is true and 1 if the predicate is false.[2]

For example, because the product of characteristic functions ϕ 1 ∗ ϕ 2 ∗ ⋯ ∗ ϕ n = 0 {\displaystyle \phi _{1}*\phi _{2}*\cdots *\phi _{n}=0} whenever any one of the functions equals 0, it plays the role of logical OR: IF ϕ 1 = 0 {\displaystyle \phi _{1}=0\ } OR ϕ 2 = 0 {\displaystyle \ \phi _{2}=0} OR ... OR ϕ n = 0 {\displaystyle \phi _{n}=0} THEN their product is 0. What appears to the modern reader as the representing function's logical inversion, i.e. the representing function is 0 when the function R is "true" or satisfied", plays a useful role in Kleene's definition of the logical functions OR, AND, and IMPLY,[2]: 228 the bounded-[2]: 228 and unbounded-[2]: 279 ff mu operators and the CASE function.[2]: 229

Characteristic function in fuzzy set theory[edit]

In classical mathematics, characteristic functions of sets only take values 1 (members) or 0 (non-members). In fuzzy set theory, characteristic functions are generalized to take value in the real unit interval [0, 1], or more generally, in some algebra or structure (usually required to be at least a poset or lattice). Such generalized characteristic functions are more usually called membership functions, and the corresponding "sets" are called fuzzy sets. Fuzzy sets model the gradual change in the membership degree seen in many real-world predicates like "tall", "warm", etc.

Smoothness[edit] See also: Laplacian of the indicator

In general, the indicator function of a set is not smooth; it is continuous if and only if its support is a connected component. In the algebraic geometry of finite fields, however, every affine variety admits a (Zariski) continuous indicator function.[3] Given a finite set of functions f α ∈ F q [ x 1 , … , x n ] {\displaystyle f_{\alpha }\in \mathbb {F} _{q}\left[\ x_{1},\ldots ,x_{n}\right]} let V = { x ∈ F q n : f α ( x ) = 0 } {\displaystyle V={\bigl \{}\ x\in \mathbb {F} _{q}^{n}:f_{\alpha }(x)=0\ {\bigr \}}} be their vanishing locus. Then, the function P ( x ) = ∏ ( 1 − f α ( x ) q − 1 ) {\textstyle \mathbb {P} (x)=\prod \left(\ 1-f_{\alpha }(x)^{q-1}\right)} acts as an indicator function for V . {\displaystyle V.} If x ∈ V {\displaystyle x\in V} then P ( x ) = 1 , {\displaystyle \mathbb {P} (x)=1,} otherwise, for some f α , {\displaystyle f_{\alpha },} we he f α ( x ) ≠ 0 {\displaystyle f_{\alpha }(x)\neq 0} which implies that f α ( x ) q − 1 = 1 , {\displaystyle f_{\alpha }(x)^{q-1}=1,} hence P ( x ) = 0. {\displaystyle \mathbb {P} (x)=0.}

Although indicator functions are not smooth, they admit weak derivatives. For example, consider Heiside step function H ( x ) ≡ I ( x > 0 ) {\displaystyle H(x)\equiv \operatorname {\mathbb {I} } \!{\bigl (}x>0{\bigr )}} The distributional derivative of the Heiside step function is equal to the Dirac delta function, i.e. d H ( x ) d x = δ ( x ) {\displaystyle {\frac {\mathrm {d} H(x)}{\mathrm {d} x}}=\delta (x)} and similarly the distributional derivative of G ( x ) := I ( x