X-Git-Url: https://www.fleuret.org/cgi-bin/gitweb/gitweb.cgi?p=tex.git;a=blobdiff_plain;f=randvar.tex;fp=randvar.tex;h=6d3aae503d9f4ac1b77da9c4a14dcc425f1ea5b5;hp=0000000000000000000000000000000000000000;hb=74cdd5e14b65ac1ff03725173eb941dc7a455edf;hpb=cf0fd332cb70bf1a2a793ce658770da5ca702db9 diff --git a/randvar.tex b/randvar.tex new file mode 100644 index 0000000..6d3aae5 --- /dev/null +++ b/randvar.tex @@ -0,0 +1,284 @@ +%% -*- mode: latex; mode: reftex; mode: flyspell; coding: utf-8; tex-command: "pdflatex.sh" -*- + +%% Any copyright is dedicated to the Public Domain. +%% https://creativecommons.org/publicdomain/zero/1.0/ +%% Written by Francois Fleuret + +\documentclass[11pt,a4paper,oneside]{article} +\usepackage[paperheight=15cm,paperwidth=8cm,top=2mm,bottom=15mm,right=2mm,left=2mm]{geometry} +%\usepackage[a4paper,top=2.5cm,bottom=2cm,left=2.5cm,right=2.5cm]{geometry} +\usepackage[utf8]{inputenc} +\usepackage{amsmath,amssymb,dsfont} +\usepackage[pdftex]{graphicx} +\usepackage[colorlinks=true,linkcolor=blue,urlcolor=blue,citecolor=blue]{hyperref} +\usepackage{tikz} +\usetikzlibrary{arrows,arrows.meta,calc} +\usetikzlibrary{patterns,backgrounds} +\usetikzlibrary{positioning,fit} +\usetikzlibrary{shapes.geometric,shapes.multipart} +\usetikzlibrary{patterns.meta,decorations.pathreplacing,calligraphy} +\usetikzlibrary{tikzmark} +\usetikzlibrary{decorations.pathmorphing} +\usepackage[round]{natbib} +\usepackage[osf]{libertine} +\usepackage{microtype} + +\usepackage{mleftright} + +\newcommand{\setmuskip}[2]{#1=#2\relax} +\setmuskip{\thinmuskip}{1.5mu} % by default it is equal to 3 mu +\setmuskip{\medmuskip}{2mu} % by default it is equal to 4 mu +\setmuskip{\thickmuskip}{3.5mu} % by default it is equal to 5 mu + +\setlength{\parindent}{0cm} +\setlength{\parskip}{1ex} +%\renewcommand{\baselinestretch}{1.3} +%\setlength{\tabcolsep}{0pt} +%\renewcommand{\arraystretch}{1.0} + +\def\argmax{\operatornamewithlimits{argmax}} +\def\argmin{\operatornamewithlimits{argmin}} + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +\def\given{\,\middle\vert\,} +\def\proba{\operatorname{P}} +\newcommand{\seq}{{S}} +\newcommand{\expect}{\mathds{E}} +\newcommand{\variance}{\mathds{V}} +\newcommand{\empexpect}{\hat{\mathds{E}}} +\newcommand{\mutinf}{\mathds{I}} +\newcommand{\empmutinf}{\hat{\mathds{I}}} +\newcommand{\entropy}{\mathds{H}} +\newcommand{\empentropy}{\hat{\mathds{H}}} +\newcommand{\ganG}{\mathbf{G}} +\newcommand{\ganD}{\mathbf{D}} +\newcommand{\ganF}{\mathbf{F}} + +\newcommand{\dkl}{\mathds{D}_{\mathsf{KL}}} +\newcommand{\djs}{\mathds{D}_{\mathsf{JS}}} + +\newcommand*{\vertbar}{\rule[-1ex]{0.5pt}{2.5ex}} +\newcommand*{\horzbar}{\rule[.5ex]{2.5ex}{0.5pt}} + +\def\positionalencoding{\operatorname{pos-enc}} +\def\concat{\operatorname{concat}} +\def\crossentropy{\LL_{\operatorname{ce}}} + +\begin{document} + +\vspace*{0ex} + +\begin{center} +{\Large On Random Variables} + +Fran\c cois Fleuret + +\today + +\vspace*{1ex} + +\end{center} + +\underline{Random variables} are central to any model of a random +process, but their mathematical definition is unclear to most. This is +an attempt at giving an intuitive understanding of their definition +and utility. + +\section{Modeling randomness} + +To formalize something ``random'', the natural strategy is to define a +distribution, that is, in the finite case, a list of values / +probabilities. For instance, the head / tail result of a coin flipping +would be +% +\[ +\{(H, 0.5), (T, 0.5)\}. +\] + +This is perfectly fine, until you have several such objects. To model +two coins $A$ and $B$, it seems intuitively okay: they have nothing to +do with each other, they are ``independent'', so defining how they +behave individually is sufficient. + +\section{Non-independent variables} + +The process to generate two random values can be such that they are +related. Consider for instance that $A$ is the result of flipping a +coin, and $B$ as *the inverse value of $A$*. + +Both $A$ and $B$ are legitimate RVs, a both have the same distribution +(H, 0.5) (T, 0.5). So where is the information that they have a +relation? + +With models of the respective distributions of $A$ and $B$, this is +nowhere. This can be fixed in some way by specifying the distribution +of the pair $(A, B)$. That would be here +% +\[ +\{(H/H, 0.0), (H/T, 0.5), (T/H, 0.5), (T/T, 0.0)\}. +\] + +The distribution of $A$ and $B$ individually are called the +\underline{marginal} distributions, and this is the \underline{joint} +distribution. + +Note that the joint is a far richer object than the two marginals, and +in general many different joints are consistent with given marginals. +Here for instance, the marginals are the same as if $A$ and $B$ where +two independent coins, even though they are not. + +Even though this could somehow work, the notion of a RV here is very +unclear: it is not simply a distribution, and every time a new one is +defined, it require the specification of the joint with all the +variables already defined. + +\section{Random Variables} + +The actual definition of a RV is a bit technical. Intuitively, in some +way, it consists of defining first ``the source of all randomness'', +and then every RV is a deterministic function of it. + +Formally, it relies first on the definition of a set $\Omega$ such +that its subsets can be measured, with all the desirable properties, +such as $\mu(\Omega)=1, \mu(\emptyset)=0$ and $A \cap B = \emptyset +\Rightarrow \mu(A \cup B) = \mu(A) + \mu(B)$. + +There is a technical point: for some $\Omega$ it may be impossible to +define such a measure on all its subsets due to tricky +infinity-related pathologies. So the set $\Sigma$ of +\underline{measurable} subsets is explicitly specified and called a +$\sigma$-algebra. In any practical situation this technicality does +not matter, since $\Sigma$ contains anything needed. + +The triplet $(\Omega, \Sigma, \mu)$ is a \underline{measured set}. + +Given such a measured set, an \underline{random variable} $X$ is a +mapping from $\Omega$ into another set, and the +\underline{probability} that $X$ takes the value $x$ is the measure of +the subset of $\Omega$ where $X$ takes the value $x$: +% +\[ +P(X=x) = \mu(X^{-1}(x)) +\] + +You can imagine $\Omega$ as the square $[0,1]^2$ in $\mathbb{R}^2$ +with the usual geometrical area for $\mu$. + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +For instance if the two coins $A$ and $B$ are flipped independently, we +could picture possible random variables with the proper distribution +as follows: + +\nopagebreak + +\begin{tikzpicture}[scale=0.8] +\draw[pattern=north east lines] (0,0) rectangle ++(0.5,0.5); +\draw (0,0) rectangle ++(1,0.5); +\node at (2.5,0.2) {$A=\text{head}/\text{tail}$}; + +\draw[fill=red!50] (4.5, 0) rectangle ++(0.5,0.5); +\draw (4.5,0) rectangle ++(1,0.5); +\node at (7.0,0.2) {$B=\text{head}/\text{tail}$}; +\end{tikzpicture} +% + +\nopagebreak + +\begin{tikzpicture}[scale=0.600] +\draw[fill=red!50,draw=none] (0, 0) rectangle (2, 4); +\draw[draw=none,pattern=north east lines] (0, 0) rectangle (4,2); +\draw (0,0) rectangle (4,4); + +%% \draw[draw=green,thick] (0,0) rectangle ++(2,2); +%% \draw[draw=green,thick] (0.1,2.1) rectangle ++(1.8257,1.8257); +%% \draw[draw=green,thick] (2.1,0.1) rectangle ++(0.8165,0.8165); + +\end{tikzpicture} +% +\hspace*{\stretch{1}} +% +\begin{tikzpicture}[scale=0.600] +\draw[fill=red!50,draw=none] (0, 0) rectangle ++(1, 4); +\draw[fill=red!50,draw=none] (1.5, 0) rectangle ++(1, 4); +\draw[draw=none,pattern=north east lines] (0, 0.25) rectangle ++(4,0.5); +\draw[draw=none,pattern=north east lines] (0, 1.25) rectangle ++(4,0.5); +\draw[draw=none,pattern=north east lines] (0, 2.) rectangle ++(4,0.5); +\draw[draw=none,pattern=north east lines] (0, 2.5) rectangle ++(4,0.5); +\draw (0,0) rectangle (4,4); +\end{tikzpicture} +% +\hspace*{\stretch{1}} +% +\begin{tikzpicture}[scale=0.600] +\draw[fill=red!50,draw=none] (0, 0) rectangle (2, 2); +\draw[fill=red!50,draw=none] (0, 4)--(2,4)--(4,2)--(2,2)--cycle; +\draw[draw=none,pattern=north east lines] (0.5, 4)--(1.5,4)--(3.5,2)--(2.5,2)--cycle; +\draw[draw=none,pattern=north east lines] (3, 3) rectangle (4,4); +\draw[draw=none,pattern=north east lines] (0,4)--(1,3)--(0,2)--cycle; +\draw[draw=none,pattern=north east lines] (2.25,0) rectangle (3.25,2); +\draw[draw=none,pattern=north east lines] (0, 0) rectangle (2,1); +\draw (0,0) rectangle (4,4); +\end{tikzpicture} + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +And if $A$ is flipped and $B$ is the inverse of $A$, possible RV would +be + +\nopagebreak + +\begin{tikzpicture}[scale=0.8] +%% \node at (3.2, 1) {Flip A and B = inverse(A)}; + +\draw[pattern=north east lines] (0,0) rectangle ++(0.5,0.5); +\draw (0,0) rectangle ++(1,0.5); +\node at (2.5,0.2) {$A=\text{head}/\text{tail}$}; + +\draw[fill=red!50] (4.5, 0) rectangle ++(0.5,0.5); +\draw (4.5,0) rectangle ++(1,0.5); +\node at (7.0,0.2) {$B=\text{head}/\text{tail}$}; +\end{tikzpicture} + +\nopagebreak + +\begin{tikzpicture}[scale=0.600] +\draw[fill=red!50] (0,0) rectangle (4,4); +\draw[preaction={fill=white},draw=none,pattern=north east lines] (0, 0) rectangle (2,4); +\draw (0,0) rectangle (4,4); +\end{tikzpicture} +% +\hspace*{\stretch{1}} +% +\begin{tikzpicture}[scale=0.600] +\draw[fill=red!50] (0,0) rectangle (4,4); +\draw[preaction={fill=white},draw=none,pattern=north east lines] (0, 0) rectangle ++(1,1); +\draw[preaction={fill=white},draw=none,pattern=north east lines] (1, 0) rectangle ++(1,1); +\draw[preaction={fill=white},draw=none,pattern=north east lines] (3, 0) rectangle ++(1,1); +\draw[preaction={fill=white},draw=none,pattern=north east lines] (0, 1) rectangle ++(1,1); +\draw[preaction={fill=white},draw=none,pattern=north east lines] (2, 1) rectangle ++(1,1); +\draw[preaction={fill=white},draw=none,pattern=north east lines] (0, 2) rectangle ++(1,1); +\draw[preaction={fill=white},draw=none,pattern=north east lines] (1, 3) rectangle ++(1,1); +\draw[preaction={fill=white},draw=none,pattern=north east lines] (2, 3) rectangle ++(1,1); +\draw (0,0) rectangle (4,4); +\end{tikzpicture} +% +\hspace*{\stretch{1}} +% +\begin{tikzpicture}[scale=0.600] +\draw[fill=red!50] (0,0) rectangle (4,4); +\draw[preaction={fill=white},draw=none,pattern=north east lines] (0, 0)--(1,1)--(3,1)--(3,4)--(0,1)--cycle; +\draw[preaction={fill=white},draw=none,pattern=north east lines] (0, 3) rectangle ++(2,1); +\draw[preaction={fill=white},draw=none,pattern=north east lines] (3,0) rectangle ++(1,1); +%% \draw (0,0) grid (4,4); +\draw (0,0) rectangle (4,4); +\end{tikzpicture} + +%% Thanks to this definition, additional random variables can be defined +%% with dependency structures. For instance, if $A$ and $B$ are two +%% separate coin flipping, and then a third variable $C$ is defined by +%% rolling a dice and taking the value of $A$ if it gives $1$ and the +%% value of $B$ otherwise. + +\end{document}