Kullback–Leibler in Pyspark

Definition We’ll start by defining the Kullback–Leibler (KL) divergence. It is defined as: $$D_{K L}(P(x) | Q(x))=\sum_{i=1}^B P\left(x_i\right) \cdot \ln \left(\frac{P\left(x_i\right)}{Q\left(x_i\right)}\right)$$ where (P) is the original distribution and (Q) is the current distribution. The KL divergence measures how one, actual probability distribution diverges from a second, expected probability distribution. $B$ is defined as the number of bins in the histogram, which in our application would be the the number of unique values in the distributions. ...

April 6, 2025 · 5 min · Carl

Best Trail Running Routes in Copenhagen

Tisvilde North Coast Ultra 15k route I recently participated in the NCU 15k race and thoroughly enjoyed the official route. The course takes you through picturesque areas throughout Tisvilde Hegn. However, since it’s a race route, it navigates over some hills instead of taking the easy trails that circumnavigate them. Of course, you are allowed to skip the last 700m in sand, and finish the loop in the forest! :) ...

March 25, 2025 · 1 min · Carl