338 - Understanding the Benford's Law of Probability

338 - Understanding the Benford's Law of Probability

DigitalSreeni

4 месяца назад

1,395 Просмотров

Benford's Law, also known as the first-digit law, is a statistical phenomenon
observed in many sets of numerical data. It states that in certain naturally
occurring datasets, the leading digits (1, 2, 3, etc.) occur with a higher frequency than larger digits (4, 5, 6, etc.).

According to Benford's Law, the distribution of leading digits follows a logarithmic pattern, where smaller digits are more likely to be the first digit in a number. This surprising and counterintuitive property is frequently encountered in diverse datasets such as financial transactions, population numbers, and scientific data, making Benford's Law a useful tool for
detecting anomalies and irregularities in numerical datasets.

In this tutorial, we analyze the distribution of leading digits in tax deduction, population, GDP, COVID numbers and also pixel distribution in images, with the objective of verifying whether the data adheres to Benford's Law.

The observed frequencies of the leading digits are computed and compared against the expected frequencies predicted by Benford's Law.

Relevant python code is available here: https://github.com/bnsreenu/python_for_microscopists/tree/master/323-Benfords%20Law

Additional Notes:
For the image data:
The code reads images in grayscale using opencv library, computes the DCT coefficients, and plots the observed Benford's Law distribution for each image.

In case you wonder why go through the pain of converting pixel values to DCT...

In the context of Benford's Law, the distribution of leading digits is expected to follow a logarithmic pattern, where smaller digits (1, 2, 3) occur more frequently than larger digits (4, 5, 6, 7, 8, 9). When pixel values are confined to a small range, it can disrupt this natural logarithmic distribution. For example, in 8 bit images, our pixels have values between 0 to 255. So any bright pixel will always have a leading digit of 2 and never have values 3 or greater.

Тэги:

#microscopy #python #image_processing
Ссылки и html тэги не поддерживаются


Комментарии: