Mathematical Information Loss

Posted on by James Thomas.

Reciprocal Distribution, Hamming, Mathematical Information Loss

How I skewed up

I intended for this blog post to be on rounding, but apparently thats not even half the problem when it comes to information loss. I started my research with R. W. Hamming’s Numerical Methods for Scientists and Engineers. I thought this book would show me better alternatives to the equations learned in Calculus and Differential Equation courses and how to reduce the error and biases produced by various forms of rounding. Instead, I learned what the applicable problems were with computing.

Rounding

If you are looking for basic understanding of rounding in a computational context, check out this article from the EE times. While that’s a good starting point, I’m going to expand on this particular topic and look at the issues that arise from information loss.

Mantissas

Computers use a discrete number system. The mantissa of the number in binary can only represent a certain number of fractions because the digits are limited. Due to the rounding, overflow, and subnormal numbers created from arithmetic on these sets of numbers, we see the reciprocal distribution in the data sets of the mantissas.

The Reciprocal Distribution

In probability and statistics when you do arithmetic on distributions that contain a limiting distribution, the outcomes and feedback will be restrained to this distribution. In the case of computers, arithmetic between distributions naturally tend toward a reciprocal distribution.

How to Solve the Problems

While the solutions for these problems are complex and I have not yet finished Hamming’s book, there will most likely not be one correct answer to a particular solution. However, you will have to understand the problem intuitively and be able to manipulate the equations you are using to reduce the errors in calculations and so far the book has done an excellent job at that.

Why this Article Took so Long

Upon hitting section 2.8 of Hamming’s book, I spent a significant amount of time trying to figure out how to explain what he was saying and then trying to personally explain it several times to a good friend. I am positive I confused him at first, but he understood the main idea eventually. I may have to continue reading and write a few more blog posts before I am able to explain these concepts more clearly. That being said if I did not explain this post correctly please contact us and I will correct it. We will have commenting up soon.

There was a point in which I took a break and fixed a python module so that I could print out the graphs of the Theano optimizations of the Euler problem, which is an experiment I’m doing in regards to understanding the ideas behind Theano optimizations, however that was just productive procrastination.

Sources (in a very poor format to save time):

Hamming's Paper