Why it's worth studying: Hashing is one of those areas where the theory and practice are quite different,
and this particular line of research gives a theoretically rigorous explanation as to why this gap tends not
to cause too many problems in practice. If you're looking for a more theory-oriented project that could
potentially lead to some interesting implementation questions, this would be an excellent launching point.
The Johnson-Lindenstrauss Lemma
Many of the randomized data structures developed recently are based on a mathematical result called the
Johnson-Lindenstrauss lemma which states, intuitively, that you can take high-dimensional data and
project it into a relatively low-dimensional subspace without stretching the distances between those data
points too much. This lemma was (transitively) the inspiration for the count sketch data structure and can
be used as a building block for other techniques like locality sensitive hashing.
Do'stlaringiz bilan baham: |