Your task is to implement a Bloom filter, analyze its false postive rate, and conduct an experiment to empirically measure its false positive rate.
Bloom filters are described in the book (see exercise DLN 10.99) and on this site.
You are expected to modify these files:
BloomFilter.java
– where you implement your BloomFilterYou are expected to read, use, but not necessarily modify these files:
HashFamily.java
– interface for hash families (just one method)HashFamilyDumb.java
– a concrete implementation of a hash family. This implementation is dumb but is useful for testing.HashFamilySmart.java
– a somehwat smarter implementation of a hash familyIn the method called selectBestKParameter
you should analytically determine the best setting of $k$ for the given $m$ and $n$. Try all values of $k$ from 1 to 100 and see which one has the lowest false positive rate. You should compute the false positive rate using the formula that is derived in a series of exercises (see DLN 10.99 to 10.103).
In the method called runExperiment
you should do the following experiment.
String
, not Integer
.runExperiment
should return the false positive rate.