Hash
Guava的Hash包提供了:
- 更灵活的hash函数
- BloomFilter算法实现
Hash
使用Guava提供的哈希函数,需要如下3步,实际上采用Fluent的方式,可以将代码合并为一行。
- 创建HashFunction,可以选择各种算法 HashFunction hashFunction = Hashing.md5();
- 将原始的值put到hashFunction HashCode hashCode = hashFunction.newHasher().putXxx();
- 取得hashcode hashCode.hashCode()
如果需要生产HashCode的是一个对象,需要明确对象的哪些属性需要被hash。这需要实现一个Funnel,Funnel的本意是漏斗,对象的Funnel就像一个榨汁机一样,一个苹果放进去,果汁就出来了。
示例:
public class HashTest {
class Person {
int id;
String firstName;
String lastName;
int birthYear;
public Person(int birthYear, String firstName, int id, String lastName) {
this.birthYear = birthYear;
this.firstName = firstName;
this.id = id;
this.lastName = lastName;
}
}
private Funnel<Person> personFunnel;
@Before
public void before() {
personFunnel = new Funnel<Person>() {
@Override
public void funnel(Person from, PrimitiveSink into) {
into.putString(from.firstName, Charsets.UTF_8);
into.putString(from.lastName, Charsets.UTF_8);
into.putInt(from.id);
into.putInt(from.birthYear);
}
};
}
@Test
public void test1() {
System.out.println("a".hashCode());
StringBuffer sb = new StringBuffer("a");
System.out.println(sb.hashCode());
}
@Test
public void test2() {
HashFunction hashFunction = Hashing.md5();
HashCode hashCode = hashFunction.newHasher().putLong(1).putString("zhangsan", Charset.forName("utf-8")).putObject(new Person(1983, "zhang", 1, "san"), personFunnel).hash();
System.out.println(hashCode.hashCode());
System.out.println(hashCode.toString());//cff805d850adcf9e936d76019502153a
System.out.println(hashCode.asInt());
System.out.println(Integer.toHexString(-670697265));//d805f8cf 刚好取前4个字节,从后向前
}
}
输出:
-670697265
cff805d850adcf9e936d76019502153a
-670697265
d805f8cf
BloomFilter
BloomFilter算法作用是:快速的判断一条数据是否在目前已有的海量数据中。 特点:
- 用极少的空间作为代价,换取时间
- 允许有一定错误的概率
原理参考:http://www.cnblogs.com/heaad/archive/2011/01/02/1924195.html
/**
* 演示BloomFilter
*/
@Test
public void test3() {
//第二个参数的意思是:这个filter中预期要存入多少个对象,这个值一定要往大里估,因为实际存储对象个数超过这个值
//错误率会迅速升高
BloomFilter<Person> personBloomFilter = BloomFilter.create(personFunnel, 10000000, 0.00001);
personBloomFilter.put(new Person(24, "zhang", 22, "lisi"));
personBloomFilter.put(new Person(25, "li", 21, "a"));
personBloomFilter.put(new Person(21, "dd", 22, "lisi"));
personBloomFilter.put(new Person(23, "fdsa", 21, "fdse"));
personBloomFilter.put(new Person(26, "s", 22, "ac"));
personBloomFilter.put(new Person(24, "yy", 12, "oi"));
if (personBloomFilter.mightContain(new Person(25, "li", 21, "a"))) {
System.out.println("contains");
}
if (personBloomFilter.mightContain(new Person(25, "li", 20, "a"))) {
System.out.println("contains too");
} else {
System.out.println("not contain");
}
}
输出:
contains
not contain
Guava的BloomFilter实现已经非常简单了,只要指定目标数据的数量级,错误率。Guava会自动选取合适个数的hash函数。