Class MurmurHash3

java.lang.Object
org.tribuo.util.MurmurHash3

public final class MurmurHash3 extends Object
The MurmurHash3 algorithm was created by Austin Appleby and placed in the public domain. This java port was authored by Yonik Seeley and also placed into the public domain. The author hereby disclaims copyright to this source code.

This produces exactly the same hash values as the final C++ version of MurmurHash3 and is thus suitable for producing the same hash values across platforms.

The 32 bit x86 version of this hash should be the fastest variant for relatively short keys like ids. murmurhash3_x64_128 is a good choice for longer strings or if you need more than 32 bits of hash.

Note - The x86 and x64 versions do _not_ produce the same results, as the algorithms are optimized for their respective platforms.

See http://github.com/yonik/java_util for future updates to this file.

  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Class
    Description
    static final class 
    128 bits of state
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    static final int
    fmix32(int h)
    32-bit mixing function.
    static final long
    fmix64(long k)
    64-bit mixing function.
    static final long
    getLongLittleEndian(byte[] buf, int offset)
    Gets a long from a byte buffer in little endian byte order.
    static void
    murmurhash3_x64_128(byte[] key, int offset, int len, int seed, MurmurHash3.LongPair out)
    Returns the MurmurHash3_x64_128 hash, placing the result in "out".
    static int
    murmurhash3_x86_32(byte[] data, int offset, int len, int seed)
    Returns the MurmurHash3_x86_32 hash.
    static int
    murmurhash3_x86_32(CharSequence data, int offset, int len, int seed)
    Returns the MurmurHash3_x86_32 hash of the UTF-8 bytes of the String without actually encoding the string to a temporary buffer.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • MurmurHash3

      public MurmurHash3()
  • Method Details

    • fmix32

      public static final int fmix32(int h)
      32-bit mixing function.
      Parameters:
      h - Value to mix.
      Returns:
      Mixed value.
    • fmix64

      public static final long fmix64(long k)
      64-bit mixing function.
      Parameters:
      k - Value to mix.
      Returns:
      Mixed value.
    • getLongLittleEndian

      public static final long getLongLittleEndian(byte[] buf, int offset)
      Gets a long from a byte buffer in little endian byte order.
      Parameters:
      buf - The buffer to operate on.
      offset - The current offset into the buffer.
      Returns:
      A long.
    • murmurhash3_x86_32

      public static int murmurhash3_x86_32(byte[] data, int offset, int len, int seed)
      Returns the MurmurHash3_x86_32 hash.
      Parameters:
      data - The data to hash.
      offset - The offset into the data.
      len - The length of the data to hash.
      seed - The initial seed of the hash.
      Returns:
      The murmurhash3_x86_32 hash.
    • murmurhash3_x86_32

      public static int murmurhash3_x86_32(CharSequence data, int offset, int len, int seed)
      Returns the MurmurHash3_x86_32 hash of the UTF-8 bytes of the String without actually encoding the string to a temporary buffer. This is more than 2x faster than hashing the result of String.getBytes().
      Parameters:
      data - The data to hash.
      offset - The offset into the data.
      len - The length of the data to hash.
      seed - The initial seed of the hash.
      Returns:
      The murmurhash3_x86_32 hash.
    • murmurhash3_x64_128

      public static void murmurhash3_x64_128(byte[] key, int offset, int len, int seed, MurmurHash3.LongPair out)
      Returns the MurmurHash3_x64_128 hash, placing the result in "out".
      Parameters:
      key - The data to hash.
      offset - The offset into the data.
      len - The length of the data to hash.
      seed - The initial state of the hash.
      out - The output value (as it's 128 bits).