Class Utf8.UnsafeProcessor

java.lang.Object
com.google.protobuf.Utf8.Processor
com.google.protobuf.Utf8.UnsafeProcessor
Enclosing class:
Utf8

static final class Utf8.UnsafeProcessor extends Utf8.Processor
Utf8.Processor that uses sun.misc.Unsafe where possible to improve performance.
  • Constructor Details

    • UnsafeProcessor

      UnsafeProcessor()
  • Method Details

    • isAvailable

      static boolean isAvailable()
      Indicates whether or not all required unsafe operations are supported on this platform.
    • isValidUtf8

      public boolean isValidUtf8(byte[] bytes, int index, int limit)
      Description copied from class: Utf8.Processor
      Returns true if the given byte array slice is a well-formed UTF-8 byte sequence. The range of bytes to be checked extends from index index, inclusive, to limit, exclusive.
      Specified by:
      isValidUtf8 in class Utf8.Processor
    • isValidUtf8BufferDirect

      protected boolean isValidUtf8BufferDirect(ByteBuffer buffer, int index, int limit)
      Description copied from class: Utf8.Processor
      Must only be called on Direct buffers. This exists as a separate method only so that the UnsafeProcessor can optimize specially for that case.
      Overrides:
      isValidUtf8BufferDirect in class Utf8.Processor
    • decodeUtf8

      String decodeUtf8(byte[] bytes, int index, int size) throws InvalidProtocolBufferException
      Description copied from class: Utf8.Processor
      Decodes the given byte array slice into a String.
      Specified by:
      decodeUtf8 in class Utf8.Processor
      Throws:
      InvalidProtocolBufferException - if the byte array slice is not valid UTF-8
    • decodeUtf8Direct

      String decodeUtf8Direct(ByteBuffer buffer, int index, int size) throws InvalidProtocolBufferException
      Description copied from class: Utf8.Processor
      Decodes direct ByteBuffer instances into String.
      Specified by:
      decodeUtf8Direct in class Utf8.Processor
      Throws:
      InvalidProtocolBufferException
    • encodeUtf8

      int encodeUtf8(String in, byte[] out, int offset, int length)
      Description copied from class: Utf8.Processor
      Encodes an input character sequence (in) to UTF-8 in the target array (out). For a string, this method is functionally identical to
      byte[] a = string.getBytes(UTF_8);
      System.arraycopy(a, 0, bytes, offset, a.length);
      return offset + a.length;
      
      but may be implemented differently for efficiency purposes.

      Matching String.getBytes(UTF_8) this replaces unpaired surrogates with a replacement character.

      To ensure sufficient space in the output buffer, either call Utf8.encodedLength(String) to compute the exact amount needed, or leave room for Utf8.MAX_BYTES_PER_CHAR * sequence.length(), which is the largest possible number of bytes that any input can be encoded to.

      Specified by:
      encodeUtf8 in class Utf8.Processor
      Parameters:
      in - the input character sequence to be encoded
      out - the target array
      offset - the starting offset in bytes to start writing at
      length - the length of the bytes, starting from offset
      Returns:
      the new offset, equivalent to offset + Utf8.encodedLength(sequence)
    • encodeUtf8Internal

      protected void encodeUtf8Internal(String in, ByteBuffer out)
      Description copied from class: Utf8.Processor
      Encodes the input character sequence to a direct ByteBuffer instance.
      Specified by:
      encodeUtf8Internal in class Utf8.Processor
    • unsafeEstimateConsecutiveAscii

      private static int unsafeEstimateConsecutiveAscii(byte[] bytes, long offset, int maxChars)
      Counts (approximately) the number of consecutive ASCII characters starting from the given position, using the most efficient method available to the platform.
      Parameters:
      bytes - the array containing the character sequence
      offset - the offset position of the index (same as index + arrayBaseOffset)
      maxChars - the maximum number of characters to count
      Returns:
      the number of ASCII characters found. The stopping position will be at or before the first non-ASCII byte.
    • unsafeEstimateConsecutiveAscii

      private static int unsafeEstimateConsecutiveAscii(long address, int maxChars)
      Same as Utf8.estimateConsecutiveAscii(ByteBuffer, int, int) except that it uses the most efficient method available to the platform.