Class CreateIndexJson

  • All Implemented Interfaces:
    java.lang.Runnable, java.util.concurrent.Callable<java.lang.Object>

    public class CreateIndexJson
    extends CollecTorMain
    Create an index file called index.json containing metadata of all files in the indexed/ directory and update the htdocs/ directory to contain all files to be served via the web server.

    File metadata includes:

    • Path for downloading this file from the web server.
    • Size of the file in bytes.
    • Timestamp when the file was last modified.
    • Descriptor types as found in @type annotations of contained descriptors.
    • Earliest and latest publication timestamp of contained descriptors.
    • SHA-256 digest of the file.

    This class maintains its own working directory htdocs/ with subdirectories like htdocs/archive/ or htdocs/recent/ and another subdirectory htdocs/index/. The first two subdirectories contain (hard) links created and deleted by this class, the third subdirectory contains the index.json file in uncompressed and compressed forms.

    The main reason for having the htdocs/ directory is that indexing a large descriptor file can be time consuming. New or updated files in indexed/ first need to be indexed before their metadata can be included in index.json. Another reason is that files removed from indexed/ shall still be available for download for a limited period of time after disappearing from index.json.

    The reason for creating (hard) links in htdocs/, rather than copies, is that links do not consume additional disk space. All directories must be located on the same file system. Storing symbolic links in htdocs/ would not have worked with replaced or deleted files in the original directories. Symbolic links in original directories are allowed as long as they target to the same file system.

    This class does not write, modify, or delete any files in the indexed/ directory. At the same time it does not expect any other classes to write, modify, or delete contents in the htdocs/ directory.

    • Constructor Summary

      Constructors 
      Constructor Description
      CreateIndexJson​(Configuration configuration)
      Initialize this class with the given configuration.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected org.torproject.metrics.collector.indexer.IndexerTask createIndexerTask​(java.nio.file.Path fileToIndex)
      Create an indexer task for indexing the given file.
      java.lang.String module()
      Returns the module name for logging purposes.
      protected java.lang.String obtainBuildRevision()
      Obtain and return the build revision string that was generated during the build process with git rev-parse --short HEAD and written to collector.buildrevision.properties, or return null if the build revision string cannot be obtained.
      void startProcessing()
      Run the indexer by (1) adding new files from indexed/ to the index, (2) adding old files from htdocs/ for which only links exist to the index, (3) scheduling new tasks and updating links in htdocs/ to reflect what's contained in the in-memory index, and (4) writing new uncompressed and compressed index.json files to disk.
      protected void startProcessing​(java.time.Instant now)
      Helper method to startProcessing() that accepts the current execution time and which is used by tests.
      protected java.lang.String syncMarker()
      Returns property prefix/infix/postfix for Sync related properties.
      • Methods inherited from class org.torproject.metrics.collector.sync.SyncManager

        merge
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • CreateIndexJson

        public CreateIndexJson​(Configuration configuration)
        Initialize this class with the given configuration.
        Parameters:
        configuration - Configuration values.
    • Method Detail

      • module

        public java.lang.String module()
        Description copied from class: CollecTorMain
        Returns the module name for logging purposes.
        Specified by:
        module in class CollecTorMain
      • syncMarker

        protected java.lang.String syncMarker()
        Description copied from class: CollecTorMain
        Returns property prefix/infix/postfix for Sync related properties.
        Specified by:
        syncMarker in class CollecTorMain
      • startProcessing

        public void startProcessing()
        Run the indexer by (1) adding new files from indexed/ to the index, (2) adding old files from htdocs/ for which only links exist to the index, (3) scheduling new tasks and updating links in htdocs/ to reflect what's contained in the in-memory index, and (4) writing new uncompressed and compressed index.json files to disk.
        Specified by:
        startProcessing in class CollecTorMain
      • startProcessing

        protected void startProcessing​(java.time.Instant now)
        Helper method to startProcessing() that accepts the current execution time and which is used by tests.
        Parameters:
        now - Current execution time.
      • obtainBuildRevision

        protected java.lang.String obtainBuildRevision()
        Obtain and return the build revision string that was generated during the build process with git rev-parse --short HEAD and written to collector.buildrevision.properties, or return null if the build revision string cannot be obtained.
        Returns:
        Build revision string.
      • createIndexerTask

        protected org.torproject.metrics.collector.indexer.IndexerTask createIndexerTask​(java.nio.file.Path fileToIndex)
        Create an indexer task for indexing the given file.

        The reason why this is a separate method is that it can be overriden by tests that don't actually want to index files but instead provide their own index results.

        Parameters:
        fileToIndex - File to index.
        Returns:
        Indexer task.