Class CreateIndexJson
- java.lang.Object
-
- org.torproject.metrics.collector.sync.SyncManager
-
- org.torproject.metrics.collector.cron.CollecTorMain
-
- org.torproject.metrics.collector.indexer.CreateIndexJson
-
- All Implemented Interfaces:
java.lang.Runnable
,java.util.concurrent.Callable<java.lang.Object>
public class CreateIndexJson extends CollecTorMain
Create an index file calledindex.json
containing metadata of all files in theindexed/
directory and update thehtdocs/
directory to contain all files to be served via the web server.File metadata includes:
- Path for downloading this file from the web server.
- Size of the file in bytes.
- Timestamp when the file was last modified.
- Descriptor types as found in
@type
annotations of contained descriptors. - Earliest and latest publication timestamp of contained descriptors.
- SHA-256 digest of the file.
This class maintains its own working directory
htdocs/
with subdirectories likehtdocs/archive/
orhtdocs/recent/
and another subdirectoryhtdocs/index/
. The first two subdirectories contain (hard) links created and deleted by this class, the third subdirectory contains theindex.json
file in uncompressed and compressed forms.The main reason for having the
htdocs/
directory is that indexing a large descriptor file can be time consuming. New or updated files inindexed/
first need to be indexed before their metadata can be included inindex.json
. Another reason is that files removed fromindexed/
shall still be available for download for a limited period of time after disappearing fromindex.json
.The reason for creating (hard) links in
htdocs/
, rather than copies, is that links do not consume additional disk space. All directories must be located on the same file system. Storing symbolic links inhtdocs/
would not have worked with replaced or deleted files in the original directories. Symbolic links in original directories are allowed as long as they target to the same file system.This class does not write, modify, or delete any files in the
indexed/
directory. At the same time it does not expect any other classes to write, modify, or delete contents in thehtdocs/
directory.
-
-
Field Summary
-
Fields inherited from class org.torproject.metrics.collector.cron.CollecTorMain
config, mapPathDescriptors, SOURCES
-
Fields inherited from class org.torproject.metrics.collector.sync.SyncManager
SYNCORIGINS
-
-
Constructor Summary
Constructors Constructor Description CreateIndexJson(Configuration configuration)
Initialize this class with the givenconfiguration
.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected org.torproject.metrics.collector.indexer.IndexerTask
createIndexerTask(java.nio.file.Path fileToIndex)
Create an indexer task for indexing the given file.java.lang.String
module()
Returns the module name for logging purposes.protected java.lang.String
obtainBuildRevision()
Obtain and return the build revision string that was generated during the build process withgit rev-parse --short HEAD
and written tocollector.buildrevision.properties
, or returnnull
if the build revision string cannot be obtained.void
startProcessing()
Run the indexer by (1) adding new files fromindexed/
to the index, (2) adding old files fromhtdocs/
for which only links exist to the index, (3) scheduling new tasks and updating links inhtdocs/
to reflect what's contained in the in-memory index, and (4) writing new uncompressed and compressedindex.json
files to disk.protected void
startProcessing(java.time.Instant now)
Helper method tostartProcessing()
that accepts the current execution time and which is used by tests.protected java.lang.String
syncMarker()
Returns property prefix/infix/postfix for Sync related properties.-
Methods inherited from class org.torproject.metrics.collector.cron.CollecTorMain
call, checkAvailableSpace, readProcessedFiles, run, syncMapPathsDescriptors, writeProcessedFiles
-
Methods inherited from class org.torproject.metrics.collector.sync.SyncManager
merge
-
-
-
-
Constructor Detail
-
CreateIndexJson
public CreateIndexJson(Configuration configuration)
Initialize this class with the givenconfiguration
.- Parameters:
configuration
- Configuration values.
-
-
Method Detail
-
module
public java.lang.String module()
Description copied from class:CollecTorMain
Returns the module name for logging purposes.- Specified by:
module
in classCollecTorMain
-
syncMarker
protected java.lang.String syncMarker()
Description copied from class:CollecTorMain
Returns property prefix/infix/postfix for Sync related properties.- Specified by:
syncMarker
in classCollecTorMain
-
startProcessing
public void startProcessing()
Run the indexer by (1) adding new files fromindexed/
to the index, (2) adding old files fromhtdocs/
for which only links exist to the index, (3) scheduling new tasks and updating links inhtdocs/
to reflect what's contained in the in-memory index, and (4) writing new uncompressed and compressedindex.json
files to disk.- Specified by:
startProcessing
in classCollecTorMain
-
startProcessing
protected void startProcessing(java.time.Instant now)
Helper method tostartProcessing()
that accepts the current execution time and which is used by tests.- Parameters:
now
- Current execution time.
-
obtainBuildRevision
protected java.lang.String obtainBuildRevision()
Obtain and return the build revision string that was generated during the build process withgit rev-parse --short HEAD
and written tocollector.buildrevision.properties
, or returnnull
if the build revision string cannot be obtained.- Returns:
- Build revision string.
-
createIndexerTask
protected org.torproject.metrics.collector.indexer.IndexerTask createIndexerTask(java.nio.file.Path fileToIndex)
Create an indexer task for indexing the given file.The reason why this is a separate method is that it can be overriden by tests that don't actually want to index files but instead provide their own index results.
- Parameters:
fileToIndex
- File to index.- Returns:
- Indexer task.
-
-