HugeGraph 配置项

Gremlin Server 配置项

对应配置文件gremlin-server.yaml

config option default value descrition
host 127.0.0.1 The host or ip of Gremlin Server.
port 8182 The listening port of Gremlin Server.
graphs hugegraph: conf/hugegraph.properties The map of graphs with name and config file path.
scriptEvaluationTimeout 30000 The timeout for gremlin script execution(millisecond).
channelizer org.apache.tinkerpop.gremlin.server.channel.HttpChannelizer Indicates the protocol which the Gremlin Server provides service.
authentication authenticator: com.baidu.hugegraph.auth.StandardAuthenticator, config: {tokens: conf/rest-server.properties} The authenticator and config(contains tokens path) of authentication mechanism.

Rest Server & API 配置项

对应配置文件rest-server.properties

config option default value descrition
graphs [hugegraph:conf/hugegraph.properties] The map of graphs’ name and config file.
server.id server-1 The id of rest server, used for license verification.
server.role master The role of nodes in the cluster, available types are [master, worker, computer]
restserver.url http://127.0.0.1:8080 The url for listening of rest server.
ssl.keystore_file server.keystore The path of server keystore file used when https protocol is enabled.
ssl.keystore_password The password of the path of the server keystore file used when the https protocol is enabled.
restserver.max_worker_threads 2 * CPUs The maximum worker threads of rest server.
restserver.min_free_memory 64 The minimum free memory(MB) of rest server, requests will be rejected when the available memory of system is lower than this value.
restserver.request_timeout 30 The time in seconds within which a request must complete, -1 means no timeout.
restserver.connection_idle_timeout 30 The time in seconds to keep an inactive connection alive, -1 means no timeout.
restserver.connection_max_requests 256 The max number of HTTP requests allowed to be processed on one keep-alive connection, -1 means unlimited.
gremlinserver.url http://127.0.0.1:8182 The url of gremlin server.
gremlinserver.max_route 8 The max route number for gremlin server.
gremlinserver.timeout 30 The timeout in seconds of waiting for gremlin server.
batch.max_edges_per_batch 500 The maximum number of edges submitted per batch.
batch.max_vertices_per_batch 500 The maximum number of vertices submitted per batch.
batch.max_write_ratio 50 The maximum thread ratio for batch writing, only take effect if the batch.max_write_threads is 0.
batch.max_write_threads 0 The maximum threads for batch writing, if the value is 0, the actual value will be set to batch.max_write_ratio * restserver.max_worker_threads.
auth.authenticator The class path of authenticator implemention. e.g., com.baidu.hugegraph.auth.StandardAuthenticator, or com.baidu.hugegraph.auth.ConfigAuthenticator.
auth.admin_token 162f7848-0b6d-4faf-b557-3a0797869c55 Token for administrator operations, only for com.baidu.hugegraph.auth.ConfigAuthenticator.
auth.graph_store hugegraph The name of graph used to store authentication information, like users, only for com.baidu.hugegraph.auth.StandardAuthenticator.
auth.user_tokens [hugegraph:9fd95c9c-711b-415b-b85f-d4df46ba5c31] The map of user tokens with name and password, only for com.baidu.hugegraph.auth.ConfigAuthenticator.
auth.audit_log_rate 1000.0 The max rate of audit log output per user, default value is 1000 records per second.
auth.cache_capacity 10240 The max cache capacity of each auth cache item.
auth.cache_expire 600 The expiration time in seconds of vertex cache.
auth.remote_url If the address is empty, it provide auth service, otherwise it is auth client and also provide auth service through rpc forwarding. The remote url can be set to multiple addresses, which are concat by ‘,’.
auth.token_expire 86400 The expiration time in seconds after token created
auth.token_secret FXQXbJtbCLxODc6tGci732pkH1cyf8Qg Secret key of HS256 algorithm.
exception.allow_trace false Whether to allow exception trace stack.

基本配置项

基本配置项及后端配置项对应配置文件:{graph-name}.properties,如hugegraph.properties

config option default value descrition
gremlin.graph com.baidu.hugegraph.HugeFactory Gremlin entrance to create graph.
backend rocksdb The data store type, available values are [memory, rocksdb, cassandra, scylladb, hbase, mysql].
serializer binary The serializer for backend store, available values are [text, binary, cassandra, hbase, mysql].
store hugegraph The database name like Cassandra Keyspace.
store.connection_detect_interval 600 The interval in seconds for detecting connections, if the idle time of a connection exceeds this value, detect it and reconnect if needed before using, value 0 means detecting every time.
store.graph g The graph table name, which store vertex, edge and property.
store.schema m The schema table name, which store meta data.
store.system s The system table name, which store system data.
schema.illegal_name_regex .\s+$|~. The regex specified the illegal format for schema name.
schema.cache_capacity 10000 The max cache size(items) of schema cache.
vertex.cache_type l2 The type of vertex cache, allowed values are [l1, l2].
vertex.cache_capacity 10000000 The max cache size(items) of vertex cache.
vertex.cache_expire 600 The expire time in seconds of vertex cache.
vertex.check_customized_id_exist false Whether to check the vertices exist for those using customized id strategy.
vertex.default_label vertex The default vertex label.
vertex.tx_capacity 10000 The max size(items) of vertices(uncommitted) in transaction.
vertex.check_adjacent_vertex_exist false Whether to check the adjacent vertices of edges exist.
vertex.lazy_load_adjacent_vertex true Whether to lazy load adjacent vertices of edges.
vertex.part_edge_commit_size 5000 Whether to enable the mode to commit part of edges of vertex, enabled if commit size > 0, 0 means disabled.
vertex.encode_primary_key_number true Whether to encode number value of primary key in vertex id.
vertex.remove_left_index_at_overwrite false Whether remove left index at overwrite.
edge.cache_type l2 The type of edge cache, allowed values are [l1, l2].
edge.cache_capacity 1000000 The max cache size(items) of edge cache.
edge.cache_expire 600 The expiration time in seconds of edge cache.
edge.tx_capacity 10000 The max size(items) of edges(uncommitted) in transaction.
query.page_size 500 The size of each page when querying by paging.
query.batch_size 1000 The size of each batch when querying by batch.
query.ignore_invalid_data true Whether to ignore invalid data of vertex or edge.
query.index_intersect_threshold 1000 The maximum number of intermediate results to intersect indexes when querying by multiple single index properties.
query.ramtable_edges_capacity 20000000 The maximum number of edges in ramtable, include OUT and IN edges.
query.ramtable_enable false Whether to enable ramtable for query of adjacent edges.
query.ramtable_vertices_capacity 10000000 The maximum number of vertices in ramtable, generally the largest vertex id is used as capacity.
query.optimize_aggregate_by_index false Whether to optimize aggregate query(like count) by index.
oltp.concurrent_depth 10 The min depth to enable concurrent oltp algorithm.
oltp.concurrent_threads 10 Thread number to concurrently execute oltp algorithm.
oltp.collection_type EC The implementation type of collections used in oltp algorithm.
rate_limit.read 0 The max rate(times/s) to execute query of vertices/edges.
rate_limit.write 0 The max rate(items/s) to add/update/delete vertices/edges.
task.wait_timeout 10 Timeout in seconds for waiting for the task to complete,such as when truncating or clearing the backend.
task.input_size_limit 16777216 The job input size limit in bytes.
task.result_size_limit 16777216 The job result size limit in bytes.
task.sync_deletion false Whether to delete schema or expired data synchronously.
task.ttl_delete_batch 1 The batch size used to delete expired data.
computer.config /conf/computer.yaml The config file path of computer job.
search.text_analyzer ikanalyzer Choose a text analyzer for searching the vertex/edge properties, available type are [word, ansj, hanlp, smartcn, jieba, jcseg, mmseg4j, ikanalyzer].
search.text_analyzer_mode smart Specify the mode for the text analyzer, the available mode of analyzer are {word: [MaximumMatching, ReverseMaximumMatching, MinimumMatching, ReverseMinimumMatching, BidirectionalMaximumMatching, BidirectionalMinimumMatching, BidirectionalMaximumMinimumMatching, FullSegmentation, MinimalWordCount, MaxNgramScore, PureEnglish], ansj: [BaseAnalysis, IndexAnalysis, ToAnalysis, NlpAnalysis], hanlp: [standard, nlp, index, nShort, shortest, speed], smartcn: [], jieba: [SEARCH, INDEX], jcseg: [Simple, Complex], mmseg4j: [Simple, Complex, MaxWord], ikanalyzer: [smart, max_word]}.
snowflake.datecenter_id 0 The datacenter id of snowflake id generator.
snowflake.force_string false Whether to force the snowflake long id to be a string.
snowflake.worker_id 0 The worker id of snowflake id generator.
raft.mode false Whether the backend storage works in raft mode.
raft.safe_read false Whether to use linearly consistent read.
raft.use_snapshot false Whether to use snapshot.
raft.endpoint 127.0.0.1:8281 The peerid of current raft node.
raft.group_peers 127.0.0.1:8281,127.0.0.1:8282,127.0.0.1:8283 The peers of current raft group.
raft.path ./raft-log The log path of current raft node.
raft.use_replicator_pipeline true Whether to use replicator line, when turned on it multiple logs can be sent in parallel, and the next log doesn’t have to wait for the ack message of the current log to be sent.
raft.election_timeout 10000 Timeout in milliseconds to launch a round of election.
raft.snapshot_interval 3600 The interval in seconds to trigger snapshot save.
raft.backend_threads current CPU vcores The thread number used to apply task to bakcend.
raft.read_index_threads 8 The thread number used to execute reading index.
raft.apply_batch 1 The apply batch size to trigger disruptor event handler.
raft.queue_size 16384 The disruptor buffers size for jraft RaftNode, StateMachine and LogManager.
raft.queue_publish_timeout 60 The timeout in second when publish event into disruptor.
raft.rpc_threads 80 The rpc threads for jraft RPC layer.
raft.rpc_connect_timeout 5000 The rpc connect timeout for jraft rpc.
raft.rpc_timeout 60000 The rpc timeout for jraft rpc.
raft.rpc_buf_low_water_mark 10485760 The ChannelOutboundBuffer’s low water mark of netty, when buffer size less than this size, the method ChannelOutboundBuffer.isWritable() will return true, it means that low downstream pressure or good network.
raft.rpc_buf_high_water_mark 20971520 The ChannelOutboundBuffer’s high water mark of netty, only when buffer size exceed this size, the method ChannelOutboundBuffer.isWritable() will return false, it means that the downstream pressure is too great to process the request or network is very congestion, upstream needs to limit rate at this time.
raft.read_strategy ReadOnlyLeaseBased The linearizability of read strategy.

RPC server 配置

config option default value descrition
rpc.client_connect_timeout 20 The timeout(in seconds) of rpc client connect to rpc server.
rpc.client_load_balancer consistentHash The rpc client uses a load-balancing algorithm to access multiple rpc servers in one cluster. Default value is ‘consistentHash’, means forwording by request parameters.
rpc.client_read_timeout 40 The timeout(in seconds) of rpc client read from rpc server.
rpc.client_reconnect_period 10 The period(in seconds) of rpc client reconnect to rpc server.
rpc.client_retries 3 Failed retry number of rpc client calls to rpc server.
rpc.config_order 999 Sofa rpc configuration file loading order, the larger the more later loading.
rpc.logger_impl com.alipay.sofa.rpc.log.SLF4JLoggerImpl Sofa rpc log implementation class.
rpc.protocol bolt Rpc communication protocol, client and server need to be specified the same value.
rpc.remote_url The remote urls of rpc peers, it can be set to multiple addresses, which are concat by ‘,’, empty value means not enabled.
rpc.server_adaptive_port false Whether the bound port is adaptive, if it’s enabled, when the port is in use, automatically +1 to detect the next available port. Note that this process is not atomic, so there may still be port conflicts.
rpc.server_host The hosts/ips bound by rpc server to provide services, empty value means not enabled.
rpc.server_port 8090 The port bound by rpc server to provide services.
rpc.server_timeout 30 The timeout(in seconds) of rpc server execution.

Cassandra 后端配置项

config option default value descrition
backend Must be set to cassandra.
serializer Must be set to cassandra.
cassandra.host localhost The seeds hostname or ip address of cassandra cluster.
cassandra.port 9042 The seeds port address of cassandra cluster.
cassandra.connect_timeout 5 The cassandra driver connect server timeout(seconds).
cassandra.read_timeout 20 The cassandra driver read from server timeout(seconds).
cassandra.keyspace.strategy SimpleStrategy The replication strategy of keyspace, valid value is SimpleStrategy or NetworkTopologyStrategy.
cassandra.keyspace.replication [3] The keyspace replication factor of SimpleStrategy, like ‘[3]’.Or replicas in each datacenter of NetworkTopologyStrategy, like ‘[dc1:2,dc2:1]’.
cassandra.username The username to use to login to cassandra cluster.
cassandra.password The password corresponding to cassandra.username.
cassandra.compression_type none The compression algorithm of cassandra transport: none/snappy/lz4.
cassandra.jmx_port=7199 7199 The port of JMX API service for cassandra.
cassandra.aggregation_timeout 43200 The timeout in seconds of waiting for aggregation.

ScyllaDB 后端配置项

config option default value descrition
backend Must be set to scylladb.
serializer Must be set to scylladb.

其它与 Cassandra 后端一致。

RocksDB 后端配置项

config option default value descrition
backend Must be set to rocksdb.
serializer Must be set to binary.
rocksdb.data_disks [] The optimized disks for storing data of RocksDB. The format of each element: STORE/TABLE: /path/disk.Allowed keys are [g/vertex, g/edge_out, g/edge_in, g/vertex_label_index, g/edge_label_index, g/range_int_index, g/range_float_index, g/range_long_index, g/range_double_index, g/secondary_index, g/search_index, g/shard_index, g/unique_index, g/olap]
rocksdb.data_path rocksdb-data The path for storing data of RocksDB.
rocksdb.wal_path rocksdb-data The path for storing WAL of RocksDB.
rocksdb.allow_mmap_reads false Allow the OS to mmap file for reading sst tables.
rocksdb.allow_mmap_writes false Allow the OS to mmap file for writing.
rocksdb.block_cache_capacity 8388608 The amount of block cache in bytes that will be used by RocksDB, 0 means no block cache.
rocksdb.bloom_filter_bits_per_key -1 The bits per key in bloom filter, a good value is 10, which yields a filter with ~ 1% false positive rate, -1 means no bloom filter.
rocksdb.bloom_filter_block_based_mode false Use block based filter rather than full filter.
rocksdb.bloom_filter_whole_key_filtering true True if place whole keys in the bloom filter, else place the prefix of keys.
rocksdb.bottommost_compression NO_COMPRESSION The compression algorithm for the bottommost level of RocksDB, allowed values are none/snappy/z/bzip2/lz4/lz4hc/xpress/zstd.
rocksdb.bulkload_mode false Switch to the mode to bulk load data into RocksDB.
rocksdb.cache_index_and_filter_blocks false Indicating if we’d put index/filter blocks to the block cache.
rocksdb.compaction_style LEVEL Set compaction style for RocksDB: LEVEL/UNIVERSAL/FIFO.
rocksdb.compression SNAPPY_COMPRESSION The compression algorithm for compressing blocks of RocksDB, allowed values are none/snappy/z/bzip2/lz4/lz4hc/xpress/zstd.
rocksdb.compression_per_level [NO_COMPRESSION, NO_COMPRESSION, SNAPPY_COMPRESSION, SNAPPY_COMPRESSION, SNAPPY_COMPRESSION, SNAPPY_COMPRESSION, SNAPPY_COMPRESSION] The compression algorithms for different levels of RocksDB, allowed values are none/snappy/z/bzip2/lz4/lz4hc/xpress/zstd.
rocksdb.delayed_write_rate 16777216 The rate limit in bytes/s of user write requests when need to slow down if the compaction gets behind.
rocksdb.log_level INFO The info log level of RocksDB.
rocksdb.max_background_jobs 8 Maximum number of concurrent background jobs, including flushes and compactions.
rocksdb.level_compaction_dynamic_level_bytes false Whether to enable level_compaction_dynamic_level_bytes, if it’s enabled we give max_bytes_for_level_multiplier a priority against max_bytes_for_level_base, the bytes of base level is dynamic for a more predictable LSM tree, it is useful to limit worse case space amplification. Turning this feature on/off for an existing DB can cause unexpected LSM tree structure so it’s not recommended.
rocksdb.max_bytes_for_level_base 536870912 The upper-bound of the total size of level-1 files in bytes.
rocksdb.max_bytes_for_level_multiplier 10.0 The ratio between the total size of level (L+1) files and the total size of level L files for all L.
rocksdb.max_open_files -1 The maximum number of open files that can be cached by RocksDB, -1 means no limit.
rocksdb.max_subcompactions 4 The value represents the maximum number of threads per compaction job.
rocksdb.max_write_buffer_number 6 The maximum number of write buffers that are built up in memory.
rocksdb.max_write_buffer_number_to_maintain 0 The total maximum number of write buffers to maintain in memory.
rocksdb.min_write_buffer_number_to_merge 2 The minimum number of write buffers that will be merged together.
rocksdb.num_levels 7 Set the number of levels for this database.
rocksdb.optimize_filters_for_hits false This flag allows us to not store filters for the last level.
rocksdb.optimize_mode true Optimize for heavy workloads and big datasets.
rocksdb.pin_l0_filter_and_index_blocks_in_cache false Indicating if we’d put index/filter blocks to the block cache.
rocksdb.sst_path The path for ingesting SST file into RocksDB.
rocksdb.target_file_size_base 67108864 The target file size for compaction in bytes.
rocksdb.target_file_size_multiplier 1 The size ratio between a level L file and a level (L+1) file.
rocksdb.use_direct_io_for_flush_and_compaction false Enable the OS to use direct read/writes in flush and compaction.
rocksdb.use_direct_reads false Enable the OS to use direct I/O for reading sst tables.
rocksdb.write_buffer_size 134217728 Amount of data in bytes to build up in memory.
rocksdb.max_manifest_file_size 104857600 The max size of manifest file in bytes.
rocksdb.skip_stats_update_on_db_open false Whether to skip statistics update when opening the database, setting this flag true allows us to not update statistics.
rocksdb.max_file_opening_threads 16 The max number of threads used to open files.
rocksdb.max_total_wal_size 0 Total size of WAL files in bytes. Once WALs exceed this size, we will start forcing the flush of column families related, 0 means no limit.
rocksdb.db_write_buffer_size 0 Total size of write buffers in bytes across all column families, 0 means no limit.
rocksdb.delete_obsolete_files_period 21600 The periodicity in seconds when obsolete files get deleted, 0 means always do full purge.
rocksdb.hard_pending_compaction_bytes_limit 274877906944 The hard limit to impose on pending compaction in bytes.
rocksdb.level0_file_num_compaction_trigger 2 Number of files to trigger level-0 compaction.
rocksdb.level0_slowdown_writes_trigger 20 Soft limit on number of level-0 files for slowing down writes.
rocksdb.level0_stop_writes_trigger 36 Hard limit on number of level-0 files for stopping writes.
rocksdb.soft_pending_compaction_bytes_limit 68719476736 The soft limit to impose on pending compaction in bytes.

HBase 后端配置项

config option default value descrition
backend Must be set to hbase.
serializer Must be set to hbase.
hbase.hosts localhost The hostnames or ip addresses of HBase zookeeper, separated with commas.
hbase.port 2181 The port address of HBase zookeeper.
hbase.threads_max 64 The max threads num of hbase connections.
hbase.znode_parent /hbase The znode parent path of HBase zookeeper.
hbase.zk_retry 3 The recovery retry times of HBase zookeeper.
hbase.aggregation_timeout 43200 The timeout in seconds of waiting for aggregation.
hbase.kerberos_enable false Is Kerberos authentication enabled for HBase.
hbase.kerberos_keytab The HBase’s key tab file for kerberos authentication.
hbase.kerberos_principal The HBase’s principal for kerberos authentication.
hbase.krb5_conf etc/krb5.conf Kerberos configuration file, including KDC IP, default realm, etc.
hbase.hbase_site /etc/hbase/conf/hbase-site.xml The HBase’s configuration file

MySQL & PostgreSQL 后端配置项

config option default value descrition
backend Must be set to mysql.
serializer Must be set to mysql.
jdbc.driver com.mysql.jdbc.Driver The JDBC driver class to connect database.
jdbc.url jdbc:mysql://127.0.0.1:3306 The url of database in JDBC format.
jdbc.username root The username to login database.
jdbc.password ****** The password corresponding to jdbc.username.
jdbc.ssl_mode false The SSL mode of connections with database.
jdbc.reconnect_interval 3 The interval(seconds) between reconnections when the database connection fails.
jdbc.reconnect_max_times 3 The reconnect times when the database connection fails.
jdbc.storage_engine InnoDB The storage engine of backend store database, like InnoDB/MyISAM/RocksDB for MySQL.
jdbc.postgresql.connect_database template1 The database used to connect when init store, drop store or check store exist.

PostgreSQL 后端配置项

config option default value descrition
backend Must be set to postgresql.
serializer Must be set to postgresql.

其它与 MySQL 后端一致。

PostgreSQL 后端的 driver 和 url 应该设置为:

  • jdbc.driver=org.postgresql.Driver
  • jdbc.url=jdbc:postgresql://localhost:5432/

Last modified April 17, 2022: add en doc (aec74b1)