Total Pageviews

Tuesday, 23 May 2023

Facebook的开源内存数据库Beringei



Beringei是Facebook开源的内存数据库,遵守BSD开源协议, 目前已经在Facebook监控基础设施中运行,对内部系统的运行状况和性能指标进行实时监控。Beringei可以支持针对监控系统提供的实时响应机制。收到请求后,立即可以提供查询服务,数据写入Beringei与可供使用之间的延迟大约是300微秒,Facebook的p95服务器响应读取请求的时间大约是65微秒。

特点:

    支持速度非常快的内存存储,并由硬盘保证数据持久性。存储引擎的查询在内存中处理,提供了极高的查询性能,减少到磁盘查询操作,所以可以在停机时间极短、数据没有丢失的情况下重启或迁移进程。
    极其高效的数据流压缩算法。采用的数据流压缩算法能够将实际的时间序列数据压缩90%以上。Beringei使用高效delta of delta压缩算法。

系统要求:

    Ubuntu 16.10(推荐)

[repo owner=”facebookincubator” name=”beringei”]

---------------------------------------------------------------------

Beringei 

A high performance, in memory time series storage engine.

In the fall of 2015, we published the paper “Gorilla: A Fast, Scalable, In-Memory Time Series Database” at VLDB 2015. Beringei is the open source representation of the ideas presented in this paper.

Beringei is a high performance time series storage engine. Time series are commonly used as a representation of statistics, gauges, and counters for monitoring performance and health of a system.

Features

Beringei has the following features:

  • Support for very fast, in-memory storage, backed by disk for persistence. Queries to the storage engine are always served out of memory for extremely fast query performance, but backed to disk so the process can be restarted or migrated with very little down time and no data loss.
  • Extremely efficient streaming compression algorithm. Our streaming compression algorithm is able to compress real world time series data by over 90%. The delta of delta compression algorithm used by Beringei is also fast - we see that a single machine is able to compress more than 1.5 million datapoints/second.
  • Reference sharded service implementation, including a client implementation.
  • Reference http service implementation that enables direct Grafana integration.

How can I use Beringei?

Beringei can be used in one of two ways.

  1. We have created a simple, sharded service, and reference client implementation, that can store and serve time series query requests.
  2. You can use Beringei as an embedded library to handle the low-level details of efficiently storing time series data. Using Beringei in this way is similar to RocksDB - the Beringei library can be the high performance storage system underlying your performance monitoring solution.

Requirements

Beringei is tested and working on:

  • Ubuntu 16.10

We also depend on these open source projects:

Building Beringei

Our instructions are for Ubuntu 16.10 - but you will probably be able to modify the install scripts and directions to work with other linux distros.

  • Run sudo ./setup_ubuntu.sh.

  • Build beringei.

mkdir build && cd build && cmake .. && make
  • Generate a beringei configuration file.
./beringei/tools/beringei_configuration_generator --host_names $(hostname) --file_path /tmp/beringei.json
  • Start beringei.
./beringei/service/beringei_main \
    -beringei_configuration_path /tmp/beringei.json \
    -create_directories \
    -sleep_between_bucket_finalization_secs 60 \
    -allowed_timestamp_behind 300 \
    -bucket_size 600 \
    -buckets $((86400/600)) \
    -logtostderr \
    -v=2
  • Send data.
while [[ 1 ]]; do
    ./beringei/tools/beringei_put \
        -beringei_configuration_path /tmp/beringei.json \
        testkey ${RANDOM} \
        -logtostderr -v 3
    sleep 30
done
  • Read the data back.
./beringei/tools/beringei_get \
    -beringei_configuration_path /tmp/beringei.json \
    testkey \
    -logtostderr -v 3 
from https://github.com/facebookarchive/beringei 



No comments:

Post a Comment