Pages

Monday, 19 August 2019

Dragonfly

Dragonfly is an intelligent P2P based image and file distribution system

Join the chat at https://gitter.im/alibaba/Dragonfly License FOSSA Status GoDoc CII Best Practices Go Report Card Build Status CircleCI codecov
Note: The master branch may be in an unstable or even broken state during development. Please use releasesinstead of the master branch in order to get stable binaries.

Contents

Introduction

Dragonfly is an open source intelligent P2P based image and file distribution system. Its goal is to tackle all distribution problems in cloud native scenarios. Currently Dragonfly focuses on being:
  • Simple: well-defined user-facing API (HTTP), non-invasive to all container engines;
  • Efficient: CDN support, P2P based file distribution to save enterprise bandwidth;
  • Intelligent: host level speed limit, intelligent flow control due to host detection;
  • Secure: block transmission encryption, HTTPS connection support.
Dragonfly is now hosted by the Cloud Native Computing Foundation (CNCF) as a Sandbox Level Project. Originally it was born to solve all kinds of distribution at very large scales, such as application distribution, cache distribution, log distribution, image distribution, and so on.
Dragonfly has finished refactoring in Golang. Now versions > 0.4.0 are totally in Golang, while those < 0.4.0 are in Java. We encourage adopters to try Golang version first, since Java versions will be out of support in the next few releases.

Features

In details, Dragonfly has the following features:
  • P2P based file distribution: Using P2P technology for file transmission, which can make full use of the bandwidth resources of each peer to improve download efficiency, saves a lot of cross-IDC bandwidth, especially costly cross-board bandwidth
  • Non-invasive support for all kinds of container technologies: Dragonfly can seamlessly support various containers for distributing images.
  • Host level speed limit: Many downloading tools(wget/curl) only have rate limit for the current download task, but dragonfly also provides rate limit for the entire host.
  • Passive CDN: The CDN mechanism can avoid repetitive remote downloads.
  • Strong consistency: Dragonfly can guarantee that all downloaded files must be consistent even if users do not provide any check code(MD5).
  • Disk protection and high efficient IO: Precheck Disk space, delay synchronization, write file-block in the best order, split net-read / disk-write, and so on.
  • High performance: Cluster Manager is completely closed-loop, which means, it does not rely on any DB or distributed cache, processing requests with extremely high performance.
  • Exception auto isolation: Dragonfly will automatically isolate exception nodes(peer or Cluster Manager) to improve download stability.
  • No pressure on file source: Generally, as long as a few Cluster Managers download file from the source.
  • Support standard http header: Support http header, Submit authentication information through http header.
  • Effective concurrency control of Registry Auth: Reduce the pressure of the Registry Auth Service.
  • Simple and easy to use: Very few configurations are needed.

Comparison

For Dragonfly, no matter how many clients start the file downloading, the average downloading time is almost stable without increasement (12s in experiment, which means it only takes 12s in total for all client to finish downloading file/image).
And for wget, the downloading time keeps increasing when you have more clients. As the number of wget clients reaches 1200 (in following experiment), the file source will crash, then it can not serve any client.
The following table shows the testing environment and the graph shows the comparison result.
Test EnvironmentStatistics
Dragonfly server2 * (24core 64GB 2000Mb/s)
File Source server2 * (24core 64GB 2000Mb/s)
Client4core 8GB 200Mb/s
Target file size200MB
from  https://github.com/dragonflyoss/Dragonfly

No comments:

Post a Comment