Custom Storage
Author: | Arvid Norberg, arvid@rasterbar.com |
---|---|
Version: | 1.0.0 |
Table of contents
libtorrent provides a customization point for storage of data. By default, (default_storage) downloaded files are saved to disk according with the general conventions of bittorrent clients, mimicing the original file layout when the torrent was created. The libtorrent user may define a custom storage to store piece data in a different way.
A custom storage implementation must derive from and implement the storage_interface. You must also provide a function that constructs the custom storage object and provide this function to the add_torrent() call via add_torrent_params. Either passed in to the constructor or by setting the add_torrent_params::storage field.
This is an example storage implementation that stores all pieces in a std::map, i.e. in RAM. It's not necessarily very useful in practice, but illustrates the basics of implementing a custom storage.
struct temp_storage : storage_interface { temp_storage(file_storage const& fs) : m_files(fs) {} void set_file_priority(std::vector<boost::uint8_t> const& prio) {} virtual bool initialize(bool allocate_files) { return false; } virtual bool has_any_file() { return false; } virtual int read(char* buf, int slot, int offset, int size) { std::map<int, std::vector<char> >::const_iterator i = m_file_data.find(slot); if (i == m_file_data.end()) return 0; int available = i->second.size() - offset; if (available <= 0) return 0; if (available > size) available = size; memcpy(buf, &i->second[offset], available); return available; } virtual int write(const char* buf, int slot, int offset, int size) { std::vector<char>& data = m_file_data[slot]; if (data.size() < offset + size) data.resize(offset + size); std::memcpy(&data[offset], buf, size); return size; } virtual bool rename_file(int file, std::string const& new_name) { assert(false); return false; } virtual bool move_storage(std::string const& save_path) { return false; } virtual bool verify_resume_data(lazy_entry const& rd, error_code& error) { return false; } virtual bool write_resume_data(entry& rd) const { return false; } virtual bool move_slot(int src_slot, int dst_slot) { assert(false); return false; } virtual bool swap_slots(int slot1, int slot2) { assert(false); return false; } virtual bool swap_slots3(int slot1, int slot2, int slot3) { assert(false); return false; } virtual size_type physical_offset(int slot, int offset) { return slot * m_files.piece_length() + offset; }; virtual sha1_hash hash_for_slot(int slot, partial_hash& ph, int piece_size) { int left = piece_size - ph.offset; assert(left >= 0); if (left > 0) { std::vector<char>& data = m_file_data[slot]; // if there are padding files, those blocks will be considered // completed even though they haven't been written to the storage. // in this case, just extend the piece buffer to its full size // and fill it with zeroes. if (data.size() < piece_size) data.resize(piece_size, 0); ph.h.update(&data[ph.offset], left); } return ph.h.final(); } virtual bool release_files() { return false; } virtual bool delete_files() { return false; } std::map<int, std::vector<char> > m_file_data; file_storage m_files; }; storage_interface* temp_storage_constructor( file_storage const& fs, file_storage const* mapped , std::string const& path, file_pool& fp , std::vector<boost::uint8_t> const& prio) { return new temp_storage(fs); }
file_pool
Declared in "libtorrent/file_pool.hpp"
this is an internal cache of open file handles. It's primarily used by storage_interface implementations. It provides semi weak guarantees of not opening more file handles than specified. Given multiple threads, each with the ability to lock a file handle (via smart pointer), there may be windows where more file handles are open.
struct file_pool : boost::noncopyable { ~file_pool (); file_pool (int size = 40); boost::intrusive_ptr<file> open_file (void* st, std::string const& p , int file_index, file_storage const& fs, int m, error_code& ec); void release (void* st); void release (void* st, int file_index); void resize (int size); int size_limit () const; };
~file_pool() file_pool()
~file_pool (); file_pool (int size = 40);
size specifies the number of allowed files handles to hold open at any given time.
open_file()
boost::intrusive_ptr<file> open_file (void* st, std::string const& p , int file_index, file_storage const& fs, int m, error_code& ec);
return an open file handle to file at file_index in the file_storage fs opened at save path p. m is the file open mode (see file::open_mode_t).
release()
void release (void* st); void release (void* st, int file_index);
release all files belonging to the specified storage_interface (st) the overload that takes file_index releases only the file with that index in storage st.
size_limit()
int size_limit () const;
returns the current limit of number of allowed open file handles held by the file_pool.
storage_interface
Declared in "libtorrent/storage.hpp"
The storage interface is a pure virtual class that can be implemented to customize how and where data for a torrent is stored. The default storage implementation uses regular files in the filesystem, mapping the files in the torrent in the way one would assume a torrent is saved to disk. Implementing your own storage interface makes it possible to store all data in RAM, or in some optimized order on disk (the order the pieces are received for instance), or saving multifile torrents in a single file in order to be able to take advantage of optimized disk-I/O.
It is also possible to write a thin class that uses the default storage but modifies some particular behavior, for instance encrypting the data before it's written to disk, and decrypting it when it's read again.
The storage interface is based on slots, each slot is 'piece_size' number of bytes. All access is done by writing and reading whole or partial slots. One slot is one piece in the torrent, but the data in the slot does not necessarily correspond to the piece with the same index (in compact allocation mode it won't).
libtorrent comes with two built-in storage implementations; default_storage and disabled_storage. Their constructor functions are called default_storage_constructor() and disabled_storage_constructor respectively. The disabled storage does just what it sounds like. It throws away data that's written, and it reads garbage. It's useful mostly for benchmarking and profiling purpose.
struct storage_interface { virtual bool initialize (bool allocate_files) = 0; virtual bool has_any_file () = 0; virtual void set_file_priority (std::vector<boost::uint8_t> const& prio) = 0; virtual int writev (file::iovec_t const* bufs, int slot, int offset, int num_bufs, int flags = file::random_access); virtual int readv (file::iovec_t const* bufs, int slot, int offset, int num_bufs, int flags = file::random_access); virtual void hint_read (int, int, int); virtual int read (char* buf, int slot, int offset, int size) = 0; virtual int write (const char* buf, int slot, int offset, int size) = 0; virtual size_type physical_offset (int slot, int offset) = 0; virtual int sparse_end (int start) const; virtual int move_storage (std::string const& save_path, int flags) = 0; virtual bool verify_resume_data (lazy_entry const& rd, error_code& error) = 0; virtual bool write_resume_data (entry& rd) const = 0; virtual bool move_slot (int src_slot, int dst_slot) = 0; virtual bool swap_slots (int slot1, int slot2) = 0; virtual bool swap_slots3 (int slot1, int slot2, int slot3) = 0; virtual bool release_files () = 0; virtual bool rename_file (int index, std::string const& new_filename) = 0; virtual bool delete_files () = 0; disk_buffer_pool* disk_pool (); session_settings const& settings () const; void set_error (std::string const& file, error_code const& ec) const; error_code const& error () const; std::string const& error_file () const; virtual void clear_error (); };
initialize()
virtual bool initialize (bool allocate_files) = 0;
This function is called when the storage is to be initialized. The default storage will create directories and empty files at this point. If allocate_files is true, it will also ftruncate all files to their target size.
Returning true indicates an error occurred.
has_any_file()
virtual bool has_any_file () = 0;
This function is called when first checking (or re-checking) the storage for a torrent. It should return true if any of the files that is used in this storage exists on disk. If so, the storage will be checked for existing pieces before starting the download.
set_file_priority()
virtual void set_file_priority (std::vector<boost::uint8_t> const& prio) = 0;
change the priorities of files.
writev() readv()
virtual int writev (file::iovec_t const* bufs, int slot, int offset, int num_bufs, int flags = file::random_access); virtual int readv (file::iovec_t const* bufs, int slot, int offset, int num_bufs, int flags = file::random_access);
These functions should read or write the data in or to the given slot at the given offset. It should read or write num_bufs buffers sequentially, where the size of each buffer is specified in the buffer array bufs. The file::iovec_t type has the following members:
struct iovec_t { void* iov_base; size_t iov_len; };
The return value is the number of bytes actually read or written, or -1 on failure. If it returns -1, the error code is expected to be set to
Every buffer in bufs can be assumed to be page aligned and be of a page aligned size, except for the last buffer of the torrent. The allocated buffer can be assumed to fit a fully page aligned number of bytes though. This is useful when reading and writing the last piece of a file in unbuffered mode.
The offset is aligned to 16 kiB boundries most of the time, but there are rare exceptions when it's not. Specifically if the read cache is disabled/or full and a client requests unaligned data, or the file itself is not aligned in the torrent. Most clients request aligned data.
hint_read()
virtual void hint_read (int, int, int);
This function is called when a read job is queued. It gives the storage wrapper an opportunity to hint the operating system about this coming read. For instance, the storage may call posix_fadvise(POSIX_FADV_WILLNEED) or fcntl(F_RDADVISE).
read()
virtual int read (char* buf, int slot, int offset, int size) = 0;
negative return value indicates an error
write()
virtual int write (const char* buf, int slot, int offset, int size) = 0;
negative return value indicates an error
physical_offset()
virtual size_type physical_offset (int slot, int offset) = 0;
returns the offset on the physical storage medium for the byte at offset offset in slot slot.
sparse_end()
virtual int sparse_end (int start) const;
This function is optional. It is supposed to return the first piece, starting at start that is fully contained within a data-region on disk (i.e. non-sparse region). The purpose of this is to skip parts of files that can be known to contain zeros when checking files.
move_storage()
virtual int move_storage (std::string const& save_path, int flags) = 0;
This function should move all the files belonging to the storage to the new save_path. The default storage moves the single file or the directory of the torrent.
Before moving the files, any open file handles may have to be closed, like release_files().
returns one of: | no_error = 0 | need_full_check = -1 | fatal_disk_error = -2 | file_exist = -4
verify_resume_data()
virtual bool verify_resume_data (lazy_entry const& rd, error_code& error) = 0;
This function should verify the resume data rd with the files on disk. If the resume data seems to be up-to-date, return true. If not, set error to a description of what mismatched and return false.
The default storage may compare file sizes and time stamps of the files.
Returning false indicates an error occurred.
write_resume_data()
virtual bool write_resume_data (entry& rd) const = 0;
This function should fill in resume data, the current state of the storage, in rd. The default storage adds file timestamps and sizes.
Returning true indicates an error occurred.
move_slot()
virtual bool move_slot (int src_slot, int dst_slot) = 0;
This function should copy or move the data in slot src_slot to the slot dst_slot. This is only used in compact mode.
If the storage caches slots, this could be implemented more efficient than reading and writing the data.
Returning true indicates an error occurred.
swap_slots()
virtual bool swap_slots (int slot1, int slot2) = 0;
This function should swap the data in slot1 and slot2. The default storage uses a scratch buffer to read the data into, then moving the other slot and finally writing back the temporary slot's data
This is only used in compact mode.
Returning true indicates an error occurred.
swap_slots3()
virtual bool swap_slots3 (int slot1, int slot2, int slot3) = 0;
This function should do a 3-way swap, or shift of the slots. slot1 should move to slot2, which should be moved to slot3 which in turn should be moved to slot1.
This is only used in compact mode.
Returning true indicates an error occurred.
release_files()
virtual bool release_files () = 0;
This function should release all the file handles that it keeps open to files belonging to this storage. The default implementation just calls file_pool::release_files(this).
Returning true indicates an error occurred.
rename_file()
virtual bool rename_file (int index, std::string const& new_filename) = 0;
Rename file with index file to the thame new_name. If there is an error, true should be returned.
delete_files()
virtual bool delete_files () = 0;
This function should delete all files and directories belonging to this storage.
Returning true indicates an error occurred.
The disk_buffer_pool is used to allocate and free disk buffers. It has the following members:
struct disk_buffer_pool : boost::noncopyable { char* allocate_buffer(char const* category); void free_buffer(char* buf); char* allocate_buffers(int blocks, char const* category); void free_buffers(char* buf, int blocks); int block_size() const { return m_block_size; } void release_memory(); };
disk_pool()
disk_buffer_pool* disk_pool ();
access global disk_buffer_pool, for allocating and freeing disk buffers
set_error()
void set_error (std::string const& file, error_code const& ec) const;
called by the storage implementation to set it into an error state. Typically whenever a critical file operation fails.
default_storage
Declared in "libtorrent/storage.hpp"
The default implementation of storage_interface. Behaves as a normal bittorrent client. It is possible to derive from this class in order to override some of its behavior, when implementing a custom storage.
class default_storage : public storage_interface, boost::noncopyable { default_storage (file_storage const& fs, file_storage const* mapped , std::string const& path, file_pool& fp , std::vector<boost::uint8_t> const& file_prio); bool move_slot (int src_slot, int dst_slot); void hint_read (int slot, int offset, int len); bool rename_file (int index, std::string const& new_filename); void set_file_priority (std::vector<boost::uint8_t> const& prio); bool has_any_file (); int move_storage (std::string const& save_path, int flags); bool write_resume_data (entry& rd) const; int write (char const* buf, int slot, int offset, int size); int writev (file::iovec_t const* buf, int slot, int offset, int num_bufs, int flags = file::random_access); size_type physical_offset (int slot, int offset); bool release_files (); bool delete_files (); bool verify_resume_data (lazy_entry const& rd, error_code& error); int readv (file::iovec_t const* bufs, int slot, int offset, int num_bufs, int flags = file::random_access); bool swap_slots3 (int slot1, int slot2, int slot3); bool initialize (bool allocate_files); int read (char* buf, int slot, int offset, int size); bool swap_slots (int slot1, int slot2); int sparse_end (int start) const; file_storage const& files () const; };
default_storage()
default_storage (file_storage const& fs, file_storage const* mapped , std::string const& path, file_pool& fp , std::vector<boost::uint8_t> const& file_prio);
constructs the default_storage based on the give file_storage (fs). mapped is an optional argument (it may be NULL). If non-NULL it represents the file mappsing that have been made to the torrent before adding it. That's where files are supposed to be saved and looked for on disk. save_path is the root save folder for this torrent. file_pool is the cache of file handles that the storage will use. All files it opens will ask the file_pool to open them. file_prio is a vector indicating the priority of files on startup. It may be an empty vector. Any file whose index is not represented by the vector (because the vector is too short) are assumed to have priority 1. this is used to treat files with priority 0 slightly differently.
files()
file_storage const& files () const;
if the files in this storage are mapped, returns the mapped file_storage, otherwise returns the original file_storage object.
enum move_flags_t
Declared in "libtorrent/storage.hpp"
name | value | description |
---|---|---|
always_replace_files | 0 | replace any files in the destination when copying or moving the storage |
fail_if_exist | 1 | if any files that we want to copy exist in the destination exist, fail the whole operation and don't perform any copy or move. There is an inherent race condition in this mode. The files are checked for existence before the operation starts. In between the check and performing the copy, the destination files may be created, in which case they are replaced. |
dont_replace | 2 | if any file exist in the target, take those files instead of the ones we may have in the source. |