I have an idea for a distributed SQL database using the bittorrent protocol for pulling and writing its data. For the sake of argument, lets say this is a messaging application, where thousands of users run a program that contains a messaging window, and an input box for them to write messages. Each message written does a INSERT to their own sqlite DB. <h3>How it could be done</h3> <ul> <li>Download a .torrent file that essentially contains the schema/DDL for creating the DB, and create it on the local machines.</li> <li>Anytime a 'write' action is done(like a user wants to send a message), that INSERT line(which is kinda like a delta) does two things: <ul> <li>Writes to their own internal DB</li> <li>Creates a .torrent file out of that line, named something like, messaging-[my-ip]-[UTC_timestamp].torrent, and posts it to a tracker</li> </ul> </li> <li>Everyone running the app is continually scanning the tracker for files of this certain name(and possibly after a certain date), downloads the .torrent and hosts it, and runs the INSERT commands on their local DB.</li> </ul> What you'd then have is a ton of delta-files, all P2P hosted for redundancy, updating local .sqlite DBs on a lot of machines. <h3>Some issues I'm having</h3> <ul> <li>How do I scrape for torrents of a certain file-name? I've read through the http bittorrent tracker spec, but you seem to only be able to query files based on their specific info name. Is there no way to query for a group of files, or based on file name?</li> <li> How do I download a .torrent file from a tracker? Will I need to host the files on a centralized server, or can I use the tracker to download the files in some way? And if I have to host the .torrent files myself... <ul> <li>Wouldn't this defeat the purpose of a decentralized DB, since if my website goes down, the application would stop getting updates?</li> </ul> </li> </ul> Thanks for the help in advance.

Bittorrent is designed for distribution of immutable and somewhat large data sets and doesn't really know any operations that span multiple torrents. Databases are mostly about mutating relatively small chunks of data and performing operations over diverse subsets of those. You will have little joy trying to shoehorn database semantics into bittorent. At best you can use it for distributing snapshots of a database. With a little tinkering bittorrent can be fairly good at recycling data from previous torrents if the new content only adds/removes files (again, of significant size) without modifying old ones. Anything beyond that would require some significant modifications to the protocol, it wouldn't really be vanilla bittorrent anymore.

Building a distributed bittorrent-SQL database

How it could be done

Download a .torrent file that essentially contains the schema/DDL for creating the DB, and create it on the local machines.
Anytime a 'write' action is done(like a user wants to send a message), that INSERT line(which is kinda like a delta) does two things:
- Writes to their own internal DB
- Creates a .torrent file out of that line, named something like, messaging-[my-ip]-[UTC_timestamp].torrent, and posts it to a tracker
Everyone running the app is continually scanning the tracker for files of this certain name(and possibly after a certain date), downloads the .torrent and hosts it, and runs the INSERT commands on their local DB.

What you'd then have is a ton of delta-files, all P2P hosted for redundancy, updating local .sqlite DBs on a lot of machines.

Some issues I'm having

How do I scrape for torrents of a certain file-name? I've read through the http bittorrent tracker spec, but you seem to only be able to query files based on their specific info name. Is there no way to query for a group of files, or based on file name?
How do I download a .torrent file from a tracker? Will I need to host the files on a centralized server, or can I use the tracker to download the files in some way? And if I have to host the .torrent files myself...
- Wouldn't this defeat the purpose of a decentralized DB, since if my website goes down, the application would stop getting updates?

Thanks for the help in advance.

346

asked Jan 30 '15 17:01

thouliha

1 Answers

Bittorrent is designed for distribution of immutable and somewhat large data sets and doesn't really know any operations that span multiple torrents. Databases are mostly about mutating relatively small chunks of data and performing operations over diverse subsets of those.

You will have little joy trying to shoehorn database semantics into bittorent.

At best you can use it for distributing snapshots of a database.
With a little tinkering bittorrent can be fairly good at recycling data from previous torrents if the new content only adds/removes files (again, of significant size) without modifying old ones.

Anything beyond that would require some significant modifications to the protocol, it wouldn't really be vanilla bittorrent anymore.

173

answered Sep 29 '22 11:09

the8472

Related questions
                            
                                MySQL select all dates that are an increment of x days
                            
                                Select n amount of random rows where n is proportionate to each value's % of total population
                            
                                How can I optimize SQLite ORDER BY rowid?
                            
                                Backup and restore of Hsqldb database in java code
                            
                                Is there any formal difference at all between PostgreSQL functions with OUT parameters and with TABLE results?
                            
                                Joining arrays within group by clause
                            
                                Using case in mysql ORDER BY
                            
                                SQL How to Count Number of Specific Values in a Row
                            
                                Without JOINs, what is the right way to handle data in document databases?
                            
                                Are PL/SQL stored procedures transactions?
                            
                                HIVE SQL Subquery in WHERE Clause
                            
                                Find the time difference between two consecutive rows in the same table in sql
                            
                                SQL VARCHAR vs NVARCHAR in CAST performance
                            
                                T-SQL create table and index in one query
                            
                                JDBC returns an empty ResultSet (rs.isBeforeFirst() == true) although the table isn't empty
                            
                                Select by increasing order SQL
                            
                                Postgresql ORDER BY - choosing right index
                            
                                postgreSQL hstore if contains value
                            
                                Average and case in SQL Server 2012
                            
                                How should I handle units of measure in an ingredient database?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Building a distributed bittorrent-SQL database

Tags:

sql

database

sqlite

bittorrent

bittorrent-sync