Technical Overview of OFF

From OFFWiki

Jump to: navigation, search

The Technology

A technical document detailing the basis and implementation of OFF: The Owner-Free Filing System / Owner Free File System (we never did settle on one or the other). Thus, these two terms (OFF, OFF System) can and will be used interchangeably. Now, on with the show.

A Basic Introduction

Basic synopsis: The Owner Free Filesystem (OFF) is a system in which all files inserted are given several additional representations via multiple use encoding and exhibit a digital state of being that has depreciated the file’s status to “ownerless” and no longer belonging to one person. A file inserted into OFF will intrinsically become part of several other files, and a block’s (see technical details) representation depends solely on its coupling with other blocks. This is known as a ‘brightnet’ where nothing held within an “OFF Cache” can be considered copyright infringing, and thus can be shared and traded freely without fear of repercussions.

On Blocks

At its very core, everything within an “OFF Cache” is a block – a uniformly sized file of 128KB. This file is ownerless; it does not represent nor is a derived work of the original file from which it was spawned. Since all blocks within OFF are uniform, they can be coupled in an infinite number of ways to produce an infinite number of files.

Abstraction

This concept is akin to a bucket of building blocks of uniform size, building blocks that cannot be owned by any one person or persons.

While it is true that certain representations derived from these building blocks – the Mona Lisa or a wholly unique design – may be “owned” by one person, the actual blocks themselves cannot be owned. Thus, a representation may be depreciated – taking a sledgehammer to the aforementioned Mona Lisa – and their constituent parts may be traded and shared freely among others. This is no way illegal.

However, given the correct “recipe,” you may be able to reconstruct an owned representation from the blocks someone gives you. The “recipe” in this case are known as descriptor blocks. However, since descriptor blocks are in no way different from any other block in the cache, it is impossible to know if any block given to you may represent a descriptor block as opposed to representing another design. In fact, it is highly possible that a given block will represent BOTH a descriptor block AND another competing design. This is multiple use encoding.

OFF applies a similar, digital mechanism to depreciate files into an ownerless format. See, ‘On File Insertion.’

On File Insertion.

Block Creation.

The process of depreciating a block in OFF is extremely simple. The targeted file to be inserted is first divided into 128KB blocks; if any block is less than 128KB in length, then it is to be padded with random data in whatever fashion the implementor sees fit.

These are the ‘original’ blocks. Although they have been separated, they still resemble the file from which they were derived. They will be referred to as “Original Blocks.”

Depreciation

Each block will now be depreciated using a method similar to World War II’s one time cipher-pad encryption system. Using this method, any given block in an OFF Cache can be used in any given file in existence; it needs only to be paired with the proper constituent blocks needed to derive any given file. It is a simple concept that is detailed in a more robust fashion, here: http://thebighack.org/papers/CopyNumbCJ.pdf (! LINK IS NOT AVAILABLE)

In the implementor’s cache, the implementation will pick one block, preferably selected at random, to be referred to as “Random Block A.” The hash of this block will be stored for later use in assembling the descriptor block(s).

Next, a byte-for-byte XOR operation will be performed on Original Block against Random Block A. The derived block of data is Result A.

The implementation will now pick another block from its cache, preferably randomly selected. This will be Random Block B. The hash of Random Block B will be stored for later use in assembling the descriptor block(s).

Result A and Random Block B will be XOR’d against each other in the fashion described above, and the result, Result B, will be hashed in memory, and written to the filesystem, named after its SHA-1 hash, with the file extension to be “.ofb” (OFF File Block).

The hash of Result B will be stored for later use in assembling the descriptor block(s).

The resulting file is placed into the cache.

Once the file insertion process has been completed, a descriptor block, or blocks, are created and inserted into the cache like any other block would be. (See, ‘On Descriptor Blocks’). The constituent hashes used to reconstruct the descriptor block (Random Block A, Random Block B, and Result B) and the original file size (in bytes) are then passed to the user in a preformatted form of the implementation’s choosing (usability’s sake).. NOTE: It does not matter what order the hashes are passed in.

The user will query your implementation with this preformatted form in order to retrieve the inserted file when he/she/and or it is ready.

This method of depreciating content creates a many-to-one encoding scheme; each block inserted into the cache will create a new interpretation for two existing blocks within the cache, and when the new block is used, it will be given another interpretation for another block, and so on, ad infinitum. Due to the uniformity of creation, any block can be used in any file if it is paired with the appropriate constituent blocks.

Here is an ASCII representation of the process:

Original Block (XOR) Random Block A
               |
            DERIVES
               |
    Result A (XOR) Random Block B
               |
            DERIVES
               |
    Result B (Final Result).

On Descriptor Blocks

Descriptor blocks are unique blocks that are not unique. They are the same as every other block in OFF – they are useless. In fact, Descriptor Blocks only exist when they are derived from other blocks in the OFF Cache. There is no way to tell if a derived block of data is a descriptor block or not.

Once a Descriptor Block has been composed they are inserted into the OFF Cache just like any other file would, and immediately become depreciated, ownerless content. They are derived by passing the constituent blocks (the ones that are required to create it) to an implementation, which is assumed to construct a valid descriptor block. If its not, well, you’re screwed. Redundancy is probably in order, here.

Descriptor Blocks should in no way shape or form be permanent – they should always be transient, never saved to disk or written elsewhere.

Also, since a descriptor is made up entirely of hashes and random padding, it is impossible to determine them from gibberish without trying to dereference other blocks. (hashes are essential indistinguishable from random) This makes it semi-resistant to brute force attacks.

Assembling

Assembling a Descriptor Block is a straight forward process, but keep a few things in mind: if a file is over 279.425MB (279,425,000 kilobytes) it is required that additional descriptor blocks be created and daisy-chained. There is space reserved at the end of every Descriptor Block for a 3-tuple that might instruct how to derive the next descriptor block in the daisy-chain. Or, perhaps, not. To determine how many descriptor blocks a file will need, check the total file size against how many megabytes a single descriptor block can represent.

A Descriptor Block is filled with hashes, in 3-tuples. Groups of three. As you know, blocks in OFF are named after their SHA-1 hash. When referring to hashes within a Descriptor Block, the filenames of a block can easily be inferred simply by adding the extension (“.ofb”) to the end of any hash encountered.

There is a total possibility of 2184 3-tuples stored within one descriptor block. Note that only 2183 3-tuples are available to describe the file itself; there is one 3-tuple reserved at the very end of the file in the event that the total file size cannot be represented within one descriptor block.

It does not matter what order hashes in a 3-tuple are written in, so long as all constituent hashes are written together in the 3-tuple.

Hashes written into a file must be written in their native hex-based format; this makes each SHA-1 hash 20 bytes long; an entire3-tuple of hashes is 60 bytes.

Like any other block in OFF, they must be padded if they are not 128KB.

Always note that there is one 3-tuple reserved at the very end of a descriptor block. One 3-tuple must always be reserved, in the event it becomes necessary to describe a 3-tuple for deriving the next descriptor block in line.

Daisy Chaining.

At the very end of each Descriptor Block there lies a reserved space for a 3-tuple to describe the next Descriptor Block in line, if any additional Descriptor Blocks are needed.

Daisy chained descriptors must be inserted in reverse order. It is necessary to create referenced hashed before they can be stored in the referrer block.

Parsing

Parsing a descriptor block is simple: hashes are stored in 3-tuples, left to right. Assuming ABC were stored in the first 3-tuple within the Descriptor Block, to assemble the first Original Block…

A (XOR) B = PRODUCT (XOR) C = ORIGINAL BLOCK.

There is space reserved for one 3-tuple at the very end of a Descriptor Block. If the constituent hashes of the original file are too big to fit within one Descriptor Block, another will be allocated and daisy-chained. Simply derive the next Descriptor Block in line from the last 3-tuple and continue on.

Use redundancy when parsing a Descriptor Block: since there are no unique identifiers or headers it is completely possible that the user will try to derive a descriptor block from a series of blocks that may not describe an actual descriptor block, and so the result will be a block that is seemingly full of “garbage.”

When you’re done

After creation, a descriptor block should be inserted into the implementation’s cache and output in a pre-formatted form the necessary information needed to recreate the descriptor block (this has its own section).

On File Retrieval

So, the user wants his file back. This is called ‘File Retrieval.’ The user supplies a preformatted link that describes the hashes of the three constituent blocks used to recreate the descriptor block, and the original file size, in bytes. The user may also supply a file name; this is what he wants the file to be saved as.

The supplied file-size will be used at the very end to crop any excess data from the derived file.

Deriving the Descriptor Block

The preformatted link will provide three hashes. XOR these hashes left to right; the final output will be the descriptor block. Assuming that the hashes ABC were provided in the link…

A (XOR) B = PRODUCT (XOR) C = DESCRIPTOR BLOCK.

Parse the Descriptor Block and attain the list of hashes needed to reconstruct the entire file. The descriptor block should not be written to disk; its presence should be transient and disposed of when the requested file has been retrieved.

Deriving an original block – many to one decoding

To produce an original block, parse the first 3-tuple of hashes within the descriptor block(s), then perform a XOR transformation on these hashes left to right; the final output will be the original block. Assuming the hashes ABC were provided in a 3-tuple… (geez, doesn’t this sound familiar yet?)

A (XOR) B = PRODUCT (XOR) C = ORIGINAL BLOCK.

Append all Original Blocks together in the order that they were created, using the file size supplied by the user to shave off any excess garbage data that will be present at the very end of the file.

Supply file to user

As you may have already noticed, there is no need to propagate the information the user supplies to other OFF Caches that may be available in a networked environment, as it would serve no point—all necessary information is either provided by the user or discerned by deriving a descriptor block.

On Keeping your OFF Cache Happy and Healthy

Although your OFF Cache may grow big and fat and seems in dire need of exercise via “del *.ofb”, please don’t – it is not healthy for your OFF Cache, nor for people storing files in your cache. Selectively trimming off weight on your OFF Cache should be calculated by keeping statistics: which blocks are important, which blocks haven’t been accessed in a long (LONG!) time, which blocks are brand-spanking-new. If you have to shave some fat off your OFF Cache, first try to liposuction the most “unused” and “unimportant” blocks. The implementation of this is up to you. Feed your OFF Cache regularly! If the user is neglecting his network-aware cache (p2p, distributed, etc), then at predefined intervals it might be smart to grab one or two “popular” blocks from another OFF Cache nearby to keep your OFF Cache fat and happy. Keep your OFF Cache safe from viruses and STDs! Since blocks in the OFF Cache are named by their SHA-1, if you receive a block whose filename does not match its SHA-1 hash, please dispose of it in a safe and timely manner. If your OFF implementation is network-aware, it would be wise to snub the host that gave you the corrupted block. Keep your OFF Cache surprised! Try to pick blocks at random when you are inserting a file, as a pattern can lead to several blocks not being associated with another relevant interpretation and thus being “useless.”

On Networking OFF

There is no predefined network-based vector for OFF, simply because: there is no need for one. As long as blocks in OFF are uniformly constructed, as per this document, they can be transferred over any medium the implementation sees fit: piggy-back over existing networks, using existing static webhosts, or developing a network solely for the transfer of OFF blocks. The very nature of OFF’s multi-use encoding makes this an inherently legal practice.

Therefore, there is no need for proxying and anonymity.

This is a ‘brightnet.’

Personal tools