Private document collaboration

>Projects>Private document collaboration

Private document collaborationCreate and share documents only among a selected group of accounts so that they can review and collaborate.

Alex Burdiyan, Julio, Gabo H Beaumont & horacio

11 October 2025, 07:13

ContentPeopleComments65

We are breaking down this project into a few building blocks:

Private Documents: Phase 1

...

Definition

Users are eager to start collaborating on a private document. Whether this collaboration is permanent (sensitive information in there) or transitory (shared draft that needs to meet some quality standards before gets published), it means that we should control who can view the document and how to distribute it.

Solution

Every private document is flagged on the document itself. This way the document's creator marks it at private on an backwards compatible way. The rest of the nodes, will see that flag and act accordingly. The hypermedia protocol needs to be bumped so that updated nodes (those who recognize the new private flag on documents) don't exchange private information with old nodes.

This flagging occurs at the time of publishing the document (it does not make sense to do it before as drafts are private by nature).

If the user trusts a specific server it can push to that server (even hyper.media) and the server will get the document. Servers won't render a private document ever, but they can help with the distribution. Again, if the server itself did not upgrade the protocol, no matter how the user trusts the server that the document can't be pushed there.

If the author does not trust any server then the document will only be distributed p2p.

The user can republish at any time to a newly trusted server.
In order to avoid path collisions among private documents, the path should be obscure (random characters up to the first non private directory). If the author wants to make the document public in the future, it can republish to a clear path.

This also helps with anonymity since the document path could leak unintended information about the document itself.

Once an author has marked a document with this flag and the document is ready to be distributed (and the associated resources like comments and images), the author needs to create a capability to allow read permissions.

We could create a compound capability which would include a list of accounts to be given access to a certain document. The problem with such capability is that we still specify single roles to the capability, so if we want to give read access to Alice and Carol and Bob but write access just to Alice and Bob. We would need two capabilities.
Nevertheless, if we have a sorted list of delegates inside a compound capability, the syncing flow would be as follows:

Alice notices she has some private documents to sync with Bob

Alice sends a list of capability IDs to Bob in order for Alice to sync those documents to Bob.

Also, Alice sends Bob a nonce

Bob learns that Alice must be behind of one of the accounts that give access to those paths, since She is willing to share those documents with him. A shady Bob can search locally for capabilities that grant permission to those accounts+paths and then find the common denominator (if any) so that Bob leans one account controlled by Alice.

Bob looks local capabilities matching the IDs Alice provided in step 2. (Alice could be making shit up to learn about Bob's accounts, hence the lookup) Per Capability ID

Bob computes a ring signature including the given nonce.

Alice checks that the ring signature is valid for the recipient set of each capability involved (step 2). Alice could learn by finding common denominator among capabilities. If this is important we could create phantom accounts when we create a compound capabilities (random accounts so that finding the common denominator is more difficult)

If leaking the list of all delegates is sensitive information (Alice can see that Bob, Carol and Dave are talking something behind her back) We could keep using single capabilities (one per delegate) but encrypted for each delegate. So that delegates can present them to a syncing peer on request. These encrypted capabilities are freely distributed and stored by all peers in the network, as regular capabilities (but only a chosen set of delegates can actually open them and prove things with them)

Alice notices she has some private documents to sync with Bob

Alice sends a list of document IDs (account + obfuscated path) to Bob in order for Alice to sync those documents to Bob.

Bob searches locally for capabilities he can decrypt and look in those capabilities to see if the account + obfuscated path are in the list Alice sent in step 2

Bob presents the plain capability(s) to Alice

Alice syncs the blobs according to the capabilities presented to her. Alice learns that Bob has that capability for that document but she is also in the group, so "no big deal" (oc, we won't leak any information to the frontend so in reality Alice would never know it, but technically, a malicius Alice with a proprietary software could know it). If it is actually a big deal, then we could make it so only a trusted site can distribute private content, and delegates would need to trust that site. That way delegates would not know about each other (kind of a bcc email).

This way, users may collaborate with chosen peers just need to add read capabilities to accounts, and all devices under that account will have access to the document.

Relying on just syncing rules is simple but limits the amount of things you can and can't do in the system.

Things you can do with a private document:

You can grant access to chosen accounts. Writing permissions require read permissions but reading permissions don't require writing permissions

You revoke access to accounts. By revoking capabilities, removed accounts won't receive updates of the documents.

Things you can't do with a private document:

You can't fork a private document, even if you have reading permissions on it.

You can't embed a private document on another document. You could theoretically sync only the document but not the private embeds, although this needs to be handled gracefully by the frontend

A document cannot be partially private (paywall preview) where you can read for example the beginning and leave the rest as private.

You can't publish a private document to a site.

The more advance solution (and compatible with the former) would be encrypting the document so that devices could decrypt it. This will allow doing all of the above can'ts

Rabbit Holes

Creating forks and branches. Creating a fork (although banned) is just creating a ref. Bad actors can create it and then good actors would not now what content was the original. However

Should the flag expressing "requires read permission" be on the document, or on the Ref blob?

What about directory access?

Related: maybe private documents should have random auto-generated paths (to avoid naming collisions with other private or public documents), and not support directories (to simplify the recursive nature of things)?

We need to prevent private documents to sync to unauthorized servers.

But how can we authorize a server, for better collaboration and syncing? Teams need storage server for robust collab.

Peer/Account mismatch. Entities that are authorized for reading are Accounts, while entities participating in the syncing protocol are Peers. There's currently no link between peers and accounts.

When a peer requests a private document we need to check if they are authorized. For that they need to present their capability first. Our syncing protocol doesn't currently support that.

Maybe private/public should be for the whole space, not per document? This will simplify things quite a bit, albeit it's not very user-friendly.

When we sync entire sites, or collection of sites, we don't want to reveal private document to people that don't have access to them. Even revealing the existence of a private document to an unauthorized party can be a problem.

On the flip side — asking for a private document someone who's not sure to have it can also be a problem, because it would reveal the existence of a private document to an unauthorized party.

This problem is often call Private Set Intersection, and it's a complex problem with many different tradeoffs.

if there's a private document that one of my accounts have access but I do have selected another account that does not have access. should I be able to see the document?