Architectural decisions

2023-02-12
Friday I discussed on how to verify consent: with a better OAuth, by following a bot or by adding a magic word to the timeline.

I also created a poll which got 34 votes. The result was

44% A bot to follow/unfollow
32% A magic word in the user profile
24% OAuth, but try to narrow

So the OAuth route was out. Later yesterday, a user said having managed to reduce the scope, but the questions on access would prevail. It would need a zero scope to get full user confidence, just a proof that the user owns the account.

During the day, I also made some search on Mastodon instances to implement the bot version. There are providers that offer managed Mastodon instances at a reasonable price. However, their terms of services exclude a bot for a search engine, so that route was blocked. This said, even it was accepted, there was a risk the instance would have been blocked by major Mastodon instances for reasons that are their own.

This left me with the magic word route. It has some real advantages: Leaving OAuth and Mastodon codebase, there are very few dependencies and the codebase can be kept tiny. The codebase of version 1.1 is 119 KB pure PHP and no external libraries. I will go that route for a while.

This is the procedure:

The user adds a magic word on the profile. I have chosen two possible words: tootfinder and tfr, which seems to me a rare English trigam. tootfinder may be part of a link tootfinder.ch, but it is not necessary to promote the site. tfr is more discrete.
The user then announces the username.
The server checks the profile page of the user. Mastodon is actually a web application, but the bio and links are in the header, so the direct source is readable.
If the magic words are in the head of the file, the user is valid.
The server starts to index.
From time to time the server checks also the profile. If the magic words have disappeared, the server stops indexing and showing results.
There is a grace period, because the profile page could also return an error. For the following 14 days, if the magic words come back, the user is indexed again.

It's implemented an in production.

There is a catch:

The verification is a two step process.
Users might not want to mess up their bio or links.

If someone want to propose other, rare magic words, I will be happy to add them

I had other options for magic words, but did not retain them, thinking they may be too techie:

Create a 3-digit hash based on the username that could be uses (eg 478)
Allowing to use one of the verified linked sites to host the magic word

There is another architectural change I made for version 1.1. There were some discussions, a consent user could disclose another user by mentioning the user in the post. This can become a problem, someone might also search for the user. Therefore, all user handles are now replaced by a placeholder @… so that these users are not exposed. I renounced to keep the mentions of consent users, because the replace query (which would had to be done by query time) would have been expensive in time.