Home Login

What a day!

2023-02-09

Yesterday I published the search engine tootfinder.ch with a short toot with my user @buercher@tooting.ch. Over the first day, I had 5 users that opted in, some of them let me panic because their toots were not visible (more on that later). At midnight, I got some first mentions, but when I wake up in the morning tootfinder.ch hat 172 users and I got 50 mentions. Over the day, I wrote over 90 answers, made some hot fixes, while having a normal work day, and while I am writing this, the site has 496 users, 5670 indexed toots and 3548 queries replied. It looks like living in the middle of an explosion.

It started like an academic exercise. I have been a Twitter user since 2011. It became quickly a platform of my choice, for political information and exchange mostly, also with people I do not agree, and my first source of information (first as in time). Swiss politicians were quite present on Twitter, so it was able to discuss on a level playing field. Twitter also allowed to search for information, when there were big developments, like the climate strike or the revolution in Iran, and classical media were not covering the topic. I was aware that Twitter was also an awful space concerning publicity and fake news, but as I used TweetBot, I had my timeline as I have chosen it, and not the timeline the Twitter algorithm calculated. So It was viable until some person decided to buy this public space for his own and everything went down. For some time I was part of the people arguing we should fight to keep the place, but it was degrading. I switched to Mastodon first as a backup, then, when the API went down and Tweetbot was not usable again, I quit. Some API somewhere still tweets my mastodon toots to Twitter, but that's not me, that's some API I forgot the name after I switched.

I like the current place on Mastodon, there is much interaction with different kind of people, not so much from Switzerland yet, but people speak to each other. Mastodon has also some governance issues (see Das Problem der Moderation von Social-Media-Plattformen from philippe.wampfler@schulesocialmedia.com), but I am optimistic these problems can be solved over time.

Still I thought, there is a problem that it is a very dense bubble. I interact more with the inner circle and rarely get some other opinion. To follow, I can only find friends from friends, but I think, getting another opinion is also helpful from time to time. And to get that, I thought we need search. What is moving other people at the moment? Can we have something as a public space?

I learnt then quite quickly that the developers of Mastodon have decided not to do that because - as I understood - the public availability of toots via search creates side effects that come with popularity. It exposes also toots that might not be meant as public as "public" when they have been sent, it may also expose the people who wrote it not wanting to be on the public space, for whatever reason. There was also some message between the lines: the refugees of birdland should respect the traditions of the native inhabitants. I think the latter is less serious, but the former, the privacy issue, is a real concern. I accept that nobody should be exposed in public space against their will.

This said, there is a group of people who want have a public space, and are consenting to expose themselves, because they believe that a public space is needed for a working democracy. With the immense challenges that are before us, we as the society must find a way to create a public discourse on how we want to cope them together and not one against another.

This consenting group of people need a public space, and searchability is part of it. For some time, I advocated for an opt-in search, with limited echo. As it works with open source projects, someone has to make the step and propose a solution.

I have been working on different programming projects since 30 years, from dealing with timecode, color correction, subtitles to a CMS named SofaWiki. I have some experience dealing with protocols and data format. When I learnt that each Mastodon user has an RSS feed by default, I thought there must be a way to build a prototype. I worked also on full text search on SofaWiki, so I could use the experiences I made with SQLite and FTS3.

Building a prototype with my own toots was a trivial weekend-project. I indexed the RSS periodically with a crontab into the SQLite database and built a front text search. The problem was to get consent from the user. How can I get the users to opt-in?

The first idea was to create a bot and let the users follow that bot. If the user follows the bot, the server indexes the users, if they stop following it, they toots are removed from the indexed. At this time, I decided also to limit the lifetime of the index. It should be a search for recent toots, not an archive, also to limit scalability problems. With 1-2 weeks, 1-2 GB database should be able to handle 100 000 users.

The first idea failed because I did not find an instance to accept the bot. The first instance had ideological problems, the second concerns about resources. If the bot is successful, it would create a lot of traffic for the instance.

So there was the time to think for a plan B. I thought creating as 3-number hash the user should place on the Mastodon profile to prove consent. It would certainly work, but it is too complicated for normal users. I thought a public message containing a special text could also show consent. Also, too complicated to explain.

So someone proposed to use OAuth. I had used that on other projects, but it was not trivial. I gave it I try, and after some fiddling it worked (yes, I am aware, I will have to work on the scope which is too wide). Furthermore, I looked some hours for a domain and finished with tootfinder.ch. I set up the domain, made some basic CSS to change the look from HTML 1995 to a Mastodon 2023 look and wrote the toot to announce it.

That was Wednesday morning. Yesterday.

For some hours, no reaction.

At noon, 5 users. Great, but I do not find them in the search? I explore the database. User 1 was me, searchable. User 2 hat not make a toot the last 7 days, so they are not searchable. I extend to 14 days. Here they are. But users 3-5 return empty RSS. I dig deeper, there is some XML, but it is not RSS. Panic. I read that there were 2 formats, but thought the file extension made me choose. However, I was completely wrong. Some instances relocate from the RSS to Atom silently and let you handle the completely different format. It was handleable. After 2 hours, on a productive server, I got Atom working as a second option. And visitors could search 5 users.

Then it was quiet until Thursday morning. This morning. I had 172 users and got 50 mentions. All of them with questions. Some very encouraging, praising something they have waited forever, some could not opt in and some showed me screenshots with nice PHP warning and errors. And some having a libertarian-socialist discussion about the opportunity and side effects of opt-in searches - I did not follow all, their thread lasted the entire day, included bird names, blocking, misunderstanding and relative peace.

I had to decide what to do, because I could not do everything at the time, there was still daytime work. So I made hot fixes for two severe errors, replied to whatever I could reply, and created this wiki to collect the issues.

At this time, there are 15 issues. Not all are severe, but I will have some work the next day. And also prepare to get the code on GitHub. To get more issues, but to make the project also better.