Issue:Handle deleted posts
Priority 2 Created 2023-03-01 Resolved 2023-03-05
- Does tootfinder also respect the disappearing messages setting in Mastodon?
- Not yet. I thought they were immutable. I will have to look into it.
2023-03-01 Il will be complicated because every query could trigger 100 CURL requests one for each post. The server will not be able to do that and still return results in reasonable time.
2 solutions
- it is checked at crawling time, each time a user is checked
- the work is done by the client (JS checks for each post and removes if necessary)
As I have experienced, Mastodon servers have also accepted delays in sync.
The following routine could remove at least recent deletes
- reverse the list to get the oldest post and get that id
- search all ids for the user that are younger than that id
- for all results, if not in feed, delete it
- if in feed, check if feed has edited_at set, if not, remove from the feed
- if edited, replace
- for the resting feed, insert
(if there is a gap might follow pagination for missing posts?)
The following more expensive (because multiple curls) could remove all
- get the feed
- get subsequent feeds until created at is -14 days (might be a lot)
- go to step 1
(this could at least be done for new users to index full 14 days)
2023-03-03
Approach filtering on the client, which will always be most current.
fetch("'.$post['link'].'", { method: "HEAD", mode: "no-cors", redirect: "follow" } )
.then( (response) => { node = document.getElementById("'.$post['link'].'"); node.style.display="block"; node.style.color="blue"; } )
.catch( (error) => { node = document.getElementById("'.$post['link'].'"); node.style.display="none";} )
Will not work with CORS (not allowed), and does return an opaque response with NO-CORS, so not possible to know if URL exists or not.
Approach filtering on indexing as above. Does not work. Two problems
- IDs of links are not chronological on all instances as some use MD5 hashes.
- PubDate seems to work better, but it gives false positives, eg links that younger than the oldest post in the feed, but missing in the feed. If the links are tested, they are valid and also still public on the website.
- Therefore, each candidate would have to be tested individually, instead of deleting it.
Implemented, but as a new list with async. Seems to work and not to give false positives.
2023-02-05 fixed