d:2022-04-16
Table of Contents
Some thoughts on the structure of the web and wikis
2022-04-16
I have a vision, new to me, but quite possibly not new to some other people, of how the web could work in a different way, much in line with my ideas for wikis. Handling backlinks could be done in a very sophisticated way, probably through the web server itself rather than something built in a scripting language. Naturally though, prototypes would be built in a scripting language.
- All http(s) requests are scanned for a referring page. If there isn't a referer (canonized misspelling of ‘referrer’), obviously no back link is made.
- If there is a referer, then the referring page is scanned to check that it really has a link.
- If there is no link there, then the referer must be spoofed. A good side effect of this is that the request is delayed. However there is a load on one's own server, making it more prone to DDOS attacks.
- Server load needs to be monitored. If a critical figure is exceeded, then any unlisted referer is treated as a request with no referer.
- If the referer has a proper link, then it goes into a decision process (I really should draw this as a flowchart, and write it in pseudocode)
- There is a whitelist of whole domains that are trusted, and a list of backlinked pages is kept from this site. If the referer is already backlinked from the linked page, no action is needed; if it is already backlinked from another page then the backlink is added to the linked page. If it is not already linked, the page is checked for a link, and if found the backlink is automatically added, with an appropriate relation. How this information is held depends in part on how backlink information is held by the site itself.
- There is a blacklist of malicious sites, with a redirect directing incoming links from this site as referer to a 404; or a rejection page explaining what is wrong with the referring site.
- There is a greylist of sites that have not been white- or blacklisted, but at least one page has been accepted for backlinking. Accepted pages from this site are kept in a list. If the referer is on this accepted list, the request is accepted similarly to the whitelist. If it is not on the list, the page is flagged for review.
- If the referer domain is not on the black or white list, or the referer page is in a greylist domain but not already backlinked, the linking page and/or site is held for human review.
- A site admin can add the whole domain to the black or white list. A new blacklist entry sets up a redirect as described above. A new whitelist entry also adds the page to the list of backlinks from that domain.
- A site admin can add a backlink to that page and add the site to the grey list
- For each backlinked page, either a dated copy or a hash is kept. At any convenient time, or at convenient intervals, a check may be done on the current page.
- If the page has changed, this is flagged for human review.
- If a copy has been kept, then the review is given as a diff.
- If storage is plentiful, old dated copies can be retained.
- If the page has gone away, backlinks are deleted.
- When a human reviews a page for potential backlinking, that is also recorded, to be transparent and accountable if problems arise.
Referer further information and spoofing
- See:
- referer spoofing is easy, and effectively impossible to police. To select visibility from different sites, you would need the referer URL itself not to be easy to find. Two layers of selective access might help.
It is also possible to have pages shown only when a particular referer is passed. But of course that might not be the real source of the request.
Backlinks
d/2022-04-16.txt · Last modified: 2023-01-09 15:10 by simongrant