Made fetch() into a plugin; we can now serve stuff from cache before a regular HTTP(S) fetch() goes out.
Next steps include:
- code cleanups
- reimplementing how we store/access data about which URL was retrieved how
- implementing a retrieve-from-cache-but-keep-fetching-in-background strategy so that a blocked site "loads" immediately, but the user still gets the fresh version eventually.
Also added some project status info and how to contact us on the landing page and in the README:
My #Samizdat ToDo list still includes moving the project to a public Gitlab instance (0xacab.org probably, since it hosts a bunch of related projects including @sutty).
I really need to do this soon, but it requires setting up the CI/CD pipeline in a new location (probably on my own server). And that's a bit of work.
Oooof, some serious code cleanups and rewrites in #Samizdat this weekend: https://git.occrp.org/libre/samizdat/compare/v0.0.2-post-mozfest...8079ed31
The tl;dr is:
- regular fetch() is now also a plugin, which opens a number of possibilities;
- any plugin that can locally cache requests and responses is now treated specially: the first such plugin is called after a successful content retrieval automagically;
- if content is retrieved from cache, Samizdat continues trying to get it from a "live" source (fetch(), Gun+IPFS) in the background.
This last bit was suggested by @tomasino and turned out to be simpler to implement than expected. So, yay!
To make #Samizdat a v1.0.0 that feels fully functional, we need to also:
- rewrite the SamizdatInfo (keeping the information on which resource was fetched and how) thing;
- add some fancy-shmancy UI displayed on any Samizdat-managed page;
- which will together allow us to inform the user "hey, we got this from cache, but there seems to be fresher version; reload please".
Here's the ticket for more context
Currently I am doing this by keeping the relevant information in Indexed DB. This has drawbacks:
- data is the same for all browser windows, leading to potential confusion if there is more than one tab open using the ServiceWorker
- there are no events to hook to catch when Indexed DB data changes, so it's down to setInterval() method, which is fugly.
I *could* use Client API, specifically `postMessage` with `FetchEvent.clientId`, but clientId is not implemented on Safari (both Desktop and Mobile):
I *could* use MessageChannel API, but it requires setting up a channel between browser window and the SW, and there's no way to track which channel is used for which browser window.
Plus, SW is quickly reaped, context destroyed, channel killed. On a new fetch() ServiceWorker restarts but the channel does not work, so a new channel would need to be set-up.
But that can only happen from the browser window side, whereas only the ServiceWorker knows a fetch() has started.
I *still* could decide to use MessageChannel API, but would need to:
- keep track in SW which fetch is from which referrer (not sure that's possible even; probably available in Request.Headers)
- keep track which channel is for which URL/referrer
- it would still get confusing if there are two tabs open with the same URP
- and I would still need to do polling in setInterval() on browser window side, kinda defeating the purpose of the channel.
So unless there is a way to hook an event in a browser window whenever a fetch() starts or when all fetch() events finish, MessageChannel API doesn't seem to be better than just using Indexed DB and polling it in setInterval() on a regularly.
And so it doesn't seem it makes sense to use MessageChannel API at all, since either it's not effective, or clientId gets implemented in Safari soon and we should move to that.
But if I'm to re-implement the Samizdatinfo on clientId now, I need a sane graceful degradation strategy for Safari.
But perhaps I am overthinking this? Perhaps the only event I need is onload. At that point I'll know already if the page is loaded from cache or not, and can display a relevant message to the user ("cache in use, try reloading"), perhaps after a sane timeout (letting the secondary fetch() in SW try to finish).
Or perhaps MDN is wrong and #Safari supports Client API?
Proof-of-Concept of the new signalling system done without removing the old one.
Can anyone test on Safari please? Open a new tab, open the JS console, and navigate here:
Then, reload (so that the service worker kicks in); you should see "ServiceWorker: yes" in orange.
Make sure that you see this commit ID in the console and in both places at the page bottom: c223b08c
If all of this is true, check if in the console you have messages saying: "SamizdatInfo received!"
Done some serious work on #Samizdat. Fixed some bugs, almost finished implementing the new messaging system (based on client.postMessage() in the end), ripped the old Indexed DB-based system out completely. Introduced new bugs to fix next.
Merge request here:
Still work in progress though.
Merged! #Samizdat now uses message passing instead of Indexed DB for ServiceWorker to inform the window clients of things. I CAN HAZ nice things, liek:
- info that a resource was fetched from cache, but fetching it via Gun+IPFS is running in background;
- near-instant info on resources being fetched and status of that;
- info when all resources get initially fetched (in the future this is when "stuff fetched from cache, but newer versions available, reload please" message will be displayed).
The Merge Request of Doom:
You might need to reload the service worker (refer to browser docs). Automagic reloading of the service worker code will come... one day, inshallah!
Also, probably doesn't work on Safari, because crapple refuses to implement things. Graceful degradation will come... one day, inshallah!
So I guess the roadmap to #Samizdat 1.0-beta would be something along the lines of:
- fix the issues (like caching plugin use is double-counted; when reloading soon after a load there is no indication how/where the resources were loaded from);
- implement the "stuff loaded from cache but newer content available, reload to see" message;
- cleanup the browser window / UI side of things so that it's easy to include on any site.
A *lot* of work, but hey, now at least we kinda have a roadmap!
Ok, back to playing with #Samizdat after some traveling.
- caching plugin not double-counted anymore;
- finally there is a proper project website at https://samizdat.is/
Need to fix Gun+IPFS for the new domain, today is a good time.
Main project home still https://git.occrp.org/libre/samizdat/ for the time being, but hoping to move it to a public GitLab instance soon.
That means now when you load the site in Firefox you should get the favicon. Favicon does not exist on the server, but exists in IPFS, for the purpose of testing all works.
In Chrome/Chromium it should show up after a reload or two (take your time though, Chrome/Chromium caches things in weird ways).
Woo! That means our migration of Samizdat is complete. It's on it's own domain, and on an open GitLab instance. 🎉 🎈
tl;dr: there needs to be a way to measure how many times Samizdat made it possible to circumvent censorship.
That's something that will have to run on reader's browser, and so there are serious privacy considerations.
But without being able to show it works, it will be hard to convince people (and site admins) it does.
In the meantime, working on cache invalidation for #Samizdat. One of the Two Hard Problems in IT (cache invalidation, naming things, and off-by-one errors)!
Anyway, trying to keep some context in cache using "x-samizdat-*" headers. But the Cache API doesn't seem to cache all headers, just some:
Of course, there is no mention of it anywhere in the docs (or I have not found it after hours of looking).
I *think* I figured out how to do cache invalidation in #Samizdat in a more-or-less sane way, *assuming that* only a single live plugin is in use.
I might have an idea how to do it across plugins too.
Relevant branch here:
Boom! Cache (or, rather, locally stashed version) invalidation implemented in #samizdat https://0xacab.org/rysiek/samizdat/merge_requests/14
From now on if you visit the site once load the current Service Worker, stuff gets stashed, and then when you happen to visit the site on a blocked connection, it is *assumed* Gun+IPFS version is fresher.
If you visit again, and have the Gun+IPFS version stashed, IPFS addresses are compared to check freshness.
If a fresh version is available, a message is displayed to the reader.
What's the difference between a "cached" and "stashed" resource in #Samizdat, you ask? Excellent question!
There can be multiple Samizdat plugins that implement the basic idea of keeping a version of a resource locally. One plugin currently implementing this is called "cache" and uses the Cache API:
So, to avoid confusion, whenever I'm talking in general about keeping versions locally, I will call it "stashing".
This will be made clear here: https://0xacab.org/rysiek/samizdat/blob/master/ARCHITECTURE.md
Worked on the documenation for #Samizdat a bit. Also, started working on implementing the standalone interface. MR: https://0xacab.org/rysiek/samizdat/merge_requests/15
The idea is to have the basic interface defined in samizdat.js so that all an admin needs to do is include that file. Currently the interface is tightly tied to index.html.
And we now have a standalone user UI in #Samizdat:
Check it out here:
Or here, to see it on a page that does not use the regular Samizdat CSS:
The UI only shows up if there are resources that seem to be unavailable via HTTPS (on samizdat.is that's the case with the favicon).
The only thing that needs to be included by website admins is a single JS file (samizdat.js).
Next step: creating a standalone admin UI.
Like measuring usage:
It *seems* like it's complicated, until it becomes clear that 3rd party tracking is not going to be affected by most website blocking scenarios. So the only thing that needs to be handled is when a website is using log analytics or their own tracker.
And the relevant merge request:
Did some code cleanup, and the samizdat-cli now can get a user's pubkey (will be needed later), and *almost* register a new Gun user.
More fun soon!
Working on implementing some basic user management in #Samizdat's samizdat-cli, as a necessary foundation for more sane deployment procedure. Relevant ticket and merge request:
Almost works, but for *some* reason users created using it are unusable. Specifically, it seems impossible to auth() as them. Moar debugging tomorrow. *sigh*
I have no clue what's wrong with my #Samizdat CLI code. When I create a user using samizdat-cli, it's impossible to auth() as that user (neither using the CLI, nor in a browser window):
But if I create a user using the same functions in a browser window, all works fine. I can then auth() as that user both in the browser window *and* via the CLI.
Relevant (fugly!) code here:
I've reported one bug already:
More to come.
Oh, did I write a test harness just for that? Yes. Yes I did:
(GitHub because Gun is hosted there; personally I prefer unifficial Gitlab instances, obviously)
I see people like shiny things. 😛
Server run by the main developers of the project It is not focused on any particular niche interest - everyone is welcome as long as you follow our code of conduct!