People say, "The Internet is forever".

This is not true. So much on the Internet has disappeared. I would state further that more has disappeared from the Internet than currently what exists.

This is one reason I wish that a P2P content delivery system was the default: because not only would it deliver information faster, information would depend less on WHERE it is and more on WHAT it is -- thereby creating an avenue for redundancies.

Follow

Imagine if YouTube was shut down.

It would be a literal tragedy for the human race. Billions of hours worth of creativity would disappear in an instant.

"That would never happen!" some believe.

But it already has.

Remember Google+? All of it's gone forever.

Or remember all that media stored on MySpace? It's vanished.

We must stop depending on Big Tech to archive our data. Their mandate is to profit off our data, not preserve it.

One reason I hope PeerTube takes off is because:

1. It's decentralized
2. It uses a P2P delivery system

That's one step in the right direction -- but I hope other social media networks follow their lead.

Actually, I kind of wish there was a universal P2P protocol that was a mixture of HTTP and BitTorrent.

That alone would fix so many problems with the Internet!

@anji I've used IPFS for three years, and I've yet to see broad adoption apart from crypto.

If something like archive.org were stored on IPFS, then that would be a game changer.

@atomicpoet @anji to use your example, neither PeerTube or IPFS are good archiving solutions. Google can't be trusted long term but IPFS forgets data as soon as someone doesn't pin it and PeerTube instances tend to shut down within months. I've been trying to use both and it's been very unreliable.

Disclaimer - I'm extremely pro-decentralization and pester everyone I know about it but I am also something of an archivist and I can't ignore the reality.

In the end long term data storage has to be done over offline media - which is also becoming a problem as more people switch to flash based storage since if you leave it unpowered for a couple of years the data will go corrupt and eventually useless much faster than on classic magnetic media.

The net will always be ephemeral on the long term unless we work up something insane like Xanadu. So for now the most stable online archiving option is with people who care - e.g. archive.org.

@polychrome @anji Okay, but hear me out. What if archive.org was decentralized, and every library and university in the world ran an instance?

@Konrad
In all seriousness, every state and territory should strive to archive. Its patently absurd to leave it to one entity.

@polychrome We also very much appreciate what you wrote about flash vs magnetic and this only bolsters our resolve that #realComputersHaveDiskDrives (ie. #bluray drives to the lay person). The current crop of so-called computers being produced are designed to be communication devices in our opinion.

@atomicpoet @anji

@atomicpoet Just to oput some numbers on this...

The US has about 9,000 public libraries (administrative units) and another 3,000 or so academic libraries, for a total of 12,000 of both classes.

There is an estimated total of over 115,000 libraries in the country. (Many are public school libraries.)

web.archive.org/web/2018102617libguides.ala.org/numberoflibr

I'm going to assume that major-city public libraries and top academic libraries might be considered archival hubs. That's a hundred or so from each list, conservatively.

The US Library of Congress holds 40 million catalogued works (books, generally, a total of ~130 million items of various descriptions).

At 5 MB/book, total disk storage would run about $3,700 ($18.3/TB), for spinning rust. Other offline / nearline storage might be cheaper. I'm going to estimate a disk storage system at roughly 4x this cost, or just under $15,000. (This is probably high, I'm being conservative.)

That is, for $15,000, any library in the world could hold the entire works of the world's largest library, the Library of Congress.

For comparison, the Internet Archive budgets $2/GB for data in perpetuity. That's $2 per 400 books or so.

Yes, "books" != "Internet data". But it's a comparison point.

@polychrome @anji

@polychrome @atomicpoet @anji it's true - you cannot trust anyone who's motives are not preservation to preserve your media. Archiving has been getting increasingly difficult and expensive over the years as the volume and diversity of media goes up, and it's expensive.

I'd go one further and say optical media - not necessarily CD/DVD, though - is the way to go - formed by irreversible chemical/ mechanical processes. Tapes and disks are fine - but are erasable and so less durable.

@polychrome @atomicpoet @anji also - tape heads have a finite lifetime (in hours read). Many kinds of tape machines (and this heads) which were once common are no longer manufactured: thus there is a finite supply of tape heads. There are archives in the world which have more hours of media stored in them than there are tape head hours in the world. So some of the archive is already lost - it's just we have to decide which bit we don't recover.

@naxxfish
Many people are looking into DNA as data archival solution atm. Likely this will be the way to go ~10yrs from now
@polychrome @atomicpoet @anji

@polychrome @atomicpoet @anji LTO Tape drives are quite popular for archival purposes, as they don't lose data easily when not in use for a long time.

@polychrome @atomicpoet @anji You can't expect non-commercial instance to be reliable if you don't support it. Support instance (mastodon, peertube, etc..) or setup/share your own and will last forever.

@polychrome @atomicpoet @anji

After researching this problem for myself, I settled on two offline storage media:

1) I buy used 1-TB 3.5" hard drives and offline-storage cases.
The 1-TB size is a good match to my needs, cheap to buy (especially used, and used is fine -- for offline use, they'll get little wear).

2) M-Disc optical disks, DVD-M or M-Disc BDR, which are much more durable than dye-based media (will probably outlast the magnetic media of the hard disks).

@polychrome @atomicpoet @anji

I plan to use the 3.5"HDs to store expanded source trees, software/distro, and the complete EXR-stream renders of my output.

The EXRs are "intermediates". Regenerating them is expensive, but it is an automatic process, once the software is running.

I'll use the optical M-Disc media to store source files, PNG streams, video renders, and software archives.

As for the volatility of PeerTube, that's why I'm running my own instance, now.

Hopefully this works. 🤞

@polychrome @atomicpoet @anji the recent threat by to oust users who wouldn't pay the ransom gave a new perspective on the net-archives, faced as I was with somehow finding new homes for 16 years worth of life-history data for 8 users.

I think we must accept that "Digital Archive" is a contradiction in terms, a transient transport from A to B. Digital 'artifacts' are a 'volatile' variable contained within a scope that will inevitably be garbage collected.

@polychrome @atomicpoet @anji any data that I have that is static, like pictures or archives of things, I generally will burn to a Bluray disk for long term storage. Convenient for me, but maybe not for others. But is someone wants the data, I can just copy it off and send it... Or just copy the disk and hand it to them.

@herag @polychrome @anji @atomicpoet

I think I recently heard about data tapes as an effective long term storage solution. Downside is you have to load the whole thing to get a bit of info off of it but for preservation purposes it's low energy and longer lasting. need to do more research on this!

@atomicpoet @anji It looks like archive.org is actually planning to use IPFS. Have to look up the source later, currently at work.

@atomicpoet @anji ok, here's a follow up to this. Looks like they removed any evidence for that, the only thing I could fine was a cut version of the interview on archive.org. The whole interview isn't available anymore.

@anji @atomicpoet ipfs backs a lot of Library Genesis. It's working well enough for them. I think it's just a matter of time.

@atomicpoet IPFS is one of the distribution mechanisms used by Library Genesis.

@anji

@atomicpoet
Theoretically possible but its only a theory and a number..

@atomicpoet Welcome to the #interpeer project, or at least where I hope it'll be in a few months.

@atomicpoet For something like archiving, IPFS may actually be a good choice. The goal for us is broader, and also support real-time scenarios such as live broad-/manycasting, as well as collaborative editing.

IPFS has a few features that mean it doesn't lend itself all that well to those scenarios where there are frequent updates to a resource.

@atomicpoet imo that would be great not only from data preservation standpoint, but also just because it's more convenient. I can't even count how many times I had to waste time waiting for slow CDNs that are thousands of kilometers away to give me data I want. Decentralization not only makes it more persistent, but also makes content delivery much faster. But I guess at that point in time average person won't really care about that, unfortunately.

@atomicpoet
There is #ZeroNet, but it has some problems & now it does not seem to be very popular

@atomicpoet that's the kind of talk that floods your mentions with IPFS people

@atomicpoet I also like the aims of PeerTube, but being based on ActivityPub it seems to have some of the same limitations as Mastodon. Lack of global search, discovery, trends, etc. It would be nice to be able to easily search for and discover all content everywhere from any server… That’s part of what makes YouTube (and Twitter) so good.

@anji @atomicpoet I'm 99% sure it would be possible to make something like this with the existing videos on PeerTube based on ActivityPub. I'm 0% sure you can currently easily search for and discover all content currently on YouTube, let alone all the content that's been deleted because YouTube didn't like it.

@raphaelmorgan @atomicpoet If someone on your local server doesn’t follow or boost a remote post/video, it never appears and cannot be found in search… That’s a problem, I think.

Not arguing that a centralized solution is better than a distributed one! But the ActivityPub follow/boost model which doesn’t proactively announce a servers content to the entire network frustrates me sometimes.

@atomicpoet stuff disappears off YouTube all the time, and it's often good content like music, recordings of old TV shows that "copyright feds" catch up with, or interesting things like the Bavarian greaser refurbishing and repainting alloy wheels for a modified VW Polo on a home made rig with near 0 PPE and a cig in his mouth, I can't find his channel anywhere (I hope nothing bad happened to him health-wise)

@atomicpoet one particularly cursed situation is when rights owners get decades old recordings taken down but won't even release legit sales of the items or have one long deleted DVD release, happens a lot with old British TV shows, particularly those aimed at younger generations (which often have very patchy archives to start with), mostly caused by arguments over rights and repeat fees/royalties..

@vfrmedia It happens a lot with music as well. This is one reason I record so much music from the Internet onto CDs and cassette tapes: they've never been available for physical release, and you never know when the rights holders might take them down.

@vfrmedia @atomicpoet @polychrome @anji One of the issues I have with IPFS as it currently stands is highlighted in that. Without anonymized transports being well-supported, copyright trolling is guaranteed to happen.

So while it solves synchronization, it requires active pinning that is vulnerable to the same copyright trolling & threatening as torrent seeding is, which effectively guarantees pinning will be insufficient.

@atomicpoet I think the loss of YouTube would be a net positive for the world, tbh.

@atomicpoet I think the amount of conspiracy-laden garbage far outweighs the occasional quality educational video. And many tens of thousands of hours of human consciousness has been completely blown watching Grammarly ads.

@ocdtrekkie That's missing the point.

If YouTube goes down, what happens to the music videos, the animations, the documentaries, the sketch comedy, the video game walkthroughs, etc.?

Like it or not, YouTube has become the place for archiving all audio and video.

Which is why it's also scary that this one site hosts so much data.

On a better Internet, YouTube would be redundant.

@atomicpoet Anyone competent has a local copy of their produced content. There are people who are not competent, but again, on average I'd say a net gain.

@ocdtrekkie Yeah, try and archive the entirety of YouTube onto your local hard drive. Then make everything you downloaded searchable.

Right now, that's a pipe dream.

@atomicpoet
He's saying one's own content should be backed up by you the creator of that content.

But yes, we do believe the world will be better off, much better off without #YouTube.

YouTube agressively engages in #censorship. One of the single greatest #activist #comedians on the planet, #LeeCamp, had a show called #RedactedTonight. He spoke up against #GoogleChrome and in less than a month all 900 shows purged!

YouTube is #internetCancer.

We must kill it.

#deleteCAGEFAM
@ocdtrekkie

@ocdtrekkie @atomicpoet there are many competent creators who don't. There are so many instances of TV shows or films being lost to the world because the studio didn't archive the media properly long term (it's an expense and easy to cut back on) - only to be found by some random person who "taped it off the telly".

@naxxfish @ocdtrekkie @atomicpoet This. Also it should be taken into consideration that there are young people who have only used cloud solutions during school. Most people are not powerusers and won't care about how their cloud infrastructure functions or about what the cloud is. If you'd ask them they'll tell you that putting it in the cloud is keeping a spare copy.

@atomicpoet @ocdtrekkie

If YouTube goes down, what happens to the music videos, the animations, the documentaries, the sketch comedy, the video game walkthroughs, etc.

Then the people who’ve made them will re-upload them on Vimeo, PeerTube, Dailymotion, etc.

@ocdtrekkie @atomicpoet You don't seem to be aware that propagandists will broadcast using the popular media that people frequent. If youtube disappears for whatever reason propaganda will be adapted for other distribution systems that would fill the void. It would be great if yt had any serious competition, until then it's for the best that it exists.

@alextofanel @atomicpoet The issue with YouTube is not that "propaganda exists". The issue is that YouTube's algorithm is purpose-built to "maximize engagement", even if that means radicalizing people into believing increasingly crazy things.

YouTube shouldn't exist, and it's not because it happens to store videos.

@ocdtrekkie @atomicpoet To be fair I haven't used youtube . com in years. I only interact with it via RSS. No ads, no suggestions, no algorithm, no autoplay (except for entries from rss links in my urls file). I find content mainly by recommendations from people I'm "subscribed" to via RSS, or people I personally know sending me stuff.

@atomicpoet I'm surprised you didn't mention the closest possible analog to your analogy: Google Video.

@vertigo I forgot about it. And therein also demonstrates the danger of using Big Tech to archive data.

@atomicpoet a lot of people would lose work too. So many artists especially make a living off youtube. The fact that that could just be taken away at any moment is horrifying tbh

Sign in to participate in the conversation
Mastodon

The original server operated by the Mastodon gGmbH non-profit