Inverting the Web
We use search engines because the Web does not support accessing documents by anything other than URL. This puts a huge amount of control in the hands of the search engine company and those who control the DNS hierarchy.
Given that search engine companies can barely keep up with the constant barrage of attacks, commonly known as "SEO". intended to lower the quality of their results, a distributed inverted index seems like it would be impossible to build.
@freakazoid What methods *other* than URL are you suggesting? Because it is imply a Universal Resource Locator (or Identifier, as URI).
Not all online content is social / personal. I'm not understanding your suggestion well enough to criticise it, but it seems to have some ... capacious holes.
My read is that search engines are a necessity born of no intrinsic indexing-and-forwarding capability which would render them unnecessary. THAT still has further issues (mostly around trust)...
@freakazoid ... and reputation.
But a mechanism in which:
1. Websites could self-index.
2. Indexes could be shared, aggregated, and forwarded.
4. Search could be distributed.
5. Auditing against false/misleading indexing was supported.
6. Original authorship / first-publication was known
... might disrupt things a tad.
Somewhat more:
https://news.ycombinator.com/item?id=22093403
NB: the reputation bits might build off social / netgraph models.
But yes, I've been thinking on this.
@dredmorbius @freakazoid
Isn't yandex a federated search engine? Maybe @drwho has input?
@enkiv2 I know SEARX is: https://en.wikipedia.org/wiki/Searx
Also YaCy as sean mentioned.
There's also something that is/was used for Firefox keyword search, I think OpenSearch, a standard used by multiple sites, pioneered by Amazon.
Being dropped by Firefox BTW.
That provides a query API only, not a distributed index, though.
@kick HTTP isn't fully DNS-independent. For virtualhosts on the same IP, the webserver distinguishes between content based on the host portion of the HTTP request.
If you request by IP, you'll get only the default / primary host on that IP address.
That's not _necessarily_ operating through DNS, but HTTP remains hostname-aware.
@dredmorbius @kick @enkiv2 IP is also worse in many ways than using DNS. If you have to change where you host the content, you can generally at least update your DNS to point at the new IP. But if you use IP and your ISP kicks you off or whatever, you're screwed; all your URLs are new invalid. Dat, IPFS, FreeNet, Tor hidden sites, etc, don't have this issue. I suppose it's still technically a URL in some of these cases, but that's not my point.
@freakazoid Question: is there any inherent reason for a URL to be based on DNS hostnames (or IP addresses)?
Or could an alternate resolution protocol be specified?
If not, what changes would be required?
(I need to read the HTTP spec.)
@freakazoid Answering my own question: no, there's not:
"As far as HTTP is concerned, Uniform Resource Identifiers are simply formatted strings which identify--via name, location, or any other characteristic--a resource."
@dredmorbius @freakazoid @kick @enkiv2
Earlier RFCs had defined meanings for the parts of HTTP URLs, but vendors ignored the standards so now URL paths are just an arbitrary string which could mean anything.
@dredmorbius @freakazoid @kick @enkiv2 Back even further, the plan was that the web would eventually use URIs, which would be dereferenced to fragile URLs. But the host-independent transport layer never happened because one-way links that break were "good enough". URIs only really survived in the DTDs.
@kick I was just listening to a #NBN interview with Safi Bahcall on "Loonshots"
https://traffic.megaphone.fm/LIT6321306965.mp3
https://us.macmillan.com/books/9781250185969
On org-behaviour phase-shifts. Why some organisations are creative, some hidebound.
You can think of this as coming from competing forces, much as with solid-liquid phase transitions (binding energy vs. entropy). And transitions can occur rapidly.
The motivators for creating _and_ adopting standards is likely similar.
@kick And it's not merely competence. Much of it is mastery across a range of skills, including marketing, organisational leadership, fundraising, fighting off (or neutralising) legal and business threats, etc.
"Capitalism as the engine of innovation" suffers massively from Texas Sharpshooter fallacy, and ignores many souls it destroyed or ignored. Aaron Swartz, Ian Murdoch, Ted Nelson, Doug Englebart, Paul Otlet, Rudolph Deisel, Nicola Tesla, Filo Farnsworth...
@kick "The State" is an extension of the capitalist arm, to an enormous extent. It always has been, and you can read much of Smith's"Wealth of Nations" as addressing that specifically. "Wealth, as Mr Hobbes says, is power."
Lessig's project of the past decade:
https://corpgov.law.harvard.edu/2019/12/06/ending-foreign-influenced-corporate-spending-in-u-s-elections/
@kick
See Jane Mayer's "Dark Money", or Orestes' "The Merchants of Doubt". There's the 1937 analysis of Establishment opposition to innovation, Bernhard J. Stern's "Resistances to the Adoption of Technological Innovation"
https://archive.org/details/technologicaltre1937unitrich/page/39
As Markdown, thanks to yours truly:
https://pastebin.com/raw/Bapu75is
@kick There's the extraordinarily long history of oppression of the Small by the Large through the instrument of Government: anti-abolition, anti-union, anti-sufferage, anti-worker-safety, anti-environmental-regulation, anti-public-domain, anti-free-software, anti-cryptography.
Not undertaken at the bequest and pleas of the vast majority of the population.
So "state" and "capitalist" are not fully distinct.
@kick The specific question of *how* you create _and maintain_ a state to serve the greater public good is a complex one dating to the earliest written histories.
I'm not _against_ the state. I'm not an anarchist. That simply creates a vacuum for Power to move into.
This will _always be_ a constant struggle.
But an appropriately-structured system, with checks and balances, multiple entities, and strong checks on unlimited power, should be possible.
It's work.
@kick Ted Nelson may well suffer from technical and organisational handicaps. He certainly cops an attitude (and reminds me of a few adgacent conversational participants in this and some related threads).
But he has some Big Dreams, and dreams which have a long legacy (Paul Otlet, whom I've just discovered, being another early pioneer). The overall mission is one tend to agree with, if perhaps not Xanadu's specific approach.
The goal has powerful enemies though.
@kick The cases of Murdoch and Swartz are slightly different, but in general: people with a demonstrated enormous talent *and* a goal of direct social benefit were attacked and/or abandoned by the instruments of their own society.
Carmen Ortiz, Steven Heymann, Michael Pickett, M.I.T., JSTOR, M.I.T. President L. Rafael Reif, and others in the prosecution chain of command are complicit in Swartz's murder. They drove him to it in all deliberation.
@kick And the proprietary academic publishing industry must be destroyed, in Swartz's name.
It will be.
@kick There's a huge back-archive that's still hard to find.
Though the situation's getting vastly better.
Eventually the (surviving) publishers will turn to a public-goods model, tax-supported, because it's the only way they can exist. And I'm talking _all_ publishing substantially.
Academic: revert copyrights to authors, publish through Universities, as it was previously.
@kick Murdoch also suffered mental health issues. He'd done well, but as with many technological pioneers, saw hugely uneven success.
At a time when he was in crisis, and quite evidently and obviously so, the system entirely failed him.
As it does so very, very, very, very many.
Sucks out all they've got to give, then spits them out.
@kick I'm (trying to) rereading the MIT report on the incident.
That's also rage-inducing.
@kick @enkiv2 @dredmorbius @mathew @freakazoid
Having worked for Ted -- I would agree only in specific constrained ways. However, throughout the 80s the technical end of Xanadu was being run at Autodesk with managerial control ultimately with John Walker (Ted was not in the picture), & everybody involved during that era was hyper-competent by 80s software dev standards. Drama over a late redesign by Mark Miller (now a VP of something at Google) kept xu88 from shipping on time.
@enkiv2 @kick @dredmorbius @mathew @freakazoid
Ted is not a programmer (but is really good at reasoning about algorithms & data structures). His ADHD makes him a less effective manager. Since 1990, everybody working under him has been a volunteer & no Xanadu-branded project has had a team of more than 2 devs except under XOC.
@dredmorbius @mathew @freakazoid @kick @enkiv2
A URL/URI distinction (with permanent URIs) would mean having static content at addresses & having that be guaranteed. There wasn't initial support for any guarantees built into the protocol, & commercial web tech uses relied upon the very lack of stasis to make money: access control, personalized ads, periodically-updating content like blogs, web services (a way to productize open source code & protect proprietary from disassembly).
@enkiv2 So, no, you _don't_ need content permanently at addresses.
You only need a persistently accessible _gateways_ to URI-referenced content, much as you're already starting to see through nascent schemes such as DOI-based URIs for academic articles, e.g.:
doi://10.1007@978-3-319-47458-8
Web browsers don't yet know what to do with that. A DDG bang search, Sci-Hub, or https://doi.org should though.
Other content-based addressing methods likewise.
@dredmorbius @enkiv2 @mathew @freakazoid @kick
This lets us keep HTTP for transport through a hack but I'm not sure how useful that is in a world where IPFS, DAT, and bittorrent magnet links all exist & are mature technologies. (Opera has supported bittorrent as transport for years, & there are plugins for IPFS and DAT along with fringe browsers like Brave that support them out of the box.) HTTP has already been replaced by HTTPS which has been replaced with QUIC in most cases now...
@enkiv2 @dredmorbius @mathew @freakazoid @kick
In other words, in terms of getting widespread support for a big protocol change, the killer isn't compatibility with or similarity to already-existing standards like HTTP but, basically, whether or not it ships with chrome (and thus with every major browser other than firefox).
@dredmorbius @freakazoid @kick @enkiv2
I think (a) it's hard to do, (b) if you do it right the user never notices versus bells and whistles like <blink> and <marquee>, and (c) the web exploded really quickly and it was impossible to even get all browsers to render the same HTML, let alone all introduce a new transport layer.
@mathew More on "why" would be interesting.
Insufficient motivation?
Sufficient of resistance?
Excess complexity?
Apathy?
@freakazoid @kick @enkiv2