In the new issue of our monthly newsletter:
• Detecting spam, and pages to protect
• Editors' intelligence test scores related to article quality
• Three papers on Wikipedia citations

In the new issue of our newsletter:
• Thanking editors makes them come back more, but not contribute more
• "Wikipedia's Network Bias" on abortion and other controversial topics
• The interests of designer drugs editors

"Age dataset: a structured general-purpose dataset on life, work, and death of 1.22 million distinguished people" inferring birth, death, gender & occupation for historical figures from Wikipedia in many languages

Why did the Gamepedia wiki become key for Fortinite players while Fandom wikis dominated Rocket League?

This paper shows the community able to initially attract the most users won, but they also needed to have a core of extremely active users who shape culture & police the wiki.

People collaborate online, but how unequal is their participation? Typically, it is said participation follows a power law distribution. However, our new @PeerJCompSci publication analyzing the largest 6,000 @getFANDOM wikis proves otherwise! 1/5

"Towards the Web of Embeddings: Integrating multiple knowledge graph embedding spaces with FedCoder"

➡️ Images have become an integral part of online media, but pose accessibility challenges. To address this, scholars introduced Concadia, a Wikipedia-based corpus made of 96,918 images with English-language descriptions, captions, and surrounding context.

In the latest issue of our monthly research newsletter:
• A century of rulemaking on Wikipedia analyzed
• Wikipedia and the Wikimedia movement as a self-organizing bureaucracy (1999-2017)

"Do the Math: Making Mathematics in Wikipedia
Computable" A spell checker for mathematical formulae, tested on Wikipedia math articles!

"The Democratization of Opportunity: The Effects of the U.S. High School Movement" Estimating the effect of the construction of US public high schools on short- and long-run outcomes, using notable people data from @wikidata.

📢 Calling all Wikimedians 📢

Are you aware of an initiative, tool, training, or other form of community engagement around misinformation, disinformation, or information integrity on a Wikimedia project?

If yes, let us know!


"SoundChoice: Grapheme-to-Phoneme Models with Semantic Disambiguation" a model for sentence transcription using data from LibriSpeech and @Wikipedia.

"An Analysis of Content Gaps versus User Needs
in the Wikidata Knowledge Graph", analyzing knowledge gaps by comparing @wikidata edit metrics with @Wikipedia pageviews.

I had great fun giving this public lecture in Newcastle @ThinkingDigital.
Also seized an opportunity to make a cheeky joke about AI and face recognition at airports which landed very well, I think!

"WShEx: A language to describe and validate Wikibase entities"

As a community science lab, CAT Lab is often asked how we create research questions & hypotheses with communities:

❓⭐Next week, the answer will be TRIVIA NIGHT⭐❓

If you're a Wikipedian or Wikipedia researcher, join our session & send questions!!!

"Developing a Dataset of Overridden Information in Wikipedia", and a detection task + model to determine whether a reference sentence has overridden a target sentence.

(Tsuchiya and Yokoi, 2022)

In "Leveraging Wikidata to Build Scholarly Profiles as Service," @mlemusrojas @jaireeo @miriancramirez and Lucille Frances Brys discuss using to build scholarly profiles and amplify work of @IUPUI faculty:

"Leveraging the Wikipedia Graph for Evaluating Word Embeddings" a similarity measure based on edges between articles in the @Wikipedia hyperlink graph.

"A Comparison of Source Distribution and Result
Overlap in Web Search Engines" While there are differences in the sources of the top search results between Google and alternative search engines, the most popular domain is @Wikipedia.

Thanks to all the new followers! I know many of you are in and - which my academic work sometimes intersects with. Here are a few of my recent publications dealing with those topics in relation to :

