Leszek<p><span class="h-card" translate="no"><a href="https://social.oevents.co.za/@kaasbaas" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>kaasbaas</span></a></span> <a href="https://genomic.social/tags/wikipedia" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>wikipedia</span></a> (English language) is only ~22GB of <a href="https://genomic.social/tags/bzip2" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>bzip2</span></a> compressed <a href="https://genomic.social/tags/xml" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>xml</span></a> (uncompressed size is ~86GB). <br>is it possible to access it without decompression? I guess <a href="https://genomic.social/tags/random" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>random</span></a> <a href="https://genomic.social/tags/access" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>access</span></a> to .xml.bz2 should be a solved problem, right? <br>we're routinely using gzip with random access in <a href="https://genomic.social/tags/bioinformatics" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>bioinformatics</span></a> ie via <a href="https://genomic.social/tags/samtools" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>samtools</span></a> or <a href="https://genomic.social/tags/tabix" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>tabix</span></a> </p><p>EDIT: <br>Wikipedia xml.bz2 does support random access for multistream version. does <span class="h-card" translate="no"><a href="https://mastodon.social/@kiwix" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>kiwix</span></a></span> or any other wiki reader support it? I couldn't find info on their website...</p>