[ cyb / tech / λ / layer ] [ zzz / drg / lit / diy / art ] [ w / rpg / r ] [ q ] [ / ] [ popular / ???? / rules / radio / $$ / news ] [ volafile / uboa / sushi / LainTV / lewd ]

cyb - cyberpunk

“There will come a time when it isn't "They're spying on me through my phone", anymore. Eventually, it will be, "My phone is spying on me.””
Name
Email
Subject
Comment
File
Password (For file deletion.)

BUY LAINCHAN STICKERS HERE

STREAM » LainTV « STREAM

[Return][Go to bottom]

File: 1446076695485.jpg (165.41 KB, 1024x737, truman.jpg) ImgOps Exif iqdb

 No.18381

I'm fuarrrking scared, lainons. As I browse the internets daily looking for knowledge, trying to be the least bit productive in my lonely NEET life, more and more of the knowledge I seek is being stuck behind paywalls, subscription gags, and summaries and abstracts whereas the actual content is held hostage.

Fellow lainons, this is a call to arms! I want infinite access to scientific knowledge and current knowledge for everyone, and I believe we should accept nothing else! Therefore, I propose an initiative to liberate such informational content! We have the technology! I know we're not the smartest techies in the world, but I have a lot of faith in what we can collectively accomplish.

In this thread, I hope to discuss with fellow lainons how we can make this happen. Also, information liberation general.
>>

 No.18382

>>18381
But I'm busy watching K-ON!

>>

 No.18387

I will start with what I believe would be the most tedious, but most reliable way of doing this.

warcprox Is a python program that creates an HTTP/S proxy, and records everything browsed through it to a standardized WARC file, which can then be opened and browsed later through various means, including webrecorder.io, webarchiveplayer[1], and many others[2]!

So, the idea is that one might browse a list of requested articles through warcproxy, and share the WARC file created that way, or browse everything on the front page, or what was published that day, and release that. If there was way to make the browser visit a list of sites, and fully load each one, that would also make the process much smoother.

[1]:https://github.com/ikreymer/webarchiveplayer
[2]: http://archiveteam.org/index.php?title=WARC

>>

 No.18400

The easiest way to get paywalled data is to just pay for it. If you do, make sure to strip any identifying metadata before distribution. The big problem is where does the money come from? A charitable individual is the best option but unless someone wants to volunteer... If you take donations they should be secure. You are funding piracy after all. Of course, people who have already bought or intend to buy these things are a big help.

Also this must have been done on some sort of a scale before, collecting together whatever we can find and making a way to index and distribute it (even if it's just a big torrent) could be worthwhile. A larger system could encourage the people who buy these things to share with the rest of us. Theoretically they're sciency types who vaguely agree that information should be free and most people are fine with piracy anyway. There's a fair chance that all we really need to do is make it easy for them. The biggest problem with this would be stripping metadata but this is a technical problem, I'll write something about it soonish. Also I can't really say but they're probably not doing serious stenography with this, probably.

Most important is a source for the papers, it's nothing without them and they're what will attract more people with papers to share. Collecting what we can find will be very useful in the long run and it means that even if there is no long run we still get a nice torrent.

>>18387
It's not really relevant to what you said but that gives me an idea. Universities probably have a lot of people in them who've paid for these types of things and it isn't that hard to get onto a university network. Just food for thought.

>>

 No.18422

I agree OP. Especially academic journals. It's a pain and I don't really understand the necessity behind paywalling digital journals.

>>18400
There exists an invite only site that is used by many in the academic and artist community to share essays & papers, most comprehensively of critical theory.

It's been taken down and brought back up a few times now. I don't think I should share its naaame - but if you know then you know.

>>

 No.18423

>>18422
>>18422
That would be the idea - but make it accessible and free for all.
And if we ever get to creating our own network - that portaaal would be a good source of material too.

>>

 No.18424

File: 1446103596814.png (24.11 KB, 1068x479, sci-hub-front-page.png) ImgOps iqdb


>>

 No.18428

File: 1446112474209.jpg (34.62 KB, 460x288, 1396516c.jpg) ImgOps Exif iqdb

we have a short memory span, do we?

https://archive.org/stream/GuerillaOpenAccessManifesto/Goamjuly2008_djvu.txt

also, about researchers sharing PDFs from behind paywalls: http://www.bbc.com/news/blogs-trending-34572462 (don't mind the 'secret codeword' bullshit, just the concept)

I guess there's much more, but that's from the tip of my head. have fun and share stuff.

>>

 No.18436

>>18424
I already know about LibGen, but they don't have any scrapers as far as I know, but the only way to access LibGen content right now is through BookZZ and BookSC, which comply with DMCA requests (there are several times books I've looked for have been deleted due to requests from the copyright holder), and they limit your downloads.

That, and they're not all inclusive. I have very frequently been unable to find paywalled articles that I wanted from them, and #icanhazpdf is not an option for me because twitter requires a phone number for registration now. For now, I think we should be seeing if any strangers will let us use their account information for such purposes, and report back what we have access to.

>>

 No.18437

File: 1446132618810.gif (2.87 MB, 352x198, time_for_piracy.gif) ImgOps iqdb

>>18436
> the only way to access LibGen content right now is through BookZZ and BookSC
The link I've posted works for me just fine. Maybe try one of the mirrors:
http://libgen.io/scimag/
http://93.174.95.27/scimag/
http://u76v7ha6j4jmtz3k2lseaso5qy36lxs77klhovmptufwcodovatq.b32.i2p/
If you are in the US or the UK you might need to use a VPN.

If you use the scientific article search (/scimag in the url) and they don't have the article, they will redirect you to sci-hub which uses proxies in universities to try to download it for you.

If that's still not working, instead of twitter you could use reddit's /r/scholar board.

>>

 No.19651

>>18400
I think it would be a good idea to have a program go through and copy all of the text into a file type which actually stores the text characters rather than continue to use pdfs. There should also be people who go back and ensure that all of the characters copied over correctly, of course. Images could be redrawn or recreated by those with the ability when necessary and copied directly when it is not. This would help the documents to be more easily searched, copied, and redistributed. It also eliminates nearly all threats of identification.

>>

 No.19653

I think volunteering for the Internet Archive (archive.org) is the best way, they've been doing it for more that 10 years and have more than 15PB of free access data. IMOit would be good if all articles in sites like >>18424, arxiv, paperswelove, etc. were uploaded to IA.
Go to a library, see if you can find a way to get the paywalled article (contacting the creators, for example), then see if the process can be automated, and upload it to IA and wherever else.
Related thread: >>18747

>>19651
That's what Google Books is doing with the captchas, and Internet Archive has a campaign for mirroring them. They were sharing 900k books (as of March 1 2014) taken from the Google stack.

>>

 No.19655

>>19651
PDFs are generally searchable. Also, what kind of papers are you even reading? Almost any formulae are impossible to render decently in anything but PDF (well, also browsers), and can't be automatically parsed.

The people behind SciHub and LibGen are doing a shitload of works. Seriously, they're the best. Don't come up with ridiculous things nobody will ever do just because you can't find a safe PDF reader.

>>

 No.19672

>>19653
We can't rely on the IA for anything but public-domain works, and whatever few privileges they get from being a legally licensed library, because they let themselves get cucked by intellectual property laws.

>>

 No.19673

The last guy I remember releasing academic information out to the public killed himself cause the feds came after him and wanted to lock him up for a long time.

Stay dumb, stay free.

>>

 No.19707

>>19673
If, at first, you don't succeed,

>>

 No.19709

>>19673
Probably because that guy was dumb enough to release them just the next day he downloaded it, while staying in the phisically in the country where he downloaded it and that country happened to be the one that fuarrk hardest all "leakers" imaginable. He could have just torrent it in 2016 through VPN while sitting in a hotel in Belgrad. But no, that would be too hard.

>>

 No.19710

The fuarrrk are you talking about a lot of knowledge is free, at least the good soykaf is, just look at the dozen or MIT's opencourseware, moocs, etc. Failing that there are still the traditional means like pdfs, books, torrents.

If you're talking about actual scholarly articles many of them are bogus either illegitimately peer reviewed or whose results are not replicable. Unless you can do it yourself why bother?

>>

 No.19740

>>19710
All the soykaf you suggested is based off the information published in those peer-reviewed articles you pooh-poohed.

Are you really suggesting it's better if I can't read the source of an idea or discovery and decide whether it is right or wrong for myself? Don't forget that we only know about many published papers being frauds because of peer-reviewed papers on the subject of fraudulent scientific papers.

>>

 No.19741

My college has access to a pretty good amount of databases.

Here's the full list: http://pastebin.com/9sCks71k

I don't mind writing a spider to go through all of these databases and rip everything I can. Or alternatively I've already hacked 15 or 20 usernames and passwords at my college, if another lainon was interested in helping catalog it I could give them a username and password that they could log in with.

Proxies + hacked accounts will also get us past the metadata question, because it will lead back to innocent people and proxy servers.

>>

 No.19742

>>19709
pretty sure he never actually released any of it

>>

 No.19746

Its pretty sad that scientific knowledge gets paywalled the way it does, i think what we need is a new publishment model. It might need to start as it's own scientific journal but i really feel we could make all that knoweldge way more useful if it was all in one database.

One database, linked citations, then you start running automated data mining scripts on it and oh god we have so much knowledge and more is being generated exponentially with every new piece of research added to the database. It would be publicly accessible with multiple synchronizing repositories and potential for forking if things ever go south.



Delete Post [ ]
[ cyb / tech / λ / layer ] [ zzz / drg / lit / diy / art ] [ w / rpg / r ] [ q ] [ / ] [ popular / ???? / rules / radio / $$ / news ] [ volafile / uboa / sushi / LainTV / lewd ]