Journals and papers?

This is perhaps a bit of a newbie question, but since I'm only just now getting interested in reading academic papers and such relating to programming languages, I was wondering how I find them. Frequently I see claims such as "XYZ is well discussed in the literature" but I'm not actually sure how to go about finding this literature. Are there online sources where new papers are frequently listed for, say, programming language or other CS topics?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Sources and finding sources

Well, you haven't been too specific about what you wanted. The biggest resource is, well, Google (and now Google Scholar), but often those point to CiteSeer which is wonderful.

A great LtU-esque resource is www.readscheme.org.

The hard part though isn't finding resources typically, but finding topics and areas of research that are interesting to you. The choice of language can make a big difference to what kinds of things you come across (and at any rate having a diverse experience with languages will help significantly). For me, Haskell prompted my theory renaissance as (apparently) it was in the direction I wanted to go.

By and far, once you're reading about something you are interested in, following the references, citings and authors will lead you to more things up that alley.

Finally, "well known" things in research or more academic fields are sometimes not really that well known in the common sense of the words.

(Also a warning, if you aren't already, you run the risk of becoming a research paper addict. [I'm not completely joking.])

Sources

The ACM Portal is an excellent resource, if your willing to pay a small fee.

Also, I'm becoming a fan of CiteULike. Several LtU members post links to PL related topics, and you can add authors, tags, etc to your watchlist.

Grumble, grumble...

Ethan wrote: The ACM Portal is an excellent resource, if your willing to pay a small fee.

...boycott the ACM, grumble, grumble, ...

Re: Grumble, grumble...

It's one thing to say that "academics should as far as possible see if they can do their work outside the ACM"; however, as far as access to papers goes, the ACM has a total monopoly on many of the papers in its digital library. For someone trying to come up to speed on the field, boycotting the ACM would be pretty counterproductive. Hence the grumbling, I suppose.

Right

POPL sort of dominates the field.

Re: Grumble, grumble...

Hmmm... Lots of researchers put lots of good papers online. Springer LNCS also has lots of nice stuff, and AFAIK everything older than two years is 100% accessible online.

There have been papers where citeseer showed me that nice not-in-database page, but then the abstracts didn't sound THAT great that I would resound my principles.

Most good stuff is available outside the ACM, so I think those guys should get off their high horse. If a researcher thinks he has to publish a paper exclusively in an elitist/not-open "company", then he deserves not to be read IMHO.

Bastards!

I think it's really horrible that the ACM are sitting on so much of the best computer science literature instead of opening it up and letting it flow through sites like LtU where a lot of new people would discover it. I boycott their digital library because it's just too depressing to read good stuff and then feel guilty about sharing it.

I really hope to see more grassroots events like the European Lisp Meeting 2005 taking over from ACM workshops too.

scholar.google.com feature request, free articles only

As a former US taxpayer it irks me that my taxes funded any research that I am not allowed to read.

As a self-employed programmer, I can't afford to subscribe to dead-tree journals, even though I could often use the information in my work.

I was thrilled about scholar.google.com but it's frustrating to use when some of the papers I find are inaccessible to me. I sent in a feature request asking for the ability to filter out subscription-required results.

My partial solution to these issues is to organize The Monad.Reader, a free online Haskell eZine.

As for grassroots events, I was part organizer of EuroHaskell in 2004 and we'll be doing it again this year.

We can fix these problems ourselves. If your favorite language doesn't have a free online magazine, start one! If you've already got one, write articles!


--Shae Erisson - ScannedInAvian.com

Tip

After you've found an interesting pay-per-download paper with Google Scholar, search again on plain Google. More often than not you'll find a freely available version on the author's homepage that for some reason wasn't catalogued on Google Scholar. I think Google Scholar is a little picky about what it considers to be good sources of scholarly material.

Absolutely

Amplifying what Shae said, there's a very important group out there, the "armchair computer scientist". These are the people who are sophisticated but not necessarily in-depth, and are interested in good theory, but don't have the time for a literature review or chasing citations. There's a lot of scope periodicals like The Monad.Reader which make this stuff accessible.

(And no, I'm not just saying this because I have an article in the first issue!)

It could be worse

It would be nice to get free access to the ACM archive. However, as it stands, I'm happy to pay their student fees and get full access. Believe it or not, the ACM is relatively well-priced in this regard - you can get full access (and a student membership) for around $42 a year (I forget how much the price is for a regular membership). Compare that to the IEEE's $35 PER MONTH, (on top of your regular membership fees), and 24 paper/month limit. The situation with the ACM, while not necessarily ideal, could be FAR worse.

An interesting question

Would it be ethical to write a paper which "paraphrases" another (giving proper credit, of course!) but which itself adds no knowledge to the field? By "paraphrase", I mean the following:

* Acquire (legally) a paper from the ACM or some other pay-through-the-nose academic publisher
* Distill the knowledge in the paper to it's essense
* Re-write the paper, so that it contains the same essential knowledge, but with prose and explaination written from scratch.
* Cite the original paper in the bibliography (also include all works cited in the original paper)
* Publish the result on CiteSeer or some other free digital library.
* Openly acknowledge, in the introduction to the paper, that the above procedure was done.

The rationale for the above is to make the knowledge contained in a non-free paper freely available, without running afoul of copyright laws. After all, the knowledge contained in an academic paper itself is not copyrightable; only the supporting prose.

The textbook industry is (somewhat) of a precedent for this; textbooks summarize the knowledge contained in papers as a matter of course. Some even do it free of charge.

The above procedure is intended to be a sort-of cleanroom methodology; I don't know if it is sufficient (or even excessive) to cleanse the re-written paper of any copyright impairment. The procedure is also intended to not run afoul of any academic traditions regarding proper credit and plagiarism.

(It might be good to secure the blessing of the original author. While said author will likely no longer have the copyright--having to surrender it to the journal in question as a condition of publication--his/her blessing for this would eliminate any question of plagiarism or other academic misconduct).

Has this ever been tried (for academic papers, that is?)

Translation?

Funny, the same idea came to me while reading the previous posts on the thread. I am not a lawyer (and I don't like the usual abbreviation either) but I suspect that given enough fire, err, lawyerpower this procedure can be presented as a mere translation of the original work, which I believe is covered by copyright laws. Or do translations have to be literal? It probably depends on the jurisdiction, but in general?

Probably not needed

In CS, a lot of papers start, or end, or are copied, as technical reports so I don't think there really is a need to do that.

Problem of finding the stuff remains

Even though it is in fact possible to find much of CS literature (well, at least of the past 10 to 15 years) online for free, it is quite a bit harder there than through e.g. portal.acm.org. And for older stuff you're still up shit creek. I've been doing some research into Algol 68, and I was damn glad the university has full portal.acm.org access...

LNCS?

I used to download and read a lot from Springer's LNCS.
It used to be free, no registration, nothing required, but articles were only online after two years from publication.

I just looked and it seems they moved to the ACM-like pay-for-article, with only the abstracts online.

Does anyone know what happened? Are all articles that are two years or older, still freely available somewhere?

They realised that they could

They realised that they could get people to pay for the stuff?

Unfortunately, they use the metapress platform now, which is absolutely awful (for example, one cannot bookmark the pages that one views - one has to manually bookmark the "bookmarkable" URLs they display, not as links, either).

As marco comments, I don't th

As marco comments, I don't think that any such effort is necessary in most cases, because most papers not available in their published form free online are available free, with identical text. However, emphatically, there is nothing preventing you from summarising another's work, commenting on it, even to the point of reproducing all results, as long as you don't try to pass it off as your own, or use any of the actual text. (IANAL)

Lots of people do that.

This is done on lots of papers, thesis, etc. It's very common to people to create distill and re-write a lot of papers and call that a graduation thesis. Also, it very well accepted on the academia, some digests have even more citations than the target paper.

The only problem is that you may want to check what the original author said and that those digests are normally not freely available.

Remember, sicence is not only testing and proposing new hypotesis, but also doing that:
"* Acquire (legally) a paper from the ACM or some other pay-through-the-nose academic publisher
* Distill the knowledge in the paper to it's essense
* Re-write the paper, so that it contains the same essential knowledge, but with prose and explaination written from scratch.
* Cite the original paper in the bibliography (also include all works cited in the original paper)
* Publish the result on CiteSeer or some other free digital library.
* Openly acknowledge, in the introduction to the paper, that the above procedure was done.
" [father post]

DBLP bibliography

I just want to mention the DBLP bibliography, which is an excellent way of tracking down papers, people, and proceedings in computer science. They don't have paper contents, but they have many links to the sites that do.

I finally knuckled under..

and subscribed to ACM Portal last week. There was too much stuff behind the wall that I wanted to read.

Paying for access

Slashdot has a discussion of the publication access issue as it relates to the IEEE: Who Will Pay For Open Access? It begins "IEEE is thinking about providing everyone with free access to its publication database [...] The problem is, where will they get the money to fund the journals if not from subscriptions?"

Amusing quote from a comment by misterpies: "Reviewing a paper is not like moderating a slashdot comment."

Cost of running a journal

I know very little about the economics of academic publishing, but...

* Authors don't get paid for papers--in many cases, they are required to assign copyright to the journal as a condition of publication.
* Peer-reviewers don't get paid
* Producing journals in dead tree format does incur costs that aren't incurred by online publishing (paper, ink, and postage mainly)
* Editors and such (the folks responsible for assembling the journal itself) do get paid, but how much?

So... where does all the money go?

All the money

Reed Elsevier, which has the largest market share in scientific publishing, has an operating margin of 34pc

(from business.telegraph)

Authors are often charged for

Authors are often charged for publication. Staffing, promotion, and distribution, all cost, however, and subscriptions for the whole dead-tree journal are very low, typically.

The problem here is that academic publishing is bound-up with paper-based publications, which have a finite capacity to carry papers over unit time, and so incur backlogs of papers to publish. Sadly, no-one has yet sorted out a good way to replicate the peer-review process for a free online journal.

The advantage of paper-based publication is the redundant replication of data and distributed timestamping of publication that that generates.

Logical Methods in Computer Science

LMCS might be of interest to LtU readers. From the purpose page:

Logical Methods in Computer Science is a fully refereed, open access, free, electronic journal. It welcomes papers on theoretical and practical areas in computer science involving logical methods, taken in a broad sense; some particular areas within its scope are listed below. Papers are refereed in the traditional way, with two or more referees per paper. Copyright is retained by the author.

Full-text access to all papers is freely available. No registration or subscription is required, and a free email notification service is available.

The editorial board has many of the PL community bigwigs, Dana Scott, Benjamin Pierce, Philip Wadler, Mitch Wand, etc.

The Scott journal

I was about to post this link, when marcin beat me to it. The rumors about the contents of the first issue are rather promising, and the editorial committee is simply stellar.

Old stuff

Can't they just release the old stuff? I think computer science lacks continuity because people are deprived of access to its history. A large and hungry audience awaits the good stuff on sites like this but it's not being allowed to flow.

Today I wanted to find the marvellous Knuth paper The IBM 650: An Appreciation From the Field in the hope of posting it on LtU. I found this listed on IEEE's computer.org at $19. That's $19 for five pages of Knuth's remeniscences written 20 years ago and with no permission to share it with your buddies. Is this what the IEEE needs to sell to support itself? What a joke.

I also wanted to read an interview with Guy Steele on Dr. Dobbs today and, money aside, I couldn't even find a way to buy access without agreeing to get commercial spam.

This all completely sucks.

The ACM would be doing the world a tremendous favour by releasing all their content that is at least 10 or 20 years old. How can they not do this?

Is there somewhere we can spend our money where it will really do some good? I've about run out of things to buy from GNU Press.

Building a bridge between cultures?

Seems to me that the programmer and academic cultures don't have much overlap. Not that academics don't program, but that programmers are more likely to look for skill improvement the java.sun.com website rather than lambda-the-ultimate. Commercial programmers don't see how research papers can be applied to their daily work.

What about a bridge between cultures?

Two ideas that come to mind are a wikipedia of applied research papers, and an academic counterpart to The Pragmatic Programmer. (I can only imagine what articles would be written by Peyton-Jones, Wadler, Steele, Cardelli)

In both cases the hook would be solutions to existing problems in existing languages. The line would point back to further reading on the same subject in SICP, CTM, HTDP, and other high calorie CS books. The sinker would be the elegance of those same solutions in Haskell, Scheme, Erlang, ...

As for releasing old papers, In the current business culture, companies have a death grip on their copyrights and patents (witness the recent EU patent vote). I think the best way to free information is for us to write up our own learning and studies.

Any other bridge building ideas? I'd like to find something worth putting in my time and money.

--Shae Erisson - ScannedInAvian.com

The old stuff is good stuff

I think computer science lacks continuity because people are deprived of access to its history. A large and hungry audience awaits the good stuff on sites like this but it's not being allowed to flow.

Here, here! I completely agree. And there is a lot of good stuff. Most ideas in computer science were invented in quite a reasonable form in the 60s and 70s. I was amazed to find the concurrent Sieve program of CTM in a 1977 paper by Kahn and MacQueen. Silly me thought I had invented it first. (I am lucky to be sitting next to a really good CS library, which has works dating back to the 40s and 50s.) The only difference is that they didn't have an implementation, but the program is there!

How about mining those good old days for new ideas? I'm sure there are still plenty there, waiting for the first person who takes the time to dig them up again. If Google were to digitize them and make them searchable, what a windfall it would be.

Compare and Contrast

Peter Van Roy: How about mining those good old days for new ideas? I'm sure there are still plenty there, waiting for the first person who takes the time to dig them up again. If Google were to digitize them and make them searchable, what a windfall it would be.

I would love to see a compare and contrast of Oz's distribution subsystem with that of the OpenCroquet project, or more generally, with the principles set forth in David Reed's 1978 Ph.D. thesis.

LtU

How do you think LtU can help with these issues?

Our general policy is to link to papers that are generally available, and don't require subscription.

Do you have any viable models for working with commercial publishers?

Who is this question addressed to?

I have answers, which is: divide up your work for publication into (Class i) the work that most furthers your involvement in a scientific conversation and (Class ii) the work that best represents what you are up to to outsiders.

For the first class avoid mass audience journals for that part of your work you care about most, but prefer in-house university publishers for printing, or be internet only; it is especially important with this kind of work to keep copyright with few restrictions on use. Look for journals whose audience most narrowly fits the profile of your ideal reader; these are the journals that will provide the best feedback. Ideally, these are the kind of journal in which publishing involves you in an ongoing conversation.

Use name publishers (OUP, MIT Press, De Gruyter are more prestigious than Elsevier) of name journals (JACM is the 800 pound gorilla here, but I'd also count I&C, MSCS, and APAL as this sort of journal; TCS has lost a lot of the cachet it used to have) for work in the second class. The aim of the exercise here is, firstly, to decorate your CV, second, to push up referenceability of your work, and third, to engage people who work outside of your core research interests. Avoiding copyright assignment and/or heavy restrictions on reuse would be nice, but it's not as important here.

Postscript (19 Mar 05): rewritten above, lots of new content.

LtU as in "this site"

I didn't mean each of us as an author, but rather the LtU site and community.

There are journals that go by the subscription model. Many of them are not planning on changing their basic model any time soon. Some of them might want to gain visibility by working with LtU, but I am not sure there's any workable solution they can agree to.

Hence, my question. If you have ideas, let me know (privately, if you prefer).

Technical reports and more

I rewrote the parent of this node so that it is more accessible, and used the opportunity to add a bit more information.

One thing I have noticed happening more and more is researchers using their departments' technical report series in a more and more creative manner. The significance of tech reports today is quite different to what it was 10 years ago.

The truth about scientific publishing

Do you want to know the truth about the (scientific) editorial market? Don't miss this:
In Oldenburg’s Long Shadow: Librarians, Research Scientists, Publishers, and the Control of Scientific Publishing

Node of the week

This forum topic is my choice for week #2 in the highly prestigious LtU node of the week series.

An Example of Open Publishing Done Right

JMLR (http://www.jmlr.org/) is the best journal in machine learning and is completely free. Check it out -- it really works.

Another example of OPDR

We have JOT, a very good journal in object technology.