[INDEX][What is PG] [Etext Listings]

How to Volunteer
    Many thanks for your interest in volunteering for Project Gutenberg!

    WE ARE ALWAYS LOOKING FOR VOLUNTEERS!

    [*] Overview of Volunteering

    [*] Clearing an Etext for Copyright

    [*] PG's Editing Standards

    [*] The Anthology Project

    Other "Virtual Volunteering" Opportunities at:
    Impact Online Community Center


    OVERVIEW OF VOLUNTEERING

    First, you should subscribe to the Volunteer's List: 
    Send the following message from your login to: 

    listproc@prairienet.org

    The Subject should be left blank 

    Add the following to the Body of the message, where "[Firstname] [Lastname]" are your First Name and Last Name, of course. 

    sub gutvol-l [Firstname] [Lastname]

    Your return address comes from your machine. 

    In the same vein, to unsubscribe, proceed as above and replace sub with unsub

    To change your return address, first, unsubscribe the from the old address, then subscribe from the new address. 

    Remember: you must send the email from the address at which you want to receive the Newsletters, and you must unsubscribe from that same address. Do NOT trust your sysadmins when they tell you that you do not have to use that particular address. Be sure to change addresses to your new one as early as possible, as the sysadmins are reluctant, at best, to reinstate your old email addresses for this purpose.
    As a member of the Volunteers' List [gutvol-l] you will receive copies of our In Progress File each month, along with suggestions on how to go about preparing, proofing and editing Etexts. 

    You may want as well to subscribe to the Project Gutenberg Newsletter. To do so (or to usubscribe, or to change), proceed as per the Volunteer's List, by simply replacing gutvol-l with gutnberg

    Whenever you are interested in a particular book, please send Michael Hart a note, with your email address, the book[s] and author[s] you are interested in and that note will go into the next In Progress File. 

    You can get started by getting our standards file. 

    Is there anything in particular you would like to work on? 
    If so, please write a few lines describing it and include your email: we will include this in our In Progress file. 

    The only real critera are copyright and interest. Rule of thumb on copyright is before 1919 if published in the US, or 50 years after death of author if a book was published elsewhere. We will do copyright searches for you. You will probably want to make sure you are subbed to the regular Project Gutenberg Listserver, too [gutnberg] 

    If you have any questions, please contact hart@pobox.com

    [Top]


    RULES OF THUMB FOR DETERMINING WHEN THINGS ENTER THE PUBLIC DOMAIN

    United States 

    1. Works first published before January 1, 1978 usually enter the public domain 75 years from the date copyright was first secured, which is usually 75 years from the date of first publication. (This is the rule Project Gutenberg uses most often) 

    2. Works first created on or after January 1, 1978 enter the public domain 50 years after the death of the author if the author is a natural person. (Nothing will enter the public domain under this rule until at least January 1, 2023.) 

    3. Works first created on or after January 1, 1978 which are created by a corporate author enter the public domain 75 years after publication or 100 years after creation whichever occurs first. (Nothing will enter the public domain under this rule until at least January 1, 2053.) 

    4. Works created before January 1, 1978 but not published before that date are copyrighted under rules 2 and 3 above, except that in no case will the copyright on a work not published prior to January 1, 1978 expire before December 31, 2002. (This rule copyrights a lot of manuscripts that we would otherwise think of as public domain because of their age.) 

    5. If a substantional number of copies were printed and distributed in the U.S. without a copyright notice prior to March 1, 1989, the work is in the public domain in the U.S. 

    Caveat: Every time a substantially new edition is created, especially if it is a new translation or done by a new editor, a new work is created, so you count from the creation of that edition, not from the creation of the original. 

    United Kingdom and a lot of other countries. 

    The general rule is life of the author plus 50 years. I do not know what their rules are on corporate authors, and I am pretty sure that publication without a copyright notice does not put a work in the public domain in these countries. 

    Whose law applies? 

    When we distribute in the United States, U.S. law applies. When we distribute to other countries, their law applies. That is why Peter Pan is marked for US distribution only. It is public domain in the U.S. but not in the U.K. 

    Under the 1909 Copyright Act, the original term lasted for 28 years (not 26). It was renewable for an extra 28 years for a total of 56 years. In the mid 1950's Congress started working on a major revision of the copyright act, but by 1960 it was clear that this would not be a short process. By 1962 it was clear that the new act would grant existing works a total term of 75 years. To prevent these works from losing out on the 75 year extension while Congress worked out all the other details of the new act, Congress started passing extension acts in 1962. They passed these acts from 1962 to 1976 with the result that all copyrights in existence in 1962 were extended to at least 1976 when the 7>5 year rule kicked in. So works published between 1917 and 1939 are not in the public domain yet. 

    A few public speeches are not copyrighted because they are not fixed in any tangible medium of expression. But if the speaker writes the speech down or authorizes it to be recorded or has someone record it at the time the speech is given, then it is fixed in a tangible medium of expression and it is copyrighted. 

    The I Have a Dream speech was written down on paper and registered with the Copyright Office. It is copyrighted and Mrs. King has at times enforced that copyright. This is also why associations which wish to tape conference presentations have to have copyright clearance from the speakers. 

    Rule of Thumb: Published at least 75 years ago in the U.S. OR published 50 years after the author's death in other countries. 

    Copyright search for Project Gutenberg Etext 

    If you have a book published before 1921 to do for Project Gutenberg, please send to the address below xerox copies of the title and copyright pages, or for books published in the US before 1989-- without any copyright information; we still need both sides of the title page for those to prove there was no copyright notice.
    Remember: new editions or translations can get new copyrights, so use older ones.
    For copyright search, please send title and copyright pages to the address below AND. . .please include LITERAL translation of EVERY word on the pages, if they are in a language other than English. 

    Address:
    MICHAEL STERN HART 
    405 WEST ELM STREET 
    URBANA, IL 61801-3231 USA

    Please include always your email name and address, and mark the envelope with some distinctive mark and or color. Colored envelopes fine. Just something so I can find it easily, the mail here is slow and deep, like snow. Please send a note to:
    hart@pobox.com for more info.

    [Top]


    HOW TO PREPARE AN ETEXT FOR RELEASE BY PROJECT GUTENBERG

    [Last Updated: 22 August 1994]

    This is the file "standard.gut" which contains many suggestions how to prepare an Etext for release by Project Gutenberg.

    Remember: these are only suggestions. People send us files in a variety of formats, and we are most glad to a little work for the purpose of getting them into an easy to read onscreen form.

    If you are interesting in editing, please ask for details on an extraordinary effort we are making to prepare Etexts in manners which will enhance both the readability and searchability of an Etext by the elimination of hyphenation and of widow/orphans on a line by line basis. This takes a bit of work, but it results in and Etext much easier to read than the paper book from which it was taken. Please ask for "editing.gut".

    [editing.gut is currently appended to the bottom of this file.]

    No indentations [anywhere other than inserted letters, poems, etc.]. [Including none for contents, chapter headings, etc.]

    No CAPITALIZATION of first word in a chapter, other than first letter.

    Obviously, the first thing to do to make sure your chosen books are clear of copyright restrictions. We will be happy to do an assortment of copyright searches and write clearance letters.

    When you start preparing the Etext, after getting the copyright clearance finished:

    Please preface the file with your name, address, phone, & email.

    Each line of your book should end with a "hard return" = cr/lf. In DOS if you save as a DOS Text File, this is the default. On Macs, each line needs to end with "end of paragraph marker" In UNIX, each line needs to end with ^M.

    This is VERY important in establishing the margination, as per the new editing policy mentioned above.

    We try to average 65, with 55 to 75 being short and long other than for emergencies, which will extend to 51 to 79.

    You can look over any of the Project Gutenberg Etexts to see a series of examples of how this works. You may notice how much easier it is to read the latest novels [such as Burroughs] due to the elimination of hyphenation, and the remargination of an assortment of lines that previous were split with words on the preceding or following lines that should have been on the same line. . .but were moved for the convenience of the publishers.

    The entire work should start with the title and end with "End of this Project Gutenberg Etext of Name of Book" Then three returns.

    We would like page numbers at the left column for proofreading purposes.

    Priorities go with the more important type headers. i.e. from end of Chapter to beginning of Part, use Part

    Title and Part type headers--5 returns after 6 before Chapter headers--3 returns before first line. Chapter ends--4 returns before next chapter header. Wide paragraph separation--3 returns. Normal paragraph separation--2 returns. End of line----one return. (These are "hard" returns, not "soft" returns.)

    Don't worry if you can't do all this, or can't do it easily. We expect to have to spend about ten hours on each book from the time we start editing it until it is ready for releasing on the networks. Adding the hard returns et. al. is an easy part of that process, so don't feel obliged.

    Actually, in 1994 we will have to cut this to five hours, or your erstwhile editor will die under the strain.

    Also, for those concerned about space. . .even if an average paragraph in your book is only 100 characters, the additions of the hard returns will only make the book a percent longer in the end.

    We would like to receive these files in a PLAIN ASCII format and if compressed, please use ZIP if you can. We could help you find it, if necessary. We prefer not to use TAR and Z-- but we will if necessary. . .we would prefer to receive just one large PLAIN ASCII file and ZIP it ourselves, rather than the various chapters, subdirectories, etc. with TAR.Z files.

    Please name files with standard DOS filename.ext, that is eight character filname and three for extension.

    General suggestions for the preparations of Project Gutenberg Etexts

    In more detail than what was presented above.

    Editing policy for margination/widows/orphans is at bottom.

    Your suggestions for rewrites of this file gratefully accepted.

    0. Please put your name, email, and other contact information INSIDE THE FILES YOU SEND, AT THE TOP. You may not believe how often we get files and cannot contact the sender to get details on the edition, etc.

    1. Let us do the copyright clearance for you.

    2. Remove vestigial traces of paper publishing. A. Page numbers [maybe the last thing to go, for reference] [sometimes they are required, so we leave them in] B. Hyphens at the end of lines, unless true hyphenated word C. Widows and orphans [at page, paragraph, and line levels] D. Remove or mark typos. [but not intentional misspellings, and leave in intentionally bad grammar]

    Spacing:

    E. Two spaces after each sentence [watch for ! or ? that do NOT end sentences, then use only one space]. F. One blank line after each paragraph. [two cr/lf returns] [If you can't do this easily, just separate each para with "**" to simlate the "hard returns"] G. Two blank lines after each section [wide paper breaks] H. Four blank lines after each chapter I. Three blank lines after chapter headers. J. Elipses [word. . .] have no spaces before or after ".'s" unless they end a sentence with four [. . . . ] then it is a sentence ending. . .with two spaces. . . . Next is a new sentence. K. Dashes will be--dashes--with no extra spaces around them [this has been discussed at great length and chang>ed one or two times already. I have heard great argumentations from both sides [_I_ preferred the spaces] but I finally decided on not having them because more people wanted it that way and because it looked more like the books [also it saves a few spaces here and there in the files].

    3. Try for 99.9 to 99.99% accuracy.

    4. Swap proofreading with others from the volunteers list, keep your reading fresh. . .once you miss an error it is a likely thing that you will miss it again.

    5. Poems and indented quotations within paragraphs: Please try to make this look as much like the book so it can be determined by the reader whether this is a separate part, part of the same paragraph or what. Feel free to use indent and blank lines to accomplish this.

    6. Most people use "quotes" but those who are sticklers for ``open'' and ``close'' quotes use these. Gets hairy if you say:

    Harry said, ``'Twas the night before Christmas'' Harry said, "'Twas the night before Christmas" is fine, [not to mention that many keyboards and programs require an extra ` to get one on the screen, so right now I have to type ```` to get just `` on the screen. When a doubt occurs, just do what you think the average searcher goes searching for. Please include a note at the top of your files indicating any of these you were unsure about.

    What we need most in proofreading are people to readjust those margins after the hyphens have been removed, and to adjust line lengths in the places where phrases, lines, and paragraphs have widows and orphans.

    We try to average 65, with 55 to 75 being short and long other than for emergencies, which will extend to 51 to 79.

    If this it NOT what you want to do, PLEASE don't let me force you into such a thing. It is something I can do, and can probably teach others to do, but I STRONGLY prefer NOT to ask people to do slave labor. The editing of this nature makes the Etexts much easier to read and search with nearly any program and computer, which is a major part of Project Gutenberg's goal. . .to get the books to EVERYONE.

    I know that I have a particular talent for margination, that comes out without apparent effort sometimes, as you might notice in the message. That talent is probably the only reason I ever decided this editing is possible, but I CAN tell you that I can't do more than about 100 pages a day of it, and that in eight separate shifts with rest in between.

    However, when I think of the millions or billions of people who should be able to use these books only one decade from now [after 22 years on the job] it is hard for me NOT to do this editing, as I think Etext is going to be a much better medium than paper ever was and should not be relegated to "copying paper" inclusive of all the problems paper might cause as a medium [even though we are used to them]. Some scholars in the Etext and paper reprint field even feel that typographical errors, along with hyphenation and pagination, should be preserved.

    Etext as developed and distributed by Project Gutenberg since 1971 was never intended to be a copy of a paper or a parchment [remember, first Project Gutenberg Etext was typed in from parchment replicas of the US Declaration of Independence].

    The major puposes of Project Gutenberg have always been:

    1. to encourage the creation and distribution of electronic texts for the general audience.

    2. to provide these Etexts in a manner available to everyone in terms of price and accessibility [i.e. no special hardware or software], and no price tag attached to the Etexts themselves.

    3. to make the Etexts as readily usable as possible, with no forms or other paperwork required, and as easily readable to the human eyes as to computer programs, and in fact, more readable than paper.

    4. to encourage the doubling of creation and distribution every year, so as to put 10,000 Etexts into general circulation by December 31 of the year 2001.

    For those of you who are not terribly interested in the editing of the books into formats to improve onscreen reading and searchin, you might want to stop here, as the following pertains mostly to editing in this new methodology. Hopefully, Etexts will allow us to exorcise the old, no longer necessary methods the publishers have used to get more words on to fewer pages, and to eliminate end of line hyphenations, and also to reconnect many phrases and sentences that were previously broken up in this same process of moving away from manuscript form. Please also realize that the examples below will look as if they orginally had the ragged margination you see here, while a quick look at the pap>er books will show you their marginations were perfectly neat. This is part of the same process called "proportional spacing" in which the publishers make an even greater effort to adjust the words to their own formats-- a process in which the letters are squeezed more closely together, for the purpose of saving more paper, or sometimes spread further apart to eliminate a particularly awful phraseology or "widow/orphan" problem.

    Eventually authors will finally have control over their own works, and will actually be able to create their books in finished published form just the way they want them.

    For those books we already have in print and in Etext, we hope to help create editions that are more readable, by trying to a job of "reverse engineering" to arrive at a book somewhat more resembling what authors intended in the first place. Given the information authors have given us in response to our questions about how the printed book looked in a comparison to what they had intended, it is HIGHLY UNLIKLEY that these efforts are going to be exactly what the authors had in mind, but this should not keep us from trying to move in that direction.

    New editing policy for margination/widows/orphans.

    Here is an example of an original paragraph from the introduction to The House of Seven Gables, followed by two possible revisions:

    As I received it after being edited and proofed several times:

    In September of the year during the February of which Hawthorne had completed "The Scarlet Letter," he began "The House of the Seven Gables." Meanwhile, he had removed from Salem to Lenox, in Berkshire County, Massachusetts, where he occupied with his family a small red wooden house, still standing at the date of this edition, near the Stockbridge Bowl.