Sunday, February 7, 2016

YA Classics using objective data

Some time ago, someone posted on one of the book forums to which I belong, a request for a list of classic teen literature. This got me to mulling, particularly as I was aware of a couple of interesting new data sets that might be helpful.

While the request seems simple, it needs to be disambiguated and made objective. What do we mean by "classic," what do we mean by "young adult," and what does it mean to "read" a book. How can we measure this? We can't go by publisher hype. Teachers tend to be biased towards texts geared towards the college bound. Copies sold and copies checked out of a library do not necessarily reflect copies read.

I would argue that what we should be interested in are the books young adults are actually reading and which they deem to be meaningful and influential to them over reaches of time. So let's start putting some parameters around that concept.

First, what do we mean by Young Adult (YA)? YA is a genre in publishing. That seems straight-forward enough. The problem is that most publisher research indicates that 60-80% of the readership are in fact adult women. So let's throw that out. Some people use bounds more associated with school grades such as fifth, seventh or ninth grade as beginning points and others prefer to limit to high school completion or the rounded number of age 20. I propose that YA is the age category covering the decade between 12-21 years old. I.e just as most approach all the physical and cognitive changes associated with maturity (12) and then the closing point when individuals are accorded all the benefits and responsibilities of full adulthood (21) and when virtually all the physical and neurological changes associated with maturity are substantially complete.

These are the most tumultuous years, with greatest variability and span of change in reading ability, physical maturity, mental maturity, degree of socialization and acculturalization, academic achievement, etc. No other decade of life is quite like it, thank goodness. Young adults at the lower bounds will still reach back into the ease and simplicity of picture books and chapter books. Young adults at the upper bound will stretch into harder core texts in sciences, philosophy, and fields beyond the entertainment of storytelling in written form. All in that decade between 12 and 12.

So that's the age group. What do we mean by someone having read something? Books, obviously. What about plays and poems? What about short stories?What about nonfiction? Even with books, do we mean being aware of the book, having read at least a chapter, having read excerpts, having had it assigned in class? Do we only count those books which have been read cover-to-cover? A Brief History of Time by Stephen Hawking is notorious as a book that everyone bought but virtually no one read. A seventeen year old who is intensely interested in biology reads three or four chapters of Charles Darwin's Origin of Species. Has she "read" The Origin of Species? Does it make a difference to what is being counted if we focus only on what is assigned versus what is electively read?

All these are fair and knotty issues. I am going to go with any book which a child between 12-21 has some form of meaningful engagement (usually revealed in conversation or recollection when older), principally through having read the book or portions of the book. I am not excluding books that might have been skimmed and which might have been supplemented via movies, stage productions, graphical renditions, and abridged editions.

Finally - what do we mean by classic? There are several issues tied up in this bundle. You might argue it is the book that is most read by the most members in a particular cohort. For example, each year there are some 4 million youth who turn 21. We might say that a classic is a book read by X% of that cohort. The challenge here is that many children are literate and don't elect to read at all or read very little. In addition, a large percentage of reading time, particularly as they get older, are books outside the traditional canon, primarily nonfiction. For example, With the Old Breed is a classic war book and it is read by a material number of YA youth, primarily boys. Is that therefore a YA classic? I would argue no because it is not read by a large enough population and too narrowly in terms of demographics.

It is hard to get a read on what represents the penetration ceiling for a book within an age cohort. From a variety of sources and studies, I suspect that it is very rare for any book to have been read by more than 10% of an age cohort by the time they turn 21. The numbers suggest that there is much greater penetration for picture books. Something like 50-75% will have had Dr. Seuss or Margaret Wise Brown read to them as a child, but at most 10% of them will have actually read Harper Lee's To Kill a Mockingbird even though it is a widely assigned classic.

There is also a very real class issue in designating classic. Classic for whom? The 30% that go on to college or for the 100% of the age cohort? For example, there is a reasonable chance that a good portion of those destined for college will have read some of Dante's Inferno but I suspect that a very low percentage of the other 70% would even have much awareness of it.

There are other issues. For example, Toni Morrison's Beloved is regarded as a classic in African-American literature and it is frequently assigned in high school. But how many actually read it and are engaged by it. It does not show up in many surveys of books that were important to people.

I am going with four approaches as to what constitutes a classic - 1) It is widely acknowledged as a classic, 2) It is widely read within age cohorts across time, 3) It is of merit and interest beyond the US, and 4) It is widely recognized even if only a small percentage may have actually read it in its entirety. An example of the latter might be Philip K. Dick's Total Recall - a classic of science fiction literature. It might have only been read by 5% of 21 year olds (basically anyone interested in science fiction) but it is also rendered in two block buster movies over the past twenty years making it widely recognized.

If these are the parameters, how do we construct such a list. No one measures which books YA actually read. I have drawn on multiple sources. The two primary sources are the Pantheon 1.0 database and The Open Syllabus Project. Pantheon gives you all authors who have Wikipedia entries in at least 25 languages (thus indicating that the authors are of interest to a wide range of people/cultures). Pantheon also gives you a measure of intensity of interest over time (page views within a time period).

Open Syllabus lets you know which books are most assigned for reading to the 30% of YA who go on to college.

Supplementary sources include a database I have built compiling library recommendations as well as public surveys of books people mention as favorites from their youth, Renaissance Learning's annual report of the top twenty most popular books read in each grade (their program being in some 25% of schools), Goodreads, and Library Thing. Renaissance Learning is interesting because it gives insight to differences in reading preferences between the genders. It also, to some small degree, corrects for the overwhelming class bias towards those who are college bound.

Pantheon only gives you authors so you have to make an educated guess as to which of the authors' works are most read by YA. In most instances this is pretty clear but there are a few judgment calls. I have checked these calls against popularity rankings in Good Reads and Librarything.

I have taken the HPI measure (a measure of balanced interest over time) from Pantheon and sorted from high to low and given the titles their ordinal rankings. Similarly, I sorted the Open Syllabus candidates from high to low, based on the number of syllabi in which they are assigned. For example, in college, Mary Shelley is the most frequently assigned author, cited in 2,710 syllabi, followed by Machiavelli, Shakespeare and Homer. Their ordinal rankings are, respectively, 1,2,3,and 4.

Pantheon and Open Syllabus diverge significantly from one another in ordinal ranking. Interestingly, if you average the two ordinal ranks, you end up with a list that likely comes closer to representing a whole population list (as opposed to only that which is of interest in college.)

Some caveats. I included several authors about whom I am dubious as to really how widely they are read by YA, even if they are well regarded among adults. Examples include Simone de Beauvoir's Second Sex, Aleksandr Solzhenitsyn, Toni Morrison, and others. Another caveat is that, despite my best efforts, this list still appears to me to be strongly biased towards the reading interests of the 30% who go on to college. Finally, the 25 languages bar filters out a lot of writers who are definitely in the American YA canon of classics such as Laura Ingalls, Wilder, Mary Norton, Katherine Paterson, E.B. White, Lois Lowry, Louis Sachar, Richard and Florence Atwater, Robert C. O’Brien, Jean Craighead George, Norton Juster, Mildred D. Taylor, Christopher Paul Curtis, Madeleine L’Engle. I was quite surprised by the number of American classics which apparently are not as well engaged with in other places as I would have thought.

Details: In Pantheon 1.0 there are 954 authors overall from some 50 or so countries and right back to Homer and that era. While many of these might be globally consequential, many of them are unknown in the US or known only to specialists, or did not write books likely to be read by YA. Examples: Francois Rabelais, Juvenal, Milan Kundera, Lope de Vega, etc.

There are 166 from the Pantheon list who have written books that are read or assigned with some frequency in the US. Of those, 17% are female authors, right in the range of representation normally seen (15-30%). 58% are foreign born. 27% were by authors born after 1900 (very roughly, "modern"). 8% are People of Color.

The final caveat. These are all proxies for the reality that we do not know which books YA actually read and value. This list is arrived at more rigorously and with better objective data than most but it is still just a proxy. And, as noted earlier, it omits many authors apparently less well known outside the US.

Apologies for the display. Beyond my HTML skills under time constraints. The listing is: Author, Title, HPI Ordinal Rank (from Pantheon), College Syllabus Ordinal Rank, and Overall Ordinal Rank.
William Shakespeare Romeo & Juliet 2 3 1
Homer Odyssey 1 4 2
Mark Twain Tom Sawyer 15 11 3
Oscar Wilde Picture of Dorian Gray 6 27 4
Franz Kafka The Metamorphosis 22 17 5
Jane Austen Pride and Prejudice 30 13 6
Geoffrey Chaucer The Canterbury Tales 39 6 7
Daniel Defoe Robinson Crusoe 19 30 8
Jonathan Swift Gulliver's Travels 29 21 9
Joseph Conrad Heart of Darkness 46 5 10
Simone de Beauvoir The Second Sex 26 25 11
Charles Dickens A Tale of Two Cities 32 20 12
George Orwell 1984 13 39 13
Edgar Allan Poe Tell Tale Heart 10 43 14
Ernest Hemingway The Old Man and the Sea 16 42 15
Henry David Thoreau Walden Pond 55 8 16
Aldous Huxley Brave New World 50 14 17
William Faulkner The Sound and The Fury 41 28 18
Arthur Conan Doyle The Adventures of Sherlock Holmes 23 47 19
Dante Alighieri Inferno 3 68 20
T. S. Eliot The Waste Land 56 16 21
Mary Shelley Frankenstein 73 1 22
Leo Tolstoy Anna Karenina 9 65 23
Erich Maria Remarque All Quiet on the Western Front 44 32 24
Aesop Fables 4 72 25
Walt Whitman Leaves of Grass 51 26 26
Albert Camus The Stranger 37 40 27
John Steinbeck The Pearl 49 29 28
J. R. R. Tolkien Lord of the Rings 31 48 29
Vladimir Vladimirovich Nabokov Lolita 47 46 30
F. Scott Fitzgerald The Great Gatsby 85 9 31
Henry James The Turn of the Screw 63 34 32
Lewis Carroll Alice in Wonderland 36 61 33
Jack Kerouac On The Road 62 36 34
Arthur Miller The Crucible 77 23 35
Fyodor Dostoyevsky Crime and Punishment 8 94 36
Jack London The Call of the Wild 28 75 37
Thomas Paine Common Sense 89 15 38
Umberto Eco The Name of the Rose 27 78 39
Isaac Asimov Foundation 21 86 40
Victor Hugo Les Miserables 5 102 41
Hans Christian Andersen Fairy Tales 17 92 42
Charlotte Bronte Jane Eyre 59 52 43
Emily Bronte Wuthering Heights 45 66 44
Nathaniel Hawthorne The Scarlet Letter 94 18 45
Tennessee Williams A Streetcar Named Desire 78 35 46
William Golding Lord of the Flies 67 49 47
Alexandre Dumas The Count of Monte Cristo 58 59 48
Toni Morrison Beloved 109 10 49
Bram Stoker Dracula 97 22 50
Robert Louis Stevenson Treasure Island 88 37 51
Samuel Taylor Coleridge The Rime of the Ancient Mariner 82 44 52
Stephen King Carrie 20 108 53
Philip K. Dick Do Androids Dream of Electric Sheep? 72 58 54
Suzanne Collins The Hunger Games 24 110 55
Herman Melville Moby-Dick 118 19 56
H. G. Wells The Time Machine 90 50 57
Elie Wiesel Night 117 24 58
Raymond Chandler The Big Sleep 70 71 59
Saul Bellow The Adventures of Augie March 79 63 60
C. S. Lewis The Chronicles of Narnia 69 73 61
Miguel de Cervantes Don Quixotes 11 131 62
Rudyard Kipling The Jungle Book 38 105 63
J. D. Salinger The Catcher in the Rye 113 31 64
Italo Calvino If On A Winter's Night a Traveller 76 69 65
E.T.A. Hoffmann The Nutcracker 52 96 66
Gabriel García Márquez One Hundred Years of Solitude 18 130 67
Agatha Christie And Then There Were None 25 124 68
Alexandre Dumas The Count of Monte Cristo 91 60 69
Jules Verne Twenty Thousand Leagues Under the Sea 35 117 70
Guy de Maupassant Collected Stories 43 111 71
Thomas Pynchon Gravity's Rainbow 114 41 72
Philip Roth Portnoy's Complaint 98 57 73
Walter Scott Ivanhoe 40 115 74
Charles Perrault Mother Goose 33 122 75
Paulo Coelho The Alchemist 48 109 76
Chinua Achebe Things Fall Apart 152 7 77
Boris Pasternak Doctor Zhivago 54 106 78
James Fenimore Cooper Last of the Mohicans 101 62 79
Louisa May Alcott Little Women 111 54 80
Frank Herbert Dune 81 84 81
Thomas Mann Death in Venice 14 152 82
E. M. Forster A Room With A View 122 45 83
Washington Irving The Legend of Sleepy Hollow 116 51 84
Niccolò Machiavelli The Prince 166 2 85
Ralph Ellison The Invisible Man 156 12 86
Ray Bradbury Fahrenheit 451 115 56 87
Anthony Burgess A Clockwork Orange 108 64 88
Thomas Hardy Tess of the D'Urbervilles  105 67 89
D. H. Lawrence Lady Chatterley's Lover 96 77 90
Nostradamus Prophresies 7 166 91
Dale Carnegie How to Win Friends and Influence People 87 87 92
Ambrose Bierce An Occurrence at Owl Creek Bridge 95 82 93
Khalil Gibran The Prophet 57 120 94
Hermann Hesse Siddhartha 12 165 95
Alice Walker The Color Purple 148 33 96
Pearl S. Buck The Good Earth 74 107 97
Harper Lee To Kill a Mockingbird 144 38 98
Anne Rice Interview With The Vampire 64 118 99
Mario Puzo The Godfather 68 116 100
Haruki Murakami Kafka on the Shore 65 119 10
Antoine de Saint-Exupery The Little Prince 42 142 102
Ayn Rand Atlas Shrugged 106 80 103
Georges Simenon Maigret 61 127 104
Roald Dahl Best Of Roald Dahl 60 129 105
Dashiell Hammett The Maltese Falcon 107 83 106
William S. Burroughs Naked Lunch 104 88 107
Allen Ginsberg Howl and other Poems 83 112 108
Astrid Lindgren Pippi Longstocking 34 161 109
Truman Capote In Cold Blood 53 146 110
Lucy Maud Montgomery Anne of Green Gables 102 99 111
Kurt Vonnegut Slaughterhouse-Five 75 126 112
Frances Hodgson Burnett A Little Princess 124 79 113
Stephen Crane The Red Badge of Courage 151 53 114
Dan Brown The Da Vinci Code 132 74 115
Thomas Malory King Arthur 125 81 116
Margaret Mitchell Gone With the Wind 119 90 117
Maya Angelou I Know Why The Caged Bird Sings 155 55 118
Ursula K. Le Guin A Wizard of Earthsea 112 100 119
Carlos Castaneda The Teachings of Don Juan: A Yaqui Way of Knowledge 71 141 120
L. Frank Baum The Wonderful Wizard of Oz 147 70 121
J. M. Barrie Peter Pan 134 85 122
Cormac McCarthy The Road 127 93 123
Robert A. Heinlein Stranger in a Strange Land 100 121 124
Arthur C. Clarke 2001: A Space Odyssey 99 123 125
Aleksandr Solzhenitsyn One Day in the Life of Ivan Denisovich  66 157 126
Johanna Spyri Heidi 92 132 127
O. Henry The Gift of the Magi 93 133 128
Daphne du Maurier Rebecca 129 98 129
Margaret Atwood The Handmaid's Tale 133 95 130
Thornton Wilder The Bridge of San Luis Rey 143 91 131
Philip Pullman The Golden Compass 137 97 132
Khaled Hosseini The Kite Runner 159 76 133
Chuck Palahniuk Invisible Monsters 150 89 134
Terry Pratchett The Color of Magic 86 153 135
Graham Greene The Quiet American 80 162 136
Ian Fleming Casino Royale 103 143 137
Karen Blixen Out of Africa 84 163 138
Evelyn Waugh Brideshead Revisited 145 103 139
Jacob Grimm Fairy Tales 110 140 140
Richard Bach Jonathan Livingston Seagull 123 128 141
Edgar Rice Burroughs Tarzan 141 114 142
Harriet Beecher Stowe Uncle Tom's Cabin 121 137 143
Neil Gaiman American Gods 160 101 144
Bill Bryson A Short History of Nearly Everything 157 104 145
J. K. Rowling Harry Potter 138 125 146
Patricia Highsmith Strangers on a Train 131 134 147
Michael Crichton Jurassic Park 136 136 148
Robert Ludlum The Bourne Identity 120 154 149
Hunter S. Thompson Fear and Loathing in Las Vegas 130 145 150
Suzanne Collins The Hunger Games 164 113 151
Joseph Heller Catch-22 135 144 152
Dean Koontz Odd Thomas 126 155 153
J. G. Ballard ConcretE Island 146 138 154
Wilhelm Grimm Fairy Tales 128 160 155
George R. R. Martin A Song of Fire and Ice 138 151 156
Ken Kesey One Flew Over the Cuckoo's Nest  140 150 157
Eoin Colfer Artemis Fowl 162 135 158
Douglas Adams Hitchhiker's Guide to the Galaxy 149 148 159
H. Rider Haggard King Solomon's Mines 142 156 160
Stephenie Meyer Twilight 154 149 161
Christopher Paolini Eragon 165 139 162
Orson Scott Card Ender's Game 161 147 163
Alex Haley Roots 153 159 164
Cornelia Funke Inkheart 158 158 165
Rick Riordan The Lightning Thief 163 164 166

No comments:

Post a Comment