Original Repository: https://github.com/YunYunSeoJANG/BookGNN
From Goodreads Book Graph Datasets, includes book/user/review ids, Poetry (36,514 books, 2,734,350 interactions, 154,555 detailed reviews) cf. 'review' consist of 'interactions' where the 'review' text is particularly lengthy.
Below are the examples of the case "book_id": "402128"
36,514 books
{"isbn": "0151686564", "text_reviews_count": "626", "series": [], "country_code": "US", "language_code": "en-US", "popular_shelves": [{"count": "15419", "name": "to-read"}, {"count": "1762", "name": "poetry"}, {"count": "302", "name": "classics"}, {"count": "187", "name": "favorites"}, {"count": "149", "name": "fiction"}, {"count": "136", "name": "cats"}, {"count": "107", "name": "animals"}, {"count": "104", "name": "owned"}, {"count": "100", "name": "books-i-own"}, {"count": "97", "name": "humor"}, {"count": "68", "name": "childrens"}, {"count": "63", "name": "currently-reading"}, {"count": "63", "name": "children"}, {"count": "55", "name": "classic"}, {"count": "46", "name": "literature"}, {"count": "45", "name": "children-s"}, {"count": "36", "name": "default"}, {"count": "31", "name": "humour"}, {"count": "31", "name": "children-s-books"}, {"count": "29", "name": "poems"}, {"count": "28", "name": "owned-books"}, {"count": "25", "name": "library"}, {"count": "25", "name": "favourites"}, {"count": "24", "name": "english"}, {"count": "23", "name": "british"}, {"count": "21", "name": "art"}, {"count": "20", "name": "childhood"}, {"count": "19", "name": "to-buy"}, {"count": "18", "name": "illustrated"}, {"count": "18", "name": "my-library"}, {"count": "18", "name": "kids"}, {"count": "18", "name": "american"}, {"count": "18", "name": "20th-century"}, {"count": "16", "name": "british-literature"}, {"count": "15", "name": "my-books"}, {"count": "14", "name": "children-s-literature"}, {"count": "12", "name": "picture-books"}, {"count": "12", "name": "home-library"}, {"count": "12", "name": "childrens-books"}, {"count": "12", "name": "childhood-favorites"}, {"count": "12", "name": "young-adult"}, {"count": "12", "name": "edward-gorey"}, {"count": "12", "name": "non-fiction"}, {"count": "11", "name": "read-in-2016"}, {"count": "11", "name": "verse"}, {"count": "11", "name": "poes\u00eda"}, {"count": "11", "name": "fantasy"}, {"count": "11", "name": "read-in-english"}, {"count": "11", "name": "own-it"}, {"count": "11", "name": "nobel"}, {"count": "10", "name": "read-in-2017"}, {"count": "10", "name": "poesia"}, {"count": "10", "name": "poetry-plays"}, {"count": "10", "name": "wish-list"}, {"count": "9", "name": "nobel-prize"}, {"count": "9", "name": "1930s"}, {"count": "9", "name": "re-read"}, {"count": "9", "name": "i-own"}, {"count": "8", "name": "2017-reading-challenge"}, {"count": "8", "name": "2016-reading-challenge"}, {"count": "8", "name": "childhood-books"}, {"count": "8", "name": "funny"}, {"count": "8", "name": "books-we-own"}, {"count": "8", "name": "read-aloud"}, {"count": "8", "name": "nonfiction"}, {"count": "7", "name": "read-in-2014"}, {"count": "7", "name": "to-read-poetry"}, {"count": "7", "name": "childrens-lit"}, {"count": "7", "name": "classic-literature"}, {"count": "7", "name": "other"}, {"count": "7", "name": "mine"}, {"count": "7", "name": "general-fiction"}, {"count": "6", "name": "read-2016"}, {"count": "6", "name": "modern-classics"}, {"count": "6", "name": "nobel-laureates"}, {"count": "6", "name": "cat"}, {"count": "6", "name": "poetry-and-plays"}, {"count": "6", "name": "kids-books"}, {"count": "6", "name": "read-alouds"}, {"count": "6", "name": "short-stories"}, {"count": "6", "name": "nobel-prize-winners"}, {"count": "6", "name": "lit"}, {"count": "6", "name": "want-to-own"}, {"count": "6", "name": "plays"}, {"count": "6", "name": "american-lit"}, {"count": "6", "name": "in-english"}, {"count": "5", "name": "read-2015"}, {"count": "5", "name": "musicals"}, {"count": "5", "name": "home"}, {"count": "5", "name": "talking-animals"}, {"count": "5", "name": "read-in-2012"}, {"count": "5", "name": "have"}, {"count": "5", "name": "adult"}, {"count": "5", "name": "modernism"}, {"count": "5", "name": "want"}, {"count": "5", "name": "read-in-2011"}, {"count": "5", "name": "t-s-eliot"}, {"count": "5", "name": "animal-fiction"}, {"count": "5", "name": "children-s-lit"}, {"count": "5", "name": "england"}], "asin": "", "is_ebook": "false", "average_rating": "4.09", "kindle_asin": "", "similar_books": ["884306", "234", "472443", "51244", "305154", "574889", "201711", "857597", "858497", "864051", "400723", "47564", "1391333", "133380", "285151"], "description": "T. S. Eliot's playful cat poems have delighted readers and cat lovers around the world ever since they were first published in 1939. They were originally composed for his godchildren, with Eliot posing as Old Possum himself, and later inspired the legendary musical Cats.", "format": "Hardcover", "link": "https://www.goodreads.com/book/show/402128.Old_Possum_s_Book_of_Practical_Cats", "authors": [{"author_id": "18540", "role": ""}, {"author_id": "21578", "role": "Illustrator"}], "publisher": "Harcourt Brace & Company", "num_pages": "56", "publication_day": "30", "isbn13": "9780151686568", "publication_month": "8", "edition_information": "Illustrated Edition", "publication_year": "1982", "url": "https://www.goodreads.com/book/show/402128.Old_Possum_s_Book_of_Practical_Cats", "image_url": "https://images.gr-assets.com/books/1327882662m/402128.jpg", "book_id": "402128", "ratings_count": "15716", "work_id": "372536", "title": "Old Possum's Book of Practical Cats", "title_without_series": "Old Possum's Book of Practical Cats"}
2,734,350 interactions
{"user_id": "80d52f5e70f023bd0098ab96599a3530", "book_id": "402128", "review_id": "fbd6a22a155c87a84fba7537f06cc94b", "is_read": true, "rating": 4, "review_text_incomplete": "", "date_added": "Fri Apr 19 08:15:15 -0700 2013", "date_updated": "Fri Apr 19 08:15:15 -0700 2013", "read_at": "", "started_at": ""}
154,555 detail reviews
{"user_id": "3ca7375dba942a760e53b726c472a7dd", "book_id": "402128", "review_id": "28423ff309bc896c071a8d9df4a10e8a", "rating": 5, "review_text": "I have three younger siblings and we grew up watching the musical Cats. We knew all the songs and attempted to do the dance moves too. I remember we used to get trouble for jumping off the sofa too. When I found out that Cats was based off of poems, I really wanted to read them. I asked for the book for Christmas one year and I read them all that day. The poems are beautifully written and actually tell stories, whereas some poems are just descriptions. I have no idea how T.S, Eliot came up with so creative and brilliant with something as familiar as the family cat. Eliot is a great writer and I would recommend this book to anyone who is looking for a break from all the intense, sophisticated poems/books they are usually reading. This book is fun and is guaranteed to brighten your day!", "date_added": "Tue Jun 12 08:59:04 -0700 2012", "date_updated": "Fri Jun 15 11:41:12 -0700 2012", "read_at": "", "started_at": "", "n_votes": 0, "n_comments": 0}
Download poetry datasets from Goodreads Book Graph Datasets in goodreads
folder.
mkdir goodreads
cd goodreads
# Download
wget https://datarepo.eng.ucsd.edu/mcauley_group/gdrive/goodreads/byGenre/goodreads_books_poetry.json.gz
wget https://datarepo.eng.ucsd.edu/mcauley_group/gdrive/goodreads/byGenre/goodreads_interactions_poetry.json.gz
wget https://datarepo.eng.ucsd.edu/mcauley_group/gdrive/goodreads/byGenre/goodreads_reviews_poetry.json.gz # not used now
# unzip
gunzip *.gz
Then the datasets are stored as following:
cd goodreads
.
βββ goodreads_books_poetry.json
βββ goodreads_interactions_poetry.json
βββ goodreads_reviews_poetry.json
For venv users (python==3.10.12 recommended)
python3.10 -m venv .bookgnn
source .bookgnn/bin/activate
pip3 install -r requirements.txt
For conda users (I haven't checked this yet)
conda create -n bookgnn python==3.10.12
conda activate bookgnn
pip3 install -r requirements.txt
Preprocess the downloaded datasets. We first only use goodreads_books_poetry.json
and goodreads_interactions_poetry.json
. The preprocessed results will be saved in the datasets
folder.
python3 src/preprocess.py
cd datasets
.
βββ books_poetry.json
βββ interactions_poetry.json
./scripts/train.sh
If you want parallelized trianing with multiple GPUs, you can add CUDA_VISIBLE_DEVICES=0
condition in train.sh
You can find out the result plot (train/val loss, train/val ROC) in /train_result_plots
python3 src/training_comics.py
wandb login
wandb sweep src/sweep_config.yaml
wandb agent pljh0906/Prometheus-GNN-Book-Recommendations/<sweep id> --count <count>
python3 src/show_list.py
python3 src/visualize.py
μ₯μ€μ (Leader) Data Preprocessing, Skeleton Code Revision, Test Code Setting, GNN Seminar
κΉμ€μ Parameter Search, Recommendation System Setting, Demo Setting (show_list.py)
κΉμ§ν Code Review & Revision (.ipynb to .py), Model Scale-up (Poetry to Comics)
λ¬Έμ¬μ Code Review, Recommendation System Setting, Website Development
λ°μ€ν Graph Visualization, Code Review, Wandb setting (Parameter Sweep, Optimization)