Remodeling the Shelf
For the past eighteen months, I kept notes on books I had read in a collection of markdown files. I used these to publish my shelf page. The main problem was that Hugo, the static site generator I use to produce my website, only has two options for a particular file: publish or ignore. This meant that when I used templates to generate a list of books from my markdown notes, I also published that many additional pages on my website— one for each markdown file. It made my page count go from the mid twenties to nearly 100.
What I wanted was a list of books I had read and a way to compile them into one
list, published on a page. To do this, I needed to completely change the way I
had kept my notes. I wanted to make a .csv file that contained my reading list
and appropriate metadata. This would have been a chore to do manually, since I
would have had to parse the yaml frontmatter of every file by hand.
Fortunately, I’ve been toying with python for the past six months. I played around with it last night and managed to kludge together a solution. I think, given some time, it could be much better, but this was what I got together in ten minutes. This set of code:
- Makes a python list of paths to
.mdfiles, inospath format. - Loops through each path to parse the
yamlfrontmatter of each file. - Takes the
title,author,date,isbn, andratingvalues for each file and creates a dataframe. - Saves the dataframe to a
.csvfile for future use.
I also made my shelf page layout
compatible with the .csv data format.
Cataloging the Shelf
I started by making a list of paths within my book notes directory.
1import sys
2import os
3from pathlib import Path
4
5files = os.listdir('books')
6paths = [Path('books/' + str(x)) for x in files]
Once I had all of the paths for the notes, I used the load_all() function from
ruamel.yaml.YAML and iterated through each path to extract the data. I sorted
the frontmatter data for each file into a bunch of lists that contained the
values I wanted to save. In this case, these were the data I read the book, the
title and author, the isbn, and the rating I assigned to the book.
1import ruamel.yaml
2yaml = ruamel.yaml.YAML()
3
4dates, titles, authors, isbns, ratings = [], [], [], [], []
5
6for path in paths:
7 for data in yaml.load_all(path):
8 break
9 dates += [data['date']]
10 titles += [data['title']]
11 authors += [data['author']]
12 isbns += [data['isbn']]
13 ratings += [data['rating']]
Although lists are good, I wanted my data to be stored in a file format. I
converted the lists into a single pandas dataframe, then saved it as a .csv
file. This is a very nice little trick that I use at work, since it allows me to
run the actual analysis and workup separate from the data processing.
1import pandas as pd
2
3df = pd.DataFrame(
4 {'date': dates[0:61],
5 'title': titles,
6 'author': authors,
7 'isbn': isbns,
8 'rating': ratings
9})
10
11df = df.sort_values(by='date')
12df.to_csv('out/shelf.csv')
Once everything was saved, I chucked shelf.csv into my data folder. It’s a lot
smaller than the sixty or so markdown files that I had before, but it contains
about the same amount of interesting information.
Rebuilding the Shelf
With this slew of data in a new and improved format, I set about constructing a
hugo template to display it. .csv files aren’t Hugo’s favorite data format
(that dubious honor belongs to toml or json), so working with them is less
smooth. Hugo recommends using the data.getCSV
function, which can access the data a little more cleanly than the
transform.Unmarshal
function, so I stuck with the former.
1{{ $data := getCSV "," "data/shelf.csv" }}
It does require one to specify the comma as the character to separate by, which
strikes me as a little bit odd for a function specifically for .csv files. The
only other quirk of the function is the way in which one accesses the values in
a particular row. I had to do a little bit of scouring on stack overflow to find
out that (index $row $column) accesses data from a particular cell. My shelf. csv file includes several columns, all of which need to be accessed in this
manner. Once I learned that, I made a simple list to test it out:
1<ul>
2 {{ range $i, $r := $data }}
3 {{ if (ne $i 0) }} {{/* don't print the header row */}}
4 {{ $title := (index $r 1) }} {{/* get data from column 1 */}}
5 {{ $author := (index $r 2)}} {{/* get data from column 2 */}}
6 <li>{{ $title }} by {{ $author }}</li>
7 {{ end }}
8 {{ end }}
9</ul>
I did a little bit more experimentation before defining my actual shelf.html
template, but this is the core of it. The final template, all polished up and
pretty, looks like this:
1{{ define "main" }}
2
3{{ .Content }}
4
5{{ $data := getCSV "," "data/shelf.csv" }}
6<ul class="shortlist">
7 {{ range $i, $r := $data }}
8 {{ if (ne $i 0) }}
9 {{ $rating := (index $r 4) }}
10 {{ $title := (index $r 1) }}
11 {{ $author := (index $r 2) }}
12 <li>
13 {{ if ( ge $rating 6 ) }}⋆{{ end }}
14 <b>{{ $title }}</b>,
15 {{ $author }}.
16 {{/* include commentary */}}
17 {{ with (index $r 5) }}
18 <ul><li>{{ . | markdownify }}</li></ul>
19 {{ end }}
20 </li>
21 {{ end }}
22 {{ end }}
23</ul>
24
25{{ end }}
What I did here is much like the previous example but with more frills. This
time I’m also including the rating (index $r 4) and any commentary I might
have (index $r 5).
And the accompanying styling:
1ul.shortlist {
2 list-style: none;
3 padding-left: 0;
4}
5
6ul.shortlist li {
7 padding-bottom: 0.6em;
8}
Closing Thoughts
I had a lot of fun putting this together. It only took me a couple minutes to plan out what I wanted to do, and the implementation only took another hour or so. Although I could have moved the data in about the same amount of time as it took me to write the code, I had more fun doing it this way. Plus, now I understand how to use some of these tools for any future projects.
It’s also a reminder for me to keep data in formats that are more amenable to processing. Even if I don’t think that I want to do work up the data right now, keeping my records in a more useable format will make the work easier for future me.
Updated by Elliott Weix.