I had a large slide deck in PDF format and wanted to add the individual slides
to Anki to revise from. There is some benefit to the learning process in doing
this manually, but it was more practical to study the slides first and then
automate the import into Anki.
The final bash command to produce an importable CSV file for Anki looks like
this:
pdftoppm my_slides.pdf my_slide \
-progress -png -f 20 -l 923 -rx 60 -ry 60 && \
cp ./*.png ~/.local/share/Anki2/my_username/collection.media/ && \
echo >| my_list.csv && \
ls *.png \
| awk '{printf("<img src='%s'/>\t\tmy_tag_1 my_tag_2\n",$1)}' \
>> my_list.csv
First it uses the pdftoppm tool to create a PNG image file for each slide:
pdftoppm my_slides.pdf my_slide -progress -png -f 20 -l 923 -rx 60 -ry 60
The -f and -l options set the range of slides to process, because we don’t
want the introductory or closing slides. The -rx and -ry options set the
DPI resolution. 60 is a relatively low value for this, but it’s a high enough
resolution for revision in Anki, and keeps the file size lower to reduce the
size of the media for the Anki deck.
The script then copies all of the newly created png files to the Anki media
location (note that this is for Linux, and also that my_username needs
replacing with your Anki username):
cp ./*.png ~/.local/share/Anki2/my_username/collection.media/
Next the script truncates the CSV file with no-clobber mode disabled:
Finally it pipes the list of PNG files through awk to produce CSV (or TSV)
rows with an HTML img tag referring to the relative image path, and adding
any Anki tags we want as the third column:
ls *.png \
| awk '{printf("<img src='%s'/>\t\tmy_tag_1 my_tag_2\n",$1)}' \
>> my_list.csv
It’s a good idea to give these rows a specific tag to make it easy to delete all
of them later if you need to. You could also put any “answer” side you want in
the second column between \t\t there.
Another possibility would have been to pipe the output of pdftoppm -progress
into awk and produce the CSV that way, but separating the two steps out via
ls makes it easier to re-use different parts of this script for other
purposes.
Then you can import this CSV file into Anki, making sure that the HTML option is
enabled.
View post:
Converting PDF slides to PNG then CSV to bulk import into Anki
|