Converting Exam Questions to Flashcards

Usually when preparing for an exam, I create flashcards based on practice exams, course material, and the notes I took during class. To do this I use a great piece of free (as in freedom) software called Anki.

Anki also has mobile apps available and makes use of spaced repetition, which basically comes down to, if you get an answer right, it takes longer for the same question to be asked again. If you get an answer wrong, the time until the same question is asked becomes shorter. Anyway, it is supposed to be a more efficient method of learning (and remembering) facts.

The one element of Anki that has always bugged me is the interface for entering the questions and answers. While nothing is necessarily bad, it is just not that good either. It’s just a combination of small annoyances over the years.

During the previous course (Intelligent Systems) one member of my project team used a simple office document with the questions, answers, and the location in the course material put into a table. While this format is easy to work with, it does (obviously) not have the exam type questioning and spaced repetition. Now I could go and copy/paste all that data from the office document into Anki by hand, but I’m a lazy person and doing it manually seems like a lot of unnecessary work. So I decided to write a script to do it for me.

I’m starting out with an odt file containing one or more tables. Each table has a row of headers and three columns (question, answer, location).

Example table image

To get the data out of the odt document I used the odfpy module. Using this module it is possible to load the data from any open-document format file and convert it to XML. To parse the data I used BeautifulSoup4 with lxml as the back-end. By default tag used for the table is table:table, which for some reason is not found by bs4. The same is true for the content in the rows, which use text:p and text:h. To fix this issue I deleted the part before the : using the replace function in Python. To make the data easier to work with, I created a 2D list, each sub-list being a row in the table.

Now I have the data out of the odt file and into a workable format, I have to figure out a way to get it into a format Anki understands. This turned out to be very easy. Anki supports importing from a plaintext file, in which the question and answer part of a flashcard are separated by a tab. So the only thing I had to do was to put together the string(using the format [QUESTION]\t[ANSWER] (LOCATION)) and write it to a text file.

There are some small improvements to be made, like some symbols getting converted to hex-values, and the script could probably use some error catching. But for now it seems to work well enough.

The code is available here and uses the MIT license.