Solving HQ Trivia Questions with OCR and Wikipedia

David Hariri
4 min readNov 15, 2017

--

Quick note: I have never used my solver to win. That would be cheating and it would defeat the purpose of the game. I dislike cheaters as much as anyone else. My intention was to see if it could be made, not to make a buck or have an unfair advantage.

Like many people, I’ve been totally hooked on the new Trivia game show app HQ Trivia. Unfortunately I have a very narrow range of knowledge. I know a lot about technology and startups, but almost nothing about history, American geography, popular culture or literature. Most of what I read online is about technology and most of the books I read are about spirituality and self-development.

Needless to say, most of my solo games end at question three or four.

Being a programmer, I started dreaming about how I could overcome my knowledge handicap with the assistance of my computer and the internet. Before I go into a diary of how I came to my solution, I’d like to present it first for those of us who don’t want to read a long article.

This is the winning answer for Question 12 from last night’s game. You can see that even when the image is cropped and filtered Tesseract still misinterprets the word “accepting” due to the rounding and size of the font. While this screen shot has the winning answer selected, that information was ignored (obviously).

My solution gives a best guess answer in under two seconds from the moment of capture. In my early tests (with very small sample sizes), it answered correctly around 70% of the time. This is obviously a lot better than guessing and there’s still so much that could be improved in every step of the pipeline (see below).

How it works

The first step is capturing a video stream of the game from my iPhone to my computer using Quicktime. I then take screenshots of the game and, using the OCR library Tesseract, read the question and answers from the screen. I run those strings through a simple term frequency algorithm which spits out a best guess at what the right answer is. The algorithm works by reading the content of the Wikipedia articles that correspond to each answer, and counting the frequency of the unique set of question terms in each article, less stop-words. This assigns a score to each answer, the maximum of which is the most likely answer.

If you’re interested in seeing the code, here’s a gist. Go easy on me- it was hacked together in a few hours and doesn’t represent my best work.

Why OCR?

I had to use OCR to “read” the question and answers because HQ Trivia has actually done a surprisingly good job of hiding the question and answer data from proxy inspection.

Using Charles, I found lots of interesting information about upcoming games, users, all the links to the live streams and in-game comments, but the actual game data seems to be streamed somewhere else.

HQ uses a service called Wirecast to stream the game from their studio. Wirecast obviously handles streaming the video, but it has support for simultaneous data streaming as well (cool!). My guess is that data is broadcasted on the Wirecast socket over a protocol Charles can’t understand (RTP/RTSP) or they’re using SSL-pinning to avoid proxy-based inspection in the first place (Twitter and Facebook use this method).

Kudos to the team at HQ Trivia. I was impressed with this level of obfuscation for something that’s only been around publicly for a few weeks.

Improvements

Here’s a list of things I’d like to improve about my solution. I would have liked to have implemented them before writing this, but frankly, I’m a founder and this project was a dubious use of time for me already.

  • Capture screen shots automatically with HammerSpoon’s waitUntil by reading the pixel data where the white question card will be. (Thanks Rinoc Johnson!). Having to take screen shots when the card shows up was stressful and error-prone.
  • Use aiohttp to make the batch of Wikipedia API requests happen asynchronously
  • Error handling
  • Better debug features (score inspection etc…)
  • Automated testing
  • Accuracy and efficiency improvements to the algorithm
  • Edge detection for getting the crop area automatically
  • Use the terms in the question (not just the answer options) to assist the likelihood scores by reading the contents of related Wikipedia articles.
  • Train Tesseract on round fonts (I tried to do this, but I didn’t do enough training to make a significant difference.
  • Run the question and answer strings through a quick spell check to fix any goofs by Tesseract (Thanks Rinoc Johnson).
  • Different methods for different types of questions. For example, this wouldn’t work well on the question: “What shape is a dead end sign?”
  • Doesn’t work when the answers are not nouns (amounts, years etc…). For example, “What year did the Apollo 11 mission leave Earth?”
  • Doesn’t work on negative questions. For example, “Which of the following is NOT a colour?”

Thanks to Brandon Mowat and Rinoc Johnson for helping me build this. If you’re interested in working with us, our team is hiring for a bunch of roles. If you’re into building quick prototypes first as a way of building product features, we’d probably love to have you join us at Ada in Toronto.

Thanks for reading!

--

--