A few months ago I decided to port my WoW data importer app from PHP to Java because the website was already built in Java. I also wanted to see if execution time could be improved in a multi-threaded app.
The PHP script stores the JSON returned from the WoW API into individual JSON files for each item. This means there are thousands of JSON cache files. There’s a second script that packs these responses into a single file, and then a third script that imports the data from the consolidated file in the second script.
The reason for this two-step process was to see the difference in time it took my script to process the individual files vs a consolidated file, as describe in my blog post on optimizing the PHP importer script.
This approach was overcomplicated and pretty inefficient. While writing the Java app, I wanted to create a cache source that didn’t require another script to pack the data, but was could be updated efficiently.
Choosing SQLite
At first, I tried Oracle BerkleyDB. Coding that up was pretty simple, and I got it working. However, I couldn’t find a good program to manage the BerkleyDB. The apps I came across either wasn’t specifically for BerkleyDBs or they required compiling, and I didn’t want to bother with that.
Since I’ve used SQLite in quite a few projects, it seemed like the next choice. While the mySQL database I’m putting the data into would have worked fine, that database could be remote, and I needed a local database to cache the API results into.
Threading Gotchas
A big challenge with SQLite is that you’re working directly with a file, and not a service like you do with mySQL. Because of this, special precautions need to be taken so that you don’t run into SQLITE_BUSY exceptions or accidentally corrupt your database by writing in two threads at once. This happened at least once while I was playing around with different configurations.
Tip #1: Share the SQLite connection between threads
Creating a shared connection should at least avoid corrupting the database. However, it’s still possible to come across the SQLITE_BUSY exception
Tip #2: Use a Lock
Sharing the connection worked very well at first. I was able to import data into a clean database without any problem. However, when the app was run again and started updated cache, I started running into SQLITE_BUSY exceptions.