Blog

  • The Making of the Library Occupancy Estimation System, Part 3 – the Server

    So the sniffer part was complete. The server was the missing piece of the puzzle.

    The server needs to do several things

    1. Receive data from the sniffer and decrypt
    2. Save sniffed data (probe requests) to a database
    3. Analyze probe requests and deduce occupancy
    4. Provide public APIs for retrieval of occupancy information

    I’m using python3 for the server since it’s one of the better-supported languages on Heroku. Unfortunately, XTEA implementations on python3 do not work well, so I reimplemented it by porting my embedded-C version. (One of the advantages of CTR mode is that the encrypting function is its own inverse. So encrypt == decrypt.) Sniffer MAC address and information of sniffed devices are recovered from decrypted data. These are stored to a DB after some simple processing (filtering out invalid MAC addresses, etc.).

    For storage of probe requests, I use sqlalchemy which is based on SQL. I didn’t use one of the now-popular no-SQL databases since I’m not storing GBs of data. Also, enforcing schemas and relationships clears up the logic of the server.

    I also made a simple linear model to deduce occupancy from probe requests. The model is run every 30 seconds and output is stored back into the DB. It is not efficient to run the model every time a user asks for occupancy information. When a user requests occupancy, pre-computed occupancy is read from DB and directly sent to the user.

    The public API is RESTful and based on simple JSON. I did not bother with pre-existing protocols like JSONAPI 1.0. These protocols are way too complicated and verbose for this project. When the cost of implementing a complicated protocol exceeds the potential benefits of that protocol, it’s generally a good idea to steer away from it.

  • The Making of the Library Occupancy Estimation System, Part 2 – the Sniffer

    So I proved that this idea is somewhat feasible. There were two major problems that need to be addressed: 1. there were too many probe requests, 2. there needed to be some way to send found devices back to a server.

    The “too many probe requests” problem was easy to fix. I implemented a bloom filter, and for each sniffing session, sniff requests from previously identified devices would be ignored. I also got rid of the Arduino environment since I need access to a newer SDK, and the Arduino environment ships with an older SDK. To set-up the SDK was quite a hassle. (SDK)

    The other problem was (in my opinion) more important: Data need to be encrypted before sent – it is a pretty bad idea to send MAC addresses in plain text. So I built a protocol around XTEA-CCM, eXtended Tiny Encryption Algorithm in CCM (CTR + CBC MAC) mode. XTEA was chosen because it is easy to implement and has no known security-breaking flaws. The weakness of XTEA is its slowness – not a problem here since data sent is usually less than 4KB: the 120MHz ESP8266 should be able to handle that easily. Now the problem is what protocol to use for transmission of this encrypted data: the ESP8266 supports TCP and UDP. I decided to go with HTTP/1.1 (which is, at its core, TCP). Using raw TCP / UDP sockets is of course convenient, but requires a standalone IP address, and server with standalone IP address is kind of expensive (a whopping $3/month!). By using HTTP, I can use services like Heroku, Google App Engine…, which are free for small projects like this. (Cost is important)

    I wrote a few wrappers around the original APIs to abstract things a bit and to make life easier. Original SDK requires you to do almost everything manually (WiFi connection management, DNS lookup, etc.), so it is nice to separate core logic from supporting code – currently user_main.c is less than 200 SLOC while the entire repository is around 1300 SLOC. Dumping 1300 lines into a single file would be a disaster.

  • The Making of the Library Occupancy Estimation System, Part 1 – the Idea

    So… Here’s a problem that most CMU students encounter on a daily basis: finding someplace to study. This process is surprisingly labor intensive: Here’s an example:

    1. You decided to find some place to study.
    2. You walk to Sorrells Library in Wean, and it was full: after the recent renovation the library has become one of the best places to study on campus, and there was no spot left.
    3. You walk to Hunt Library, climb to the 3rd floor, and it was also full. You think that Gates may have some spots left and walk to Gates…

    … and you just traversed half of the campus, wasting 15 minutes for no good reason. (Or, if you are one of the glass-half-full kind of person, it’s 15 minutes of workout.) What if there exists a web page so people can check if a library is full or not…

    The basic idea here is to estimate occupancy in a cheap and reliable way. Some ways are:

    1. Installing an IR sensor at the door of the library, and count people entering/exiting the library.
    2. Putting a camera at the door of the library, and count people entering/exiting the library.
    3. Putting cameras in the library, and count people currently in the library.
    4. Using some device to count Wi-Fi devices in the library, and estimate occupancy based on Wi-Fi devices.

    The problem with 1 and 2 is that both solutions are counting delta/difference. We need to integrate the difference to get an occupancy estimation, and error will accumulate over time. 2 and 3 both require cameras: although cameras themselves are cheap, and computer vision libraries for recognizing people exist, hardware able to process camera input is expensive. Cameras may also raise concerns about privacy: most people are not comfortable if they are “monitored”. The only cheap and reliable way is 4: using some gadget to count Wi-Fi devices.

    And that gadget is ESP8266. A Wi-Fi-enabled SoC. There are a lot of boards with this chip out there on eBay. The one I used to develop the system is this one:

    And these boards are very cheap (4-8 dollars a piece). Buy one here.

    To investigate if this is going to work, I modified some code (Github) found online and see if it is going to pick up any device.

    The code listens for probe request frames (broadcast frames from Wi-Fi enabled devices to check what access points are available nearby). I flashed the program into the ESP8266, and it worked! The problem is: there are too many probe requests…

    To be continued.

  • “Adding” a microUSB port to a HDMI recorder

    So the device is here… But it is not what I wanted…

    Apparently there are two types of this device on eBay that look identical, one with a microUSB port, one without. I, of course, accidentally bought the one without the microUSB port. So what is missing from the non-microUSB version? I suspect that if I can figure out the parts missing from the board and fill these in, I might add the missing USB-to-PC functionality I am looking for. The only way to find out is to open it up.

    Okay, the PCB layout looks OK to me, and there is no major part missing from the board. Wait a second: There is already a microUSB port there on the board, the only thing missing on this product is a cut-out for the port on the casing.

    That’s… That is one of the worst designs I’ve seen in a while. It took some serious courage to make this design decision.

    The only thing left for me to do is to dremel out a hole on the casing.

  • Hacking 100% orange juice to run multiple instances

    I recently got the game 100% Orange Juice, and as a board game it doesn’t support local multiplayer. It does have an online multiplayer though. So the idea is to run multiple instances of the game, and use the online multiplayer mode to “simulate” local multiplayer. However, it doesn’t event want to you to run multiple instances. That’s a bummer.

    So I open up the trusty OllyDbg and trace the startup of the game. 15 minutes later I found that “Game is already running” check is performed at offset 0x00B31033, almost at the very beginning of the exe. Based on the result, at 0x00B3103B, the program either jumps to 0x00B31056 to continue execution, or gives you an error message and crash. We certainly don’t want the latter.

    The operation at 0x00B3103B is JE SHORT 00B31056 (74 19), simply change that to JMP SHORT 00B31056 (EB 19) and it is fixed.

    However, I later figured out that the multiplayer mode is implemented using Steam API, which again does not allow multiple instances to go online simutanously. Hacking the Steam API or to decouple the game from Steam API is a much bigger project which I decided to step away from.

    So it turned out that I still can’t do local multiplayer with this game.