Boarding Pass Parser

The 2D barcodes found on boarding passes encode a large amount of interesting flight and passenger data. I wrote a parser to process the output from a scan of a boarding pass barcode.

Scan of paper boarding pass

Capabilities

Tools

Languages & Frameworks Ruby on Rails (2.3.1, 4.2.3)
HTML
CSS
SCSS
Version Control Git
Hosting Heroku (as part of my Flight Historian project)
Project Management Trello

History

Once my Flight Historian was largely complete, I began to think about future flight data I could track with it – things such as seat numbers, fare codes, or record locators. Since I didn’t yet know what I wanted to track, I realized I would need to start saving as much information about my current flights as I could to give me retroactive data. Thus, I began scanning my boarding passes (or screenshotting my electronic ones), as they seemed to be the best source of flight information I had.

However, having images of boarding passes would still require a lot of manual entry in the future, and I realized that all of that information should be stored digitally. I began researching the Bar-Coded Boarding Pass (BCBP) standard. Instead of storing images of the boarding pass, I could just scan the barcode, and store it for future use in a barcode data field for each flight.

Screenshot of digital boarding pass

Eventually, I decided that it would be helpful for me to write a class that could interpret the boarding pass data I’d been storing.

Design

Data Structure

Once the barcode is scanned, the data retrieved from it comes out in a string that looks like the following:

M1DOE/JOHN            EABC123 BOSJFKB6 0717 345P014C0010 147>3180 M6344BB6              29279          0 B6 B6 1234567890          ^108abcdefgh

Some of the data is obvious (passenger name, airport codes), but a lot of it isn’t obvious at first glance, and has to be interpreted through the BCBP standard.

The barcode data has four primary categories:

Additionally, though it’s rare, boarding passes can technically contain data for multiple flights for the same passenger. Thus, some data is considered repeated data (data for each specific flight, even if there is only one flight) and some data is considered unique data (data specific to the passenger or the boarding pass itself).

Unique Repeated
Mandatory
Conditional
Airline Use
Security

So there are really six groups of data: three unique and three repeated.

The mandatory groups are a string of consecutive fixed length fields (for example, the name field in Unique Mandatory is 20 characters, and is padded with spaces if necessary).

The conditional groups are also a collection of fixed length fields, but the total length of each conditional group is defined by a two digit hexadecimal number encoded in the string, and fields are encoded until the length of the group is reached. (It is possible to have a zero-length conditional group, which means that conditional group is absent.)

The airline use group is a single, variable length field, again defined by a two-digit hexadecimal number encoded in the string.

The security group has a few fixed-length fields, then a hexadecimal number, followed by a variable length field whose length was defined by that hexadecimal number.

The string is ordered as follows:

Strategy

The solution I implemented works as follows:

  1. Define the list of possible fields (create_fields)
  2. Read through the string and create a hash of start locations (within the data string) and lengths of each of the groups (create_control_points method)
  3. Create a data structure to store all field data (build_structured_data)
    1. For each group, loop through the possible fields until the length of the group is reached
      1. Store the BCBP field ID, field name, raw data, validity of the data, and an interpretation of what the data means in a hash
      2. Create a hash of these hashes for each group
    2. Create a hash of any unknown data that cannot be interpreted
    3. Store all of the groups in the following data structure (square brackets represent arrays, curly braces represent hashes):
      {
        unique {
          mandatory {}
          conditional {}
          security {}
        },
        repeated [
          leg 1 {
            mandatory {}
            conditional {}
            airline {}
          },
          ⋮
          leg N {
            mandatory {}
            conditional {}
            airline {}
          }
        ],
        unknown {}
      }
            

From there, it’s trivial to loop through the data structure to print the data and its interpretations in a tabular format.

JSON API

It’s also trivial to convert the above data structure to JSON, so I wrote a JSON API for the boarding pass parser. All that’s required is to place the raw barcode data into the following URL:

https://www.flighthistorian.com/boarding-pass/json/RAW_DATA

If you need a callback function, use the following structure:

https://www.flighthistorian.com/boarding-pass/json/CALLBACK/RAW_DATA