Smiley Face Corrupted USPS Scanner Data

Smiley Face Corrupted USPS Scanner Data

While mailing a package the other day, I bumped into a fellow USPS (United States Postal Service) customer who said her packages had been sent back to her by the USPS. She said that the Smiley face sticker(s) on the envelope scanned by the USPS sorting machines were mistaken for a QR or barcode. I found this very interesting and humorous at the same time. I thought I’d share it with this data quality audience because it highlights data quality from a machine’s (non-human) perspective. In an age of AI and robots, I imagine we’ll see more and more examples of this kind of confusion.

Using the Conformed Dimensions of Data Quality (CDDQ) we’d categorize this data quality issue within the Representation dimension. Under that dimension, the CDDQ includes a few applicable Underlying Concepts that might apply:

  • Media Appropriate- The appropriate media (e.g. Web-based, hardcopy, or audio…etc) are provided.
  • Presentation Language- Data that is represented well is simple but elegantly formed with good grammar and presented in a standard way.

From the USPS scanner’s perspective, perhaps the Smiley face looked like a QR code or barcode. In other words, the Smiley face was inappropriate media. If the smiley face was interpreted as a code, the system would have the erroneous code value stored in the USPS database and may even cause the letter to be routed to the wrong destination, or sent back to the sender (as previously mentioned). This could be frustrating to both the sender and receiver, so B2C companies should be careful decorating their packages in such a way that may derail timely delivery.

Additionally, downstream business processes that try to use the code (now stored in internal databases) will be negatively affected. In other words at step (a), when the mail is processed in San Francisco it is incorrectly scanned, but later, in step (b), correctly scanned in Sacramento, the audit trail for this letter will likely be fatally broken, without human intervention and correction. Another question that the USPS may need to research, would be why the correct label (e.g. barcode or QR code) wasn’t found and chosen over the smiley face. The program needs a way to identify if there are multiple codes on the letter and then, likely using probability, choose the right one.

In an HBR article titled “AI Won’t Replace Humans — But Humans With AI Will Replace Humans Without AI,” Harvard Business School professor, Karim Lakhani, said that the key is to have a digital mindset when tackling this “70% organizational; 30% technical challenge” that we face when using AI. Optimally, a human in the middle scenario could be implemented at the USPS to correct these errors in the future. Clearly, as AI grows in importance, data quality issues like this will remain at the heart of technical and business process changes that impact our lives. Depending on the volume of returns due to this data quality issue, this could impact the bottom line of all companies using the USPS (or any AI enabled platform for that matter).

In closing, I have to give a shout out to our USPS postal workers whom I have found to be knowledgeable and helpful every time I visit the post office. I hope they don’t consider these observations an affront to the awesome work done for us every day.

Did you enjoy this blog? If so, help out by taking the annual Dimensions of Data Quality Survey open now!