A free and open source tool called “Truffle Hog” can help developers check if they have accidentally leaked any secret keys through the projects they publish on GitHub.
Truffle Hog is a Python tool designed to search repositories, including the entire commit history and branches, for high-entropy strings that could represent secrets, such as AWS secret keys.
“This module will go through the entire commit history of each branch, and check each diff from each commit, and evaluate the shannon entropy for both the base64 char set and hexidecimal char set for every blob of text greater than 20 characters comprised of those character sets in each diff,” explained Dylan Ayrey, the tool’s developer. “If at any point a high entropy string >20 characters is detected, it will print to the screen.”
As Reddit users have pointed out in a discussion about TruffleHog, bots often scan GitHub in search of secret keys that can be abused for malicious AWS instances. Since these types of activities have often resulted in bills of thousands of dollars that Amazon ended up refunding, the cloud services provider has taken a proactive approach and has temporarily blocked AWS accounts whose secret keys are found in a public repository.
Truffle Hog already has more than 700 stars on GitHub, making it Ayrey’s second most popular project after Pastejack.
Security experts have often warned developers who publish their projects on GitHub about the risks of leaking sensitive data through their code. In January 2013, GitHub introduced a new internal search feature that made it easy to find passwords, encryption keys and other data. At the time, users discovered thousands of such secrets on GitHub.
More recently, experts warned Slack bot developers that they were unknowingly exposing sensitive data, including business-critical information, by publishing their Slack access tokens on GitHub.