Security and Privacy by design: PrivateString

Security and Privacy by design: PrivateString

Author: Gerret Sanders (https://twitter.com/Gerret_s)

At Simacan we are constantly thinking about security and privacy. We like to think that security and privacy should never be an afterthought; that is why we take it seriously as early as in the design phase of our software.

Of course, when dealing with sensitive data such as passwords, we do not like to keep them plain text any longer than absolutely necessary. But of course, it is impossible to authenticate a user without passing their password to our authentication library first.

This small but necessary step always comes with a risk: If a service has eager logging for debugging, passwords might end up in a log before they were hashed. This has already happened to several large companies who take their security seriously, such as Twitter and GitHub (https://arstechnica.com/information-technology/2018/05/twitter-advises-users-to-reset-passwords-after-bug-posts-passwords-to-internal-log/).

This risk is not limited to passwords. It’s also easy to send privacy-sensitive data to logs. We have always worked from the principle that personal data such as people’s names should only be stored when absolutely necessary, so it should not end up in logs. On top of common sense, the new GDPR legislation gave us another incentive to focus on privacy. It requires us to delete personal data upon request of the person whose data is stored. If it is in the logs, this would require pruning the logs which is never a fun job. Therefore, we would rather make sure that this information never ends up in the logs in the first place.

As I analyzed this risk, I found a simple solution that makes it much less likely for it to occur.

 

PrivateString

I implemented a class called PrivateString that is a logical extension of a String , but prevents it from being accidentally printed. This class now lives in a common utilities library used by most of our microservices.

Here is the main bit of the code, in Scala:

If this looks complicated, don’t worry. I’ll go through it step by step to explain how it works. I think you will find it will not be too hard to implement a version of this in another programming language of your choice.

You simply construct an instance of the class by passing in a String . This could be a password from a HTTP Request or a full name. By default, warnOnToString is set to true.

When you try to print or log any Java or Scala object, it uses the toString method to turn the object into a printable String. I have specifically overridden the toString here to log a warning if you attempt to log a PrivateString (because this is usually a mistake). Afterwards, it does not actually print the String. Instead, it just writes "[PRIVATE STRING]" to the log.

If you actually need the content of the String , in case of a full name to show it to the user, or in case of a password for the actual authorization step, you need to explicitly call the secureContent function to retrieve it.

For compatibility reasons I implemented the Java default methods equals and hashCode.

I also made a PrivateString companion object, which is a Scala construct where you can put functions that are not specific to the class instance. In other words, the equivalent of static methods in languages such as Java.

The apply() function is a Scala convenience that allows you to create instances of a class without using the new keyword. On the other hand, the unapply() function allows you to ‘unpack’ the contents of an object in a convenient way. I specifically did not implement the latter because I do not want to make it convenient to unpack a PrivateString !

 

Configurable logging

Finally, there is the warnOnToString boolean that defaults to true . Why do we need this? Well, in Scala we commonly use case classes. Case classes are used to pass sets of immutable data around. However, as a convenience to developers, when print is called on a case class, it prints out the entire contents of the class instance.

Sometimes, it might make sense to have a PrivateString in a case class, but you also want to log it. As an example, consider a HttpRequest case class. A class like that would contain the request’s body but also the authentication. If a developer wants to debug their API, it would make sense for them to log all requests. In that case we do not want to log the passwords (hence we store them in a PrivateString within the case class), but we do not need to log an extra warning every time a Request is printed, because the developer designed it this way on purpose. For this use case, the warning can be turned off with this boolean setting.

 

Convenient to use

There is one extra Scala convenience I added. I wanted to make it hard to unpack PrivateStrings, but easy to implement them. For that reason, I added the following package object to the package containing the PrivateString code:

This implicit function allows developers to use a regular String in any place a PrivateString would normally need to be constructed. For example:

The password is automatically packed into a PrivateString by the implicit function. Don’t worry, your IDE should tell you an implicit conversion is taking place here.

 

In conclusion

PrivateString is a class that prevents developers from accidentally printing passwords and other sensitive data to logs and so on. It does so by overriding the toString method. It is easy to implement and a version of this can probably be made in any common programming language.

Using it would prevent mistakes like Twitter and GitHub have made.

I would like to thank my coworkers who reviewed and tested my code, and the internet user with the handle Subjunctive, who gave me the original idea for the PrivateString .