Protecting JavaScript Files From Formjacking
Most web pages now consist of multiple JavaScript files that are included in various ways (via <script> or in some more dynamic fashion, bundled and minified or not). But since these scripts interact with everything on the page, they can be a security risk.
And Magecart showcased that risk – the group attacked multiple websites, including British Airways and Ticketmaster, and stole a few hundred thousand credit card numbers.
It is a simple attack called “formjacking” where attacker inserts a malicious javascript snippet into a trusted javascript file, collects credit card details entered into payment forms and sends them to an attacker-owned website. Obviously, the easy part is writing the malicious javascript; the hard part is getting it on the target website.
Many websites rely on externally hosted assets (including scripts) – be it a CDN, or a dedicated asset server (as in the case of British Airways). These externally hosted assets may be vulnerable in several ways:
- Asset servers may be less protected than the actual server, because they are just static assets, what could go wrong?
- Credentials to access CDN configuration may be leaked which can lead to an attacker replacing the original source scripts with their own
- Man-in-the-middle attacks are possible if the asset server is misconfigured (e.g. allowing TLS downgrade attack)
- The external service (e.g. CND) that was previously trusted can go rogue – that’s unlikely with big providers, but smaller and cheaper ones are less predictable
Once the attackers have replaced the script, they are silently collecting data until they are caught. And this can be a long time.
So how to protect against those attacks? A typical advice is to introduce a content security policy, which will allow scripts from untrusted domains to be executed. This is a good idea, but doesn’t help in the scenario where a trusted domain is compromised (well, it helps a little by blocking some requests to a malicious site that could transmit the data, but it can’t block redirects that pass the data through GET). There are several main approaches, and I’ll summarize them below:
- Subresource integrity – this is a browser feature that lets you specify the hash of a script file and validates that hash when the page loads. If it doesn’t match the hash of the actually loaded script, the script is blocked. This sounds great, but has several practical implications. First, it means you need to complicate your build pipeline so that it calculates the hashes of minified and bundled resources and inject those hashes in the page templates. It’s a tedious process, but it’s doable. Then there are the dynamically loaded scripts where you can’t use this feature, and there are the browsers that don’t support it fully (Edge, IE and Safari on mobile). And finally, if you don’t have a good build pipeline (which many small websites don’t), a very small legitimate change in the script can break your entire website.
- Don’t use external services – that sounds straightforward but it isn’t always. CDNs exist for a reason and optimize your site loading speeds and therefore ranking, internal policies may require using a dedicated asset server, sometimes plugins (e.g. for WordPress) may fetch external resources. An exception to this rule is allowed if you somehow sandbox the third party script (e.g. via iframe as explained in the link above)
- Secure all external servers properly – if you can do that, that’s great – upgrade the supported cipher suites, monitor for 0days, use only highly trusted CDNs. Regardless of anything, you should obviously always strive to do that. But it requires expertise and resources, which may not be available to every company and every team.
There is one more scenario that may sound strange – if an attacker hacks into your main application server(s), they can replace the scripts with whatever they want. It sounds strange at first, because if they have access to the server, it’s game over anyway. But it’s not always full access with RCE – might be a limited access. Credit card numbers are usually not stored in plain text in the database, so having access to the application server may not mean you have access to the credit card numbers. And changing the custom backend code to collect the data is much more unpredictable and time-consuming than just replacing the scripts with malicious ones. None of the options above protect against that (as in this case the attacker may be able to change the expected hash for the subresource integrity check)
Because of the limitations of the above approaches, at my company we decided to provide a tool to monitor your website for such attacks. It’s called Scriptinel.com (short for Script Sentinel) and is currently in early beta. It’s mainly targeted at small website owners who can’t get any of the above 3 points, but can be used for sophisticated websites as well.
What it does is straightforward – it scans a given URL, extracts all scripts from it (even the dynamic ones), and starts monitoring them for changes with periodic requests. If it discovers a change, it notifies the website owner so that they can react.
This means that the attacker may have a few minutes to collect data, but time is an important factor here – this is not a “SELECT *” data breach; it relies on customers using the website. So a few minutes minimizes the damage. And it doesn’t break your website (I guess we can have a script to include that blocks the page if scriptinel has found discrepancies). It also doesn’t require changes in the build process to include hashes. Of course, such a reactive approach is not perfect, especially if there is nobody to react, but monitoring is a good idea regardless of whether other approaches are used.
There is the issue of protected pages and pages that are not directly accessible via a GET request – e.g. a payment page. For that you can enter your javascript files individually, rather than having the tool scan the page. We can add a more sophisticated user journey scan, with specifying credentials and steps to reach the protected pages, but for now that seems unnecessary.
How does it solve the “main server compromised” problem? Well, nothing solves that perfectly, as the attacker can do changes that serve the legitimate version of the script to your monitoring servers (identifying them by IP) and the modified scripts to everyone else. This can be done on the compromised external asset servers as well (though not with leaked CDN credentials). However this implies the attacker knows that Scriptinel is used, knows the IP addresses of our scanners, and has gained sufficient control to server different versions based on IP. This raises the bar significantly, and can even be made impossible to pull off if we regularly change the IP addresses in a significantly large IP range.
Such functionality may be available in some enterprise security suites, though I’m not aware of it (if it exists somewhere, please let me know).
Overall, the problem is niche, but tough, and not solving it can lead to serious data breaches even if your database is perfectly protected. Scriptinel is a simple to use, good enough solution (and one that’s arguably better than the other options).
Good information security is the right combination of knowledge, implementation of best practices and tools to help you with that. And I maybe Scriptinel is one such tool.
Most web pages now consist of multiple JavaScript files that are included in various ways (via <script> or in some more dynamic fashion, bundled and minified or not). But since these scripts interact with everything on the page, they can be a security risk.
And Magecart showcased that risk – the group attacked multiple websites, including British Airways and Ticketmaster, and stole a few hundred thousand credit card numbers.
It is a simple attack called “formjacking” where attacker inserts a malicious javascript snippet into a trusted javascript file, collects credit card details entered into payment forms and sends them to an attacker-owned website. Obviously, the easy part is writing the malicious javascript; the hard part is getting it on the target website.
Many websites rely on externally hosted assets (including scripts) – be it a CDN, or a dedicated asset server (as in the case of British Airways). These externally hosted assets may be vulnerable in several ways:
- Asset servers may be less protected than the actual server, because they are just static assets, what could go wrong?
- Credentials to access CDN configuration may be leaked which can lead to an attacker replacing the original source scripts with their own
- Man-in-the-middle attacks are possible if the asset server is misconfigured (e.g. allowing TLS downgrade attack)
- The external service (e.g. CND) that was previously trusted can go rogue – that’s unlikely with big providers, but smaller and cheaper ones are less predictable
Once the attackers have replaced the script, they are silently collecting data until they are caught. And this can be a long time.
So how to protect against those attacks? A typical advice is to introduce a content security policy, which will allow scripts from untrusted domains to be executed. This is a good idea, but doesn’t help in the scenario where a trusted domain is compromised (well, it helps a little by blocking some requests to a malicious site that could transmit the data, but it can’t block redirects that pass the data through GET). There are several main approaches, and I’ll summarize them below:
- Subresource integrity – this is a browser feature that lets you specify the hash of a script file and validates that hash when the page loads. If it doesn’t match the hash of the actually loaded script, the script is blocked. This sounds great, but has several practical implications. First, it means you need to complicate your build pipeline so that it calculates the hashes of minified and bundled resources and inject those hashes in the page templates. It’s a tedious process, but it’s doable. Then there are the dynamically loaded scripts where you can’t use this feature, and there are the browsers that don’t support it fully (Edge, IE and Safari on mobile). And finally, if you don’t have a good build pipeline (which many small websites don’t), a very small legitimate change in the script can break your entire website.
- Don’t use external services – that sounds straightforward but it isn’t always. CDNs exist for a reason and optimize your site loading speeds and therefore ranking, internal policies may require using a dedicated asset server, sometimes plugins (e.g. for WordPress) may fetch external resources. An exception to this rule is allowed if you somehow sandbox the third party script (e.g. via iframe as explained in the link above)
- Secure all external servers properly – if you can do that, that’s great – upgrade the supported cipher suites, monitor for 0days, use only highly trusted CDNs. Regardless of anything, you should obviously always strive to do that. But it requires expertise and resources, which may not be available to every company and every team.
There is one more scenario that may sound strange – if an attacker hacks into your main application server(s), they can replace the scripts with whatever they want. It sounds strange at first, because if they have access to the server, it’s game over anyway. But it’s not always full access with RCE – might be a limited access. Credit card numbers are usually not stored in plain text in the database, so having access to the application server may not mean you have access to the credit card numbers. And changing the custom backend code to collect the data is much more unpredictable and time-consuming than just replacing the scripts with malicious ones. None of the options above protect against that (as in this case the attacker may be able to change the expected hash for the subresource integrity check)
Because of the limitations of the above approaches, at my company we decided to provide a tool to monitor your website for such attacks. It’s called Scriptinel.com (short for Script Sentinel) and is currently in early beta. It’s mainly targeted at small website owners who can’t get any of the above 3 points, but can be used for sophisticated websites as well.
What it does is straightforward – it scans a given URL, extracts all scripts from it (even the dynamic ones), and starts monitoring them for changes with periodic requests. If it discovers a change, it notifies the website owner so that they can react.
This means that the attacker may have a few minutes to collect data, but time is an important factor here – this is not a “SELECT *” data breach; it relies on customers using the website. So a few minutes minimizes the damage. And it doesn’t break your website (I guess we can have a script to include that blocks the page if scriptinel has found discrepancies). It also doesn’t require changes in the build process to include hashes. Of course, such a reactive approach is not perfect, especially if there is nobody to react, but monitoring is a good idea regardless of whether other approaches are used.
There is the issue of protected pages and pages that are not directly accessible via a GET request – e.g. a payment page. For that you can enter your javascript files individually, rather than having the tool scan the page. We can add a more sophisticated user journey scan, with specifying credentials and steps to reach the protected pages, but for now that seems unnecessary.
How does it solve the “main server compromised” problem? Well, nothing solves that perfectly, as the attacker can do changes that serve the legitimate version of the script to your monitoring servers (identifying them by IP) and the modified scripts to everyone else. This can be done on the compromised external asset servers as well (though not with leaked CDN credentials). However this implies the attacker knows that Scriptinel is used, knows the IP addresses of our scanners, and has gained sufficient control to server different versions based on IP. This raises the bar significantly, and can even be made impossible to pull off if we regularly change the IP addresses in a significantly large IP range.
Such functionality may be available in some enterprise security suites, though I’m not aware of it (if it exists somewhere, please let me know).
Overall, the problem is niche, but tough, and not solving it can lead to serious data breaches even if your database is perfectly protected. Scriptinel is a simple to use, good enough solution (and one that’s arguably better than the other options).
Good information security is the right combination of knowledge, implementation of best practices and tools to help you with that. And I maybe Scriptinel is one such tool.
https://tushirnitin.wordpress.com/2019/10/03/the-5-most-popular-front-end-frameworks-compared/