This is part 2 of a multi-part series where I will cover strategies you can use to secure your digital supply chain


Introduction

In part 1 of this series, we looked at:

  1. Moving away from static credentials and toward more secure alternatives like OIDC, IAM Identity Center and IAM Roles,
  2. Utilising secret management solutions like AWS SecretsManager and AWS Parameter Store for any static credentials we do have,
  3. Monitoring IAM for potentially misconfigured access using IAM Access Analyzer,
  4. A basic introduction to Attribute Access Control (ABAC)

Now, let’s cover the following (slightly more advanced) topic: securing your pipeline by scanning our external dependencies, storing our internal packages securely, and scanning our code for vulnerabilities before deployment.

Architecture Diagram



Scanning external dependencies

To ensure your code is safe, you must also provide your dependencies are secure. Recently, there’s been more than a few occasions where external repositories or individual packages have been compromised or may contain critical vulnerabilities.

Fortunately, scanning your external dependencies is an easy process. Many open-source tools exist which can assist.

I’ll be using Pypi packages as an example for the following examples, but you can find tools for other languages and repositories, such as NPM.


Scanning for malicious dependencies

Today, popular packages from NPM or PyPi are at constant risk of compromise; they are very high-value targets. Compromising one of these popular packages could yield very high rewards as many applications will be compromised.

A big company you’ve probably heard of, Datadog, built an open-source package scanning tool called GuardDog. This tool (as with any security tool) won’t necessarily have a 100% detection rate, but it uses pretty robust heuristics to determine whether a package may be compromised.

Common malware traits, such as code obfuscation and downloading executables (as well as many more), are detected by this tool and can help identify potentially malicious packages. It’s also possible trusted packages could be flagged as malicious; although unlikely, sometimes there are valid reasons to perform specific actions such as downloading executables.

Additionally, GuardDog will look for signs of compromise in the package’s metadata, for traits such as names similar to popular packages (typosquatting) or a compromised maintainer email domain.


Using GuardDog

Using GuardDog is very simple; you can add the following actions to your pre-build process (example using python).

# Add to your pre-build actions:
python3 -m pip install guarddog
guarddog pypi verify ./requirements.txt

If you’d like to output the results to a sarif file, you can also do that using GuardDog.


Scanning for vulnerabilities

Once you’ve confirmed you’re not downloading malicious packages, the next step is to check for any vulnerabilities in your dependencies that could compromise your application.

You have many options here; a good choice for those looking for a free tool is pip-audit (npm will have some alternatives), or you could purchase a more in-depth tool such as pyup or Snyk.

Remember that this is scanning dependencies for known vulnerabilities and not scanning them for new undiscovered vulnerabilities; this is fine in most cases, especially for external packages we don’t control.

To ensure your existing dependencies are frequently scanned for vulnerabilities, you should periodically scan your requirements.txt file via scheduled workflows.


Using pip-audit

Like GuardDog, pip-audit is easy and requires inserting a few commands in your pre-build actions.

# Add to your pre-build actions:
python3 -m pip install pip-audit
pip-audit -r ./requirements.txt

As stated above, it’s recommended to configure scheduled actions that run scans on your current application’s requirements file to let you know whether there are any critical vulnerabilities you should resolve.



Securing internal code

So, we’ve learned how to scan our external packages for malicious code and known vulnerabilities, but what about our own packages?

Firstly, if we’re utilising external packages in our internal packages, we should incorporate scanning them in our integration pipelines.

Secondly, we will include additional checks to look for sensitive data we may have left in our code and new vulnerabilities we may not have picked up.

Thirdly, we’ll discuss utilising an internal-only artifact repository such as AWS CodeArtifact, which allows us to access external package repositories and our internal packages inside our environment, allowing us to restrict internet access from our development pipeline (which could be used by an adversary to download malicious packages from a third-party repository).


Scanning for secret leakage

It’s uncommon but not unheard of for developers to leave secrets in application code accidentally; this could be for numerous reasons, but usually just that some local ad-hoc testing was performed using some credentials, and they forgot to remove them.

If you’re not using static credentials, this risk should be mitigated; however, with so many different systems that need to work together these days, there is most likely one or two that still use static credentials.

You have a couple of options for scanning your code for secrets; a popular free option is GitLeaks, which contains many rules for detecting credential patterns for popular services and platforms like AWS, GitHub and Planetscale (and many more).

For those willing to pay for a solution, you can use a tool like Bridgecrew to get a more centralised view of any secrets in your repositories.


Pre-commit

Pre-commit is the recommended place to put your secrets scanning, as you want to catch any sensitive information in your code before it gets uploaded to the repository.

The Gitleaks GitHub README has a pretty detailed explanation of setting up pre-commit hooks for detecting secrets before committing.

A solution like BridgeCrew can help discover sensitive values in your code after it’s been pushed when every minute the issue hasn’t been remediated counts.


Accessing packages internally

For susceptible projects, you may want to restrict internet access to your CI/CD runners and, instead, limit them to downloading packages from an internal package repository such as CodeArtifact.

The benefit of using a repository like CodeArtifact is it lets you organise, upload and version your packages but also access external dependencies, all from an internal AWS service where you can control access via IAM.

If you set up your pipeline to use OIDC or IAM roles, you can integrate with CodeArtifact, making it an easy choice.

I won’t dive into setting up CodeArtifact in this section, as AWS has a pretty good blog article about integrating with CodeArtifact already.


Scanning our code for vulnerabilities

Now, it’s time for us to scan our code for new vulnerabilities.

We have a few options we can use for scanning our code:

  • For our infrastructure as code (IaC), we can use CheckOv (free) or Snyk (paid)
  • For our application code (assuming Python), we can use Bandit (free) or Snyk (paid)

For development teams who use many languages, consider using a paid tool like Snyk instead of free tools. Free tools are often targeted at one language or framework, whereas Snyk can scan many languages.


Implementing CheckOv

CheckOv is quick to implement and is highly recommended for all teams. We should all be using infrastructure-as-code, and CheckOv can detect vulnerabilities in our configurations and report them to us before deployment.

To implement CheckOv in your pipeline, perform the following pre-build steps.

# Add to your pre-build actions:
python3 -m pip install checkov
checkov -d /terraform-modules



Tying it all together

Taking what we’ve learned, an example CI/CD pipeline might look like this…

Architecture Diagram

While I didn’t cover every tool and how to implement them in this article, I hope you’ve learned some good strategies for ensuring the security of your CI/CD pipelines.

Provided you’ve been following along so far - you’ve got tight, restricted access to your critical resources and development pipelines, visibility into any misconfigured access policies and being alerted if developers manually log in to a sensitive workload account.

In addition to these controls, you’re now scanning your external dependencies for malicious code and performing regular vulnerability scanning against them, so you can quickly remediate any critical flaws in your application. You’ve also got steps that scan your code for potentially sensitive information and find vulnerabilities and misconfigurations in your infrastructure and application code.



Closing notes

You may wonder what else we need to cover; we already have a secure supply chain. However, we must design our environments with the expectation that we will be breached, whether tomorrow or in 10 years.

Follow along with my last article in this series, where I’ll cover how to monitor our environment for malicious activity, be alerted to any deviations from our compliance standards, scan our infrastructure for vulnerabilities, and build observability into our applications.

Although it may seem unrelated, our supply chain is only as good as our infrastructure security, so we must ensure we have robust scanning to proactively find issues, monitoring to detect security incidents and strong observability to triage a potential breach.

I hope you found this article helpful, have a great day.