Barstool Engineering

Rust Wasm on Fastly Compute@Edge

Jude Giordano — Tue, 24 Jan 2023 00:12:24 GMT

Recently I've been playing with a lot of different versions of WebAssmbly using Rust. From Leptos, Dioxus, and most recently by using Compute Edge on Fastly. In this blog, I'm going to build a simple api, and show how we can deploy it to Fastly using their Compute cli.

To get started, just install the Fastly Compute SDK using the following documentation. After creating an account and api token, you'll have to run something to effect of:

fastly profile create  --token=

Then, create a new Rust Compute service using:

fastly compute init

When prompted, simply choose rust as the language of choice. This will create a typical Rust binary project, with the Fastly crate included as a dependency, as well as a Fastly.toml file in the root for configuring the Compute settings.

Before we continue, let's do some simple optimizations. If you run:

fastly compute build

you will see Fastly creates a bin/main.wasm. If you check the file size using ls -lh bin/main.wasm, you will see that this file is around 2.5 mb. Let's improve that at the Rust release level. I'm just going to update the Cargo.toml to the following:

[profile.release]
opt-level = 3     # 0-3
strip = true      # strip symbols from binary
lto = true        # enable link time optimization
codegen-units = 1 # maximize size reduction optimizations

NOTE: codegen-units can have some tradeoffs, most notably of which is increased compile time; it's up to you if you want to include it. Now, if we re-run fastly compute build and then ls -lh bin/main.wasm, we can see it has decreased to around 400 kb (this will vary by machine specs, of course). Awesome!

Debugging the application is as simple as running:

fastly compute serve

and hitting the url printed in your terminal, for me it's at 127.0.0.1:7676 . As you would expect, Fastly simply creates some bindings to our src/main.rs file using their crate (which I believe just has some convenience wrappings around the http crate).

Easy! To publish our service to Fastly, all we have to do is run

fastly compute deploy

For a more involved example where I use Fastly's backends to serve a proxy of jsonplaceholder, check out this repository where I added the tokio async runtime, as well as whipped up some custom routing, tracing, and serialization.

Keep on rusting!

Fast and Efficient AWS Lambdas Built With Rust

Jude Giordano — Wed, 28 Dec 2022 18:22:19 GMT

As Barstool's resident Rustacean, it is my obligation to push for the use of Rust, and to lament about its elegance and power. When running services at scale, my favorite solution is just 'throw it in a lambda', and let AWS worry about handling request load.

As for other ways to trim cost, milliseconds per request is a decent angle for Rust to come in and do some heavy lifting. Now AWS does in fact support Rust as a lambda runtime, for this example I'll be using the fantastic IAC framework: Serverless

More specifically, the Serverless-Rust plugin. To get started, simply run the cargo command cargo init aws-example to create a new Rust project.

In the Cargo.toml, we are going to add the following new dependencies:

[dependencies]
lambda_http = "0.7.2"
tokio = { version = "1.22.0", features = ["macros", "rt-multi-thread"] }
serde = { version = "1.0.151", features = ["derive"] }
serde_json = "1.0.91"

If you use Rust, you are likely familiar with most of these dependencies, save for
Lambda Http, which adds types and helpers to tranform AWS request events. We will also need to run npm init -y in the root of the project, and use the following package.json

{
  "name": "aws-example",
  "version": "1.0.0",
  "dependencies": {
    "serverless": "^3.24.0",
    "serverless-rust": "^0.3.8"
  }
}

As you can see, we only need the Serverless and Serverless Rust packages as dev-dependencies.

We'll also add a serverless.yml file in the root of the project, and paste the following in:

service: aws-example

frameworkVersion: "3"

provider:
  name: aws
  runtime: rust
  region: us-east-1
  versionFunctions: false
  memorySize: 2048
  timeout: 30

plugins:
  - serverless-rust

package:
  individually: true

custom:
  rust:
    target: x86_64-unknown-linux-musl
    linker: clang
    dockerless: true

I won't go over every line of the custom section, but you can read all about the plugin here.

In our project src, create a single rust binary called bin/ping.rs (or whatever you want). Serverless can point to individual rust binaries and deploy them as lambdas that all scale independently. (you can also delete src/main.rs) Our structure now looks like this:

The actual code is very straightforward: we simply make each binary entry-point async using the Tokio macro, and return a response using the Lambda-Http Rust package. For example:

use lambda_http::{
    aws_lambda_events::serde_json::json, http::StatusCode, run, service_fn, Error, IntoResponse,
    Request, Response,
};

pub async fn ping(_event: Request) -> Result {
    let body = json!({ "message": "sup" }).to_string();
    let response = Response::builder()
        .status(StatusCode::OK)
        .header("Content-Type", "application/json")
        .body(body)
        .map_err(Box::new)?;
    Ok(response)
}

#[tokio::main]
async fn main() -> Result<(), Error> {
    run(service_fn(ping)).await
}

You could, of course, make your own structs that have a trait implementation of stringifying &self.

Finally, all we have to do to invoke our binary is add it to our Serverless definition with whatever event we like. The syntax is application-name.binary-name

functions:
  ping:
    handler: aws-example.ping
    # function url
    url: true
    events:
    # v1 REST Api
    - http:
        method: GET
        path: /api/v1/ping
    # v2 HTTP Api
    - httpApi:
        method: GET
        path: /api/v2/ping

I've outlined different ways for our lambda to be invoked: v1 Rest Api, v2 HTTP Api, and as a Function Url. The preference is simply up to you!

Finally, all that's left is to deploy and invoke our lambda. We can do this through the Serverless cli using npx serverless deploy --stage prod and then hitting one of our three endpoints.

Ta-da! A super fast, scalable, and very resource-efficient service using rust on AWS : )

A Real World Comparison between Cloudflare Workers and Fastly Compute@Edge

Andrew Barba — Fri, 17 Dec 2021 19:02:21 GMT

Over the past couple months the race to build the best edge compute platform has really heated up. We're seeing companies like Cloudflare, Fastly, AWS and Fly building compelling platforms to run code as close to your users as possible. Gone are the days of single compute instances handling many requests, we're entering a new era of compute where each request gets its own isolated container and the ability to scale to thousands, even millions, of requests per second.

Although there are many comparisons still to be done between all platforms, I want to take this opportunity to focus on Cloudflare and Fastly as the two companies have been going back and forth in what I would consider a largely meaningless feud. The saga began with Cloudflare testing their JavaScript runtime against Fastly's JavaScript runtime (still in beta) in a basic hello world test. The test was simple: how fast can each runtime return a hello world response. In this case "hello world" simply meant replying with a JSON response of the current request headers. If you're just dying to know how fast each platform could return such a response, let me spoil it for you: really fucking fast!

Both platforms were returning this response in under 100ms up to the 90th percentile. I don't know about you, but we dont have many hello world endpoints in production so this wasn't exactly going to sell us in either direction. What we wanted was a more robust example of moving a traditional server workload to the edge, and thats what I plan to show you today. Importantly, I think a comparison between any of these platforms needs to go well beyond just time-to-first-byte. We care a lot more about developer experience, framework support, CI/CD, and the other things that make development teams happy and more efficient in their everyday work. My goal is to give a comprehensive overview of the following:

TypeScript Language Support
JavaScript Platform APIs
Deploying with GitHub Actions
Performance Comparison
Platform Limitations

Before we get into our comparisons, it's important to understand the production workload that I've re-written for each platform. Internally we call this product Pipe-Stream: it's goal is to take a list of MP3 files and stream them together, in order, as a single MP3 file. We use this technology for quite a few different things - but one of the obvious benefits is to swap out ad reads dynamically as we get new ad partners throughout the year. Our Podcast API pre-chunks our mp3 files into segments so they are 100% ready to be combined into a single stream. Pipe-Stream does not know anything about the mp3 spec, it's job is to simply concatenate the segments into one streaming response. And the "streaming" aspect of this service is really important. We have some MP3 files over 1GB in size and thus we do not want to pull all that data into memory. Streaming should be completely pass-through so we can optimize time-to-first-byte as well as runtime memory.

TypeScript Language Support

I'm happy to report that both platforms have excellent support for developing your application in TypeScript. Each platform provides first-class types for their platform API's, making it dead simple to ensure your code is always using their API's correctly. Each platform compiles your TypeScript using webpack, and the webpack config files are nearly identical between the platforms. As of writing, Cloudflare gets the slight edge in getting started as they provide a one-line-command to create a new TypeScript workers project. Fastly provides a similar one-line-command for a JavaScript project, but it's up to you to figure out how to add webpack and get it to build. Hint: copy the Cloudflare webpack file and dependencies.

JavaScript Platform APIs

Okay so the TypeScript support is nearly identical between the platforms, but what can we actually do in TypeScript (compiled to WASM) on each platform? There are some critical components coming from our production code running on Node.js that need to be available in each runtime:

Readable Streams
Writable / Transform Streams
HTTP Requests with Streaming Bodies

Readable Streams

A readable stream is perhaps the most important aspect of this entire project. Our goal is to stream each MP3 segment from the source (in our case Amazon S3) to the client. We must avoid reading the entire file into memory and instead stream the response directly to the client. This can only be achieved if the platform supports the concept of a readable stream. In Node.js this looks like:

import { Readable as ReadableStream } from 'stream'

In both Fastly Compute@Edge and Cloudflare Workers ReadableStream is a global class, so no need to import it to use it. Their implementations of ReadableStream follow the same spec as the Web API. This conformance to the Web API is a common theme that you will see throughout this post. Each platform makes a strong effort to conform to the Web API as much as possible, but each has their own differences and trade-offs which we will cover in another section.

Writable / Transform Streams

In order to successfully combine multiple readable streams into a single destination stream, we must pipe the streams through what's known as a TransformStream. A TransformStream is once again part of the Web API and it provides both a writable and readable stream. You can write data to the writable stream and that data is made available on the readable stream. In Node.js this looks like:

import { Transform, Readable, Writable } from 'stream'

export function combineStreams(streams: Readable[]): Readable {
  const stream = new Transform()
  _combineStreams(streams, stream).catch((err) => stream.destroy(err))
  return stream
}

async function _combineStreams(sources: Readable[], destination: Writable) {
  for (const stream of sources) {
    await new Promise((resolve, reject) => {
      stream.pipe(destination, { end: false })
      stream.on('end', resolve)
      stream.on('error', reject)
    })
  }
  destination.end()
}

This block of code is doing the following:

A function called combineStreams accepts an array of readable streams to stitch together
It creates a new TransformStream
Loops through each readable stream and pipes it to the transform stream
Prevents closing the transform stream during each pipe call
Closes the transform stream once all streams have been combined
Returns the transform stream synchronously so it can be used by the caller

Node's TransformStream differs from the Web API in one key way - it is both a Readable and Writable stream and thus can be returned as either without the caller knowing it is one or the other. The Web API provides a slightly different spec where a TransformStream is actually not a stream at all, but a class the exposes both readable and writable streams as properties of the class.

Here is the exact Cloudflare implementation of the previous Node.js code:

export function combineStreams(streams: ReadableStream[]): ReadableStream {
  const stream = new TransformStream()
  _combineStreams(streams, stream.writable)
  return stream.readable
}

async function _combineStreams(sources: ReadableStream[], destination: WritableStream) {
  for (const stream of sources) {
    await stream.pipeTo(destination, {
      preventClose: true
    })
  }
  destination.close()
}

Amazingly: this is less lines of codes than the Node.js implementation and even comes with an async version of Stream.pipe, greatly cleaning up our code. So how does Fastly compare? Well the Fastly implementation is actually identical to the Cloudflare implementation. This is a huge win for developers looking to experiment on both platforms.

HTTP Requests with Streaming Bodies

At this point we have the ability to successfully combine readable streams into a single readable destination stream. Now we need a way to actually download content from Amazon S3 and return the data as a readable stream. Lucky for us, both Cloudflare and Fastly implement fetch from the Web API, but with one major difference.

Fetching data with Cloudflare is as easy as:

const urls = [...]
const requests = urls.map(url => fetch(url.href))
const responses = await Promise.all(requests)
const streams = responses.map(res => res.body)

Although fetching data with Fastly is similar, there is one major difference in that the hostname of all fetched resources must be defined as Fastly Backends. This wont come as a surprise to anyone familiar with Fastly's VCL platform, but I bet this will be a major hangup for new customers coming from a more traditional web background. Assuming our hostnames are defined as backends, fetching data with Fastly is nearly as easy as Cloudflare:

const urls = [...]
const requests = urls.map(url => fetch(url.href, {
  backend: url.hostname
}))
const responses = await Promise.all(requests)
const streams = responses.map(res => res.body)

Deploying with GitHub Actions

We're finally ready to deploy our code to each platform, and at Barstool this involves setting up a GitHub actions workflow. I'm happy to report that both platforms provide a GitHub Actions steps that makes it dead simple to deploy your code. Here's our workflow for Cloudflare:

name: Deploy Application

on:
  push:
    branches: [main]

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2

      - name: Use Node.js
        uses: actions/setup-node@v2
        with:
          node-version: 16
          cache: yarn

      - run: yarn install --frozen-lockfile

      - name: Deploy to Cloudflare
        uses: cloudflare/wrangler-action@1.3.0
        with:
          apiToken: ${{ secrets.CF_API_TOKEN }}

And here's our workflow for Fastly:

name: Deploy Application

on:
  push:
    branches: [main]

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2

      - name: Use Node.js
        uses: actions/setup-node@v2
        with:
          node-version: 16
          cache: yarn

      - run: yarn install --frozen-lockfile

      - name: Deploy to Compute@Edge
        uses: fastly/compute-actions@beta
        env:
          FASTLY_API_TOKEN: ${{ secrets.FASTLY_API_TOKEN }}

Each step correctly uses the build command in your package.json which should compile your code with Webpack and then bundle it for deployment. There's technically a bit more to it than simply defining this workflow, as each platform has their own configuration file for setting up domains, environments, etc. That's slightly out of scope for this article, but I can assure you both are pretty complete and easy to define your infrastructure as code.

Performance Comparison

Finally. Let's see how these platforms perform with a real world use-case. In order to provide a fair comparison I'm going to run the tests without any CDN in front of our production workload on EC2, however, I am going to enable caching on the origin requests as each platform supports caching fetch requests and this is a critical part of the performance profile for our use case. If we don't cache the origin requests our Amazon S3 bill would be astronomical.

The way pipe-stream works is it provides a single route GET /stream and requires a JWT to be passed as ?token=. Inside the JWT is the list of urls that need to be combined into a single stream. We use a JWT to ensure a bad actor cannot use our service to stream whatever files they want. The JWT is signed by a Barstool private key so we can validate it and ensure the JWT came from one of our services.

The first file we are going to test is a 3MB file defined by the following JWT payload:

{
  "iat": 1624309200000,
  "exp": 1624395600000,
  "data": {
    "u": [
      "https://cms-media-library.s3.us-east-1.amazonaws.com/barba/splitFile-segment-0000.mp3",
      "https://cms-media-library.s3.us-east-1.amazonaws.com/barba/splitFile-segment-0001.mp3",
      "https://cms-media-library.s3.us-east-1.amazonaws.com/barba/splitFile-segment-0002.mp3"
    ],
    "c": "audio/mpeg"
  }
}

At runtime, the platforms will fetch each file in the array and stream them back as a single combined MP3 file. Here are the test urls for each platform:

I performed all tests from my apartment in Brooklyn, NY on a 1GB Verizon FIOS connection. At the time of testing, I was consistently getting 580mbps according to fast.com. The benchmark was performed using a custom Deno script which fetches each url 100 times and then computes the average time-to-first-byte (TTFB) and average time-to-download (TTD). Here are my results:

PLATFORM	TTFB (MS)	TTD (MS)	SIZE (MB)
ec2	77.02	151.54	3.33
cloudflare	61.98	130.88	3.33
fastly	38.07	86.19	3.33

Next I wanted to test streaming a much larger file. Below are 3 more urls for a 70MB file:

And here are the results from fetching these urls 10 times on each platform:

PLATFORM	TTFB (MS)	TTD (MS)	SIZE (MB)
ec2	87.8	1750.9	69.79
cloudflare	66.4	1832.2	69.79
fastly	32.4	1369.1	69.79

The first thing to notice is TTFB is considerly better on the Edge platforms. This shouldn't be much of a surprise, as this is precisely what the original blog posts were showcasing from both Fastly and Cloudflare. Those blog posts did a much more robust analysis, testing TTFB from a variety of locations around the world. For the sake of time I did not perform my tests anywhere other than Brooklyn, NY. Keeping this in mind, it's still hard to ignore Fastly's results when it comes to TTFB. The project we're testing is doing considerably more work in the WASM runtime than the original blog posts, yet Fastly's runtime is optimizedto such a degree that you would barely notice.

Things become more interesting when it comes to downloading the entire stream. Keep in mind we're not downloading a single static file, the runtimes are stitching multiple static files together. Both Cloudflare and EC2 had similar performance characteristics, but Fastly managed to stream the entire smaller file 33% faster and the larger file 25% faster. Given this project is very IO heavy, and we enabled caching on the origin requests, this is also testing the pure CDN performance of each platform rather than just the WASM runtimes.

[2022-02-04] Update:

A previous version of this blog post tested an earlier version of Fastly's SDK which led to much slower performance than both Cloudflare and EC2. Since the initial post, the Fastly team has released the latest version their JS SDK with a native TransformStream and fixes to pipeTo. In all of our tests using the latest version of the SDK, Fastly has outperformed both EC2 and Cloudflare.

Platform Limitations

Aside from the discussed limitations of Fastly's fetch API, there's another limitation that's present on both platforms: Content-Length headers are not returned on streaming responses, even if we pre-compute the content length and set it ourselfs. This is actually turning out to be a major blocker in terms of officially migrating away from EC2 and onto one of the edge platforms.

We've spoken to both Cloudflare and Fastly about this limitation, and their teams are aware and looking for ways to fix it. Although we've received minimal details about what it will take to implement a fix, it's clear there are issues in the Web API fetch specification that prevent setting a content-length when using chunked encoding. In our case, we're not really using chunked encoding as we know the total amount of bytes between the MP3 files, but the runtime doesn't. I found some additional details regarding the issue here.

[2021-12-18] Update:

The tech lead for Cloudflare Workers responded to us on HackerNews and showed us how to correctly stream with a Content-Length header. Cloudflare implemented a non-standard FixedLengthStream class that allows passing the known content length into the constructor. Updating our implementation was as simple as:

const responseStream = new FixedLengthStream(contentLength)

combineStreams(streams).pipeTo(responseStream.writable)

return new Response(responseStream.readable)

If you view any of the Cloudflare URL's above you will now see the correct Content-Length returned in all cases.

Final Thoughts

I think the future is incredibly bright for this new age of edge computing, and I'm excited for both Fastly and Cloudflare to continue to improving their platforms and taking away market share from the 3 big major cloud vendors. Clear competition across the industry can only mean a better product for us, the developers. If you're interested in working on projects like this, check out barstoolsports.com/jobs or shoot me an email at barba@barstoolsports.com

Reporting on Data in High Volume Mongo Collections

Markham F Rollins IV — Mon, 06 Dec 2021 22:34:48 GMT

There are several services used here at Barstool that generate millions of mongo documents every day. In order to facilitate efficient reporting tools for the data and analytics team, we implemented a pattern that would yield near real-time results while not creating any long-running queries on millions of records. Our solution entailed using TTL on Mongo collections, Stitch to Snowflake for backup, and Mongo aggregations.

Mongo has built-in support to set TTLs on collection to act as a sort of cleanup after a set period of time. Even with proper indexing running aggregates on a high volume of documents is non-performant. We did an assessment of our collections and determined what the most useful timeframe would be and then created the indexes. When adding a TTL to an existing collection, if you have a large number of documents that will qualify for deletion you'll need to consider the strain this will place on the DB. In order to launch, we began by setting the expireAfterSeconds to a date far in the past and periodically updated it until we reached our desired TTL (collMod to update index). This minimized the number of records it was deleting and distributed them over time.

// Create the initial index
db.reportCollection.index(
  {
    created_at: 1
  },
  {
    expireAfterSeconds: 60 * 60 * 24 * [STARTING_NUM_DAYS]
  }
)

// Update the TTL
db.runCommand({
  collMod: reportCollection,
  index: {
    keyPattern: created_at_1,
    expireAfterSeconds: 60 * 60 * 24 * [NUM_DAYS]
  }
})

TTL Index

After NUM_DAYS, based on created_at, the document will be deleted. To ensure proper backup we sync records via Stitch into Snowflake for long-term storage. The collection now has a manageable number of documents that we can report on. To further streamline the process and ensure Mongo doesn't get overloaded, we generate aggregates on a cron and store the results in another collection.

Working with the data and analytics team we determined what the smallest granularity of reporting would be and wrote a Mongo aggregation based on their parameters. On our end, we store the results and create records for each level of granularity to cut down on the amount of processing needed when generating reports.

// Aggregation
db.collection.aggregate({
  $match: {
    field: 'foo'
  }
},
{
  $group: {
    _id: {
      topLevelGranularity: '$topLevel',
      midLevelGranularity: '$midLevel',
      baseLevelGranularity: '$baseLevel'
    }
  },
  sum: {
    $sum: 1
  }
}
                        
// Records Examples
{
  aggregation: 'baseLevel',
  baseLevel: 'blog',
  midLevel: 'sports',
  topLevel: 'barstool',
  sum: 1
}
{
  aggregation: 'baseLevel',
  baseLevel: 'video',
  midLevel: 'sports',
  topLevel: 'barstool',
  sum: 1
}
{
  aggregation: 'midLevel',
  midLevel: 'sports',
  topLevel: 'barstool',
  sum: 2
}
{
  aggregation: 'topLevel',
  topLevel: 'barstool',
  sum: 2
}

Aggregate Query

The above is a very basic aggregation example but can be expanded to have as many pipelines as required. Additionally, add indexes on the aggregated data to make querying data for reports performant. The front-end team uses our API to collect and present this data in various charts and graphs. We have a roadmap to further offload these reports using a few services and will post about the outcome.

Better Sold Out Variant Styling On Shopify Dawn Theme

Joseph Bona — Fri, 03 Dec 2021 16:20:01 GMT

Here at Barstool we recently launched a new Shopify store for the podcast brand Call Her Daddy. We chose the Shopify built Dawn theme as a starting point for this new project and it has performed very well for us through our busy holiday season.

When building out the theme for our use case we have made some customizations like creating a more traditional product image gallery for the Dawn theme, and building a better way to display sold out variants for a product without disabling selection of these variants. This blog will guide you through completing the latter.

Our inventory changes quite frequently, especially during the holidays so we use a CTA to sign up for a back in stock email when someone is viewing an out of stock variant. Out of the box the Dawn theme disables the add to cart button and adds a subtle 'Sold out' label.

Default sold out variant UI on the Dawn Theme (https://dawn-theme-default.myshopify.com/products/thelma-sandal)

What we prefer is to show the user what variants are out of stock and keep them clickable so they can open the back in stock email signup form.

Edit the JS for the `VariantRadios` class:

class VariantRadios extends VariantSelects {
  constructor() {
    super();
    // Trigger change when loaded
    this.onVariantChange()
  }

  // Overwrite updateOptions method to check for unavailable variants
  updateOptions() {
    const fieldsets = Array.from(this.querySelectorAll('fieldset'));
    this.options = fieldsets.map((fieldset) => {
      return Array.from(fieldset.querySelectorAll('input')).find((radio) => radio.checked).value;
    });
    const possibleVariants = this.getVariantData().filter(variant => variant.option1 === this.options[0])
    for (let index = 0; index < possibleVariants.length; index++) {
      const variant = possibleVariants[index]
      const input = document.querySelector(`[value="${variant.option2}"]`)
      if (!variant.available) {
        input.classList.add('unavailable')
      } else {
        input.classList.remove('unavailable')
      }
    }
  }
}

customElements.define('variant-radios', VariantRadios);

Editing VariantRadios class in /assets/global.js

Here we are editing the updateOptions class method to check for unavailable variants. For example if we have a product with size and color options we want to check all colors for size: small and add a classname to any that are unavailable. We get all variants with this.getVariantData() and filter those results to find variants with the first selected option, in this example the selected size option.

Once we have those variants we simply check if variant.available is falsy and add/remove the .unavailable class accordingly. One thing to note is that we invoke this method in the constructor so this check is done when the page loads and then on each subsequent selection.

Adding the CSS to style the radio buttons:

.product-form__input input[type='radio']:disabled + label,
.product-form__input input[type='radio'].uanvailable + label {
  border-color: rgba(var(--color-foreground), 0.3);
  color: rgba(var(--color-foreground), 0.4);
  text-decoration: line-through;
}
.product-form__input input[type='radio'].unavailable:checked + label {
  color: rgb(var(--color-background));
}

Add styles to assets/section-main-product.css

We add some CSS to designate unavailable variants with gray strikethrough text.

The finished experience:

We are hiring a full-time Shopify Engineer to join our team. If you have experience creating custom Shopify experiences like this one please apply.

Barstool Sports - Shopify Engineer

Barstool Sports is hiring a Shopify Engineer for Barstool Sports located in New York City. You will be expected to create and manage Shopify frontend features/improvements across our multiple storefronts. We are looking for someone with a UX background who is accustomed to working on Shopify Plus st…

Barstool Sports logo

Creating an SNS Fanout in Serverless

Markham F Rollins IV — Tue, 16 Nov 2021 15:51:06 GMT

Previously we've covered what the Barstool Queue Engine is and how we use it for long-running services. Our next problem was how to handle alerting our various services that those jobs had finished or errored. As we rely heavily on the Serverless framework, we implemented this principle using their tools. The idea of an SNS fanout is to send out messages to a single SNS topic that any SQS handler could listen to and interpret. To begin, we need a new topic, MyOutputTopic, created as follows that utilizes a multi-stage deployment:

resources:
    Resources:
        MyOutputTopic:
            Type: 'AWS::SNS::Topic'
            Properties:
                TopicName: MyOutput-${opt:stage}

service1/serverless.yml

Since our services are in separate repositories, we also need to make sure the ARN of that topic is accessible outside of this single serverless deployment. Serverless provides an Output implementation to direct our stack to export values:

resources:
    Outputs:
        MyOutputTopic:
            Value: !Ref MyOutputTopic
            Export:
                Name: MyOutputTopic-${opt:stage}

service1/serverless.yml

In our other repositories, we built out SQS queues in Serverless that subscribe to a singular topic. We had to create access policies so that we had permission to read from the topic:

resources:
    # Create a basic queue
    Service2Queue:
        Type: 'AWS::SQS::Queue'
        Properties:
            QueueName: Service2-${opt:stage}
            VisibilityTimeout: 60

    # Create an SNS subscription for the queue above
    Service2Subscription:
        Type: 'AWS::SNS::Subscription'
        Properties:
            TopicArn: MyOutputTopic-${opt:stage}
            Endpoint:
                Fn::GetAtt: [Service2Queue, Arn]
            Protocol: sqs
            RawMessageDelivery: 'true'
        
    # Provide the proper permissions for Service2Queue to recieve messages from MyOutputTopic-{$opt:stage}
    Service2QueuePolicy:
        Type: AWS::SQS::QueuePolicy
        Properties:
            PolicyDocument:
            Version: '2012-10-17'
            Statement:
                - Effect: Allow
                  Principal: '*'
              	  Action: SQS:SendMessage
              	  Resource:
                      - Fn::GetAtt: [Service2Queue, Arn]
                  Condition:
                      ArnEquals:
                      AWS:SourceArn: MyOutputTopic-${opt:stage}
            Queues:
                - !Ref Service2Queue

service2/serverless.ymlv

Now we have an SQS queue Service2Queue that is listening to all messages sent to the MyOutputTopic-${opt:stage} topic. This would have been enough if we truly cared about receiving everything but we don't, so as an additional step we employed SQS filters to limit what each queue would receive. When we send messages to BQE we include the following snippet as part of the message:

MessageAttributes: {
    filterKey1: {
        DataType: 'String',
        StringValue: `${output.filterKey1}`
    },
    filterKey2: {
        DataType: 'String.Array',
        StringValue: JSON.stringify(output.filterKey2)
    }
}

SNS Message

Once the above exists on a message the Serverless definition for the subscription can be updated to include a FilterPolicy to limit the invocations to only the messages that the queue needs to handle:

resources:
    Service2Subscription:
        Type: 'AWS::SNS::Subscription'
        Properties:
            TopicArn: MyOutputTopic-${opt:stage}
            Endpoint:
                Fn::GetAtt: [Service2Queue, Arn]
            Protocol: sqs
            RawMessageDelivery: 'true'
            FilterPolicy:
                filterKey1: ['foo']
                filterKey2: ['bar']

service2/serverless.yml

The last piece of code to drive this home is to hook the subscription into a lambda function to process the messages:

functions:
  MyOutputTopicQueue:
    handler: handlers/sqs.myOutputTopic
    events:
      - sqs:
          arn:
            Fn::GetAtt: [Service2Queue, Arn]
          batchSize: 10

SNS fanouts have proven incredibly helpful for our needs. I've put all this yml together in a Gist for easy reference. To learn more about SNS, Amazon has additional information here.

Getting Started: Convert a React Project to TypeScript

Gabriel Zarate — Fri, 22 Oct 2021 16:50:17 GMT

As a superset of JavaScript, Typescript can work in conjunction with JavaScript, importing Typescript code into a JavaScript file and vice versa. This means that migrating to TypeScript can be done incrementally, far different from converting a codebase from one programming language to something unrelated.

However, it can be daunting to convert a large codebase to TypeScript without creating days or sometimes weeks of work.

The best way to embark on this transition is to make small, incremental changes to your codebase and quickly get those changes into main. You don't want to have some long-running TypeScript branch while other members of your team are making changes upstream; that is setting yourself up for more work and a merge conflicts nightmare. Making focused efforts to push forward basic conversions to TypeScript will save a lot of time. Get your codebase converted to TypeScript before you type everything perfectly right away.

Get the Compiler Running

In this first step, set up the Typescript compiler with the most permissive settings. This is not the time to enable strict mode. In this first phase, disable noImplicitAny and rename all your files from .js to .ts (use this bash script).

{ 
   "compilerOptions":{
      "baseUrl": "src",
      "target":"es5",
      "allowJs": true,
      "skipLibCheck": true,
      "noImplicitAny": false,
      "moduleResolution": "node",
      "module": "esnext",
      "jsx": "preserve",
      "strict": false,
   },
   "include":[ 
      "src/**/*"
   ],
   "exclude":[ 
      "node_modules"
   ],
}

At this stage, only fix errors causing Typescript compiler errors, being careful to avoid functionality changes to the codebase.

Depending on your application, many of the errors you will find at this point involve defining what function parameters are required or optional, typing event onChange handlers, typing React Component props, etc.

Property 'children' is missing in type '{ title: string; items: any[]; secondary: any; small: any; }' but required in type 'Pick & Pick<...> & Pick<...>, "items" | ... 3 more ... | "tertiary">'.ts(2741)

Don't shy away from using the explicit any type at this point. You can add more meaningful types later.

const App = ({ props }: any) => {props.message}
;

Disable noImplicitAny

Next, set noImplicitAny to true. Your goal for this step will be to provide more meaning types where you can or add explicit any.

The compiler will no longer infer types in your components / functions.

function fetchData(arg) {
	return fetch(arg)
}
// Error: arg has an implicit 'any' type

function fetchData(arg: string) {
    return fetch(arg)
}

Depending on if you setting your skipLibCheck you may need to import types for your dependencies at this stage as well.

Enable Strict Mode

This last phase will most likely need to happen incrementally. Each team and application will have different needs as far as what how strict you set your compiler.

{ 
   "compilerOptions":{
     "strict": true,
     "noUnusedLocals": true,
     "noUnusedParameters": true,
     "noImplicitReturns": true,
     "forceConsistentCasingInFileNames": true
   },
   
}

You can also extend @typescript-eslint/parser for type specific linting.

In conclusion, the TypeScript compiler is your friend. It has not always felt that way when I am in the middle of converting large amounts of code to TypeScript in the past, but I certainly miss it when I am working in a vanilla JS codebase. Also, the TypeScript documentation is a great resource. You can find helpful guides to assist you in migrating to TypeScript or to learn the language for the first time.

Using ACRCloud to Identify Copyrighted Audio Content

Nick Booth — Fri, 06 Aug 2021 21:32:00 GMT

Consumer applications like Shazam have been around for years, but only recently has using Automated Content Recognition (ACR) tools in enterprise software been easy and affordable. As a major podcast producer, the team at Barstool has to verify that any new podcast episodes we upload don’t contain copyrighted content that we aren’t licensed to use prior to publishing. Up until recently, this process was manual and laborious. The engineering team set out to fix that.

The goal was to create an automated system identify any copyrighted audio at upload-time. We could then compare the identified content with a whitelist of labels that we license for use in podcasts, and display a simple UI so producers can verify at-a-glance that we have rights to use all of the content prior to publishing. If a section of audio comes back as containing unlicensed content, the producer would then recut the episode with alternate content in its place.

After evaluating several vendors, we chose to go with ACRCloud. They’re a relatively new startup, but have made a name for themselves in their short time in the industry. Their music identification service allows you to upload an audio or fingerprint file for identification. While uploading a full podcast audio file was tempting, in practice the transit and processing time was too long for our purposes.

ACRCloud provides a number of tools and SDKs on their GitHub to interface with their APIs. To prove out the concept we used the Python Scan Tool. This generates fingerprint files for every 10 second segment of the input file, then uploads them to the ACRCloud API. The output is parsed as either JSON or CSV locally. While we wouldn't use this tool directly in our infrastructure, it helped us to understand & prove out the process. After vetting the data, we were ready to replace the python script with a direct integration in our CMS.

Under the hood, the Python script uses the ACRCloud Extractor Tool to generate fingerprint files. The fingerprint files are generated for a specific span of time based on the CLI input. When each fingerprint file was ready, a request was made to the ACRCloud's identification API. These endpoints require each upload be signed to verify the payload integrity, so each request required an SHA1 hash to accompany the request. Below is the NodeJS to send the request


async function _identifyFingerprintFile({ file, offset, options }) {
//ACRCloud requires a header signing the payload 
  const data = Buffer.from(await fs.readFile(file))
  let current_data = new Date()
  let timestamp = current_data.getTime() / 1000
  let stringToSign = _buildStringToSign(
    'POST',
    options.endpoint,
    options.access_key,
    options.data_type,
    options.signature_version,
    timestamp
  )
  //Creating signature for the request
  let signature = crypto.createHmac('sha1', options.access_secret)
      .update(Buffer.from(stringToSign, 'utf-8'))
      .digest().toString('base64')
  let form = new FormData()
  form.append('sample', data)
  form.append('sample_bytes', data.length)
  form.append('access_key', options.access_key)
  form.append('data_type', options.data_type)
  form.append('signature_version', options.signature_version)
  form.append('signature', signature)
  form.append('timestamp', timestamp)
  const body = await http
    .post('https://' + options.host + options.endpoint, {
      method: 'POST',
      body: form
    })
    .json()
  return { file, offset, result: body }
}
function _sign(signString, accessSecret) {
  return 
}

Uploading audio fingerprint to ACRCloud

Our first performance improvement was to asynchronously upload the fingerprints for identification in parallel rather than one at a time. This reduced the total processing time greatly, while adding comparatively little complexity. This would all live in BQE so we aren't resource constrained per-process and don't have an upper run-time limit to worry about.

While the actual identification was performant, we noticed the tool took exponentially longer to generate fingerprints the further into the audio file they were, meaning generating a fingerprint from 0-10 was almost instant, while each subsequent fingerprint would longer and longer to generate. This is common behavior with applications that require seeking to a specific location in a media file, as the underlying mp3 library has to decode every frame to make sure your seek command is frame-accurate.

While FFMPEG can have this same issue, there are several ways to make frame-accurate seeking faster. We’ve used some of these methods when clipping live video so we were familiar with the problem. To speed up ACRCloud, our solution was to generate mp3 clips using FFMPEG map function with the flag -segment_time flag set to 10 seconds.

async createMp3Chunks({ input, eventId }) {
    return new Promise((resolve, reject) => {
    //_createAudioFfmepgProcess is a helper that creates an ffmpeg 
    //process with a standard bitrate, refresh rate & audio codec
      const ffmpegProcess = _createAudioFfmpegProcess(input, eventId)
      ffmpegProcess
        .addOutputOptions(['-c copy', '-map 0', '-segment_time 00:00:10', '-f segment', '-reset_timestamps 1'])
        .save(`temp/${eventId}-segment-%04d.mp3`)
        .on('start', function (commandLine) {
          console.log(`Segment process spawned Ffmpeg with command: ${commandLine}`)
        })
        .on('end', async (event) => {
          console.log('event', event)
          console.log(`${eventId} -  File segmented successfully`)
          //have to read the temp directory for matching files, since
          //ffmpeg wont return output files as part of stdout
          const tempFiles = await fs.readdir('temp/')
          resolve(tempFiles.filter((file) => file.indexOf(`${eventId}-segment-`) > -1))
        })
        .on('error', reject)
    })
  }

Generating Segmented MP3s

We then could take all of the generated mp3s segments and process them using the ACR extractor tool in parallel. Once we have an array of fingerprint files we had to tie everything together and generate a response with the specific time ranges where copyrighted content was found. This reduced our processing time to ~1 minute for a 1 hour long audio file.

Once we had a JSON response with all of the licensed content used in the audio file, all that was left to do was return the value back to our Podcast API. Below is a view of that our producers see if an audio file is uploaded with unlicensed content:

Example of a Podcast Producer View

Product Image Slider for Shopify Dawn Theme Using Web Components

Joseph Bona — Fri, 30 Jul 2021 20:46:37 GMT

Among the multitude of updates announced at Shopify Unite 2021 was the introduction of a new starter theme built by Shopify called Dawn. The theme is a great resource for learning and using Shopify's Online Store 2.0 and it's new features.

The team at Shopify put a lot of time and effort into the UX and you can read about their approach here. While I like a lot of the choices they made when designing the theme there is one component in particular that may not be best for all stores. On the product detail page product images are laid out in a grid rather than using a more common image slider gallery.

Default product image gallery on Dawn

This blog will guide you through creating a more traditional product image gallery for the Dawn theme.

Product image gallery slideshow

One thing you will notice when browsing the code for the Dawn theme is the decision to use Javascript sparingly and leverage broswer APIs for progressive enhancement. One of these APIs is the Web Components API to create custom elements. We will leverage Web Components to create our product image gallery.

Let's start by writing our liquid mockup for the image slider:


  {%- if product.media.size > 1 -%}
  
    {%- for media in product.media -%}
      
        {% render 'product-thumbnail', media: media %}
      
    {%- endfor -%}
  
  {%- endif -%}
  
    {%- for media in product.media -%}
      
        {% render 'product-thumbnail', media: media %}
      
    {%- endfor -%}

First we set up our product-gallery custom element. If the product has multiple images we will render the thumbnail navigation element: ul.product-gallery__nav. We then create a div.product-gallery__images to hold the current image being displayed. By default these images will be hidden unless the item is the active image, which is designated with a classname .product-gallery__image--active. We also add navigational buttons for previous and next slide. The product-thumbnail snippet we use for our images is the one that comes with the theme with some minor changes to remove the modal that displays a larger image.

Next let's add some CSS:

.product-gallery {
  display: flex;
}
// Slider buttons are positioned absolutely over the active image
.product-gallery .slider-button {
  position: absolute;
  top: 50%;
  transform: translateY(-50%);
}
.product-gallery .slider-button:not([disabled]):hover {
  border-color: rgba(var(--color-foreground), 0.3);
}
.product-gallery .slider-button:disabled {
  display: none;
}
.product-gallery .slider-button--prev {
  left: 0;
  border-left-width: 0;
}
.product-gallery .slider-button--next {
  right: 0;
  border-right-width: 0;
}
// Thumbnail navigation will not exceed the height of the active image and will scroll overflowing elements
.product-gallery__nav {
  width: 140px;
  list-style: none;
  margin: 0 .5rem 0 0;
  padding: 0;
  height: 100%;
  overflow-y: auto;
  display: none;
}
.product-gallery__nav::-webkit-scrollbar { 
  display: none; 
}
.product-gallery__nav-item {
  display: block;
  cursor: pointer;
}
.product-gallery__nav-item + .product-gallery__nav-item {
  margin-top: .5rem;
}
.product-gallery__nav-item img {
  width: 100%;
  display: block;
}
.product-gallery__images {
  flex-grow: 1;
  height: fit-content;
  position: relative;
}
// Hide images unless they are the active image
.product-gallery__image {
  display: none;
}
.product-gallery__image--active {
  display: block;
}
@media screen and (min-width: 750px) {
  .product-gallery__nav {
    display: block;
  }
}

Here we are setting up the basic layout for our gallery. Things to note are that the .product-gallery__nav will fill 100% of the height of it's parent. The parent .product-gallery will have it's height set programattically to be the height of the active image. This allows the nav to not exceed the height of the image and scroll if it does. I think this is a better use of vertical space than the default image gallery, especially if you care about the viewability of recommendations, reviews or user generated content below the main product. One other note is that the thumbnail navigation is hidden on mobile. I don't think it adds anything to the mobile experience and adds more images for the user to download. Our navigational elements do a good job of letting the user know there are more images without cluttering the UI.

Finally we create our Web Component for the slider:

class ProductGallery extends HTMLElement {
  constructor() {
    super();
    this.init()

    // Add resize observer to update container height
    const resizeObserver = new ResizeObserver(entries => this.update());
    resizeObserver.observe(this);

    // Bind event listeners
    this.navItems.forEach(item => item.addEventListener('click', this.onNavItemClick.bind(this)))
    this.prevButton.addEventListener('click', this.onButtonClick.bind(this));
    this.nextButton.addEventListener('click', this.onButtonClick.bind(this));
    // Listen for variant selection change to make current variant image active
    window.addEventListener('message', this.onVariantChange.bind(this))
  }

  init() {
    // Set up our DOM element variables
    this.imagesContainer = this.querySelector('.product-gallery__images');
    this.navItems = this.querySelectorAll('.product-gallery__nav-item');
    this.images = this.querySelectorAll('.product-gallery__image');
    this.prevButton = this.querySelector('button[name="previous"]');
    this.nextButton = this.querySelector('button[name="next"]');
    // If there is no active images set the first image to active
    if (this.findCurrentIndex() === -1) {
      this.setCurrentImage(this.images[0])
    }
  }

  onVariantChange(event) {
    if (!event.data || event.data.type !== 'variant_changed') return 
    const currentImage = Array.from(this.images).find(item => item.dataset.mediaId == event.data.variant.featured_media.id)
    if (currentImage) {
      this.setCurrentImage(currentImage)
    }
  }

  onNavItemClick(event) {
    const mediaId = event.target.closest('li').dataset.mediaId
    this.images.forEach(item => item.classList.remove('product-gallery__image--active'))
    this.setCurrentImage(Array.from(this.images).find(item => item.dataset.mediaId === mediaId))
  }

  update() {
    this.style.height = `${this.imagesContainer.offsetHeight}px`
    this.prevButton.removeAttribute('disabled')
    this.nextButton.removeAttribute('disabled')
    if (this.findCurrentIndex() === 0) this.prevButton.setAttribute('disabled', true)
    if (this.findCurrentIndex() === this.images.length - 1) this.nextButton.setAttribute('disabled', true)
  }

  setCurrentImage(elem) {
    this.images.forEach(item => item.classList.remove('product-gallery__image--active'))
    elem.classList.add('product-gallery__image--active')
    this.update()
  }

  findCurrentIndex() {
    return Array.from(this.images).findIndex(item => item.classList.contains('product-gallery__image--active'))
  }

  onButtonClick(event) {
    event.preventDefault();
    let index = this.findCurrentIndex()
    if (event.currentTarget.name === 'next') {
      index++
    } else {
      index--
    }
    this.setCurrentImage(this.images[index])
  }
}

customElements.define('product-gallery', ProductGallery);

This is the bulk of the functionality for our gallery. We create our web component by extending the HTMLElement class. In our constructor we set up variables for our DOM elements and bind event listeners to the component. We rely on the data attributes set in our liquid to reference which thumbnails belong to which images to help with the onNavItemClick method as well as our onVariantChange callback.

Another caveat is using Array.from(this.images). The images are stored in a variable using querySelectorAll. This function returns a NodeList which is array-like but not an array. Our component uses array methods to do some of the heavy lifting so it's important to create an array from the NodeList and not use the NodeList directly.

Emit an event when a variant selection is made:

We want to update our slider so when a variant is chosen the active image is that variant's image. To do this we will add some code to the VariantSelects class in assets/global.js.

onVariantChange() {
  this.updateOptions();
  this.updateMasterId();
  this.toggleAddButton(true, '', false);
  this.updatePickupAvailability();

  if (!this.currentVariant) {
    this.toggleAddButton(true, '', true);
    this.setUnavailable();
  } else {
    this.updateMedia();
    this.updateURL();
    this.updateVariantInput();
    this.renderProductInfo();
  }
  // When variant is changed post a message with the variant's data
  window.postMessage({
    type: 'variant_changed',
    variant: this.currentVariant
  }, '*')
}

Using Shopify's new Dawn theme is a great way to see how Shopify thinks about theme development in 2021. When making changes for your store you should follow the patterns and conventions they are using but that doesn't mean you can't add your own features. Hopefully this blog helps get you started on that path by showing how to use a product image slider over their product gallery grid on the Dawn theme.

We are hiring a full-time Shopify Engineer to join our team. If you have experience creating custom Shopify experiences like this one please apply.

Barstool Sports - Shopify Engineer

Barstool Sports logo

Combining Sequential Streams with Node.js and Express

Andrew Barba — Tue, 13 Jul 2021 16:05:00 GMT

Asynchronously streaming bytes of data was perhaps the single most important feature of Node.js when it launched in 2009. Node provided a brilliant API that allowed developers to stream bytes through a pipeline of operations and never block the main thread. Streams back everything from http requests and responses, to child processes and a lot more.

The primary benefit of streams is they remove the need to buffer large amounts of data in memory and then perform operations once that data is fully available. With streams you can effeciently operate on data in small chunks and push it through a pipeline where you're only ever consuming a small fraction of the total data in your currently running process. For example, let's say you wanted to build a basic proxy server in Node.js for large video files. When a request comes in for a particular file, you do not want the process to download the entire video into memory and then begin returning the video file to the client. Instead, its much more effecient to download the file in small chunks and send those chunks to the client as soon as they are received by the Node.js process. Once the chunks are written to the client Node can evict them from memory allowing us to handle many concurrent streams of data at once.

In order to implement this http proxy server lets start by building a basic async http request function that returns a stream:

const https = require('https')

function fetch(src, options = {}) {
  return new Promise((resolve, reject) => {
    const url = new URL(src)
    const options = {
      hostname: url.hostname,
      port: 443,
      path: url.pathname,
      method: 'GET',
      ...options
    }
    const req = https.request(options, (res) => {
      if (res.statusCode >= 300) {
        const error = new Error(res.statusMessage || 'Invalid file')
        reject(error)
        return
      }
      resolve({
        res,
        url: url.href,
        headers: res.headers
      })
    })
    req.end()
  })
}

It might not be immediately obvious, but the resolved `res` is a Node.js Readable Stream

Next lets create a small express app that accepts all traffic and proxies to a pre-defined origin:

const express = require('express')
const app = express()
const origin = 'https://example.com'

app.get('/*', async (req, res) => {
    const proxy = await fetch(`${origin}${req.path}`)
    res.writeHead(proxy.res.statusCode)
    proxy.res.pipe(res)
})

app.listen(process.env.PORT)

For a code-complete proxy we should also set the same response headers

Here's a breakdown of what this proxy route is doing:

Issue a request to the defined origin + request path
Write the same status code to our response
Pipe the response from the fetch request to the current response stream

At this point we can efficiently stream a single file from an origin, through our Node.js proxy, and back to the client. To make things more interesting, let's say we are a video provider that is required to insert ads into the middle of our video streams. In order to accommodate this, the video team saves our video files in chunks, split up depending on where we want to insert ads. For example, lets say we have a video file that is 1 hour long, and we want to insert an ad at 20 min, and 40 min into the video. Our video team would save 3 files:

https://cdn.com/video_0_20.mp4
https://cdn.com/video_20_40.mp4
https://cdn.com/video_40_60.mp4

In addition to this, our advertising team has provided us urls for video ads:

https://cdn.com/ad_1.mp4
https://cdn.com/ad_2.mp4

Our job is to combine these 5 files into a single video stream. Of course we could have our video team manually edit these files together and store them as one file, but lets use our knowledge of Node.js streams and combine these files on the fly.

First lets gather our video files into an array in the exact order we want to stich them together:

const urls = [
  'https://cdn.com/video_0_20.mp4',
  'https://cdn.com/ad_1.mp4',
  'https://cdn.com/video_20_40.mp4',
  'https://cdn.com/video_40_60.mp4',
  'https://cdn.com/ad_2.mp4'
]

Next, lets issue a request for each video file:

const reqs = await Promise.all(urls.map(fetch))
const streams = reqs.map(req => req.res)

Now we need to create a new function that can combine our array of streams into a single stream, maintaining the order of the bytes:

const { PassThrough } = require('stream')

function combineStreams(streams) {
  const stream = new PassThrough()
  _combineStreams(streams, stream).catch((err) => stream.destroy(err))
  return stream
}

async function _combineStreams(sources, destination) {
  for (const stream of sources) {
    await new Promise((resolve, reject) => {
      stream.pipe(destination, { end: false })
      stream.on('end', resolve)
      stream.on('error', reject)
    })
  }
  destination.emit('end')
}

Notice we need two functions to implement this correctly. The first function creates a new PassThrough stream which acts as a single container for piping the other streams into. The second function is responsible for actually writing the bytes of the source streams into the destination stream. Aside from error handling, the most important thing to remember is that pipe will automatically end the destination stream when called, but lucky for us, Node provides an option to disable that behavior allowing us to continue writing bytes to the stream. Finally, once we loop through all stream we can manually call end on the destination stream.

Our complete Express route now can now be implemented like so:

app.get('/video-ssai.mp4', async (req, res) => {
    const reqs = await Promise.all(urls.map(fetch))
    const streams = reqs.map(req => req.res)
    const combined = combineStreams(streams)
    res.writeHead(200)
    combined.pipe(res)
})

Wrapping Up

Combining streams in Node.js is an extremely powerful concept that can be used to accomplish sophisticated data workflows, server-side ad insertion being just one of many. One important feature missing from our Express server is the ability to support Byte Range requests. Byte Range requests are when the client requests a certain range of the file instead of the whole thing. This is a critical feature to support for any streaming media, as you do not want clients consuming more resoruces than necessary. If you're interested in how we built this functionality at Barstool feel free to email me at barba@barstoolsports.com or apply to an open position here: https://www.barstoolsports.com/jobs

Gzip Compression with AWS Lambda and API Gateway HTTP API

Andrew Barba — Fri, 18 Jun 2021 20:54:44 GMT

In late 2017 we made the decision to go all in on Serverless. Our API's would move off of EC2 and traditional load-balancers and move to AWS Lambda and API Gateway. This led to some dramatic shifts in how we think about our architecture, and also many nights of frustration dealing with the shortcomings of these cutting edge services.

AWS has since filled many of the issues and gaps around Serverless architecture. Some of our favorites include - much faster startup times for Lambdas in a VPC, 15 minute execution times, and a shiny new version of API Gateway with a focus on cost and performance for HTTP API's.

API Gateway HTTP API's are a new product that provide much lower latency than the traidtional API Gateway REST API's. They also propvide a massive cost improvement making Lambda HTTP API's perfectly viable for production workloads. However, like many new AWS products, it's missing some key features of the prior version. HTTP compression in the form of gzip and brotli have become the standard way to reduce the number of bytes clients need to download to display a website, render json data, etc. Most CDN's offer gzip and brotli compression out of the box, and ours does as well, but with one major caveat - only for cacheable requests.

We choose Fastly over CloudFront for its incredible performance and feature set, as well as excellent pricing across the board. But one major limitation with Fastly is its inability to compress non-cacheable content. Before moving to HTTP API's, we would compress our content with API Gateway. This allowed our Lambdas to simply return a response and let API Gateway handle the rest. However, now that we've moved to HTTP API's, we needed a way to compress our responses in the Lambda runtime directly.

Lets start by creating a new file called compression.js and exporting a single function called compress:

const zlib = require('zlib')

exports.compress = (input, headers) => {
  ...
}

Our function needs to take in two parameters: the input string to compress, and the headers object which will allow us to look at the accept-encoding header to figure out which compression algorithm to use:

// Parse the acceptable encoding, if any
const acceptEncodingHeader = headers['accept-encoding'] || ''

// Build a set of acceptable encodings, there could be multiple
const acceptableEncodings = new Set(acceptEncodingHeader.toLowerCase().split(',').map(str => str.trim()))

Next we need to check certain encodings in priority order, and use that encoding if its present in our set. We will check brotli, gzip then deflate, in that order:

// Handle Brotli compression (Only supported in Node v10 and later)
if (acceptableEncodings.has('br') && typeof zlib.brotliCompressSync === 'function') {
  ...
}

// Handle Gzip compression
if (acceptableEncodings.has('gzip')) {
   ...
}

// Handle deflate compression
if (acceptableEncodings.has('deflate')) {
  ...
}

Finally, we can call the correct compression method on the zlib framework and return the compressed data along with the algoritm we used:

// Brotli
return {
  data: zlib.brotliCompressSync(input),
  contentEncoding: 'br'
}

// Gzip
return {
  data: zlib.gzipSync(input),
  contentEncoding: 'gzip'
}

// Deflate
return {
  data: zlib.deflateSync(input),
  contentEncoding: 'deflate'
}

// No Match
return {
  data: input,
  contentEncoding: null
}

Now that we can successfully compress any input based on the request headers, we can call our compression function right before we return our content in the lambda handler:

exports.handler = async (event, context) => {
  const res = await handleRequest(event, context)
  
  const { data, contentEncoding } = compression.compress(res.body, event.headers)
  
  return {
    statusCode: res.statusCode,
    body: data.toString('base64'),
    headers: {
      ...res.headers,
      'content-encoding': contentEncoding
    },
    isBase64Encoded: true
  }
}

It's important to return the correct content-encoding so the client knows how to correctly parse the response.

Wrapping Up

AWS has already vouched to make the new HTTP API's feature complete with the old REST API's, but they made that promise nearly 2 years ago now and they have yet to add support for compression out of the box. Hopefully this tuturial helps you fill the gap in the meantime.

Intro to SWR: improving data-fetching & cache for fast user-interfaces

Gabriel Zarate — Thu, 17 Jun 2021 18:27:59 GMT

SWR is a React Hooks library created by Vercel that simplifies data-fetching logic in your application and makes it possible to implement caching and dependent querying. SWR, or 'stale-while-revalidate,' returns data from cache first, sends the fetch request, and finally renders up-to-date data. The package is simple, lightweight, and includes features like SSR support, pagination, scroll-position recovery, and revalidation on focus.

SWR vs Axios/Fetch API

Other popular data-fetching strategies such as Axios or Fetch API make the request and return the expected response, nothing more. SWR is a layer built on top of Fetch API that provides features like caching and pagination that would otherwise need to be managed elsewhere in the codebase. As a result, SWR provides a significant advantage for React applications to have fast, reactive user interfaces. A more appropriate library to compare with SWR would be something like react-query which supports caching and pre-fetching.

Refactoring data-fetching with SWR

Earlier this year, the engineering team at Barstool launched a feature for our internal CMS to process audio files via ACR to identify copyrighted content for podcast uploads.

The client-side of this feature involves multiple successive polling requests as the files are uploaded and processed. With polling coming as a feature out of the box in SWR, we took the opportunity to refactor the data-fetching for this feature to clean things up.

const pollMedia = async (mediaId) => {
   interval.current = setInterval(async () => {
     const response = await mediaApi.findById(mediaId)

     if (response.status === 'ready') {
       clearInterval(interval.current)
       pollCopyright(media.id, media.url)
     }
   }, pollInterval)
 }

const pollCopyright = async (mediaId, url) => {
	interval.current = setInterval(async () => {
    	const response = await podcastApi.processCopyrightClaims(mediaId, url)
        
        if (response?.status === 'ready') {
            clearInterval(interval.current)
        }
    }, 5000)
}

These functions poll the copyrights and media endpoints until the API service returns the copyrighted content for the audio file. Let's refactor with SWR.

The useSWR hook accepts a key string and a fetcher function.

const ENDPOINT_KEY = '/podcast-api/admin/media'

const fetcher = url => fetch(url).then(r => r.json())

function useEpisodeCopyrightClaims() {
 	const { data, error } = useSWR(PODCAST_MEDIA_ID, fetcher)
    
    return { data, error }
}

This is the most basic setup, but we need to implement polling and a custom fetcher that returns the data once the audio file has been processed.

const ENDPOINT_KEY = '/podcast-api/admin/media'

async function fetchEpisodeMedia(url, mediaId) {
	return await mediaApi.findById(mediaId)
}

function useEpisodeCopyrightClaims(mediaObj) {
	const { data, error } = useSWR(
    	[ENDPOINT_KEY, mediaObj.id],
    	fetchEpisodeMedia, 
        { refreshInterval: 5000 }
    )
    
    return { data, error }
}

The parameters needed for our fetcher are passed within the array argument to useSWR. To manage when to stop polling, SWR includes an onSuccess handler:

const ENDPOINT_KEY = '/podcast-api/admin/media'
const interval = 5000

async function fetchEpisodeMedia(url, mediaId) {
	return await mediaApi.findById(mediaId)
}

function useEpisodeCopyrightClaims(mediaObj) {
  const [interval, setPollingInterval] = useState(interval)

  const { data, error } = useSWR(
    [ENDPOINT_KEY, mediaObj.id], 
    fetchEpisodeMedia, 
    {
      refreshInterval: interval,
      onSuccess: (data) => data?.status === 'ready' && setPollingInterval(0)
    }
  )

  return { data, error }
}

pollingInterval gets set to zero once the media is processed, canceling the polling for the request. The last portion of this refactor involves adding the second request for copyright claims processing. SWR supports dependent fetching which makes this process easy:

const ENDPOINT_KEY = '/podcast-api/admin/media'
const interval = 5000

async function fetchEpisodeMedia(url, mediaId) {
	return await mediaApi.findById(mediaId)
}

async function fetchEpisodeCopyrightClaims(url, mediaId, mediaUrl) {
	return await podcastApi.processCopyrightClaims(mediaId, mediaUrl)
}

function useEpisodeCopyrightClaims(mediaObj) {
	const [intervalMedia, setIntervalMedia] = useState(interval)
    const [intervalCopyright, setIntervalCopyright] = useState(interval)

    const { data: media, error: mediaError } = useSWR(
      [ENDPOINT_KEY, mediaObj.id], 
      fetchEpisodeMedia, 
      { 
        refreshInterval: intervalMedia, 
        onSuccess: (data) => data?.status === 'ready' && setIntervalMedia(0) 
      }
    )

  const { data: copyrightClaims, error: copyrightError } = useSWR(
    [ENDPOINT_KEY, media.id, media.url],
    fetchEpisodeCopyrightClaims,
    { 
      refreshInterval: intervalCopyright, 
      onSuccess: (data) => data?.status === 'ready' && setIntervalCopyright(0)
    }
  )

	return { copyrightClaims, error: mediaError || copyrightError }

}

The second request will not run until media.id is defined, which removes the complexity around handling these requests separately and allows us to list the copyright claims response as follows:

function CopyrightClaimsSummary({ mediaObj }) {
	const { copyrightClaims, error } = useEpisodeCopyrightClaims(mediaObj)
    
    if (error) return Something went wrong
    if (!copyrightClaims) return 
    
    return (
    	
        	{copyrightClaims.map((item) => (
            	
                	{item.song.name}
                
            )}
        
    )
}

We are excited about how SWR simplifies data-fetching in our codebase, with all of its useful features, especially how well it fits with Next.js. Thank you, Vercel!

Creating a slider component in React

Mike Nichols — Thu, 17 Jun 2021 17:39:43 GMT

As a digital media company, Barstool Sports is home to a lot of podcasts. In an average week we publish 100+ episodes and while a majority of people listen to these on platforms like Apple Podcasts or Spotify, we'd like to bring those listeners to our site.

In an effort to do this, we updated the audio player component all our sites use to have a more modern look and feel.

Our new audio player component

At its core it's the standard audio element with a React UI built on top that uses Refs to control the audio file. Nothing new there. Something that's somewhat unique though is how we went about presenting the range input elements for progress and volume control.

Our Slider Component

A good indication of progress or change needs a clear seperator for before and after, something that the default range input doesn't do too well. I've personally always had a strong disdain for the range input as I feel the element is unusable without throwing at least some css at it. Even then, trying to style it in a way with two colors has been something that has always stumped me. I was able to find a solution though that simply handles that problem. A solution so simple that I didn't believe it would work.

Before I get into that though let me quicky go over the logic behind the component.

Logic

The component, named Slider, takes mulitple props:

colorAfter/colorBefore: the color of the bar before and after the current spot. The thumb of the slider will be the colorBefore.
highlighted: the color of the thumb and the bar before it on hover
size: height of the thumb (will grow 4px on hover)
value: the current value. Will be between 0 and 1

also accepts additional props that will be directly passed to the input element

const Slider = ({
  colorAfter = '#E1E1E6',
  colorBefore = '#A5AAB2',
  highlighted = '#EB3E3E',
  size = 10,
  value,
  ...props
}) => {
  const percent = value * 100
  const growTo = size + 4

  const [hover, setHover] = useState(false)

  return (
     setHover(true)}
      onMouseLeave={() => setHover(false)}
      value={value}
      size={size}
      colorAfter={colorAfter}
      colorBefore={colorBefore}
      highlighted={highlighted}
      percent={percent}
      growTo={growTo}
      seeking={hover}
      {...props}
    />
  )
}

The input element here, StyledSlider, is actually a styled-components input element. This doesn't affect the element other than how the css is applied to it. A cool feature of styled-components though is that you can essentially pass props right to the css. As you'll see in the next section, most of the props for StyledSlider are solely to affect the look of the Slider.

Styling

Background: Initially we tried using styled-jsx to style the audio player but ran into issues upon testing integration into our sites. After digging through every issue in their repo and trying the provided solutions, we decided to switch to styled-components. It uses a very similar method as styled-jsx and was very easy to get the hang of.

The real magic of this component is in the styles because it seperates it from every other boring range input out there. This block of css might look a bit overwhelming, especially if you haven't used styled-components before, so let me highlight the key points and explain.

const transition = 'height 0.15s 0s ease, width 0.15s 0s ease'

const StyledSlider = styled.input`
  cursor: pointer;
  background: linear-gradient(
    to right,
    ${(props) => (props.seeking ? props.highlighted : props.colorBefore)} 0%,
    ${(props) => (props.seeking ? props.highlighted : props.colorBefore)}
      ${(props) => props.percent}%,
    ${(props) => props.colorAfter} ${(props) => props.percent}%,
    ${(props) => props.colorAfter} 100%
  );
  border-radius: 8px;
  height: 4px;
  width: 100%;
  outline: none;
  padding: 0;
  margin: 5px 10px;
  -webkit-transition: ${transition};
  -moz-transition: ${transition};
  -o-transition: ${transition};
  transition: ${transition};
  -webkit-appearance: none;
  &::-webkit-slider-thumb {
    border: none;
    -webkit-appearance: none;
    width: ${(props) => (props.seeking ? props.growTo : props.size)}px;
    height: ${(props) => (props.seeking ? props.growTo : props.size)}px;
    cursor: pointer;
    background: ${(props) => (props.seeking ? props.highlighted : props.colorBefore)};
    border-radius: 50%;
  }
  &::-ms-thumb {
    border: none;
    height: ${(props) => (props.seeking ? props.growTo : props.size)}px;
    width: ${(props) => (props.seeking ? props.growTo : props.size)}px;
    border-radius: 50%;
    background: ${(props) => (props.seeking ? props.highlighted : props.colorBefore)};
    cursor: pointer;
  }
  &::-moz-range-thumb {
    border: none;
    height: ${(props) => (props.seeking ? props.growTo : props.size)}px;
    width: ${(props) => (props.seeking ? props.growTo : props.size)}px;
    border-radius: 50%;
    background: ${(props) => (props.seeking ? props.highlighted : props.colorBefore)};
    cursor: pointer;
  }
`

The two color slider comes from using a linear-gradient background with the props for the value and different colors passed in. It may look a bit confusing with all the JS sprinkled in but it's really just telling it what colors you want to use and where to stop the before bar and thumb at.

background: linear-gradient(
    to right,
    ${(props) => (props.seeking ? props.highlighted : props.colorBefore)} 0%,
    ${(props) => (props.seeking ? props.highlighted : props.colorBefore)}
    ${(props) => props.percent}%,
    ${(props) => props.colorAfter} ${(props) => props.percent}%,
    ${(props) => props.colorAfter} 100%
);

The browser-specific styles target the slider thumb and basically do the same as mentioned above.

&::-webkit-slider-thumb {
    border: none;
    -webkit-appearance: none;
    width: ${(props) => (props.seeking ? props.growTo : props.size)}px;
    height: ${(props) => (props.seeking ? props.growTo : props.size)}px;
    cursor: pointer;
    background: ${(props) => (props.seeking ? props.highlighted : props.colorBefore)};
    border-radius: 50%;
}
&::-ms-thumb {
    border: none;
    height: ${(props) => (props.seeking ? props.growTo : props.size)}px;
    width: ${(props) => (props.seeking ? props.growTo : props.size)}px;
    border-radius: 50%;
    background: ${(props) => (props.seeking ? props.highlighted : props.colorBefore)};
    cursor: pointer;
}
&::-moz-range-thumb {
    border: none;
    height: ${(props) => (props.seeking ? props.growTo : props.size)}px;
    width: ${(props) => (props.seeking ? props.growTo : props.size)}px;
    border-radius: 50%;
    background: ${(props) => (props.seeking ? props.highlighted : props.colorBefore)};
    cursor: pointer;
}

Conclusion

While this focused on a method of creating a slider component that fit our needs at Barstool, hopefully this inspires you to improve upon any range input/sliders you may have on your site already. My main goal here was to present the ways of creating a two-color, unique slider as having separate colors and some animation really improves the usability of any element, but especially this one.

Errors from the BQE and beyond....using Slack

Markham F Rollins IV — Wed, 16 Jun 2021 17:31:41 GMT

Recently Nick Booth wrote a blog about our long-running service the Barstool Queue Engine. This being a core service we needed active error notifications to alert the team when it runs into trouble. While there are multiple options, levels, and redundant alerts to consider, I will focus on how I built a simple dynamic system to communicate with Slack.

My main goal was to build a reusable library that would accept a recognizable format that would allow for dynamic content. Slack's webhooks accept JSON to structure the message so it seemed a natural choice. Using the Block Kit documentation I created a base set up to be used around our various applications:

{
  "attachments": [
    {
      "color": "#ff3d41",
      "blocks": [
        {
          "type": "header",
          "text": {
            "type": "plain_text",
            "text": "⚠️ {{Application}}-{{Stage}}",
            "emoji": true
          }
        },
        {
          "type": "section",
          "fields": [
            {
              "type": "mrkdwn",
              "text": "*Field 1*\n{{FIELD_1}}"
            },
            {
              "type": "mrkdwn",
              "text": "*Field 2*\n{{FIELD_2}}"
            }
          ]
        },
        {
          "type": "divider"
        },
        {
          "type": "section",
          "text": {
            "type": "mrkdwn",
            "text": "*Error Message*\n{{ERROR_MESSAGE}}"
          }
        }
      ]
    }
  ]
}

This JSON will render the following in Slack:

Base Block Kit Format

Pulling from other templating languages I utilized {{ }} to denote variable placement to support dynamic content in the messages. The core logic simply accepts a slack webhook, a path to the template and the data to be used when recursively replacing all of the placeholders in the template.

async function sendMessage({ endpoint, messageFilepath, data }) {
  const messageBody = JSON.parse(await fs.readFile(messageFilepath))
  const messageWithData = _replaceVariables({ message: messageBody, data })
  await got.post(endpoint, {
    json: {
      ...messageWithData
    }
  })
}

function _replaceVariables({ message, data }) {
  for (var key in message) {
    if (typeof message[key] == 'object' && message[key] !== null) {
      _replaceVariables({ message: message[key], data })
    } else {
      if (['text', 'color'].includes(key)) {
        Object.keys(data).map((dataKey) => {
          message[key] = message[key].replace(`{{${dataKey}}}`, data[dataKey])
        })
      }
    }
  }
  return message
}

The final product is a well-formatted error message in Slack that looks similar to the example below.

Error message as seen in Slack

We've implemented this in our BQE but plan to roll it out to many of our services to stay ahead of our users when it comes to resolving issues they may run into. The flexibility allows us to send all pertinent information to enable speedy debugging.

Resilient Multi-part file uploads to S3

Zach Ward — Sat, 05 Jun 2021 18:46:13 GMT

Just about anyone who has spent time developing web apps has had the need to handle file uploads. We live in a (literal) web of profile pics, gifs, memes, live streams, vlogs, etc, etc. With the rise of services like AWS S3, the task of handling uploads and storing file objects has, for the most part, become trivial. This is obviously a great thing and for most web apps, anything more than a basic integration with AWS S3 would be overkill. Like many other engineering teams, at Barstool we are big on keeping it simple and not over-engineering. In the (paraphrased) words of Knuth

"Premature optimization is the root of all evil"

In mid-2019, the engineering team at Barstool was in the early stages of the odyssey that has been replacing wordpress with our own in-house CMS - Barstool HQ. We were in the midst of pumping out new features almost weekly, in a race to gain internal buy-in to the new and unfamiliar platform. If we were going to enable our content team to be productive in HQ, a proper media management module was an absolutely crucial. Blog posts need thumbnails, bloggers need avatars, video posts require... videos. So we quickly built the internal tools for users to upload new files, browse existing files, and to attach those files wherever they might be needed.

Phase 1

By late 2019, HQ had started to gain traction internally. Some bloggers were operating entirely within HQ and consequently usage of our file uploader skyrocketed. With increased usage, unaccounted-for edge cases and plain old bugs naturally followed. Bugs were squashed and some edge-cases covered, and work generally carried on as usual.

Things started to get interesting when a member of the content team made a request for bulk image uploads. It was a simple enough request and it would provide a big boost in productivity for end-users. The changes required to support multiple uploads didn't seem too difficult either. Our initial implementation had consisted of a drag-n-drop uploader component:

While bulk of the work occurred in the handleDrop method:

handleDrop = async (acceptedFiles, rejectedFiles) => {
  if (acceptedFiles.length > 0) {
    const extension = acceptedFiles[0].name.split('.').pop()
    const data = await mediaApi.getSignedUrl({ extension, content_type: acceptedFiles[0].type })
    await mediaApi.uploadToS3(data.upload, acceptedFiles[0], this.onUploadProgress)
    const { provider } = this.props
    const mediaObject = await mediaApi.create({ provider, key: data.key, title: filename })
    this.props.onCompleted(mediaObject)
    this.reset()
  } else if (rejectedFiles.length > 0) {
    this.setState({
      errorMessage: 'Upload failed'
    })
  }
}

Forgive the icky class component

To add support for bulk uploads, we moved the logic responsible for communicating with the mediaApi service to a function handleUploadFile while handleDrop would now only be responsible for iterating through the files and passing them to as arguments to handleUploadFile:

async function handleUploadFile(file, index) {
  const { type, name, size } = file
  const key = `${name}-${index}`
  setUploadProgress(key, { loaded: 0, total: size })
  const extension = last(name.split('.'))
  const data = await mediaApi.getSignedUrl({ extension, content_type: type })
  await mediaApi.uploadToS3(data.upload, file, progressData => onUploadProgress(key, progressData))
  const mediaObject = await mediaApi.create({ provider, key: data.key, title: name })
  incFilesUploadedCount()
  onFile(mediaObject)
  return mediaObject
}

async function handleDrop(acceptedFiles, rejectedFiles) {
  if (acceptedFiles.length > 0) {
    setFilesToUploadCount(acceptedFiles.length)
    const mediaObjects = await Promise.all(acceptedFiles.map(handleUploadFile))
    if (onCompleted && isFunction(onCompleted)) {
      onCompleted(mediaObjects)
    }
    reset()
  } else if (rejectedFiles.length > 0) {
    setErrorMessage('Upload failed')
  }
}

During QA, this worked great. We were able to drop multiple files into the uploader and they'd all get uploaded concurrently thanks to Promise.all. However, once we released it quickly became apparent that the simple solution above would not be adequate for a number of reasons:

Due to the nature of Promise.all, one failed upload would cause the entire process to fail.
Our admin api's have maximum concurrency limits, which it turns out were occasionally being exceeded when a single user opens 40 additional connections while uploading 40 files
There is no way for users to 'retry' failed uploads, they have to start from the beginning and select all their files again, this is obviously very frustrating for users.

Phase 2

So we went back to the drawing board and determined that with the following improvements, we would cooking with gas:

Upload files in batches with a maximum concurrency set, preventing the maximum concurrency limit imposed by our api from being exceeded.
For each batch of uploads, add automated retry logic in order to mitigate non-fatal errors caused by network conditions, etc
In addition to automated retry logic, track failed uploads and allow user to manually retry the failed uploads

Here is the helper function we came up with, mapAsync which powers the batched upload logic. It takes as arguments: an array of items, a concurrency limit, and a callback which will be invoked with each item, respectively.

async function mapAsync(items, concurrency = 1, handler) {
  let results = []
  let failures = []
  let index = 0
  while (index < items.length) {
    const batch = items.slice(index, index + concurrency)
    try {
      const _results = await Promise.all(batch.map(handler))
      results = [...results, ..._results]
    } catch (err) {
      failures = [...failures, ...batch]
    }
    index += concurrency
  }
  return { results, failures }
}

For the automated retry logic, we wrote a simple but powerful helper function, withRetries which takes as arguments: an async function, the number of retries to allow, and an error. Note that the err argument is meant to be passed when the function is invoked, but rather from within recursive calls after an error.

async function withRetries(fn, retries = 3, err = null) {
  if (!retries) {
    return Promise.reject(err)
  }
  return fn().catch(err => {
    return withRetries(fn, retries - 1, err)
  })
}

It should be clear what is going on here, but basically the callback fn is invoked and if it rejects, then withRetries is recursively called with retries - 1, it will continue to do this either until fn succeeds or retries is exhausted, at which point an error is thrown back to the caller.

With these two helper functions we made the following modifications to the code in our FileUploader component:

async function handleRetry(files) {
  try {
    await handleDrop(files)
  } catch (err) {
    setErrorMessage(`Uploading failed again, please try again later`)
  }
}

async function handleDrop(acceptedFiles, rejectedFiles) {
  if (acceptedFiles.length > 0) {
    // update progress state with all files to be uploaded prior to processing - because processing is done in batches this step is necessary beforehand in order to achieve realistic progress
    acceptedFiles.forEach(({ name, size }) => setUploadProgress(name, { loaded: 0, total: size }))
    setFilesToUploadCount(acceptedFiles.length)

    // upload files in asynchronous batches because concurrency limits can cause uploads to fail if surpassed resulting in entire upload hanging
    // because processing happens in batches, its possible that only certain batches fail to upload, so there can be successes and failures (rather than just one or the other like if this was all wrapped in a promise.all)
    const { results: allMediaObjects, failures } = await mapAsync(
      acceptedFiles,
      4,
      async (file, index) => await withRetries(async () => await handleUploadFile(file, index))
    )

    // if there are failures, we could allow retrying the upload with just those items, that way we wont re-upload any successful uploads
    if (failures.length) {
      setErrorMessage(`${failures.length} files failed to upload, would you like to retry uploading these files?`)
      setFilesToRetry(failures)
    } else {
      reset()
      if (onCompleted && isFunction(onCompleted)) {
        onCompleted(allMediaObjects)
      }
    }
  } else if (rejectedFiles.length > 0) {
    setErrorMessage('Upload failed')
  }
}

And the end-result, when uploading four files and the fourth file fails to upload, the user can retry manually, and the uploader will pick back up where it left off, only attempting the fourth file again:

Failure is simulated

The compounding effects of the batched uploads and automated retry logic actually made it rather tedious to test the manual retry logic - generally it's a pretty good sign when it's hard to break something.

After the changes were released, complaints dropped off almost entirely. We were very satisfied: without too much effort, we had iterated on our initial uploader, reusing most of our existing code to build a resilient uploader that could handle any number of files. I tested a bulk upload of 100 high-res images from Unsplash, and the uploader churned right through them, no failures.

Phase 3

For several months, no one touched the code of the uploader component. Our primary users early on were bloggers who primarily uploaded images or short video clips that they wanted to use in blog posts, and the uploader continued to work great for them.

During that time, however, the engineering team was working on a big project to bring all our video management functionality in-house. As part of bringing video management in-house, we'd need to have the capability to handle large files, often up to and sometimes even over 10 GB.

We were ahead of the curve this time, before anything was released we were aware of the fact that we had a big problem: The maximum part size of an object being uploaded to S3 is 5 GB. Without a workaround, this was a show-stopper for moving video management in-house.

S3 supports multi-part uploads as documented here Uploading and copying objects using multi-part upload and so did some research to figure out what our exact approach would be. We browsed the GitHub, NPM, etc. for any existing solutions, surely we weren't the first engineering team to run into this issue.

Our head of engineering, Andrew, found this really great article Multipart uploads with S3 pre-signed URLs which outlines the server-side changes needed to support multi-part uploads. Using the article as a reference, we were able to very quickly implement the necessary endpoints for supporting multi-part uploads. The endpoints we implemented were as follows:

POST /upload-multipart { content_type: String, extension: String, filename: String, parts: Number }: parts is the number of chunks we will break the file into when uploading, the response includes { bucket: String, key: String, location: String, upload_id: String, urls: [String] } where urls is the array of endpoints that we will use to upload each corresponding chunk.
POST /upload-multipart/complete { key, bucket, upload_id, etags } which is hit after all the chunks have been upload, this stage’s job is to inform to S3 that all the parts were uploaded. By pass the etag of each part to this endpoint, S3 knows how to construct the object from the uploaded parts.

Next, we came across UpChunk, which just so happens to be built by our video provider, Mux. According to the ReadMe, UpChunk is "a JavaScript module for handling large file uploads via chunking and making a put request for each chunk with the correct range request headers." This is exactly the type of library we were looking for. Unfortunately, after digging deeper and even asking the mux team about it directly, it was apparent that it would not be compatible with multi-part uploads that S3. But not all was lost - the UpChunk source is very tiny and easy to comprehend, so using that as a starting point and reference, while modifying what we needed to make it compatible with our system, we were able to make our own chunk uploader that was compatible with S3 multi-part uploads.

I remember very clearly that all of this discovery was done during a Friday afternoon in September of 2020. I was heads down on the code all evening and had something nearly working by around 9pm. Our new ChunkUploader module worked as follows:

Initialize uploader with file to upload and getEndpoints which is either an array of endpoints or an async function which returns the endpoints after being invoked. Additionally, an onUploadProgress callback can be passed, which will receive progress events as chunks are uploaded, this is useful for displaying a progress bar.
Call uploader.upload() which, as its name implies, kicks of the upload process. upload() first calculates the number of chunks to break the file into and subsequently calls the getEndpoints method with that number of chunks to get the array of endpoints that each chunk will be uploaded to.
Next, a private method _sendChunks which similar to the FileUploader component, uploads the chunks in batches set to a maximum concurrency, each of the chunks can be retried up to 5 times exponential backoff upon failures. To get each chunk, each endpoint return from getEndpoints is iterated, and each file is sliced from index * chunk size in bytes to index + 1 * chunk size in bytes, where index is the current index of each endpoint as they are iterated through.
Upon successful upload of each chunk, the etag returned is stored in an array corresponding to each chunk.
uploader.upload() returns an object with the shape: { key: String, bucket: String, upload_id: String, etags: [String] }, this will be used in to make the request to POST /upload-multipart/complete to finish the multipart upload process.

And here's what that code looks like:

async function handleUploadFile(file) {
  if (cancelUploadRef.current) {
    throw new CanceledUploadError()
  }
  const { type, name } = file
  const key = `${name}`
  const extension = last(name.split('.'))

  // initialize uploader with file object and getEndpoints callback for fetching pre-signed urls for each file part
  const uploader = new ChunkUploader({
    file,
    cancelRef: cancelUploadRef,
    getEndpoints: ({ parts }) => {
      return mediaApi.requestMultipartUpload({
        extension,
        content_type: type,
        filename: name,
        parts
      })
    },
    onUploadProgress: (progressData) => {
      onUploadProgressThrottled(key, progressData)
    }
  })

  // complete chunk upl
  const multipartData = await uploader.upload()
  const signedUrl = await () => mediaApi.completeMultipartUpload(multipartData)

  const mediaObject = await mediaApi.create({ provider, key: signedUrl.key, title: name, duration: audioFile?.duration })
  incFilesUploadedCount()
  return mediaObject
}

To keep our codebase as simple as possible, we use this as the upload logic for all of our files, not just big ones. Since files are uploaded in 8 MB chunks, so anything smaller than 8 MB is just a single chunk, nice and consistent!

Another nice benefit of uploading in chunks is that we can retry individual chunks in addition to retrying individual files, so we now have an extra layer to our retry logic to combat network issues. Additionally, we track upload progress per-chunk-url, which gives us progress data on a very granular level, allowing for some nice-to-haves like displaying 'Estimated time remaining' for uploads. 'Estimated time remaining' actually proved to be a very valuable tool for producers, who spend a lot of time waiting for files to upload: knowing the time remaining, they can continue working on other stuff and come back to the upload once the file is close to being fully uploaded.

Closing Thoughts

Since releasing the changes ~9 months ago, I can't think of a single complaint that we've received related to failed uploads that could be attributed to our upload logic. A lack of complaints does not necessarily mean that there are no issues, but it's a promising sign considering we consistently upload several hundred videos and several thousand images per month.

Future Improvements

There are always improvements to be made. Given that we are a small engineering team with a ton of shit to do, we try to make those changes when we know that they are needed and that they will have a noticeable impact on the productivity of our users. Regardless, here's a list of some things that are on our radar that we'd like to improve at some point:

Support uploads in the background - a simple starting point would be to store a global reference to the upload process so that the user can leave the upload page and continue work while the uploader continues to work in the background. Upon successful upload, the user would receive a toast notification with a link to the new file
Better support for changes to network conditions like online/offline events, if the browser loses connection, we should immediately pause the upload process and allow the user to continue once their connection is restored.
Improvements to 'Estimated time remaining' calculation. Our current implementation works very well for larger files, the algorithm being ((milliseconds elapsed / progress amount) - milliseconds elapsed) where progressAmount is a decimal between 0 and 1. The current algorithm is nice and simple, it first calculates millisecondsTotal (estimated total time that the upload will take) by dividing millisecondsElapsed (time since start of the upload) by progressAmount (decimal between 0 and 1 indicating the current progress, 50% progress being 0.5), and then substracting millisecondsElapsed from millisecondsTotal. This works really nicely and gets more accurate with each second that upload is running, however it always takes a few seconds to 'calibrate', which for smaller files means that it is almost never accurate.
Last but not least, code improvements: a lot of love has gone into the FileUploader component and related ChunkUploader, but there's more we could do to make it reusable, or even potentially open-source it some day.

Hopefully you've learned a thing or two about handling file uploads in a resilient manner and I also hope that you've learned about how the culture Barstool engineering team culture and how we tackle problems. If you're interested in working on problems like this (or not, if file uploads aren't for you, we do a lot of other stuff too) check out our openings here, we're actively hiring. Thanks for reading.