Solving HTTP 404 for GCP Storage Bucket Hosted Single Page Web App

Storage buckets are a simple low-cost solution for web hosting that is well suited for serving static websites. However, what if we want to serve a single page application (SPA) such as a ReactJS app? Here we will go over a key search engine optimization (SEO) that may be needed to get your SPA indexing!

Image for post
Image for post
Photo by Taylor Vick on Unsplash

Pre-requisites

You have a web application deployed to a Google Storage Bucket similar to this and need it to be indexable by a search engine.

The Problem

The command below shows a possible way to serve a SPA hosted on GCP’s object storage.

Since the routing is performed client-side, some URLs do not map to actual files in the bucket. Thus, a 404 HTTP status will be returned. An end user would not perceive any effect from this but it would be disastrous for search engine indexing.

The Solution

To address this, enters pre-rendering! The idea of pre-rendering is to have a matching html file for every URL instead of a single index.html on the root path.

A simple way to pre-render a react application is using the React Snap library. To add the library run yarn add --dev react-snap and add "postbuild": "react-snap" to yourpackage.json scripts tag. This will automatically crawl your application and generate pre-rendered pages for every URL when running yarn build. The Dockerfile below, builds and pre-renders a ReactJS app if configured with the correct package.json.

To use the Dockerfile above you will need to add the following snippet to your package.json .

Note: There are some situations for which React Snap may not be the best choice or may require specific configurations:

  1. If your website contains a large number of URLs pre-rendering may take a long time and be overly demanding of your backend.
  2. If not all URLs are linked, you may need to manually set various entrypoints. If none are linked, disable crawling altogether and manually specify them all.
  3. If new URL’s are created frequently you may need to run the build in a cron task. Ideally, there should be a way to pre-render only the new routes.

Common Issues with Pre-rendering

Issue 1: Redirect to /index.html

Google’s storage buckets are designed to redirect requests to non-directory URLs to a file created by appending /<mainPageSuffix> to the requested path as described here. However, say you have an eCommerce website that has a /departments/ page that contains specific department as /departments/my-department/. If the user decides to copy and paste the /departments/ link and for some reason leaves the final / out (/departments), they will be redirected to /departments/index.html which is likely a non-existent department.

A possible solution to this issue (described here) is to include the code below. It will redirect users to the correct URL. Since it would be placed in the head of the HTML, it should run before any of the client side code runs.

Issue 2: Images with relative URLs

Some websites use third party scripts to embed stamps to verify a websites authenticity. These scripts tend to follow the general pattern of the sample component below. However, if one uses this component in a React Snap rendered website the image would have to be sourced from the URL https://my-image.com/localhost:45678 . To address this, these scripts would need to be placed directly in the app code and have a hardcoded url.

Issue 3: Conditional scripts

There may be situations where we don’t want a script to run during pre-rendering but need that script to run when at the user machine. A possible solution is to wrap your script in an if that that checks for the React Snap user agent.

Conclusions

We have seen a possible solution to the HTTP 404 status code returned by GCP’s storage buckets when accessing client-side URLs in a single page application. This is one of many optimizations that a SPA may need to rank high with search engines. An example of a production application using this solution is bigdelivery.com.br.

Written by

A curious minded engineer.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store