Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Angular service worker and index.html caching

While there are similar posts, I can't find clear answer if index.html should be cached using Cache-Control header.

Correct me if I am wrong, but right now I am returning Cache-Control: no-store for index.html to avoid hash mismatch errors which forces service worker to go into degraded mode.

I think that if index.html which has Cache-Control: max-age=3600 is cached on CDN server and the app will be updated before the cache expires, ngsw.json will return different file hashes comparing to script files, included in index.html and bad things will happen. Right?

Also, just to make it clear, I have noticed some people add index.html to ngsw-config.json and that also does not make sense because index.html is loaded before the service worker.

like image 246
Zygimantas Avatar asked Oct 13 '19 16:10

Zygimantas


Video Answer


2 Answers

By default, index.html is included. If you don't include it in the manifest, then it's not going to be part of the files hashed and checked. If it's not in the manifest (and subsequently, ngsw.json), changes to index.html won't trigger an event in the service worker. Of course, when you next load/refresh the site, it'll pick up the new index.html.

If you're serving index.html out of a CDN, then presumably, it's part of the distribution you built on the last deployment. It should be correctly calculated. The area you highlighted above is important to understand if you have files that don't match their hash in ngsw.json. If, for some reason, you're modifying index.html without updating your whole distro, service worker will assume the file is corrupted. It'll try again; since the file doesn't match the hash in ngsw.json, SW will assume the second try was corrupted and shut down.

In my case, it was because the application contained tokens left in during build which were replaced in the release pipeline with Azure resource keys. When the app was built, the hashes were correct. In the release, after token replacement was run, my main*.js files were no longer consistent with their hash values in ngsw.json. The way I elected to fix it was to add a powershell step and recalculate the hashes. It's important to note that, while the actual filenames have unique hash? code embedded, you do not have to correct that for the service worker to work. The filename/hash key/value pair must point to a valid file, and the SHA1 hash of that file must match what is in ngsw.json. The script I wrote to do post-compile validation/correction of the hashes is below. If you have some process that updates index.html independently of the entire distro, use this script to update the ngsw.json and include it with your index.html push.

Notes:

  • script accepts 3 parameters. If they're not passed, it assumes:
    • the script is being run from the root of the angular project
    • the working directory is "./dist" (where the scripts to be checked are)
    • the input path is "<working_dir>/ngsw.json"
    • the output path is "<working_dir>/ngsw_out.json"
  • Make sure you specify the same input path and output path if you want to modify the file
  • if you put this in AzDO, you'll need to check the "use Powershell Core" checkbox.

Powershell script begins:

param([string]$working_path = "./dist"
  , [string]$input_file_path = "$working_path/ngsw.json"
  , [string]$output_file_path = "$working_path/ngsw_out.json")

"Checking for existence of hash script..."

$fileExists = Test-Path -Path $input_file_path

if ($fileExists) {
  "Service Worker present.  Beginning hash reconciliation."
  ""
  $files_to_calc = @()
  $ngsw_json = (Get-Content $input_file_path -Raw) | ConvertFrom-Json

  "-----------------------------------------"
  "Getting list of javascript files to check"
  "-----------------------------------------"
  $found_count = 0
  for ($idx = 0; $idx -lt $ngsw_json.hashtable.psobject.properties.name.count; $idx++) {
    $current_file = $ngsw_json.hashtable.psobject.properties.name[$idx]
    if ($current_file.Contains(".js")) {
      $files_to_calc += $current_file
      "   File {$idx} $($files_to_calc[-1]) found."
      $found_count++
    }
  }

  "---------------------------------------"
  "$($files_to_calc.count) files to check."
  "---------------------------------------"
  $replaced_count = 0
  $files_to_calc | ForEach-Object {
    $new_hash_value = (Get-FileHash -Algorithm SHA1 "$($working_path)$_").Hash.ToLower()
    $current_hash_value = $ngsw_json.hashTable.$_
    $current_index = [array]::IndexOf($ngsw_json.hashTable.psobject.properties.name, $_)
    $replaced = $false

    if ($ngsw_json.hashTable.$_ -ne $new_hash_value) {
      $ngsw_json.hashTable.$_ = "$new_hash_value"
      $replaced = $true
      $replaced_count++
    }

    "$($replaced ? '** ' : '   '){$current_index}:$_ --- Current Value: " +
    "$($current_hash_value.substring(0, 8))... New Value: " +
    "$($new_hash_value.substring(0, 8))..."

  }
  ""
  "--> Replaced $replaced_count hash values"

  $ngsw_json | ConvertTo-Json -depth 32 | set-content "$output_file_path"
}
else {
  "Service Worker missing.  Skipping."
}
like image 54
Tom Marks Avatar answered Oct 08 '22 00:10

Tom Marks


I am not an expert on this but I am pretty sure that following links will help you with your doubts.

https://angular.io/guide/service-worker-getting-started#whats-being-cached

What's being cached?

Notice that all of the files the browser needs to render this application are cached. The ngsw-config.json boilerplate configuration is set up to cache the specific resources used by the CLI:

  • index.html.

  • favicon.ico.

  • Build artifacts (JS and CSS bundles).

  • Anything under assets.

  • Images and fonts directly under the configured outputPath (by default ./dist//) or resourcesOutputPath. See ng build for more information about these options.

And the below link has info about Service worker and caching of app resources. from which I would like you to read about App versions, Update checks and Resource integrity.

https://angular.io/guide/service-worker-devops#service-worker-and-caching-of-app-resources

I am also pasting the content of these three section here just to avoid making this answer "a link only answer"

App versions

In the context of an Angular service worker, a "version" is a collection of resources that represent a specific build of the Angular app. Whenever a new build of the app is deployed, the service worker treats that build as a new version of the app. This is true even if only a single file is updated. At any given time, the service worker may have multiple versions of the app in its cache and it may be serving them simultaneously. For more information, see the App tabs section below.

To preserve app integrity, the Angular service worker groups all files into a version together. The files grouped into a version usually include HTML, JS, and CSS files. Grouping of these files is essential for integrity because HTML, JS, and CSS files frequently refer to each other and depend on specific content. For example, an index.html file might have a tag that references bundle.js and it might attempt to call a function startApp() from within that script. Any time this version of index.html is served, the corresponding bundle.js must be served with it. For example, assume that the startApp() function is renamed to runApp() in both files. In this scenario, it is not valid to serve the old index.html, which calls startApp(), along with the new bundle, which defines runApp().

This file integrity is especially important when lazy loading modules. A JS bundle may reference many lazy chunks, and the filenames of the lazy chunks are unique to the particular build of the app. If a running app at version X attempts to load a lazy chunk, but the server has updated to version X + 1 already, the lazy loading operation will fail.

The version identifier of the app is determined by the contents of all resources, and it changes if any of them change. In practice, the version is determined by the contents of the ngsw.json file, which includes hashes for all known content. If any of the cached files change, the file's hash will change in ngsw.json, causing the Angular service worker to treat the active set of files as a new version.

With the versioning behavior of the Angular service worker, an application server can ensure that the Angular app always has a consistent set of files.

Update checks

Every time the user opens or refreshes the application, the Angular service worker checks for updates to the app by looking for updates to the ngsw.json manifest. If an update is found, it is downloaded and cached automatically, and will be served the next time the application is loaded.

Resource integrity

One of the potential side effects of long caching is inadvertently caching an invalid resource. In a normal HTTP cache, a hard refresh or cache expiration limits the negative effects of caching an invalid file. A service worker ignores such constraints and effectively long caches the entire app. Consequently, it is essential that the service worker gets the correct content.

To ensure resource integrity, the Angular service worker validates the hashes of all resources for which it has a hash. Typically for an app created with the Angular CLI, this is everything in the dist directory covered by the user's src/ngsw-config.json configuration.

If a particular file fails validation, the Angular service worker attempts to re-fetch the content using a "cache-busting" URL parameter to eliminate the effects of browser or intermediate caching. If that content also fails validation, the service worker considers the entire version of the app to be invalid and it stops serving the app. If necessary, the service worker enters a safe mode where requests fall back on the network, opting not to use its cache if the risk of serving invalid, broken, or outdated content is high.

Hash mismatches can occur for a variety of reasons:

  • Caching layers in between the origin server and the end user could serve stale content.
  • A non-atomic deployment could result in the Angular service worker having visibility of partially updated content.
  • Errors during the build process could result in updated resources without ngsw.json being updated. The reverse could also happen resulting in an updated ngsw.json without updated resources.
like image 28
HirenParekh Avatar answered Oct 08 '22 01:10

HirenParekh