Home

Automatically uploading new files to S3 with Golang

Pixelated golang gopher holding a light bulb

I recently needed to watch a directory for new files and have them immediately uploaded to S3. Because I needed this to run inside a Docker container I wanted the code to be as small and lean as possible.

After some initial planning I settled on the following requirements:

  • Should be as lightweight as possible
  • As I'll be deploying via Docker it should be 12-factor friendly by being configurable via environment variables
  • Syncing must work recursively. Files within folders should be uploaded with their path relative to the watch directory preserved.
  • Syncing only needs to work for created files. Files deleted, renamed or moved do not need to be reflected in S3.

Choosing a language

I first considered writing a script in Node and using Chokidar to watch for file changes. Node seemed like a good fit due to the concurrent nature of watching for file changes and uploading files however due to the relatively simple nature of the app I felt bundling the complete Node runtime was overkill. As I had decided I didn't feel like bundling a runtime the next logical choice was Go.

Go, like Node, is particularly good at running code concurrently but with the added benefit of compiling to a standalone executable binary.

Writing the app

My first task was to find a library which would handle the file watching. A quick google search lead me to fsnotify on github which seemed popular at 3.4k stars. This library works by tapping into the OS specify file events. After a quick scan of the README I realised fsnotify doesn't watch files recursively by default which didn't quite fit my requirements. This lead me to the second result of my google search, watcher, which has a more conservative 600+ stars. Unlike fsnotify with it's OS specifc file events, watcher uses polling.

After instantiating watcher it can be configured to only watch for certain file events. Naturally I only chose to watch for CREATE events.

// Instantiate watcher and only watch for CREATE events
w := watcher.New()
w.FilterOps(watcher.Create)

// Watch test_folder recursively for changes
if err := w.AddRecursive("/path/to/watch"); err != nil {
  log.Fatalln(err)
}

// Start the watching process
log.Printf("Watching: %v\n", watchPath)
if err := w.Start(time.Duration(watchInterval * 100000)); err != nil {
  log.Fatalln(err)
}

Watcher works by emitting events onto a channel which you handle via a Go select statement inside a goroutine. The Event type composes os.FileInfo so we use the IsDir method to ignore new directories. On each newly created file we create another goroutine to handle the file upload to S3 (to see the actual implementation check out the source code).

go func() {
  for {
    select {
    case event := <-w.Event:
      if event.IsDir() {
        continue
      }
      go upload(event.Path) // pseudo-code
    case err := <-w.Error:
      log.Fatalln(err)
    case <-w.Closed:
      return
    }
  }
}()

Building with Docker

Finally I created a multi-stage Docker file which compiles the binary using the official Go image and then copies the build artifact to an image which extends Alpine linux. The final image comes out at only 18.4 MB. Nice and lean.

To see the final result check out go-watch-s3 on github.


Hi, I'm Will

I'm a lead software engineer with over 13 years experience from Melbourne Australia. Got a project you'd like to discuss? Reach me below.