Automatically uploading new files to S3 with Golang
16 June, 2019
I recently needed to watch a directory for new files and have them immediately uploaded to S3. Because I needed this to run inside a Docker container I wanted the code to be as small and lean as possible.
After some initial planning I settled on the following requirements:
- Should be as lightweight as possible
- As I'll be deploying via Docker it should be 12-factor friendly by being configurable via environment variables
- Syncing must work recursively. Files within folders should be uploaded with their path relative to the watch directory preserved.
- Syncing only needs to work for created files. Files deleted, renamed or moved do not need to be reflected in S3.
Choosing a language
I first considered writing a script in Node and using Chokidar to watch for file changes. Node seemed like a good fit due to the concurrent nature of watching for file changes and uploading files however due to the relatively simple nature of the app I felt bundling the complete Node runtime was overkill. As I had decided I didn't feel like bundling a runtime the next logical choice was Go.
Go, like Node, is particularly good at running code concurrently but with the added benefit of compiling to a standalone executable binary.
Writing the app
My first task was to find a library which would handle the file watching. A quick google search lead me to fsnotify on github which seemed popular at 3.4k stars. This library works by tapping into the OS specify file events. After a quick scan of the README I realised fsnotify doesn't watch files recursively by default which didn't quite fit my requirements. This lead me to the second result of my google search, watcher, which has a more conservative 600+ stars. Unlike fsnotify with it's OS specifc file events, watcher uses polling.
After instantiating watcher
it can be configured to only watch for certain file events. Naturally I only chose to watch for CREATE
events.
Watcher works by emitting events onto a channel which you handle via a Go select
statement inside a goroutine. The Event type composes os.FileInfo
so we use the IsDir
method to ignore new directories. On each newly created file we create another goroutine to handle the file upload to S3 (to see the actual implementation check out the source code).
Building with Docker
Finally I created a multi-stage Docker file which compiles the binary using the official Go image and then copies the build artifact to an image which extends Alpine linux. The final image comes out at only 18.4 MB. Nice and lean.
To see the final result check out go-watch-s3 on github.