How we made encryption, data integrity, and authentication a priority.
In July 2016, Envato completed the move to HTTPS everywhere for Market. HTTPS is protocol for more secure computer network communications, which authenticates the home website and protects the privacy and security of data exchanged.
This was no easy feat, since we are serving over 170 million pageviews a month on the Envato Market! That includes about 10 million products, all of which are user-generated content. Along the way, we learned some valuable lessons that we think will be helpful for anyone working on a similar HTTPS migration.
The idea for the HTTPS rollout started back in 2014, when some Envato engineers implemented a feature toggle for staff staff to opt-in for HTTPS. For years, this functionality sat dormant and unused by most staff. Earlier this year, we decided to give it another push.
Why move to HTTPS?
HTTPS isn’t just about the having a padlock or green indicator shown in the browser; it’s about creating a trusted connection between the end user and your services, via three protection layers:
– Encryption: Securing the exchanged data to prevent eavesdropping on the connections.
– Data integrity: Confidence that the data has not been altered mid transit without being detected.
– Authentication: Assuring the website you are connecting to is who you expect them to be.
An added side effect of migrating to HTTPS is that you can unlock HTTP/2, and features like request multiplexing and server push, which are great news for performance! In August 2014, Google announced HTTPS as a ranking signal; by migrating to HTTPS, sites can demonstrate their commitment to security for customers,which tells Google that their search engine can also trust your site.
The Market is built on user-managed content. The problem here is that many of our users don’t have the time or additional funds to implement things like content delivery network (CDN for short) caching. Without CDN, most of our user managed content requests would end up needing to hit their origin servers to fulfill requests. This was bad for a few reasons:
– Many authors use shared hosting. During the testing phase, we generated a low amount of traffic for a particular set of assets that would sometimes take over 20 seconds to complete! The result of these slow load times is a very poor experience for buyers and would result in many people looking elsewhere because they couldn’t see previews or screenshots of the product quickly enough.
– If we intended to serve our pages under HTTPS, we needed to ensure the assets on the page were also served securely. The issue here is that it’s very unrealistic for Envato to force users to spend time (and potentially money) on updating all of their assets to be served via HTTPS to avoid seeing mixed content warnings on the item pages.
To solve both of these issues, we decided to use an approach that consisted of an image proxy and a CDN. The image proxy would rewrite all of the non-secure links at render time to point at our CDN, which would help speed up response times and cache the assets.
Initially we used camo, a program that was built by Corey Donohoe. He created it for GitHub where they needed to solve a similar issue. This worked well for us until we started trying to scale it to handle more traffic. GitHub solved the scaling issue by adding more worker processes; we tried adding clustering support so that we could utilise more of the hardware we already had in place. This didn’t solve the problem for long, and we eventually ended up back in the same position: we needed to resize our hardware to account for the additional load.
We looked for a better solution, and found found go-camo, which is a Go port of Corey’s original project. For a while we ran the two implementations side-by-side and discovered that go-camo was able to better utilise all of the existing hardware (due to its ability to use more than a single operating system thread), and was easier to debug.After a few weeks of load testing, we decided to completely switch to go-camo.
As you may know, Envato Market is built using Ruby on Rails. Rails offers the ability to define how you handle your cookies. To continue with our incremental rollout, we needed to allow user cookies to be accessible on HTTP or HTTPS. This was achieved by omitting the Secure flag on cookies until we were confident post rollout that we were not going to roll back.
One of the big concerns from teams looking to undertake HTTPS migrations is that they will incur a performance hit once it’s live. In most cases, it’s just not true. Deploying to modern hardware/software setup and using a suitable cipher suite mitigates many of the performance bottlenecks that used to be associated with HTTPS.
In Envato’s case, we haven’t seen any performance impacts and our end user time is consistent with the weeks prior to the HTTPS rollout.
Monitor All The Things
One of the most important things you can do during a HTTPS migration is Monitor All The Things. By having insight into the changes during the migration, you can quickly detect an issue before all your users do. During our rollout some of the metrics we kept a very close eye on were:
– Exception rate
– Time spent in network requests
– End user response time
– Application response time
– Instance resource utilisation (CPU specifically)
– Total number of requests
– Edge network requests and the count by status code
During our rollout we identified a couple of issues, most notably a load balancer misconfiguration. We were seeing a CPU spike on a small subset of web instances that were missed in all of our testing which we managed to catch before rolled it out to all of our users.
Here are two that we put together to keep everyone informed about how far through the rollout we were. The top graph is the initial rollout (mostly just staff usage), and the second is the full-switch to HTTPS.
2016 has been a big year for SEO at Envato. We’ve kicked off many initiatives targeting better visibility for search engines into our author products. During early discussions, it was decided that we needed to be extra careful during the migration not to undo all of the hard work we’ve put into SEO in the last 7 months. To ensure that we didn’t do go backwards, we took a couple of steps:
– We submitted both HTTP and HTTPS sitemaps to Google webmaster tools. In the week leading up to the swap over, we took a snapshot of our sitemaps and uploaded them into Google webmaster tools as a new set of sitemaps. This was done to ensure that when we swapped over to HTTPS, Google would have access to a HTTP and HTTPS sitemap source and would allow Google to continue crawling the HTTP sitemap but at the same time be lead into the HTTPS version of the site.
– We updated our Robots.txt files to specify the new sitemap location.
– We maintained 1:1 redirects: This helped ensure our users (and bots) still knew where to find us even though we moved to HTTPS.
– We updated internal site links to HTTPS: Don’t rely on the HTTP redirects!
In taking these steps, 61% of the high-volume search terms we track have remained stable or improved their rankings since the HTTPS release. The remaining terms that have moved backwards were not on page 1 and have not actually lost us traffic or revenue.
SEO resources we suggest you keep handy (other than your in-house team): Patrick Stox’s SEO’s Guide to Securing a Website and Not Provided’s What Can Possibly Go Wrong with Migrating a Website to HTTPS.
The migration wasn’t completed overnight, and it took longer than we had planned, but we’ve managed to roll this out without any negative impacts on our users or application, which is something we are extremely proud of. Most importantly, the data on the Envato Market is safer and more secure than ever. Moving to HTTPS has a bad reputation for being a difficult undertaking, but it doesn’t have to be. We hope that this published case study of our experience will help others make the same transition!