CSS Wizardry

Time to First Byte: What It Is and Why It Matters

CSS Wizardry

I’m working on a client project at the moment and, as they’re an ecommerce site, there are a lot of facets of performance I’m keen to look into for them: load times are a good start, start render is key for customers who want to see information quickly (hint: that’s all of them), and client-specific metrics like how quickly did the key product image load? can all provide valuable insights. However, one metric I feel that front-end developers overlook all too quickly is Time to First Byte (TTFB). This is understandable—forgivable, almost—when you consider that TTFB begins to move into back-end territory, but if I was to sum up the problem as succinctly as possible, I’d say: While a good TTFB doesn’t necessarily mean you will have a fast website, a bad TTFB almost certainly guarantees a slow one. Even though, as a front-end developer, you might not be in the position to make improvements to TTFB yourself, it’s important to know that any problems with a high TTFB will leave you on the back foot, and any efforts you make to optimises images, clear the critical path, and asynchronously load your webfonts will all be made in the spirit of playing catchup. That’s not to say that more front-end oriented optimisations should be forgone, but there is, unfortunately, an air of closing the stable door after the horse has bolted. You really want to squish those TTFB bugs as soon as you can. What is TTFB? The TTFB timing entry isn’t particularly insightful. View full size/quality (375KB). TTFB is a little opaque to say the least. It comprises so many different things that I often think we tend to just gloss over it. A lot of people surmise that TTFB is merely time spent on the server, but that is only a small fraction of the true extent of things. The first—and often most surprising for people to learn—thing that I want to draw your attention to is that TTFB counts one whole round trip of latency. TTFB isn’t just time spent on the server, it is also the time spent getting from our device to the sever and back again (carrying, that’s right, the first byte of data!). Armed with this knowledge, we can soon understand why TTFB can often increase so dramatically on mobile. Surely, you’ve wondered before, the server has no idea that I’m on a mobile device—how can it be increasing its TTFB?! The reason is because mobile networks are, as a rule, high latency connections. If your Round Trip Time (RTT) from your phone to a server and back again is, say, 250ms, you’ll immediately see a corresponding increase in TTFB. If there is one key thing I’m keen for your to take from this post, its is that TTFB is affected by latency. But what else is TTFB? Strap yourself in; here is a non-exhaustive list presented in no particular order: Latency: As above, we’re counting a trip out to and a return trip from the server. A trip from a device in London to a server in New York has a theoretical best-case speed of 28ms over fibre, but this makes lots of very optimistic assumptions. Expect closer to 75ms. This is why serving your content from a CDN is so important: even in the internet age, being geographically closer to your customers is advantageous. Routing: If you are using a CDN—and you should be!—a customer in Leeds might get routed to the MAN datacentre. only to find that the resource they’re requesting isn’t in that PoP ’s cache. Accordingly, they’ll get routed all the way back to your origin server to retrieve it from there. If your origin is in, say, Virginia, that’s going to be a large and invisible increase in TTFB. Filesystem reads: The server simply reading static files such as images or sylesheets from the filesystem has a cost. It all gets added to your TTFB. Prioritisation: HTTP/2 has a (re)prioritisation mechanism whereby it may choose to stall lower-priority responses on the server while sending higher-priority responses down the wire. H/2 prioritisation issues aside, even when H/2 is running smoothly, these expected delays will contribute to your TTFB. Application runtime: It’s kind of obvious really, but the time it takes to run your actual application code is going to be a large contributor to your TTFB. Database queries: Pages that require data from a database will incur a cost when searching over it. More TTFB. API calls: If you need to call any APIs (internal or otherwise) in order to populate a page, the overhead will be counted in your TTFB. Server-Side Rendering: The cost of server-rendering a page could be trivial, but it will still contribute to your TTFB. Cheap hosting: Hosting that is optimised for cost over performance usually means you’re sharing a server with any number of other websites, so expect degraded server performance which could affect your ability to fulfil requests, or may simply mean underpowered hardware trying to run your application. DDoS or heavy load: In a similar vein to the previous point, increased load with no way of auto-scaling your application will lead to degraded performance where you begin to probe the limits of your infrastructure. WAFs and load balancers: Services such as web application firewalls or load balancers that sit in front of your application will also contribute to your TTFB. CDN features: Although a CDN is a huge net win, in certain scenarios, their features could lead to additional TTFB. For example, request collapsing , edge-side includes , etc.). Last-mile latency: When we think of a computer in London visiting a server in New York, we tend to oversimplify that journey quite drastically, almost imagining that the two were directly connected. The reality is that there’s a much more complex series of intermediaries from our own router to our ISP; from a cell tower to an undersea cable. Last mile latency deals with the disproportionate complexity toward the terminus of a connection. It’s impossible to have a 0ms TTFB, so it’s important to note that the list above does not represent things that are necessarily bad or slowing your TTFB down. Rather, your TTFB represents any number of the items present above. My aim here is not to point fingers at any particular part of the stack, but instead to help you understand what exactly TTFB can entail. And with so much potentially taking place in our TTFB phase, it’s almost a miracle that websites load at all! So. Much. Stuff! Demystifying TTFB. Thankfully, it’s not all so unclear anymore! With a little bit of extra work spent implementing the Server Timing API , we can begin to measure and surface intricate timings to the front-end, allowing web developers to identify and debug potential bottlenecks previously obscured from view. The Server Timing API allows developers to augment their responses with an additional Server-Timing HTTP header which contains timing information that the application has measured itself. This is exactly what we did at BBC iPlayer last year: The newly-available Server-Timing header can be added to any response. View full size/quality (533KB). N.B. Server Timing doesn’t come for free: you need to actually measure the aspects listed above yourself and then populate your Server-Timing header with the relevant data. All the browser does is display the data in the relevant tooling, making it available on the front-end: Now we can see, right there in the browser, how long certain aspects of our TTFB took. View full size/quality (419KB). To help you get started, Christopher Sidebottom wrote up his implementation of the Server Timing API. during our time optimising iPlayer. It’s vital that we understand just what TTFB can cover, and just how critical it can be to overall performance. TTFB has knock-on effects, which can be a good thing or a bad thing depending on whether it’s starting off low or high. If you’re slow out of the gate, you’ll spend the rest of the race playing catchup.

Self-Host Your Static Assets

CSS Wizardry

One of the quickest wins—and one of the first things I recommend my clients do—to make websites faster can at first seem counter-intuitive: you should self-host all of your static assets, forgoing others’ CDNs/infrastructure.

Cache 285

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Performance Budgets, Pragmatically

CSS Wizardry

One of the key tools that performance engineers have at their disposal is the Performance Budget: it helps us—or, more importantly, our clients—ensure that any performance-focused work is monitored and maintained after we’ve gone.

Making Cloud.typography Fast(er)

CSS Wizardry

Disclaimers: I was not approached or hired by Hoefler&Co or Cloud.typography to look into any of the following issues. I disclosed all of the below to Cloud.typography and gave them ample opportunity to work together to solve the issues at the root of the problem.

CSS and Network Performance

CSS Wizardry

Despite having been called CSS Wizardry for over a decade now, there hasn’t been a great deal of CSS-related content on this site for a while. Let me address that by combining my two favourite topics: CSS and performance.

Image Inconsistencies: How and When Browsers Download Images

CSS Wizardry

This year, I’ve been working closely with the wonderful Coingaming team out in beautiful Tallinn. We’ve been working pretty hard on making their suite of online products much faster , and I’ve been the technical consultant leading the project.

Tips for Technical Interviews

CSS Wizardry

Yesterday, I spoke at ITKonket in Kragujevac, Serbia. During the Q&A after my talk, one great and non-technical question I got was for general advice on interviewing at tech companies. I decided to write down (and expand on) my answer in the hope that it might help someone else, too. Disclaimer. I don’t claim to be an authority on interviewing. I don’t think this article is definitive or gospel.

Code 171

The Three Types of Performance Testing

CSS Wizardry

A lot of companies—even if they are aware that performance is key to their business—are often unsure of how, when, or where performance testing sits within their development lifecycle. To make things worse, they’re also usually unsure whose responsibility performance measuring and monitoring is.

Bandwidth or Latency: When to Optimise for Which

CSS Wizardry

When it comes to network performance, there are two main limiting factors that will slow you down: bandwidth and latency. Bandwidth is defined as…. the maximum rate of data transfer across a given path.

ITCSS × Skillshare

CSS Wizardry

Back in February 2018, Scott Sullivan, Partnerships Team Lead at Skillshare , sent me an email asking if I’d be interested in collaborating on an official ITCSS video course in conjunction with them. Sign up and learn ITCSS today! The email was extremely well timed.

Getting to Know a Legacy Codebase

CSS Wizardry

The other day, Brad dropped me a message asking me about the topic of getting to know a brand new (specifically CSS) codebase. The kind of codebase that no one person truly understands any more; the kind of codebase that’s had a dozen different contributors over just as many years; the kind of codebase that’s never had a full-scale refactor or overhaul, but that’s grown organically over time and changed with new techniques, styles, and trends. (Un)fortunately,

What If?

CSS Wizardry

I was recently conducting some exploratory work for a potential client when I hit upon a pretty severe flaw in a design decision they’d made: They’d built a responsive image lazyloader in JavaScript which, by design, worked by: immediately applying display: none; to the ; waiting until the very last of the page’s images had arrived; once they’d arrived, removing the display: none; and gradually fading the page into visibility.

Cache-Control for Civilians

CSS Wizardry

The best request is the one that never happens: in the fight for fast websites, avoiding the network is far better than hitting the network at all. To this end, having a solid caching strategy can make all the difference for your visitors. ?? How is your knowledge of caching and Cache-Control headers? — Harry Roberts (@csswizardry) 3 March, 2019.

Cache 217

My Digital Music Setup

CSS Wizardry

I want to begin this post with a disclaimer: I’m not an audiophile, and I don’t claim to be particularly knowledgable when it comes to music technology. If I sound like I don’t have a clue what I’m talking about, that’s probably because I don’t.

Identifying, Auditing, and Discussing Third Parties

CSS Wizardry

A large part of my performance consultancy work is auditing and subsequently governing third-party scripts, dependencies, and their providers.

Google 148

Beam-Up Load Balancing: The Portable Next Generation App Experience

DZone

We’ve seen many of the technological advances described in the Star Trek milieu become reality over the last 50 years, from personal communication devices and instant translators to GMOs, medical robots, 3D printing and weapons that stun. But, ah yes, the matter transporter.

Best Practice for Creating Indexes on your MySQL Tables

Scalegrid

By having appropriate indexes on your MySQL tables, you can greatly enhance the performance of SELECT queries. But, did you know that adding indexes to your tables in itself is an expensive operation, and may take a long time to complete depending on the size of your tables?

New event type helps avoid unnecessary alerts for planned host downscaling

Dynatrace

Dynatrace news. Modern service infrastructure depends heavily on IT’s ability to dynamically scale the number of hosts up or down, depending on the expected workload.

154
154

Customize Dynatrace analysis timeframes as never before with the new global timeframe selector

Dynatrace

Dynatrace news. The timeframe selector is one of the most widely used UI controls in Dynatrace.

Important Health Checks for your MySQL Master-Slave Servers

Scalegrid

In a MySQL master-slave high availability (HA) setup, it is important to continuously monitor the health of the master and slave servers so you can detect potential issues and take corrective actions. In this blog post, we explain some basic health checks you can do on your MySQL master and slave nodes to ensure your setup is healthy.

Introducing Digital Business Analytics: AI-powered real-time answers for better business outcomes

Dynatrace

Dynatrace news. Traditionally, it’s critical for Dev and Ops teams to be able to quickly discover and remediate application performance and customer-facing issues.

Important Health Checks for your MySQL Master-Slave Servers

High Scalability

In a MySQL master-slave high availability (HA) setup, it is important to continuously monitor the health of the master and slave servers so you can detect potential issues and take corrective actions. In this blog post, we explain some basic health checks you can do on your MySQL master and slave nodes to ensure your setup is healthy.

Principles to Handle Thousands of Connections in Java Using Netty

DZone

C10K problem is a term that stands for ten thousand concurrently handling connections.

Java 207

Moore's Law is not Ending Soon and the Reason May Surprise You

High Scalability

Jim Keller recently gave a fascinating and far ranging interview on the AI Podcast. You can find it at Moore's Law, Microprocessors, Abstractions, and First Principles. One of the many topics of discussion was the often predicted death of Moore's Law.

Your Guide to Automated Testing [Article and Tutorials]

DZone

It's time to automate you testing process! What Is Automated Testing?

GraphQL Search Indexing

The Netflix TechBlog

by Artem Shtatnov and Ravi Srinivas Ranganathan Almost a year ago we described our learnings from adopting GraphQL on the Netflix Marketing Tech team. We have a lot more to share since then!

Top Automation Testing Trends To Look Out In 2020

DZone

Quality Assurance (QA) is at the point of inflection and it is an exciting time to be in the field of QA as advanced digital technologies are influencing QA practices.

Easily migrate your OneAgent from one tenant or server to another

Dynatrace

Dynatrace news. Deployment of OneAgent is really easy. You just run the installer—no parameter configurations required—and OneAgent takes care of the rest.

Improve user experience with more visibility into CDN-related HTTP errors (Part 1) 

Dynatrace

Dynatrace news. Modern web applications rely heavily on Content Delivery Networks (CDNs) and 3rd-party integrations (for example, web analytics, tag managers, chat bots, A/B testing tools, ad providers, and more).

Tuning 189

Manage thousands of hosts with the new OneAgent on a host REST API (Preview)

Dynatrace

Dynatrace news. Dynatrace helps you monitor hyper-complex environments where tribal knowledge about entities and their relationships isn’t sufficient. To achieve this, Dynatrace provides important components such as OneAgent and ActiveGate.

Testing Asynchronous Operations in Spring With Spock and Byteman

DZone

This is the second article that describes how to test asynchronous operations with the Byteman framework in an application using the Spring framework.

Autonomous Cloud Enablement aka Scaling NoOps via Self-Service

Dynatrace

Dynatrace news. Autonomous Cloud is not another lofty marketing term. Autonomous Cloud is what enables our globally distributed development teams at Dynatrace to deliver better software faster following our NoOps approach: Fully Autonomous and as a Self-Service!

Cloud 178

Upcoming Software Testing Trends in 2020

DZone

The projections are in! Check out these testing trends! The software development landscape continues to evolve with DevOps and Agile development methods taking over traditional approaches. The advent of these methods has led to the innovation and use of new testing techniques.

Perform 2020: Transform the way you work – Product update

Dynatrace

Dynatrace news. Across both his day one and day two mainstage presentations, Steve Tack, SVP of Product Management, described some of the investments we’re making to continue to differentiate the Dynatrace Software Intelligence Platform.

Mobile 161

Stuff The Internet Says On Scalability For February 14th, 2020

High Scalability

Wake up! It's HighScalability time: Visualize the huge scale of Deep Time by identifying key reference points along the way. Do you like this sort of Stuff? Without your support on Patreon Stuff won't happen.

Reimagining Experimentation Analysis at Netflix

The Netflix TechBlog

Toby Mao , Sri Sri Perangur , Colin McFarland Another day, another custom script to analyze an A/B test. Maybe you’ve done this before and have an old script lying around. If it’s new, it’s probably going to take some time to set up, right? Not at Netflix.

Essential Suite?—?Artwork Producer Assistant

The Netflix TechBlog

Essential Suite?—?Artwork Artwork Producer Assistant By: Hamid Shahid & Syed Haq Introduction Netflix continues to invest in content for a global audience with a diverse range of unique tastes and interests.

Design 157

How Dynatrace and ServiceNow Event Management provide deep observability and rapid resolution

Dynatrace

Dynatrace news. Businesses know that any service disruption can have detrimental business impact.

Dynatrace & ServiceNow feed and enrich the CMBD, automatically.

Dynatrace

Dynatrace news. Modern microservices infrastructure commonly contain thousands of individual business-critical services and related dependencies. Managing highly dynamic service and application infrastructures with a CMDB database can be cumbersome and error prone.

Multidimensional analysis 2.0: Analyze microservice-based metrics without code changes (Part 2)

Dynatrace

Dynatrace news. In Part 1 of this blog series , we presented a few Dynatrace customer use cases for multidimensional analysis.

Code 175

Are Times still Good for Load Testing?

Alex Podelko

My post Good Times for Load Testing was published in 2014. It is difficult to believe that 5 years passed… Are times still good for load testing? Well, yes and no. I am not so upbeat as I was in 2014.

Managing High Availability in PostgreSQL – Part III: Patroni

Scalegrid

In our previous blog posts, we discussed the capabilities and functioning of PostgreSQL Automatic Failover (PAF) by Cluster Labs and Replication Manager (repmgr) by 2ndQuadrant. In the final post of this series, we will review the last solution, Patroni by Zalando, and compare all three at the end so you can determine which high availability framework is best for your PostgreSQL hosting deployment. Managing High Availability in PostgreSQL – Part I: PostgreSQL Automatic Failover.

Selenium WebDriver and TestNG: Find Perfect Match for Automation Testing

DZone

Me looking for the perfect match for automation testing. The manual testing process has been replaced by automated testing during recent years. Selenium automation testing increases the effectiveness and efficiency of the testers and allows them to leverage various benefits at the same time.