Stuff The Internet Says On Scalability For November 15, 2019

 Wake up! It's HighScalability time:

 

Which exoplanet  do you want to haunt? (JPL)

Do you like this sort of Stuff? I'd greatly appreciate your support on Patreon. I also wrote Explain the Cloud Like I'm 10 for all who need to understand the cloud. On Amazon it has 61 mostly 5 star reviews (137 on Goodreads). Please recommend it. You'll be a real cloud hero.

Number Stuff: 

  • 24,620,491+: open database of free scholarly articles. 
  • 20TB: Seagate hard drive to ship in 2020.
  • 160: bits per second Voyager 2 sends data back to mother earth. Launched in 1977, they were designed for a 5 year mission. 
  • 75.6 GB: data stored on glass. Infrared lasers encode the data into voxels, a three-dimensional version of pixels, which are stored in the glass, and machine learning algorithms can decode the patterns to read back the data
  • 5%: of Galileo capacity is lost to software problems likely in the Orbit Synchronization Processing Facility (OSPF), run by GMV.
  • 273bn: global 2018 spend on digital advertising. 
  • 232M: Spotify monthly active users. 500 distinct Event Types, 8M events per second at peak, 350 TB of  events (raw data).
  • 5%: of IBM Mac users called tech support vs 40% for PC users. IBM saves anywhere from $273 to $543 when its end users choose Mac over PC.
  • lo: first word transmitted on the internet.
  • 60: more Starlink satellites launched by SpaceX.
  • 115,151: Backblaze hard drives. 
  • 3.2 billion: fake Facebook accounts obliterated. More than doubles the number of fake accounts taken down during the same period last year.
  • 38Gpbp: delivered by Google's Snap (user space microkernel) /Pony (ground-up implementation of networking primitives) using 1.05 cores vs 22Gbps using 1.2 cores for the baseline.
  • $23 to $500: cost of a phishing attack. All attacks started with an email lure to the victim account.
  • 1.4 terabits per second: with 256 NVMe Drives
  • 66: days to hire a new tech employee, up from 55 days in 2016.
  • 500,000: workers needed to be added to the U.S. cybersecurity workforce, a 62 percent increase.
  • #1: Python now more popular than Java on GitHub. Dart was the fastest-growing language between 2018 and 2019, with usage up a massive 532%. It was followed by the Mozilla-developed Rust, which grew a respectable 235%. 
  • 40 million+: GitHub users. More than 10 million new users, 44 million repositories created and 87 million pull requests in the last 12 months. 80% of Github users are outside of the US.
  • 36: additional deaths per 10,000 heart attacks annually because of ransomware. 
  • 40%: productivity increase from a glorious four-day work week. 
  • 68.14 GB: bandwidth served in October by Cloudflare.

Quotable Stuff:

  • @davidgerard: ok coiner
  • Geoff Huston: We have reached a somewhat sad moment when it is clear that the DNS has been entirely co-opted into this regime. Sadder still to think that if this is a new realm of national sovereignty then our existing nation-state world order is just simply not able to engage with the new IT corporate nation-states in any manner that can curb their overarching power to defend their chosen borders. The 1648 Peace of Westphalia has much to teach us, and not all of the lesson is pleasant.
  • Jesse Frederik: The experiment continued for another eight weeks. What was the effect of pulling the ads? Almost none. For every dollar eBay spent on search advertising, they lost roughly 63 cents,  according to Tadelis’s calculations. The experiment ended up showing that, for years, eBay had been spending millions of dollars on fruitless online advertising excess, and that the joke had been entirely on the company....In seven of the 15 Facebook experiments, advertising effects without selection effects were so small as to be statistically indistinguishable from zero. 
  • Salvatore Sanfilippo: I think Redis is entering a new stage where there are a number of persons that now actively daily contribute to the open source. It’s not just “mostly myself”, and that’s great.
    Redis modules are playing an interesting role, we see Redis Labs creating modules, but also from the bug reports in the Github repository, I think that there are people that are writing modules to specialize Redis for their own uses, which is great.
  • @ArielDumas: I'm looking at Wayfair and my phone just rang -  an unknown number. Picked it up, and it was a Wayfair employee saying they noticed I was browsing their website so happy creepy Halloween I guess.
  • jaaron: In my experience, the most common issues with complex distributed systems are much more likely to be due to misconfiguration because of a limited understanding of the systems involved than such issues are to be caused by core, underlying bugs. And I believe that's why some engineers shy away from otherwise valuable frameworks and platforms: they have a natural and understandable bias to solve problems via engineering (writing code) than via messing with configuration parameters.
  • Jana Iyengar: In other words, QUIC is as simple as the modern internet demands, which is not very simple in absolute terms.
  • @willsalsa76: Open Core vs Open Source. I think Open Core is somewhat the Sharewares of Open Source. Now also, theses companies want to make money. Landscape changed: it's no longer a benevolent volunteer effort.
  • @kateMorris102: News flash: most of my code never went live as projects were constantly canned. It was a strangely lucrative, low risk, deeply unsatisfying professional experience contracting in the late 1990’s.
  • Melanie Lefkowitz: A study by Cornell University researchers found bitcoin transaction fees may neutralize the cryptocurrency's long-term usefulness and add to energy waste. As the cryptocurrency has grown, so has the wait for transactions to be added to the blockchain ledger, and users pay fees to speed that up. However, Cornell's Maureen O'Hara warns the fees' practicality could be negated by their cost. Said O'Hara, "If everybody's paying a transaction fee now, then you may end up in the same situation that you were in before—the fees got high and you have to wait anyway." The transaction backlog also requires an enormous amount of energy, due to blockchains' massive computing energy requirements. O'Hara said higher fees encourage more blockchain users or miners to compete to solve the mathematical problems that yield bitcoin, causing energy use to spike.
  • @riking27loud: This is spot on. I recently did a market review for a personal product of mine, running 4 processes that each need 400MB ram, 100millicpu, and ~120kBps network inbound. The job scheduling here is very easy: they all go on a single DO droplet.
  • @CoverosGene: Netflix does 4000 deploys per day. Up from 2000 pre-Spinnaker. 4500 pipelines running per day. @aglover #AgileDevOpsCon
  • lexeichemenda: A digital channel can deliver a solid ROI (>200%) at $10K-$50K / month. Advertiser is excited, wants to scale to $500K / month. ROI drops to 110%. Woops, not as good. So what does advertiser do? Advertiser finds the max scale they can run at to maintain an acceptable level of ROI (for ex, 140%) and that is $100K / month of spend on that channel. The interesting shift we're seeing is that historically, advertisers just went on and multiplied the number of channels, spending $10K / mo on channel 1, $50K / mo on channel 2, $500K / mo on channel 3. However, the cost of maintaining each channel and optimizing is greater than the added value. So current trend we're seeing is consolidation of this spend, and understanding that they won't be able to spend as much on ads since they still to need that 140% ROI, but only on a few channels. As to measurement, incrementality measurement (usually two methods, ITT (intention to treat, divide your entire audience in 2 parts and show ads to only 1 of the group) or ghost ads (described below) delivers a very clean metric as to whether ad spend if bringing any sort of value and how much value it actually brings. Assuming a healthy p-value is present (aka, assuming advertiser is running enough marketing spend $ that results are significant), that's your answer to how much more you should invest on the current marketing campaigns (or it will show that you need to change your campaigns because current ones are not performing)
  • @0xdabbad00: AWS historically (in my opinion) had been fairly equivalent for everyone. More and more it is becoming a world and haves and have nots.  The $100M/yr customers get advance notice of things and everyone else gets the rug pulled out from under them.
  • Lifespan: If the genome were a computer, the epigenome would be the software. It instructs the newly divided cells on what type of cells they should be and what they should remain, sometimes for decades, as in the case of individual brain neurons and certain immune cells.
  • Heather Piwowar: One interesting realization from the modeling we’ve done is that when the proportion of papers that are OA [Open Access] increases, or when the OA lag decreases, the total number of views increase -- the scholarly literature becomes more heavily viewed and thus more valuable to society.
  • @jim_dowling: If you care about HA and cost in the cloud, then synchronous replication protocols come back. Reads are local to an availability zone, reducing interAZ traffic costs. Non blocking TPC is also a thing.
  • @joe_hellerstein: “Stability is just as important as consistency and performance, but gets way less attention”. — @MarcJBrooker #hpts2019
  • @kmett: RAFT was deliberately designed to be understandable, decomposable and able to get right. It came at the cost of more messaging overhead than Paxos variants. It also makes leader election way simpler by requiring the most current node to be made leader, and this has its own costs.
  • jessitron: We once layered by serving data to software. Now we layer by serving value to people.
  • Steve Tadelis: What Randall is trying to say is that marketeers actually believe that their marketing works, even if it doesn’t. Just like we believe our research is important, even if it isn’t.
  • @dhh: So let’s recap here: Apple offers a credit card that bases its credit assessment on a black-box algorithm that 6 different reps across Apple and GS have no visibility into. Even several layers of management. An internal investigation. IT’S JUST THE ALGORITHM!
  • @sfiscience: 4 Principles of Collective #Computation: 1 Ground truth (objective reality) 2 Effective ground truth (what we agree on, whether or not it is accurate) 3 #Information can be collectively encoded in #networks 4 Outputs are a product of collective dynamics
  • Melanie Mitchell: Can you learn cause and effect from data?
  • gt565k: Can't wait to see every Tesla become a wifi hotspot hooked up to Starlink. Traditional ISPs are in trouble if the Starlink latency is really in the sub 100ms due to the LEO distance advantage. Global coverage with decent latency!
  • Avery Segal: A 2017 survey found that 70 percent of Chinese netizens found carrying cash unnecessary. Since then, mobile payment adoption has continued to rise. Vxiaocheng’s mobile system features an in-app button where attendees can tip the bar staff and live musicians; such online tips can double a musician’s income. Because Vxiaocheng aggregates data on musicians over time, it has also become a valuable hiring platform for bar owners to browse more than 3,000 musicians.
  • @sfiscience: "Not every dynamical system is computing something. There has to be an interpreter." - @MelMitchell1
  • @brendangregg: Did Linux get slower after 4.14? Yes, if you use the KPTI defaults. How much? Between 1 and 800%, depending on your workload. I expected our workloads to slow by between 0.1 and 6%.
  • @sfiscience: "We're actually living in the *aftermath* of the #singularity. Those artificial intelligences are the institutions, the corporations...they've been in charge for a hundred years, and we're living in the #postsingularity nightmare." - @cshalizi
  • reaperducer: Google Capone: "Hey, nytimes.com, your site loads awful slow. We're gonna have to put this badge of shame on it for everyone to see. Now, if you just dumped your other ad networks and ran everything through us, I bet it would be load much faster and that badge might magically disappear..."
  • bane: I've been having this hard to articulate notion that humanity hasn't really begun to fathom the changes that will come with "cheap" launch vehicles.
  • @QuinnyPig: Wow. So AWS revenue dwarfs “everything Google sells that isn’t ads.”
  • briffle: I'm overall pretty happy with GCP, but wish they would better isolate their availability zones.. I have yet to see a single AZ problem, its often a full Region, or global, and that is not great in a cloud world..
  • @swardley: It's also why when people solely focus on money as the measure of success then I tend to measure those same people in terms of a Pablo Escobar ($30Bn) i.e. you're only worth a 1/1000th of a Pablo. People tend to get upset with that for some reason.
  • Andrei Frumusanu: SiFive’s design goals for the U8-Series [RISC-V OoO CPU Core] are quite straightforward: Compared to an Arm Cortex-A72, the U8-Series aims to be comparable in performance, while offering 1.5x better power efficiency at the same time as using half the area. The A72 is quite an old comparison point by now, however SiFive’s PPA targets are comparatively quite high, meaning the U8 should be quite competitive to Arm’s latest generation cores.
  • @jayapapaya: When ppl go like "everything is a network" or "everything is a market" or like "everything is energy" or whatever "everything" - you know they are trying to win you over to whichever religion they are part of 
  • César Hidalgo: What I try to communicate in why information grows, I don’t know if I succeed, but what I try to communicate is, at the end of the day, you have these systems that have a finite ability to accumulate knowledge, to accumulate that capacity to make. The only way that those systems can transcend that limited capacity is by developing collective phenomenon, collective systems that include multiple units. You go from single cellular organisms to multicellular organisms because you could never achieve the level of complexity of a multicellular organism with a single-celled organism, but multicellular organisms, they peak at the human,
  • @cloud_opinion: If you don't understand how big AWS in Cloud, consider this metric: AWS makes more money on re:Invent registration fees this year than Oracle+IBM cloud revenues.
  • David J. Epstein: But the game’s strategic complexity provides a lesson: the bigger the picture, the more unique the potential human contribution. Our greatest strength is the exact opposite of narrow specialization. It is the ability to integrate broadly.
  • Donald Hoffman: Steven Pinker sums up the argument well: “We are organisms, not angels, and our minds are organs, not pipelines to the truth. Our minds evolved by natural selection to solve problems that were life-and-death matters to our ancestors, not to commune with correctness.”
  • @tracyalloway: You can replace the term "distributed ledgers" with "shared Excel sheets" in 90% of talk about blockchain and finance.
  • @decimalator: kubernetes - turning things off and on again, at scale
  • marrakech07: So clickbaity. Salesforce.com is not moving to Azure. Marketing Cloud is, which makes sense because it's built on .NET anyway.
  • Stephanie Sherriff: The quick answer is we moved to using gPRC microservices with protobuf schemas. Kafka was cut completely from the stack. The benefits have been considerable.
  • @raphaelsoeiro: When I was a pre sales engineer at Google, I would always pitch the Titan chip as a security benefit/advantage of GCP. Now yet another service/product being open sourced. In my opinion this just diminishes the value add and importance of GCP..
  • dogfish182: For my next project I would want to go all in on one cloud and try to max out your availability on that one cloud, multicloud is a difficulty multiplier
  • Karl Bode: ISPs Cut Back 2020 Investment Despite Tax Breaks, Death Of Net Neutrality
  • tabtab: The physical hardware for the computers of 70's probes had more parts and complexity. Voyager used magnetic tape recorders, for example. Newer tech has allowed for simpler computer hardware, but does shift problems into the realm of software and file system management. Both Spirit (Mars) and New Horizons (Pluto) had down days as issues with file system management puzzled the IT staff. The New Horizons case was a crazy mad-scramble, as the probe was scheduled to pass by Pluto in a few days whether the probe was working or not. There was no re-do. Dozens of choice careers were in the balance. Probe chips are still not very powerful by today's standards because they are designed to work in the harsh conditions of space. Thus, they are more comparable to a 1980's PC, and may mostly stay that way, being smaller components don't handle radiation well. I read somewhere it's estimated that even surface-reaching cosmic radiation fouls up the typical desktop PC roughly once a year. Most just grumble at Microsoft and reboot.
  • Luke Wagner: WebAssembly is changing the web, but we believe WebAssembly can play an even bigger role in the software ecosystem as it continues to expand beyond browsers. This is a unique moment in time at the dawn of a new technology, where we have the opportunity to fix what’s broken and build new, secure-by-default foundations for native development that are portable and scalable. But we need to take deliberate, cross-industry action to ensure this happens in the right way.

Useful Stuff:

  • Is cloud infrastructure a zero-sum game? Serverless: Is It The Kubernetes Killer? says no: "Serverless isn't here to destroy Kubernetes. The cloud infrastructure space race isn't a zero-sum game. Kubernetes is an obvious evolution following OpenStack and can be run successfully inside of it. Serverless is another tool in the belt of forward-thinking development teams."  I'm not so sure. Technologiical evolution sees one technology suplant another without complete replacement. We still have radio and TV. We still have TV and streaming. We still have horses and cars. Yet nobody would say radio, TV, and horses are growing industries. 

  • Serverlessconf New York videos are now available. The dark mode theme is mostly unreadable so you'll just have to go look for yourself.

  • Starlink is a very big deal
    • Starlink was born conceptually in 2012 when SpaceX realized that its customers, primarily comsat providers, had better margins than they did...Because SpaceX can launch their satellites for about a tenth of the price (per kg) of the original Iridium constellation, they’re able to address a substantially more inclusive market.
    • Starlink’s world-spanning internet will bring high quality internet access to every corner of the globe. For the first time, internet availability will depend not on how close a particular country or city comes to a strategic fiber route, but on whether it can view the sky.
    • Assuming the antenna can support 100 separate beams, and each beam can transmit at 100MB per second using advanced coding such as 4096QAM, the satellite generates $1000 of revenue per orbit, assuming a subscriber cost of $1/GB. This is sufficient to earn the $100k deployment cost in only a week, greatly simplifying the capital structure. The remaining 29,900 orbits are profit, once fixed costs are accounted for.
    • Even taking into account its ludicrously low usage fraction, a Starlink satellite can deliver 30 PB of data over its lifetime at an amortized cost of $0.003/GB, with practically no marginal cost increase for transmission over a longer distance.
    • It’s not obvious that internet satellites are the way to go. SpaceX, and only SpaceX, is in a position to rapidly build out an enormous internet constellation, because only SpaceX had the vision to spend a decade struggling to break the government-military monopoly on space launch. 

  • AWS Savings Plans is nothing less than a complete overhaul of the AWS compute pricing model. AWS Begins Sunsetting RIs; Replaces Them With Something Much, Much Better: At a high level, you no longer need to purchase RIs for a given instance type. Instead, you commit to a baseline level of spend per hour on compute that you’ll pay regardless of actual use. Anything at or below that usage level is included; anything above it you’ll pay at the existing on-demand rates....Compute Savings Plans come with a high level of flexibility. They aren’t tied to any specific region. You decide your spend commitment per hour across all of your accounts and that’s that. You pay that much regardless of your usage; any usage beyond that baseline commitment gets charged at normal on-demand rates.

  • You've had a very successful experience running in a co-lo. What would make you finally move to the cloud? And would you move everything to the cloud or take a hybrid approach? Those questions and many more are answered by FreeAgent in Head In The Clouds
    • Co-locating has been a terrific win for us over the years, providing us with a cost-effective, high performance compute platform that has allowed us to scale to over 95,000 customers with close to 5 9's reliability.
    • Growth often acts as a forcing function with regards to infrastructure. Head count has doubled. Customer count is growing quickly. 
    • Desire for new features is another forcing function. They wanted more datacenters to increase resilience. They were reaching hardware limitations. The ops team was pressed and it was challenging to find ops engineers with the right skills. They were experimenting with ML. Serverless was becoming a go to for production. They wanted to improve deployment. And scaling the database was a challenge. 
    • Experiments were run to research moving to AWS:  Granted, any infrastructure migration would be expensive, the project complex and it would come with many challenges, but the advantages and opportunities that a full cloud migration would open up in the future were undeniable.
    • The decision was made to migrate to AWS!
    • Early on in the R&D phase we became customers of Gruntwork.io and have relied heavily on their Infrastructure as Code library and training to accelerate the project.

  • Episode #21: Getting Started with Serverless (Special Episode). Fun tour through the highlights of different episodes. One of the big ideas goes counter to the lots of little functions advice you usually get with lambda. At Bustle they don't prematurely optimize. They have very few very large lambdas and they do billions invocations monthly. Latencies are very low. They webpack javasccript into one single file so there are no file system operations. It's minified. That's easier than managing many tiny functions. Functions aren't zero cost. Try things out first. Don't stress too much about being perfect. There are many best practices.

  • Looks interesting. Fastly has entered private beta for Compute@Edge. The pitch: Compute@Edge provides a powerful, flexible, and scalable way to build serverless applications at the network edge — with unprecedented performance and security. Powered by our open-source WebAssembly runtime and compiler Lucet, Compute@Edge allows you to reimagine what’s possible. you can build complex applications that enable personalized user experiences and interactions. And those applications execute in microseconds, running across our globally distributed platform with 64 POPs and 52 Tbps of network capacity. Now, with Compute@Edge, you can serve GraphQL from our network edge — and deliver way more personalized experiences. With Compute@Edge, you can develop your own customized API protection logic. Think: authentication, encryption, caching, and beyond.

  • Notes from Redecentralize 2019 and some videos and more notes. Lloyd: Overall, my take was that interoperability is seen as a more important focus than decentralization for its own sake. There were conversations about standards, models, public policy and UX patterns. There was concern in the room about how to deal with personal and group abuse effectively. There was a healthy mix of light-hearted joking and serious talk about important issues.

  • Excellent trip report from JSConf Budapest 2019. You might like: Essential JavaScript debugging tools for the modern detective by Rebecca Hill or Mastering UIs with Finite State Machines by Rubén Sospedra. 

  • What happens when demand exceeds your highest expectations? Disney+ hit by technical glitches on launch day. So when Netflix flipped the switch to operate at a global scale—it was an accomplishment.

  • I still miss Cfront. Trip report: Autumn ISO C++ standards meeting (Belfast)and another trip report. C++ 20 generated 378 comments from national bodies. 200 experts attended the meeting. It has been a 3 year release cycle. Some important new features: modules, coroutines, concepts, ranges. A lot of interesting papers were put forward: fiber_context - fibers without schedulerText ParsingA Unified Executors Proposal for C++. Also, Comparing parallel Rust and C++

  • Once again, we need standard tests and we need continuous testing for each and every car software release. Remember the Uber self-driving car that killed a woman crossing the street? The AI had no clue about jaywalkers.

  • Does everyone really need to work in the same open office space? InVision has over 850 remote employees. They don't even have an office. So it can work, if you: go all in, set ground rules, use technology, invest big in employee onboarding, hire people with great EQ, and nurture work-life balance. For work-life balance InVision: encourages employees to use the flexibility remote work offers them. Every InVisioner gets a home-office stipend and coffee-shop vouchers to encourage them to work in other locales some of the time. The company also offers unlimited vacation and encourages personal-interest Slack channels for employees to connect over non-work interests.

  • A curated list of Favorite Talks From Strange Loop 2019Probabilistic Scripts for Automating Common-Sense TasksHow to Teach Programming (and Other Things)?New programming constructs for probabilistic AIUptime 15,364 days - The Computers of Voyager

  • Here's why processes periodically need restarting. Why does my App's Memory Use Grow Over Time?: Total memory use goes up as the number of threads are increased; Memory use for an individual thread is a factor of the largest possible request it will ever serve; Memory use across all threads are based on a distribution of how likely that maximum request is to be hit simultaneously by all existing threads; As your application executes over time, it is expected and natural that your memory requirements will increase until they hit a steady-state.

  • I hope 2020 is more interesting than this. Forrester: The 5 ways cloud computing will change in 2020: IBM and Oracle retreat to familiar territory; Alibaba threatens Google; SaaS vendors exit proprietary platforms and move to the hyperscale leaders; High performance computing (HPC) use in public cloud grows to 40%; Open source cloud native development battles target service meshes and serverless; Cloud management players tackle cloud security. 

  • Another trip report. This time it's Hot SRE trends in 2019 (brought to you from SREcon EMEA). You might like: Advanced Napkin Math: Estimating System Performance from First PrinciplesBuilding a Scalable Monitoring SystemFault Tree Analysis Applied to Apache Kafka

  • Space is hard. Why make it harder by poor incident management, especially for a week long outage? The July Galileo Outage: What happened and why. When you have to upload configuration data to a satellite many times a day there are a lot of opportunities for those things that should never happen to actually happen: The outage in the ephemeris provisioning happened because simultaneously: The backup system was not available; New equipment was being deployed and mishandled during an upgrade exercise; There was an anomaly in the Galileo system reference time system; Which was then also in a non-normal configuration. Why did this happen? The operation of Galileo is spread out over a large number of organizations and companies.

  • Our goal is to move towards a completely zero trust platform. This means that in theory, we'd be able to run malicious code inside our platform with no risk. We built network isolation for 1,500 services to make Monzo more secure: We've moved from every service being able to call 1,500 others, to every service being only able to call six others on average, with review of every new pairing. Solving the human issues around managing these rules was a particularly interesting challenge. And it's been an incredible win for security and a huge step towards a zero trust platform...To start we wrote a tool called rpcmap. This would read all the Go code in our platform, and attempt to find code that looked like it was making a request to another service. In doing so, the tool maps out the connections between our services and the services they call...We actually implement our Kubernetes network policies using Calico, which is networking software that lets our services talk to each other. 

  • It's databases all the way down. System design hack: Postgres is a great pub/sub & job server
    • There are very few use cases where you'd need a dedicated pub/sub server like Kafka. Postgres can easily handle 10,000 insertions per second, and it can be tuned to even higher numbers. It's rarely a mistake to start with Postgres and then switch out the most performance critical parts of your system when the time comes...it turns out that Postgres generally supersedes job servers as well. You can have your workers "watch" the "new events" channel and try to claim a job whenever a new one is pushed. As a bonus, Postgres lets other services watch the status of the events with no added complexity. 
    • An instance of an API server creates a run by inserting a row into the "Runs" row of a Postgres table...How do the workers worker "claim" a job? By setting the job status atomically...Finally, we can use a trigger and a channel to notify the workers that there might be new work available...All the workers have to do is "listen" on this status channel and try to claim a job whenever a job's status changes.
    • jordic: We use a lot this kind of tooling.. say, you need to check 20k URLs and you want to rate limit them.. add them to a Pg table (with state and result fields). A single thread worker that just takes a row (marks it as pending) and later updates it. With select for update and skip tricks you can horizontal scale it to the number of workers you need. I had seen it also for soft that sends massmail (our case around 100k/day).. it's state is a postgres queue. We also use Pg for transactional mail. We insert it on a table. (There is a process that sends the row mails).. the so nice part is that the mail is joining the dB transaction for free.. (all or nothing)
    • colinchartier: This pattern falls down if you need to poll the database, because if you have 3 queues and 100 workers you're making 300 queries per poll interval. The feature of postgres that makes this viable in comparison to most other databases is the "channel"
    • Also, graphile/workerrudderlabs/rudder-serverAn Opinionated Approach to Developing Event-Driven Microservice Applications with Kafka and Web-Sockets.

  • Murat with a great set set of notes on SOSP'19 (Symposium on Operating Systems Principles). Day 0, Day 1, Day 2, plus a paper review of SOSP19 Verifying Concurrent, Crash-safe Systems with Perennial. Prepare for some deep reading. 

  • Building a Large-scale Distributed Storage System Based on Raft: The core of a distributed storage system is nothing more than two points: one is the sharding strategy, and the other is metadata storage. Keeping applications transparent and consistent in the sharding process is crucial to a storage system with elastic scalability...If a storage system only has a static data sharding strategy, it is hard to elastically scale with application transparency. Such systems include MySQL static routing middleware like Cobar, Redis middleware like Twemproxy, and so on. All these systems are difficult to scale seamlessly...Sharding is a database partitioning strategy that splits your datasets into smaller parts and stores them in different physical nodes. The unit for data movement and balance is a sharding unit. Each physical node in the cluster stores several sharding units. Two commonly-used sharding strategies are range-based sharding and hash-based sharding...In TiKV, each range shard is called a Region. Because we need to support scanning and the stored data generally has a relational table schema, we want the data of the same table to be as close as possible. Each Region in TiKV uses the Raft algorithm to ensure data security and high availability on multiple physical nodes.

  • We chose to prioritize liveness over lateness.  In other words, a noisy, broken, or blocked Event Type will not halt the rest of the system. Spotify’s Event Delivery – Life in the Cloud.                                  
    • Our system consists of close to 15 different microservices that are deployed on around 2500 VMs. 
    • At our scale, we learned that completely abstracting cost away from engineers and data scientists can create waste.
    • Intermittent and short cost saving sprints have been a great mechanism to cut waste, while allowing unconstrained spending on new projects.
    • Now GDPR is a prime requirement whenever designing a system that handles data. 
    • We observed that data grows an order of magnitude faster than service traffic. Growth is a multidimensional function on the  dimensions of DAU and organization growth. Whereas DAU is quite obvious, organizational growth is not. There will be an increasing number of engineers and teams introducing new features and instrumenting them. Capturing more data means a need for more data engineers and scientists looking into that data to gain more insights. More insights means more features, and the growth compounds.
    • The Event Type is defined by a producer, it has a name, a schema and metadata. From the operational difficulties with our Kafka-based system, we learned that not all events are equal and that we can leverage this in our favor.
    • Event Types are prioritized and may differ based on some of the properties listed below: Business Impact Event Types – some are used to pay royalties to labels and artists, and some are used to calculate company key metrics. These Event Types are subject to external SLAs both for timely delivery, and quality Volume Event Types –  are emitted a few hundred times an hour, and some are emitted 1M+ times a second. Size Event Types –  size varies between a few bytes and tens of kilobytes.
    • In order to prevent high volume or noisy events disrupting the business-critical data, we chose to isolate event streams as soon as possible. Event Types are separated right after the Event Service which is the entry point to our infrastructure.
    • Event Types are distinguished by importance: high, medium, and low, and we have separate priorities and Service Level Objectives (SLOs) for each importance level. This allows us to prioritize work and resources during incidents in order to deliver the most critical events first.
    • Delivered events are partitioned hourly; this means that each Event Type has an immutable hourly bucket where events are stored
    • As part of Spotify’s move to the cloud, the strategy has been to outsource time consuming problems that are not core to our business to Google and GCP. Particularly, we take advantage of managed messaging queues, data processing, and storage. The backbone of our system is Cloud Pub/Sub. Cloud Storage (GCS) is the main storage for both the final datasets and intermediate data. The ETL is built on Compute Engine (GCE) instances (a cluster per Event Type, using Regional Managed Instance Groups), CloudSQL for metadata, and Dataproc for deduplication. We use Dataflow jobs for encryption of sensitive data in events. We use BigQuery for data warehousing, and this tool has become a favourite for data engineers, data scientists, analysts, product managers, and most who wish to interact with event data.

  • You can’t avoid error handling code, not at scale. Scaling in the presence of errors—don’t ignore them: The secret to error handling at scale isn’t giving up, ignoring the problem, or even it trying again—it is structuring a program for recovery, making errors stand out, allowing other parts of the program to make decisions. Techniques like fail-fast, crash-only-software, process supervision, but also things like clever use of version numbers, and occasionally the odd bit of statelessness or idempotence. What these all have in common is that they’re all methods of recovery. Recovery is the secret to handling errors. Especially at scale. Giving up early so other things have a chance, continuing on so other things can catch up, restarting from a clean state to try again, saving progress so that things do not have to be repeated. That, or put it off for a while. Buy a lot of disks, hire a few SREs, and add another graph to the dashboard.

  • Great story well told. How Much of a Genius-Level Move Was Using Binary Space Partitioning in Doom?: Still, Carmack found himself faced with a novel problem—”How can we make a first-person shooter run on a computer with a CPU that can’t even do floating-point operations?”—did his research, and proved that BSP trees are a useful data structure for real-time video games. I still think that is an impressive feat, even if the BSP tree had first been invented a decade prior and was pretty well theorized by the time Carmack read about it. Perhaps the accomplishment that we should really celebrate is the Doom game engine as a whole, which is a seriously nifty piece of work.

  • Another thoughtful discussion Exponent EPISODE 177 — PRINCIPLE STACKS
    • They take on the whole idea that free speech laws apply only to the government. The consensus is Facebook is taking a principled stand in allowing all kinds of speech. They postulate a principle stack for making decisions based on a defined stack of principles. Since Facebook is putting free speech at the top of their stack they are covered.
    • An iteresting idea is that the US constitution was a form of permissionless government and that's one reason why the US was so successful. We saw AWS take off because it was a form of permissionless IT. What the podcast missed is the AWS allowed you to sign up, but in doing so you opted in to a very restrictive social contract. Misbehave and you are deported swiftly and unceremoniously out of AWS country. So Facebook is no way prevented from setting rules—if they want to.
    • Another point the podcast missed is many states made the passing of a Bill of Rights contingent on their ratifying the constitution. That was part of the deal. So permissionless only goes so far. Nobody trusts or should trust an open ended stack of principles. Remember Mathew 23:27? Woe unto you, scribes and Pharisees, hypocrites! for ye are like unto whited sepulchres, which indeed appear beautiful outward, but are within full of dead men's bones, and of all uncleanness. There's the law then there's the spirit of the law. The idea that because rights would not be enumerated in the constitution meant that all rights would be preserved is fanciful at best. Do you have, for example, the right for a profit on your beaver pelts? That was an actual suggested right. The whole idea of rights essentially does not exist without enumeration and definition. 
    • Another point the podcast missed is that the US constitution is already essentially a stack of principles. If you've taken any history course on the Supreme Court and constitutional law you'll realize how much deliberate interpretation goes into making sense of principles. That's the idea of case law and precedent. A principle stack is not a programming language. No matter how detailed, interpretation is key. That's why we have an entire legal system and government structure just to create, interpret, and administer those principles. Do you think Facebook has any of that? No. They have an algorithm and a profit motive. 
    • Another point the podcast missed is that a stack implies a strict hierarchy. In practice stacks never work that way. Each stack layer sends its tendrils up and down the stack obeying a neuronic impulse to excite and inhibit all the layers it can recruit. That's the interpretation layer mentioned in the previous section. 
    • We like the nice clean abstraction of a stack. We like the nice clean abstraction of applying a well defined principle. But in practice there's nothing nice or clean about any of them. You will find no refuge there. Applying the US constitution resulted in a Civil War. We mostly avoid further wars by a system of laws, courts, checks and balances. These encode duties and obligations. Until the same applies to Facebook (et al) the battles will continue. 

Soft Stuff


  • comby-tools/comby (video): tool for changing code across many languages.

  • Bytecode Alliance~ an open source community dedicated to creating secure new software foundations, building on standards such as WebAssembly and WebAssembly System Interface (WASI). The Bytecode Alliance is a new industry partnership coming together to forge WebAssembly’s outside-the-browser future by collaborating on implementing standards and proposing new ones. Our founding members are Mozilla, Fastly, Intel, and Red Hat, and we’re looking forward to welcoming many more.

  • tikv/tikv (article): TiKV is an open-source, distributed, and transactional key-value database. Unlike other traditional NoSQL systems, TiKV not only provides classical key-value APIs, but also transactional APIs with ACID compliance. Built in Rust and powered by Raft, TiKV was originally created to complement TiDB, a distributed HTAP database compatible with the MySQL protocol.

  • TiDB: A Golang Database is compiled into WebAssembly so it can run in the browser.

  • The Update Framework (TUF): helps developers maintain the security of a software update system, even against attackers that compromise the repository or signing keys. TUF provides a flexible framework and specification that developers can adopt into any software update system.

  • zephyrproject: a scalable real-time operating system (RTOS) supporting multiple hardware architectures, optimized for resource constrained devices, and built with safety and security in mind.

Pub Stuff: 


  • Efficient and Scalable Thread-Safety Violation Detection --- Finding thousands of concurrency bugs during testing: This paper presents TSVD, a thread-safety violation detector that addresses these challenges through a new design point in the domain of active testing. Unlike previous techniques that inject delays randomly or employ expensive synchronization analysis, TSVD uses lightweight monitoring of the calling behaviors of thread-unsafe methods, not any synchronization operations, to dynamically identify bug suspects. It then injects corresponding delays to drive the program towards thread-unsafe behaviors, actively learns from its ability or inability to do so, and persists its learning from one test run to the next. TSVD is deployed and regularly used in Microsoft and it has already found over 1000 thread-safety violations from thousands of projects. It detects more bugs than state-of-the-art techniques, mostly with just one test run.

  • Snap: a microkernel approach to host networking: This paper describes the networking stack, Snap, that has been running in production at Google for the last three years+. It’s been clear for a while that software designed explicitly for the data center environment will increasingly want/need to make different design trade-offs to e.g. general-purpose systems software that you might install on your own machines. But wow, I didn’t think we’d be at the point yet where we’d be abandoning TCP/IP! You need a lot of software engineers and the willingness to rewrite a lot of software to entertain that idea. Enter Google!

  • An analysis of performance evolution of Linux's core operations: To our surprise, the study shows that the performance of many core [Linux] operations has worsened or fluctuated significantly over the years. For example, the select system call is 100% slower than it was just two years ago.