AI, startup hacks, and engineering miracles from your friends at Faraday

Finding public S3 objects

Seamus Abshere on

There's lot of concern over public objects in S3 buckets. Now Amazon gives you a way to lock down entire buckets - but what if you legitimately have a mix of public and private objects?

Ruby script to find public S3 objects

First get 2 gems:

source 'https://rubygems.org' do
  gem 'aws-sdk'
  gem 'thread'
end

Then the script itself:

# find_public_s3_objects.rb
require 'aws-sdk-s3'
require 'thread/pool'
BUCKET = ARGV[0] or raise("expected bucket")
s3 = Aws::S3::Resource.new(region: 'us-east-1')
count = 0
pool = Thread.pool 8
mutex = Mutex.new
s3.bucket(BUCKET).objects.each do |object|
  pool.process do
    grants = object.acl.grants
    mutex.synchronize do
      count += 1
      if count % 100 == 0
        $stderr.write "#{count}..."
      end
    end
    if grants.map { |x| x.grantee.uri }.any? { |x| x =~ /AllUsers/ }
      mutex.synchronize do
        puts object.key
      end
    end
  end
end
pool.shutdown

Then you run it like this:

bundle exec ruby find_public_s3_objects.rb my-bucket-name

It's much faster than the bash-based example on StackOverflow.

Third-party data in marketing: Why it's important and how to protect consumers' privacy

Alexis Hughes on

third-party data iceberg illustration
The use of third-party data isn't new, but it's becoming a necessity as businesses shift to a customer-centric approach. Marketers everywhere are realizing that first-party data is only the tip of the iceberg when it comes to understanding their customers.

Pitney Bowes and Forrester report that 92% of marketing and data analytics professionals agree that the rise in digital technologies and interactions has increased the need for bringing outside data into their companies. While it may be as simple as understanding your existing customers' household and financial standing, with third-party data you can also gain deep insights into your leads on a known-identity basis.

Overall, leveraging third-party and first-party data adds breadth and depth to your customer insights, informing better branding and messaging decisions, personalization efforts across channels, and taking much of the guesswork out of targeted and location-based marketing campaigns.

Customer identity resolution

customer identity resolution illustration
The capability to bring together first-party data from various sources and third-party data to compile a unified picture of each customer is increasingly valuable. In 2018, spending dedicated to identity data assets grew by more than 50%, to a total of $846MM.

Personally identifiable information (PII) is crucial to connecting the dots across multiple touchpoints (online and offline) to form a holistic picture of your customers and prospects so you can reach them with content that will resonate with them through the channels that matter.

Using third-party data to improve personalization

marketing personalization illustration
In a time when almost all consumers prefer ads and products that are relevant to them, knowing how and when to serve your target market is crucial to effectively engaging your customers. According to Forbes, right now, 46% of marketing executives are not where they want to be in terms of delivering personalization, and 57% say that they hope to advance their personalization initiatives over the next year.

Enriching your first-party data with third-party information on your customers' financial, household, transactional, or lifestyle categories gives you a clearer idea of what your customers look like in real life.

To take it a step further, enriched customer data helps machine learning algorithms build better models. Cluster models, for example, sort individuals into distinct groups based on common attributes. These groups, or personas, can affirm your intuitions about what your customer base looks like, but may also surprise you by revealing significant groups you hadn't known about or previously considered important.

How third-party data helps optimize ad targeting and nurture campaigns

predictive scoring targeted marketing illustration
If you're using predictive lead scoring capabilities, the accuracy and precision of those scores improve with the quality and quantity of third-party data that you introduce. This not only benefits your leads' experiences, as they'll receive content that better aligns with their interests and where they are in their customer journeys, but it will save you time and money when you're deciding how to prioritize your outreach.

A popular way of leveraging third-party data with predictive modeling for digital marketing is Facebook lookalike audiences. But since the platform shut down advertisers' access to this data in April of 2018, gaining deep insights from Facebook has been a struggle for many businesses that are trying to better understand their audiences and target the right consumers.

If you enrich your first-party data with third-party attributes, you can gain insights on a granular level that you aren't getting with Facebook and other marketing channels. You can then segment known-identity audiences based on certain attributes — or develop in-depth personas for segmentation — and push the appropriate ad creative to those individual segments with the goal of increased, personalized engagement.

This goes for not only new leads you're nurturing, but also for the customers you're actively engaging. Pitney Bowes and Forrester's survey showed that 83% of respondents realize timelier and contextually relevant customer experiences are a high priority when it comes to optimizing their marketing strategies. With nurture campaigns, timeliness and relevance are key components for success.

Another advantage of leveraging third-party data is the ability to match site's captured emails to real people. This allows you to bucket leads into certain personas for more personalized nurture campaigns that will pull them through their customer journeys with you. Your leads are more likely to receive messaging that aligns with where they are in their individual journeys, and it will save your team time and money when you decide how to prioritize your lead outreach.

customer lifecycle optimization whitepaper

Enhancing geospatial analyses with third-party data

geospatial analysis illustration
Geographic data has obvious benefits, but incorporating additional third-party in geographic analyses improves the quality of the insights. Knowing where your ideal customers live is one piece of the geomarketing puzzle, but gaining an understanding of how your audiences will best interact with your ads offline is achieved by knowing more than just their geographic location.

Attributes like household information, transportation preferences, and lifestyle data helps you prioritize where you place ads, as well as where you might establish new brick-and-mortar presences. A subway ad or billboard placed in your customers' neighborhoods can be a great way to advertise your e-commerce business for those who don't live near a storefront, but for customers who are in your brick-and-mortar trade zones, perhaps a direct mailer advertising an in-store sale would better drive business.

From a creative perspective, audiences can vary widely depending on their physical locations, so the messaging and images chosen for your campaigns should be curated for specific audiences in particular geographies. Providing relevant content increases your chances of audience engagement, so capitalizing on the insights provided by third-party data is a necessity.

This is the case with SEM advertising, which segments audiences by geographic location. With the insights gained from third-party data, you can optimize your SEM spend by targeting areas where higher-value customers reside, spending less on areas that aren't as likely to bring in revenue.

Using third-party data ethically

data privacy security illustration
With the rise of regulations like GDPR, privacy and validity of data are prominent concerns when using third-party data and PII. However, there are ways to ensure that your customers' and prospects' privacy is protected. You want to be sure your data comes from a verified source, isn't dangerously invasive, and is leveraged to benefit the customers you're targeting.

Social scraping is a common tactic, and it can provide large categories of data on behavioral tendencies, consumer habits, and some personal information, though this data may not always be entirely accurate. At Faraday we avoid social scraping in favor of canonical sources of information from established third-party data vendors. Vendors like Epsilon verify the accuracy of the high-quality data we license and leverage for our partners.

Faraday's Director of Data Science, William Morris, notes that we sidestep many consumer privacy issues by dealing with datasets that categorize data, such as consumer purchases, by type of product and frequency, rather than the exact stores where those purchases are made. This leads to a significant lift in machine learning models' predictive capabilities without being invasive from a privacy standpoint. Additionally, each of our clients' customer data is siloed using industry-standard practices to protect the privacy of each clients' customers.

Faraday also actively works to build models that are agnostic to a person's membership to protected classes. We exclude potentially harmful categories like ethnicity, religion, or primary language data, as these can disenfranchise certain groups of the consumer population. As an additional security measure, we employ a suite of tools for post-model checking to ensure that attributes that could indicate a person's race or class don't make the models biased or adversely impact a population.

Taking into consideration these types of ethical practices is important when exploring your own use of third-party data.

blog_cta_faradiy

What is predictive geospatial analysis, and how does it impact location-based marketing initiatives?

Alexis Hughes on

geospatial insights and intelligence
It's no secret that marketers need to make the most of their customer data, especially considering the importance of creating relevant, enjoyable omnichannel experiences for today's consumers. Location is a pivotal component of an effective omnichannel strategy — the challenge is identifying where your investments will yield the greatest returns for your business.

Location intelligence refers to a suite of geospatial analysis techniques enhanced with predictive customer insights. Whether you're considering increasing investment in your existing markets, expanding to new ones, or looking to drive foot traffic to specific retail or branch locations, incorporating propensity scores into the following geospatial analyses enables your business to identify which areas have the best opportunity to maximize ROI on geographical expansion, geo-targeted marketing campaigns, and much more.

Three predictive geospatial analysis techniques for your location-based marketing strategy

Every geography is different when it comes to the types of customers who live there and how they choose to engage with your brand, products or services, and marketing initiatives. Sometimes these geographic differences are negligible, but often there are important distinctions that can help you improve out-of-home campaigns or store placement. Here are three predictive geospatial analysis techniques that can help maximize your returns.

Trade zone analysis

"Who lives near my business's location, and how far are they willing to travel to get there?"
predictive trade zone analysis
Predictive trade zone analysis considers the location of an existing or proposed business site. The analysis uses customer and geographic data to establish which customers and prospects live within a certain radius of the location and how far they're willing to travel to get there.

Leveraging the results of a trade zone analysis, you can better identify which customers within a given area you should target, whether that's with a direct mailer, a phone call, or even an email drip.

Market and hotspot analyses

"How can I efficiently expand into new markets?"
predictive hotspot analysis
Predictive market analysis, also known as market sizing, is used when you want to understand the level of opportunity and/or site-suitability in a particular city or neighborhood. Though not limited to offline initiatives — you can also identify high-value areas of online shoppers to optimize your keyword bidding strategy for SEM campaigns — this analysis is often useful when establishing an offline experience (e.g. new storefront, pop-up shop, or billboard).

A hotspot analysis — a component of a predictive market analysis — identifies high-density areas of your target customers and ranks the geographies where your business would perform best. Depending on your growth goals, the scale of the prospective area could be as large as the entire nation or as small as a specific neighborhood.

Penetration analysis

“What is the realistic opportunity for growth in a given geography?”
market penetration analysis
Unlike the two previous geospatial analyses, a penetration analysis does not predict market performance so much as it reports on existing performance to give you insight into how you should prioritize your resources. The most valuable insight derived from a penetration analysis is the opportunity index, which suggests the percentage of likely-to-act consumers in a given market. It essentially identifies markets where your business has the greatest potential to maximize returns on future investments.

A penetration analysis can be supplemented with a subsequent market analysis to educate you further about customers and new opportunities in smaller areas of interest, like specific zip codes or neighborhoods.

geospatial insights for retail strategy

Importance of physical retail in effective omnichannel commerce

Establishing a brick-and-mortar presence can be expensive, and it's often difficult to execute efficiently. In the past, retail siting and market sizing efforts have been outsourced to slow-moving consultants or have been founded on educated guesses, rather than data-driven predictions.

Today we're hearing a lot about the “retail apocalypse,” where longstanding big-box stores are shuttering thousands of stores as consumers increasingly shop online and engage with direct-to-consumer brands over traditional retail. But as larger stores close locations, digitally native brands are turning towards brick-and-mortar retail as an additional revenue source — and they're finding a high rate of success. According to Forbes, these brands have plans to open more than 850 new locations in the next few years.

The same report notes that “when a retailer opens a new store, on average, that brand's website traffic increases by 37%, relative share of web traffic goes up by 27% and the retailer's overall brand image is enhanced.” This so-called “halo effect” is often improved with location intelligence.

If you're looking to expand your own business into new areas, temporary retail initiatives — pop-up shops, showrooms, and multi-brand partnerships — provide you with the opportunity to explore different markets at a lower risk. But because there's not always the same draw as a full-fledged brick-and-mortar presence, you still need to be intentional about where you place these temporary offline experiences and how you curate them for each geography. What may be successful with consumers in one city may not engage a different city's consumers as effectively. Employing predictive geospatial analysis techniques can help you understand the nuances in your customers' offline shopping behaviors.

geospatial insights for marketing strategy

Impact of location intelligence on marketing and advertising

Location intelligence isn't limited to physical retail initiatives. The insights generated from these predictive geospatial analyses are often applicable to both traditional and digital marketing initiatives.

If your business operates in a variety of locations, it's often safe to assume the kinds of customers you serve will vary from place to place.

For example, if you renovate homes and build audiences for marketing campaigns targeted around what kinds of houses your potential customers live in in Texas, that same audience criteria probably won't work in New Jersey. Why? Well, most houses in Texas don't have basements. And if you're using that criteria to target homeowners in New Jersey, where many houses do include this feature, you could miss out on a number of potential customers.

With direct mail campaigns, canvassing, or even phone banking, leveraging household details — from family size to type of home to a roof's solar suitability — on your leads can reduce your customer acquisition costs. Instead of targeting neighborhoods or cities en masse, you can identify specific homes that meet the criteria of your target audience.

A similar effect happens when you leverage geospatial insights on your customer base if you're thinking about implementing subway, bus ads, billboards, or any other out-of-home experience into your marketing strategy. Identifying high-opportunity areas and understanding how consumers in those areas compare to their neighbors or the greater population enables you to tailor your ads to those who are likely to engage with them.

Online initatives like search engine marketing (SEM) can also be bettered with the use of geographic and customer data. For example, putting bid modifiers on high and low-opportunity zip codes helps optimize your overall spend.

Grow intelligently

As consumers continue to engage and shop across a variety of channels, failing to fully explore both online and offline opportunities can be the Achilles' heel of companies with even the highest potential for growth.

Location intelligence is just one piece of an optimized omnichannel strategy, but a vital one. Right now there is more access than ever to consumer and geographic data, and the businesses that are scaling efficiently in today's competitive markets are capitalizing on the insights that data provides. How is your business growing intelligently?


burrow omnichannel case study

How leading DTC growth experts built their brands, and how they plan to grow in the future.

Alexis Hughes on

Exploring the future of DTC at Direct Currents

At Direct Currents this past April, Faraday gathered some of the most forward-thinking minds in the DTC space to facilitate conversations around brand-building and growth marketing strategies.

Over drinks and hors d'oeuvres, attendees from brands like Birchbox, Crabtree & Evelyn, Hubble, and Vroom chatted about their various marketing strategies, what was working and what wasn't, and how they're looking to expand in the future.

Later in the evening, we heard from Maria Molland, CEO of THINX, and a panel of executives from leading DTC brands Warby Parker, Burrow, Away, and Leesa. Moderated by Digiday's Aditi Sangal, the speakers discussed mission-driven brand strategies, omnichannel growth, and the importance of leveraging data.

mission driven brand

Building a mission-driven brand

In a time when people are inundated with ads from DTC companies across every channel, it's not enough to simply put product ads in front of consumers and expect them to convert. Brands that are growing intelligently have realized that the most effective way to reach new customers is to lead with their brand missions, rather than traditionally straightforward product marketing.

Direct Currents keynote speaker Maria Molland spoke on how THINX is focused on revolutionizing the feminine care industry by making sustainable menstrual products that can benefit women on a global level. More than just a feminine care brand, THINX has made their mission the basis of their marketing strategy.

maria molland thinx mission driven brand strategy quote

THINX and the other brands represented on the panel focus heavily on social media as channels to help build their brands because they realize that developing a strong presence on those platforms can have a massive influence on consumer trends. The media versatility and wide audience on Instagram in particular have proven to be useful resources for mission-driven companies looking to grow a loyal customer base. But more importantly, social media gives a brand the ability to convey that their customers shouldn't just invest in their products as commodities — they should invest in the brand as a lifestyle.

Direct Currents panelist Alex Kubo, Head of Intelligence at Burrow, said, "We're taking a much more conservative approach to building our brand and building a resonance — and kind of an aura — around the brand, rather than just preaching products and value props." Showing consumers that buying a couch can be more than just buying a functional piece of furniture has proven to be key to growing their brand.

mark chou away brand building quote

omnichannel growth graphic

Implementing omnichannel growth

Of course, mission-based marketing on social media can't be the sole driver of any brand's revenue if sustainable growth is the ultimate goal. Implementing and optimizing omnichannel campaign performance is instrumental to a successful growth strategy.

Intelligently growing brands have been diversifying their marketing channels as they hit the ceiling of what Facebook's audiences can do for them in an increasingly costly and competitive ad space. Mattress company Leesa has been leveraging direct TV ads as an acquisition channel, originally thinking it would be a "first-click equivalent." Nick Stafford, former COO of Leesa and recent founder of DTC growth agency Belay, noted that the addition of TV to their marketing strategy is actually "a very important part of the customer journey … More people were engaging with us in some other form, but then the TV was the thing that drove the purchase."

Brands are continuing to find value in podcasts as well. Different mediums allow for varying types of engagement — while Instagram is visual and often video-based for brand and product marketing, podcasts bring an audio perspective that Kubo believes "can really help create more of the voice of the brand, whereas channels like social and search are kind of like your opportunity to pitch one value prop."

Leveraging offline initiatives is becoming an increasingly important piece of DTC brands' brand-building and omnichannel growth. Burrow, Away, Leesa, and Warby Parker have all opened brick-and-mortar stores or partnered with larger retailers to push their products offline and give customers unique experiences with their brands.

Implementing offline tactics has come under fire recently, though, as traditional retailers have shuttered thousands of stores the last few years as in-store business continues to decline. In light of this so-called "retail apocalypse," it makes sense to question why the brick-and-mortar approach to retail is becoming a standard practice for DTC brands.

brian magida brick and mortar retail quote

Panelist Brian Magida, Warby Parker's Director of Performance Marketing, explained how the company's first customers would come to the founders' apartment to try on their first few pairs of glasses. What felt like a ridiculous thing to ask of a potential customer became an intimate experience their initial base found valuable. Years later, having opened almost 100 retail locations nationwide, Warby Parker continues to offer personable offline experiences that have helped to grow their customer base and establish their prominence in the eyewear market.

Unlike a traditional storefront, pop-up shops and showrooms give brands the opportunity to explore new markets at a lower risk. Recently, female-founded brands LIVELY, FUR, and Blume collaborated to put together a shoppable pop-up event in New York City, featuring a panel of the brands' founders.

Despite the success these kinds of pop-up shops and events have had, brands still need to be mindful about where they place these temporary offline initiatives, because there's not always the same draw as a full-fledged brick-and-mortar presence.

alex kubo burrow showroom quote

This intentionality around omnichannel initiatives extends to partnerships as well. Retail partnerships can be a significant source of revenue (and often built-in brand marketing) for DTC brands, particularly those without a brick-and-mortar presence of their own. Partnerships provide the legitimacy of an established brand backing their products, an alternative to going it alone with a pop-up shop or physical retail launch.

These days, partnerships between larger retailers and DTC brands can come in many forms. Nordstrom partnerships have become a great avenue for growth in brand awareness and in revenue for brands like THINX, Dagne Dover, and Bonobos. Meanwhile, Leesa has teamed up with West Elm as their exclusive mattress partner, placing more permanent products in West Elm stores across the country.

Perhaps the newest iteration of brand partnerships has come in the form of temporary retail that provides a single space for multiple vendors. Texas-based Neighborhood Goods and New York's Showfields are havens for consumers seeking out DTC brands in a retail setting. These spaces house a rotating selection of brands, from Rothy's to Hims to Solé bicycles to Eight Sleep, offering companies that may not have a traditional brick-and-mortar strategy the chance to gauge the reception of a physical retail presence among consumers.

No matter how brands go about leveraging offline initiatives — whether through partnerships or a physical retail presence or even a subway ad — implementing a true omnichannel strategy doesn't mean throwing money into every channel with the hope that there will be a significant ROI based on the pure volume of the retail or marketing efforts. It must be executed intelligently, with intention.

customer data iceberg

The importance of capitalizing on data

Leveraging customer data and market research help brands make smarter decisions and scale their growth more efficiently. Data drives personalization in ad content, where brands place retail locations, what product lines are introduced, and so much more. And while many like to say they're leveraging their customer data to the fullest extent, the truth is that most brands have only hit the tip of the iceberg.

To effectively build out a mission-driven brand with a loyal customer base and sustainably growing revenue, brands must focus on understanding who engages with them, where there are opportunities for expansion, and accurately measuring the impact of their marketing and growth strategies.

Brands that have a loyal following have built trust with their customers, whether that's through selling high-quality products, serving up relevant ads, or providing helpful customer support. This trust requires that brands know who their customers are and what they want as consumers.

Chou spoke to how Away leverages customer insights and primary market research to expand product lines that align with their customers' interests and create marketing content that resonates with their audiences — efforts that can greatly increase customers' trust in a brand.

Similarly, Burrow works hard to understand what drives people to engage with their brand. According to Kubo, "Part of [brand-building] is making sure we're speaking to the right people, identifying the right audiences, and hitting them with the right value props at the right time." This goes for marketing initiatives that aim to scale both online and offline revenue.

In line with this thinking, Magida advises brands to "really focus on measurement, and be true about measurement," particularly when it comes to evaluating campaign and retail performance. When brands intelligently leverage data, they're making sure to consistently measure the impact of their efforts. Experiment and see what works — and more importantly, figure out what doesn't. It's imperative that brands make an effort to learn from the data they collect from various marketing campaigns and retail initiatives, and to employ those learnings effectively.

mark_chou_quote_engineer_growth_paid_media

Often, customer data shows brands that their audiences are more likely to engage with them if they don't have to seek the brand out. Meeting customers where they are both in their journeys and in real life can have a significant impact on a brand's growth strategy.

As mentioned earlier, Leesa's use of direct TV ads was initially expected to drive immediate purchases. But because Leesa was smart about attribution and measuring their marketing efforts, they figured out how crucial this channel of outreach was to engaging customers who were actually already on their way toward making a purchase. The TV ad that found them right where they were at home was the final push they needed.

All in all, a brand can have a great mission and push omnichannel growth, but without carefully considering customer data, third party insights, or primary market research, success is often limited.

dtc launch graphic

Is DTC more than just a launch strategy?

There's been plenty of talk about the direct-to-consumer business model ultimately being unsustainable — great for a brand's initial launch, but not necessarily a smart long-term growth strategy.

The Direct Currents panelists' responses to these claims varied — Stafford doesn't believe any brand has to be a DTC "purist" and encourages companies to seek out growth opportunities that are "positive from a margin point of view," while Chou thinks it's possible to effectively grow an online-only DTC business. And these differing views are obvious in their brands' approaches to growth.

In the end, it seems only time will tell whether or not a strict DTC path can lead brands to success and stability.



Open source at Faraday

Seamus Abshere on

Here are Faraday's contributions to open source that we use every day in production.

(beta release) 3rd gen batch processing on k8s: falconeri

falconeri is a distributed batch job runner for kubernetes (k8s). It is compatible with Pachyderm pipeline definitions, but is simpler and handles autoscaling, etc. properly.

(beta release) Seamless transfer between Postgres/Citus and BigQuery: dbcrossbar

dbcrossbar handles all the details of transferring tables and data to and from Postgres and Google BigQuery. Additionally, it knows about citus, the leading Postgres horizontal sharding solution - so it can do highly efficient transfers between Citus clusters and BigQuery.

A new standard for secrets: Secretfile

secret_garden (Ruby), vault-env (JS), and credentials-to-env (Rust) all implement a standard we call Secretfile(s):

# /app/Secretfile
DATABASE_URL secrets/database/$VAULT_ENV:url
REDIS_URL secrets/redis/$VAULT_ENV:url

Then you use it like this SecretGarden.fetch('DATABASE_URL').
Clients implementing this standard are meant to first check the environment for DATABASE_URL, then failing that look up the secret in Hashicorp Vault (interpolating $VAULT_ENV into production, staging, etc. first). It's very useful for development where your DATABASE_URL is just postgres://seamus@127.0.0.1:5432/myapp - you can save this in a local .env file and only mess with Vault in production/staging.

Lightning fast CSV processing: catcsv and scrubcsv

catcsv is a very fast CSV concatenation tool that gracefully handles headers and compression. It also supports Google's Snappy compression. We store everything on S3 and GCS szip'ed using burntsushi's szip.

$ cat a.csv
city,state
burlington,vt

$ cat b.csv
city,state
madison,wi

$ szip a.csv

$ ls
a.csv.sz
b.csv

$ catcsv a.csv.sz b.csv
city,state
burlington,vt
madison,wi

Of course, before you cat files, sometimes you need to clean them up with scrubcsv:

$ scrubcsv giant.csv > scrubbed.csv
3000001 rows (1 bad) in 51.58 seconds, 72.23 MiB/sec

Lightning-fast fixed-width to CSV: fixed2csv

fixed2csv converts fixed-width files to CSV very fast. You start with this:

first     last      middle
John      Smith     Q
Sally     Jones

You should be able to run:

$ fixed2csv -v 10 10 6 < input.txt
first,last,middle
John,Smith,Q
Sally,Jones,

World's fastest geocoder: node_smartystreets

node_smartystreets is the world's fastest geocoder client. We shell out to its binary rather than using it as a library. It will do 10k records/second against the smartystreets geocoding API. If you don't have an Unlimited plan, use it with extreme caution.

Better caching: lock_and_cache

lock_and_cache (Ruby) and lock_and_cache_js (JS) go beyond normal caching libraries: they lock the calculation while it's being performed. Most caching libraries don't do locking, meaning that >1 process can be calculating a cached value at the same time. Since you presumably cache things because they cost CPU, database reads, or money, doesn't it make sense to lock while caching?

def expensive_thing
  @expensive_thing ||= LockAndCache.lock_and_cache("expensive_thing/#{id}", expires: 30) do
    # do expensive calculation
  end
end

It uses Redis for distributed caching and locking, so this is not only cross-process but also cross-machine.

Better state machine: status_workflow

status_workflow handles state transitions with distributed locking using Redis. Most state machine libraries either don't do locking or use Postgres advisory locks.

class Document < ActiveRecord::Base
  include StatusWorkflow
  status_workflow(
    archive_requested: [:archiving],
    archiving: [:archived],
  )
end

Then you can do

document.enter_archive_requested!

It's safe to use in a horizontally sharded environment because it uses distributed locking - the second process that tries to do this will get a InvalidTransition error even if it's the same microsecond.

Rust build tools: rust-musl-builder and heroku-buildpack-rust

rust-musl-builder is how we build Rust apps on top of Alpine. It also drives heroku-buildpack-rust, the preeminent way of running Rust on Heroku.

Minimal postgres for node: simple-postgres

simple-postgres (JS) is just the essentials to talk to Postgres from Node. We particularly love its use of template literals for apparently magical escaping:

let account = await db.row`
  SELECT *
  FROM accounts
  WHERE id = ${id}
`

Yes, that's safe!

Minimal HTTP server: srvr

srvr (JS) is a small HTTP server that speaks for itself:

  • everything express does
  • better
  • less code
  • no dependencies
  • websockets

Proper Docker API support for Rust: boondock

boondock is a rewrite of rust-docker to be more correct.

Coordinate docker-compose: cage

cage boots multiple docker-compose.ymls, each as a pod. It's sortof like a local k8s. You configure it with a bunch of docker-compose files:

pods/
├── admin.yml (a pod containing adminweb and horse)
├── common.env (common env vars)
├── donkey.yml (a pod containing donkey)
├── placeholders.yml (development-only pod with redis, db, etc.)
[...]

Local development looks like this:

$ cage pull
==== Fetching secrets from vault into config/secrets.yml
==== Logging into ECR
Fetching temporary AWS 'administrator' credentials from vault
Pulling citus        ... done
Pulling citusworker1 ... done
Pulling citusworker2 ... done
Pulling queue        ... done
Pulling redis        ... done
Pulling s3           ... done
Pulling smtp         ... done
Pulling vault        ... done
Pulling horse        ... done
Pulling adminweb     ... done
[...]
$ cage up
Starting fdy_citusworker2_1 ... done
Starting fdy_smtp_1         ... done
Starting fdy_citus_1        ... done
Starting fdy_vault_1        ... done
Starting fdy_citusworker1_1 ... done
Starting fdy_queue_1        ... done
Starting fdy_s3_1           ... done
Starting fdy_redis_1        ... done
Starting fdy_horse_1 ... done
Starting fdy_adminweb_1 ... done
[...]
$ cage stop
Stopping fdy_citusworker2_1 ... done
Stopping fdy_vault_1        ... done
Stopping fdy_citus_1        ... done
Stopping fdy_s3_1           ... done
[...]

Fixed up rust crates: rust-amqp

rust-amqp@tokio (Rust) is our rewrite of the internals of the rust-amqp crate in proper tokio. It is much more reliable and needs to be merged upstream.

Conclusion

We love open source at Faraday!