Existential Consistency - Measuring and Understanding Consistency at Facebook

Paper digest - Existential Consistency: Measuring and Understanding Consistency at Facebook

Motivation

  • Provides real world insights on how to quantify trade offs between performance and consistency wrt replicated storage systems.
  • Describes a practical consistency monitoring system that tracks a new consistency metric ideally suited for health monitoring.

Introduction

  • Stronger consistency implementation increases latency and/or decrease throughput.
  • Eventual consistency (usually) provides good latency and/or throughput at larger scale. However there are drawbacks:
    • User visible anomalies: e.g. out order comments on social network post.
    • Programming complexity: complicated cases to reason about on weaker consistency model.
  • So lots of recent work focus on providing stronger consistency models to overcome drawbacks of a weaker consistency model.
  • But, not much work is done to quantify the trade offs, which this paper addresses.
  • Within a cluster, per-object sequential and read-afterwrite consistency are provided.
  • Across the entire system, eventual consistency is provided.
  • Consistency is measured by running offline consistency checker (that checks various consistency models) on a logs that gathers random samples of social graph system.

Background

  • Facebook’s replicated storage
    • Data model is directed graph.
    • User data is persistent in relational database, which is sharded, geo-replicated.
    • Each shard has a single maser region. Replication to slave region is async.
    • Database is cached (two-level) for the full replica per region. Root / leaf caches. Reads served by leaf cache.
    • Write path to cache: invalidation is async.
  • Consistency models:
    • The usual ones, mentioned again (linearizability, sequential, per object sequential, read-after-write, eventual.)
    • Facebook’s consistency: per-object sequential consistency and read-after-write consistency within a cache, and eventual consistency across caches.
    • When user session spreads across multiple leaf caches (e.g. load balanced, or leaf cache failed): eventual consistency.

Principled Consistency Analysis

  • Trace
    • Collect trace to identify violation of consistency models.
    • Trace only requests to a subset of vertices / edges stored in system for practical feasibility / avoid overhead.
    • Requests are logged on the web server that issuing the requests.
  • Deal with clock skew
    • Solved by adding offsets (We account for this clock skew by expanding the invocation time and response time, i.e., we subtract 35 ms from all invocation times and add 35 ms to all response times.)
  • Deal with losing logs
    • Use a secondary trace from Wormhole system which is lossy as well but combined with the writes from both trace has good coverage.
  • The checker
    • Anomaly Checkers
      • Maintaining a directed graph.
      • Vertices represent the state of an object.
      • Edges represent the constraints on the ordering.
      • Check state transition order observed by reads is consistent with these constraints.
    • Linearizability Checker
      • Model operation as vertices, edges as constraints.
      • Check cycles.
      • Linearizability requires that there exists a total order that is legal.
      • Merging read vertices into the write vertices they observe requires matching reads to write.
    • Per-Object Sequential and Read-After-Write Checkers
      • As an add on of linearlizability checker which is superset of sequential / RAW consistency.
  • Directions on future research into systems with stronger consistency
    • Build system were non-anomalous types have negligible overhead or
    • Provide stronger consistency for a small subset of a larger system.
    • While the latter would not prevent all anomalies, it would allow incremental deployment of these systems and significantly reduce the rate of anomalies.
    • Interestingly, such a subsystem that does provide linearizability is used within Facebook for a small set of object types, e.g., passwords.

Practical Consistency Analysis

  • Why - need real time (for monitoring) instead of principal analysis (can be only done offline).
  • φ-consistency

Conclusion

  • TAO is highly consistent.
  • There were anomalies under all of the consistency models we studied.