USACO 5.3 Milk Measuring

Brutal force DFS search with a lot of pruning. The search has N passes, and it starts from 1 pail, then 2 pails, etc and up to N pails. For each pass, use memorization and pruning to avoid hitting unnecessary search space. The passes are deepened via DFS, and we return once we gets a hit in a given pass as the question was asking the smallest set of pails.

#include <vector>
#include <iostream>
#include <fstream>
#include <algorithm>

using namespace std;

int Q, N, total;
vector<int> pails, ans, cur;
bool found = false;

void dfs(int depth, int index, int left) {
if (depth == total) {
if (!left) {
found = true;
// dfs does not guarantee smallest order, so
// keep this and compare with a future solution.
ans = cur;
}
return;
}
int current = pails[index];
// not feasible.
if (index >= N || left < current) return;
// current search is no smaller than existing solution.
if (found && current > ans[depth]) return;
cur[depth] = current;
for (int i = 1; i <= left / current; i++) {
dfs(depth + 1, index + 1, left - i * current);
}
dfs(depth, index + 1, left);
}

int main(){
ifstream fin("milk4.in");
ofstream fout("milk4.out");
fin >> Q >> N;

pails = cur = vector<int>(N, 0);
for (int i = 0; i < N; i++) {
fin >> pails[i];
}
sort(pails.begin(), pails.end());

for (total = 1; total <= N; ++total) {
dfs(0, 0, Q);
if (found) break;
}
fout << total; cout << total;
for (int i = 0; i < total; i++) {
fout << " " << ans[i]; cout << " " << ans[i];
}
fout << endl;cout << endl;
return 0;

Paper Digest - Megastore, Providing Scalable, Highly Available Storage for Interactive Services

Paper digest - Megastore: Providing Scalable, Highly Available Storage for Interactive Services

Abstract

  • Blends the scalability of a NoSQL datastore with the convenience of a traditional RDBMS.
  • Provides both strong consistency guarantees and high availability.
  • Provides fully serializable ACID semantics within fine-grained partitions of data.

Motivation

  • Online services are forcing the storage community to meet new demands.
    • Highly scalable.
    • Rapid development.
    • Low latency.
    • Consistent view of data.
    • Highly available.

Design Highlights

  • Partition data. Replicate unit is a partition.
  • Full ACID semantics within partitions only.
  • Across partitions, limited consistency guarantees (later spanner improves this)
  • Provide database features (2nd indexes), but does not provide scalability guarantees (can only scale within user-tolerable latency limits.)
  • Innovative usage of Paxos: use it for replication instead of traditional usage (locking, leader election, replicate config data - e.g. ZooKeeper/ZAB)

Availability with Scalability

  • Replication
    • Replicate a write-ahead log over a group of symmetric peers.
    • Any node can initiate reads and writes.
    • Each log append blocks on acknowledgments from a majority of replicas.
    • Optimizations on Paxos
      • Allow local reads.
      • Single roundtrip writes.
    • Multiple replicated logs instead of single log for better availability.
  • Key Concept - Entity Groups
    • Entity group is a collection of data and unit of replication.
    • Within a single entity group: ACID semantic.
    • Across entity groups:
      • Two phased commit (strong consistency)
      • Or async communications through message queues.
  • Physical Layout of data
    • Use BigTable for data storage.
    • Letting applications control the placement of data.
      • Applications try to keep data near users and replicas near each other.
      • The data for an entity group are held in contiguous ranges of Bigtable rows.

API Design

  • Cost-transparent APIs: runtime costs that match application developers’ intuitions.
  • Eliminate needs for joins: offer fine-grained control over physical locality.
  • Joins when required is implemented in application code.

Data Model

  • Lies between the abstract tuples of an RDBMS and the concrete row-column storage of NoSQL.
  • Data model is declared in a schema and is strongly typed.

Transactions and Concurrency Control

  • Use MVCC. Readers and writers don’t block each other, and reads are isolated from writes for the duration of a transaction.
  • Read semantics
    • current (single entity group)
    • snapshot (single entity group)
    • inconsistent (ignore state of log)
  • Queues
    • Cross group batch operation deliver.
  • 2 Phased Commit for atomic update across entity groups.

Paxos based Replication (key innovations)

  • Pitfall of master-slave based Paxos
    • Master failover can require a complicated state machine.
    • A series of timers must elapse before service is restored.
    • It is difficult to avoid user-visible outages.
  • Fast Reads
    • Local read.
    • Coordinator tracks if a specific replica has observed the full state of writes so it can serve local reads.
  • Fast Writes
    • Single round trip writes.
    • Pre-preparing optimization: each successful write includes an implied prepare message granting the master the right to issue accept messages for the next log position.
    • If the write succeeds, the prepares are honored, and the next write skips directly to the accept phase.
    • Multi-Paxos: independent instance of the Paxos algorithm for each log position. T
  • Replica Types
    • Full Replica: contain all entity and index and able to serve current reads.
    • Witness Replica: vote and store WAL, but does not apply the WAL (kind like observer in ZooKeeper but not strictly the same - observer in ZK does not vote.)
    • Read Only Replica: does not vote but contain full data of a point of time in the past (more like ZK observer but ZK observer is more up to date WRT its state.)

Coordinator

  • The coordinator is a simple process with no external dependencies and no persistent storage.
  • Failure Detection
    • Use an out of band protocol to identify when other coordinators are up, healthy, and generally reachable.
    • Coordinators obtain specific Chubby locks in remote datacenters at startup.
    • Revert its state to a conservative default when loses majority of locks and / or parititons.
    • Brief and rare outage risk..
    • Liveness protocol is vulnerable to asymmetric network partitions.

Operation / Metrics / Experience - ommitted.

Paper Digest - Paxos Made Simple

Paper digest - Paxos Made Simple

Motivation

  • Solve consensus problem on a single value (basic Paxos).
  • Consensus problem for a series of values that forming a log is solved using Multi-Paxos.

Consensus Problem

  • A consensus algorithm ensures that a single one among the proposed value is choose.
  • Chosen: meaning a consensus has been reached.
  • Safety properties of consensus:
    • Only a value that has been proposed may be chosen.
    • Only a single value can be chosen.
    • A process never learns that a value has been chosen unless it actually has been.
  • Liveness properties of consensus:
    • One of many proposed values is eventually chosen.
    • If a value is chosen, processes eventually learn about it.

Failure Models

  • Fail Stop: servers may crash, and stop forever.
  • Fail Recovery: servers may crash, but finally will restart and recover.
  • No Byzantine failures: messages can be delayed but they can’t be corrupted; servers can fail but they can’t behave maliciously.

Terms

  • Proposer: propose value. A proposer propose value by sending a proposal, which contains the value (among other things), to acceptors.
  • Acceptor: accept value proposed by proposer. Accept means acknowledge to proposer that the proposed value is accepted, by sending response to proposer.
  • Chosen: a value is chosen when majority of acceptors accept the value.
  • A process can be both a proposer and an acceptor.

Choose a Value

  • Single acceptor:
    • Easiest and most intuitive approach :)
    • However, it is not fault tolerant. If the single acceptor crashes, the system will not be able to make progress.
  • Multiple acceptors:
    • Solve the SPOF problem for single acceptor case.
    • A value is chosen when this value is accepted by a majority of acceptors.
    • This majority of acceptors is called a quorum.
    • Why majority? We need ensure that any two sets of quorum overlap. With drawer principle, if a quorum is majority, then any two majorities will have at least one acceptor in common.
    • Now why we need ensure any two sets of quorum overlap? This is for fault tolerance / recovery purpose - so that we can have at least one acceptor that has complete history of the states (of the Paxos state machine.)
    • Now why we need complete history? We need this so only a single value is chosen (important safety property). Why? Elaboration next.
  • How to choose a value
    • Case 1: single proposed value - only a single value is proposed from proposers.
      • If there is only a single value proposed ever, then this value has to be chosen.
      • This implies that an acceptor must accept the first proposal that it receives.
      • Conclusion: An acceptor must accept the first proposal that it receives. [P1]
    • Case 2: multiple proposed value - multiple values are proposed from proposers.
      • This raises a problem. With multiple proposed value, it is possible no single value is accepted by the quorum. So it’s possible consensus will never be reached, without additional constraints.
        • A sample case that consensus never reached:
          • 2 processes P1 and P2. P1 proposes V1 and P2 proposes V2.
          • V1 is accepted by P1 (or P2), V2 is accepted by P2 (or P1), because an acceptor must accept the first value it receives.
          • No proposal would ever get majority in such case.
      • To address the problem, where P1 must be true and yet a consensus has to be reached somehow, one additional constraint should be added:
        • An acceptor must be allowed to accept more than one proposal.
        • Since there are multiple proposals now, there is a need to distinguish between different proposals. The mechanism that serves this purpose is to use proposal number
        • A proposal number is a number attach to each proposal. How this is implemented depends on actual implementation.
      • With the new constraint that an accept can accept more than one proposal, there is a new problem: we may end up with multiple values chosen; this violates the safety property.
        • A sample case multiple values are chosen:
          • 5 processes, P1 through P5.
          • P1 proposes V1, P5 proposes V5.
          • V1 is accepted by P1, P2, and P3.
          • Assume there is a delay, and then V5 arrives to acceptor P3.
          • Since an acceptor can accept multiple values, P3 is happily accepting V5.
          • Meanwhile, V5 is accepted by P4, and P5. (It could also be accepted by P1 and P2 of course.)
          • The end result is no consensus reached: V1 and V5 are both gaining a majority.
        • To satisfy safety property of only a single value is chosen, an additional constraint should be added.
      • If a proposal with value v is chosen, then every higher numbered proposal that is chosen has value v. [P2]
        • Because proposals are ordered (use a proposal number), this ensures only a single value is chosen.
        • To be chosen a proposal must be accepted by at least one acceptor. So, we can satisfy P2 by satisfying:
        • If a proposal with value v is chosen, then every higher numbered proposal accepted by any acceptor has value v.
      • Constraint of P2 introduces a new problem that requires a strengthened constraint:
        • Now assume a value V is chosen already. Because chosen a value only requires a majority of acceptors, it is possible that the proposal WRT the chosen value never reached one or more acceptors.
        • Because an acceptor can be a proposer, from one of those unreached acceptors, one acceptor (now a proposer) may propose a new value with a higher proposal number.
        • Because of P1, an acceptor must accept the first proposal it receives. So an acceptor that receives this new proposal which contains a different value than the chosen value must be accepted.
        • This violates P2a, so we need add more constraints.
      • Maintaining both P1 and P2a requires strengthening P2a to P2b: If a proposal with value v is chosen, then every higher numbered proposal issued by any proposer has value v.
        • This means before a proposer proposes a value, it has to look around and see if there is any exiting chosen value.
        • But how to satisfy this new constraint P2b?
      • We can satisfy P2b by maintaining invariance of P2c.
      • For any v and n, if a proposal with value v and number n is issued,
        then there is a set S consisting of a majority of acceptors such that
        either (a) no acceptor in S has accepted any proposal numbered less
        than n, or (b) v is the value of the highest-numbered proposal among
        all proposals numbered less than n accepted by the acceptors in S. [P2c]
      • To satisfy P2c, now comes to my favorite part of the paper:
        • To maintain the invariance of P2c
          , a proposer that wants to issue a proposal
          numbered n must learn the highest-numbered proposal with number
          less than n, if any, that has been or will be accepted by each acceptor in
          some majority of acceptors. Learning about proposals already accepted is
          easy enough; predicting future acceptances is hard. Instead of trying to predict
          the future, the proposer controls it by extracting a promise that there
          won’t be any such acceptances. In other words, the proposer requests that
          the acceptors not accept any more proposals numbered less than n.

Proposer Algorithm

  • Two phases: prepare and propose.
  • Prepare: A proposer chooses a new proposal number n and sends a request to each member of some set of acceptors, asking it to respond with:
    • a. A promise never again to accept a proposal numbered less than n, and
    • b. The proposal with the highest number less than n that it has accepted, if any.
  • Propose: If the proposer receives the requested responses from a majority of the acceptors, then it can issue a proposal with number n and value v, where v is the value of the highest-numbered proposal among the responses, or is any value selected by the proposer if the responders reported no proposals.

Acceptor Algorithm

  • An acceptor can ignore any request without compromising safety.
  • We need to say only when it is allowed to respond to a request.
  • It can always respond to a prepare request.
  • It can respond to an accept request, accepting the proposal, iff it has not promised not to.
  • Optimization: acceptor can ignore prepare requests which have lower proposal number, and also proposals that it already accepted.

Learning a Chosen Value

  • Use distinguished learners to optimize from product of number of learners and acceptors, to the sum of number of acceptors and learners.

Implementation

  • Single Paxos not so useful, as it only reaches consensus on a single value.
  • Multiple instances of Paxos - called multi-paxos allows consensus on a set of values. Very useful for building RSM (replicated state machine) systems.