Design Hit Counter

Design a hit counter which counts the number of hits received in the past 5 minutes.

Each function accepts a timestamp parameter (in seconds granularity) and you may assume that calls are being made to the system in chronological order (ie, the timestamp is monotonically increasing). You may assume that the earliest timestamp starts at 1.

It is possible that several hits arrive roughly at the same time.

Example:

HitCounter counter = new HitCounter();

// hit at timestamp 1.
counter.hit(1);

// hit at timestamp 2.
counter.hit(2);

// hit at timestamp 3.
counter.hit(3);

// get hits at timestamp 4, should return 3.
counter.getHits(4);

// hit at timestamp 300.
counter.hit(300);

// get hits at timestamp 300, should return 4.
counter.getHits(300);

// get hits at timestamp 301, should return 3.
counter.getHits(301);

Follow up:
What if the number of hits per second could be very large? Does your design scale?

Analysis

这篇博文很好:http://blog.gainlo.co/index.php/2016/09/12/dropbox-interview-design-hit-counter/

这其实是一个系统设计问题。从simple case入手,不考虑concurrency,scalability,先得到一个可行解。

Queue / Linked List

用Queue,LinkedList都可以直观地实现,依次存入timestamp,当超出时间窗口time window后,就把最早的节点/元素移除。这样getHits()和hit()都能

但是这个方法有个问题在于如果时间窗口time window里,突然有大量hit()请求,queue/linked list都会变得很大,这样空间复杂度space不够优化。

Window Buckets

更进一步的,是因为已知时间窗口,可以设定若干bucket,比如60 spots for 60 seconds。这样每一秒的hit都能归入一个bucket,只需要增加这个bucket内count的数字,而不用担心buckets[]所占用空间大小的剧增。姑且把它称为Window Buckets方法。


考虑Concurrent, Distributed

Solution

Straightforward Queue Inplementation

class HitCounter {
    ArrayDeque trac;
    /** Initialize your data structure here. */
    private final int FIVE_MINUTES = 300;
    public HitCounter() {
        trac = new ArrayDeque<Integer>();
    }

    /** Record a hit.
        @param timestamp - The current timestamp (in seconds granularity). */
    public void hit(int timestamp) {
        trac.addLast(timestamp);
    }

    /** Return the number of hits in the past 5 minutes.
        @param timestamp - The current timestamp (in seconds granularity). */
    public int getHits(int timestamp) {
        while(trac.size() > 0 && ( int) trac.getFirst() + FIVE_MINUTES <= timestamp) {
            trac.removeFirst();
        } return trac.size();
    }
}

Queue

class HitCounter {
    Queue<Integer> q;
    /** Initialize your data structure here. */
    private final int FIVE_MINUTES = 300;
    public HitCounter() {
        q = new LinkedList<Integer>();
    }


    /** Record a hit. 
        @param timestamp - The current timestamp  (in seconds granularity). 
    */
    public void hit(int timestamp) 
    { 
        q.offer(timestamp); 
    } 

    // Time Complexity : O(1) 

    /** Return the number of hits in the past 5 minutes. 
        @param timestamp - The current timestamp (in seconds 
        granularity). 
    */
    public int getHits(int timestamp) 
    { 
        while (!q.isEmpty() && (timestamp - q.peek() >= FIVE_MINUTES)) { 
            q.poll(); 
        } 
        return q.size(); 
    } 
    // Time Complexity : O(n) 

}

/**
 * Your HitCounter object will be instantiated and called as such:
 * HitCounter obj = new HitCounter();
 * obj.hit(timestamp);
 * int param_2 = obj.getHits(timestamp);
 */

Window Buckets 固定大小的窗口数组 times, hits @grandyang

定义了两个大小为300的一维数组times和hits,分别用来保存时间戳和点击数,在点击函数中,将时间戳对300取余,然后看此位置中之前保存的时间戳和当前的时间戳是否一样,一样说明是同一个时间戳,那么对应的点击数自增1,如果不一样,说明已经过了五分钟了,那么将对应的点击数重置为1。那么在返回点击数时,我们需要遍历times数组,找出所有在5分中内的位置,然后把hits中对应位置的点击数都加起来即可,参见代码如下:

  • Time: hit() - O(1), getHits() - O(M) where the M is the window size
  • Space: O(M) - the M is window size
class HitCounter {
    int[] hits;
    int[] times; 
    int WINDOW_SIZE = 300;

    /** Initialize your data structure here. */
    public HitCounter() {
        hits = new int[WINDOW_SIZE];
        times = new int[WINDOW_SIZE];
    }

    /** Record a hit.
        @param timestamp - The current timestamp (in seconds granularity). */
    public void hit(int timestamp) {
        int idx = timestamp % WINDOW_SIZE;
        if (times[idx] != timestamp) {
            times[idx] = timestamp;
            hits[idx] = 1;
        } else {
            hits[idx] += 1;
        }

    }

    /** Return the number of hits in the past 5 minutes.
        @param timestamp - The current timestamp (in seconds granularity). */
    public int getHits(int timestamp) {
        int count = 0;
        for (int i = 0; i < WINDOW_SIZE; i++) {
            if (timestamp - times[i] < 300) {
                count += hits[i];
            }
        }
        return count;
    }
}

/**
 * Your HitCounter object will be instantiated and called as such:
 * HitCounter obj = new HitCounter();
 * obj.hit(timestamp);
 * int param_2 = obj.getHits(timestamp);
 */

Reference

Dropbox Interview - Design Hit Count

https://www.geeksforgeeks.org/design-a-hit-counter/

http://www.cnblogs.com/grandyang/p/5605552.html

results matching ""

    No results matching ""