Estimation Algorithms assignment

Algorithms homework attached 

Get Help With a similar task to - Estimation Algorithms assignment

Login to view and/or buy answers.. or post an answer
Additional Instructions:

cs4102 Fall 2020 2 Randomness and Sorting Exercise 1 A Matching Pair For Problems 1 and 2 below we will be exploring algorithms to solve the following: Professor Brunelle has n socks in his drawer, but he never pairs up his socks after washing them. Let’s help him find the best way find a matching pair in the morning. Fortunately, Professor Brunelle knows that exactly 3n4 + 1 of his socks are identical, and so picking any two from that group would make a matched pair. Unfortunately, none of his other socks will match at all! (Where on earth could those other socks have gone?!). We’ll call M the set of all the identical socks of Professor Brunelle’s. We’ll call U the set of all unique socks. Note that |M |> 3 · |U | and |M |+|U |= n. You may assume n is a multiple of 4. Creative Commons BY-NC 4.0 Nathan Brunelle and David Wu Hasan Mukati Hasan Mukati cs4102 Fall 2020 3 Randomness and Sorting Exercise 1 Problem 1: Linear Time Expected Currently, this is the algorithm Professor Brunelle is using to select socks: repeat until there’s a match: s1 = random sock repeat n4 + 1 times: s2 = a random sock if s1 matches s2 then wear them, break return all socks to the drawer In other words, he picks a first sock at random, checks n4 + 1 different socks with it to see if any match. If there is no match, he replaces all of the socks, picks a new first sock, and repeats. We will show that this algorithm will require an expected Θ(n) comparisons by considering separately the expected number of iterations of the inner loop and the outer loop (i.e. the expected number of socks chosen as s2 and s1 respectively). Inner Loop: We’ll begin by focusing on how many times the inner loop might iterate for each choice of s1 (i.e., the expected number of times we select a different sock for s2 for a single choice of s1). Note that we expect different behaviors for the inner loop depending on if s1 ∈ U vs. if s1 ∈M . Consider these cases separately: Case 1: Assuming s1 ∈ U , show that the expected number of times the inner loop will iterate is Θ(n). Case 2: Assuming s1 ∈ M , show that the expected number of times the inner loop will iterate is Θ(1). Hint: Bound the probability that the algorithm continues after iteration i (i.e., the first time that s2 ∈M occurs after iteration i). Outer Loop: Next we’ll take a look at the outer loop. Observe that this outer loop will continue iterating until s1 ∈M . Once this event occurs, that is guaranteed to be the last iteration through the outer loop. For this reason, to determine the expected number of iterations of the outer loop, we will focus on finding the expected number of times s1 will come from U . Show that the expected number of times the outer loop will iterate is Θ(1). Hint: Bound the probability that the algorithm continues after iteration i (i.e., the first time that s1 ∈M occurs is after iteration i). Putting it together: Use all three of your answers above to find the overall expected running time of this algorithm in terms of the number of comparisons. Creative Commons BY-NC 4.0 Nathan Brunelle and David Wu Hasan Mukati Hasan Mukati cs4102 Fall 2020 4 Randomness and Sorting Exercise 1 Problem 2: A Better Algorithm Suggest a new algorithm that will allow Professor Brunelle to find a matching pair in a constant number of comparisons (in expectation). Algorithm: Describe your algorithm here. Make it clear where randomness is being used. Pseudo-code is acceptable if that’s what you prefer. Correctness: Justify how you know your algorithm is going to find a matched pair (eventually). Running Time: Show that the expected number of comparisons that will be done by your algorithm is Θ(1). Creative Commons BY-NC 4.0 Nathan Brunelle and David Wu Hasan Mukati Hasan Mukati cs4102 Fall 2020 5 Randomness and Sorting Exercise 1 Problem 3: The Count-Min Sketch and Frequency Estimation Google is interested in collecting some frequency statistics on user search queries and identify trending queries over time.1 One way that Google can do so is to maintain a hash table mapping search queries to counts. However, if there are n unique queries, this will require maintaining a table of size Ω(n), which at Google-scale, is rather infeasible. In this problem, we will explore the concept of a count-min sketch data structure which allows us to compute approximate frequencies (i.e., counts) with much less space. Let U be the universe of possible search queries (i.e., the set of all strings up to length 1000). We begin by allocating an array of counters (c1, . . . , cM ) where each counter is initialized to 0. We also sample a hash function h from a family of universal hash functions from U → {1, 2, . . . , M}. Then, whenever a user makes a search query q ∈ U , we compute i = h(q) and increment ci = ci + 1. To estimate the frequency of a query q ∈ U , we again compute i = h(q) and return the count ci. While this method is very simple, we show in this problem that it can be used to provide good frequency estimates. Let q1, . . . , qn be the queries that have been made so far, and for each i, let fi > 0 denote the number of times query qi has been made. Let N = f1 + · · ·+ fn be the total number of queries that have been made. Let Xi be a random variable for the count of qi reported by the above algorithm (i.e., Xi = ch(qi)). Expected value of estimate. Show that fi ≤ E[Xi] ≤ fi + N/M . Hint: Try defining an indicator random variable Xij = 1 if h(xi) = h(xj) and 0 otherwise. This shows that the expected error of this approach is roughly N/M . Quality of estimates. Show that Pr[Xi ≥ fi + 2N/M ] ≤ 1/2. Hint: You may use without proof that for a random variable X taking on non-negative values, Pr[X ≥ t] ≤ E[X]/t. This shows that with probability 1/2, the ’‘error” of this approach is within an additive factor of 2N/M . Reducing the error. We can reduce the error by using multiple independent copies of the above data structure. Namely, we initialize multiple arrays of counters C1 = (c1,1, . . . , c1,M ), . . . , Ck = (ck,1, . . . , ck,M ) and multiple independent hash functions h1, . . . , hk (each sampled independently from the family of universal hash functions). When we obtain a query q, we increment all of the counters c1,h1(q), . . . , ck,hk(q). To obtain an estimate for a query q, we take the minimum count across all of the arrays of counters: Xi = mini=1,...,k ci,hi(q). Show that Pr[Xi ≥ fi + 2N/M ] ≤ 1/2k. This shows that we can decrease the error probability exponentially using only linearly more space! The count-min sketch data structure has many applications for estimating frequencies in settings where it is infeasible to store all of the data (e.g., streaming environments). 1Google has implemented such a service: https://trends.google.com/trends/trendingsearches/daily. Creative Commons BY-NC 4.0 Nathan Brunelle and David Wu Hasan Mukati Hasan Mukati

Related Questions

Similar orders to Estimation Algorithms assignment
31
Views
0
Answers
Advance topics in machine learning (Kernels) 2 questions.
Any 2/3 Questions need answering on Advance topics in machine learning (Kernels). relevant notes can be shared upon request....
17
Views
0
Answers
Using R programming to perform functions and do work.
The expectations are to use R programming to answer the questions attached the file, and the result is also to be a .r file....
13
Views
0
Answers
write up a report using c programming.
Using the knowledge acquired from this module write a descriptive report to solve the programming problems listed below. The word count limit is 500 words You must include the full code implementation as appendix! I recommend you use Courier New size 1...
16
Views
0
Answers
Analyse requirements and select appropriate solutions. Design programmes that use appropriate data structures
The assignment requires you to select and implement appropriate data structures, design and implement algorithms and create the relevant software applications that will allow a user to store, update and manipulate the data relating to the operations of an...