-
Notifications
You must be signed in to change notification settings - Fork 64
/
QuestionQueue.txt
17578 lines (13088 loc) · 652 KB
/
QuestionQueue.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Addition of 2 big numbers (represented by 2 int arrays, assuming each integer can only be 0-9)
========
find the median of million rows in each of the 1000 servers.
==============
What was the most challenging aspect of your previous job?
==============
Write a class that iterates through a list of lists with methods next and hasNext. Next returns the next item in the list, and hasNext returns True or False depending on whether or not we are at the end of the list.
Example:
i = Iterator([[1, 2], [3, 4]])
i.next() returns 1
i.next() returns 2
i.hasNext() returns True
i.next() returns 3
i.next() returns 4
i.hasNext returns False
=================
Given an unlimited stream of input, output the median value at any moment.
============
What was the most challenging issue/bug you came across and how did you resolve it?
==============
Millions of cache entries, each with a unique key, stored in several servers. how to sort all entries whose key falls within a specified range using Map Reduce? What if one server has a very small range of keys (say 1 to 100) and another server has a very large range of keys (say 100 to 1000000)? Sorry I don't remember the answer.
=================
BST running time (creation/updating/traversal)
=====================
Given a string and a list of character mappings of the form (a=>j,k), (d=>r), (t=>r,q,e,y), print out all the possible variants by applying the mapping to the original string.
example:
string="foo", mappings = "o=>1,2", "k=>d,l,w"
output:
foo
fo1
fo2
f1o
f11
f12
f2o
f21
f22
===============
Describe a current project. What was challenging about it? What did you learn from it?
==================
A quad tree is used to represent a black/white image. If you are provided with two such image representations, write a function to create a third tree that represents the merged image. (Black overrides white, mixed; mixed overrides white)
=====================
Implement a random number generator such that the random number generated is always in a particular range. Perform through time complexity analysis of it. How would you improve the solution.
=======================
If a person dials a sequence of numbers on the telephone, what possible words/strings can be formed from the letters associated with those numbers?
=====================
Please tell us about the most satisfying project you worked on in the last year?
=================
Please tell us about a personal challenge you had with someone who managed you?
======================
Find a repeated number in a list of numbers
Naive way: brute force O(n^2) time
Batter way: punch it into a hashtable/hashmap/dictionary, O(n) time
====================
How do you check if a URL is bad really fast in Google server. The point is for the user not to notice the lag in the checking
Use Bloom Filter algorithm. I don't really know how to do this
i don't think a bloom filter works here.. it works as an approximate cache, but when you're talking about arbitrary URLs, you'll end up missing the cache on a huge % of time... hence not valuable
================================
I was asked to determine whether two nodes were connected in a graph.
use union-find data structure (also called disjoint sets)
=================================
design a structure have these two functions: insert, getMedian, discuss the time complexity
use a self balancing binary search tree (BST) that has the same size (or off by one) left and right subtrees from the root
getMedian: O(1)
since the tree is balanced the median will always be the root node
insert: O(logN)
BST gives us logN insertion, balancing the tree is O(logN) as well and works in the following way: (this is only the most difficult case) if upon insertion the left subtree has two more nodes then the right subtree we need to rebalance the BST, that is take the largest value on the left subtree (rightmost node) and make that the new root, that works fine as this value is greater than everything on the left subtree and less than everything on the right subtree, then take the old root value and insert it in the right subtree, then the tree is balanced again and the median value is the root, ta-da! :)
I am not sure that the median is in root. You can draw a tree that is balanced ( the depth of the left child - the depth of the right child <= 1). Then it is very hard to see which is the element.
You can solve this problem by using 2 heaps:
- one max-heap for the 1/2 lowest elements;
- one min-heap for the 1/2 highest elements;
On insertion, you can insert in heap and balance the heap (if the difference of the sizes is greater than 1, then remove the maximum (minimum) from one and insert to another). This is done in O(log n).
When you get the median if the sizes of these 2 heaps are the same, then take the average of the max (min) heaps, otherwise, return the element from top of the heap with the highest number of elements.
===========================
Try to figure out the top unique search queries from a database with 100 billion entries
Use a hash table then sort the counting results
Not sure if this is a trick question.
Are you trying to find the most queried search term? If so, then how is it "unique"?
Assuming it's not unique, this is a typical application of the map/reduce paradigm. Assign each query a value of 1 (map) and then count the number of occurrences (reduce). Google sort of popularized the shift towards map/reduce anyway.
==================
Sample uniformly on a circle centered at zero with radius 1.
randomly generate an integer X.
we know that cos(X)^2 + sin(X)^2 = 1, so just use xx = cos(X) and yy = sin(X)
I think that was the whole point of the question (and I don't like it very much)
=======================
Write code to check whether every element of an array has been visited (you are given a starting position and each position stores a pointer to a subsequent position)
================
You have a genealogy: 1) Describe a data structure to represent it. 2) Given any two people within the genealogy, describe an algorithm to determine if they share a common ancestor. You just need to return true/false, not all ancestors.
1) Each person is represented by an object, with two pointers: "mom" and "dad" that point to their respective parent.
2) Choose one of the two people arbitrarily. Perform depth-first traversal and create a set of ancestors reachable from that person. Perform depth-first traversal on the 2nd person and for each node visited check if they are in the set; if yes, return true. Use a hash-set for best performance.
If you've optimizing for worst case, hash set is O(n) for search. You'd do better in worst case with a sorted structure. You'd do even better if you store a "visited" flag at each node. Alternately you can number each node and keep an array of visited flags.
since depth first seach might find a longer link through dad before checking mom, you're better off with breadth first search. Anything reachable in one hop would be seen before anything reachable at best in 2 hops
Yes. Good points. The second traversal should be breadth first. The first traversal it doesn't matter, as you need to visit all ancestors anyways.
The use of a visited flag is a good optimization. Based upon the way the question was worded, it wasn't clear if you could design the data structure to optimize the algorithm or not.
=====================
Print mth to last elements of the linkedlist
=============================
Create a binary tree based on a sorted array
If you already have a sorted array, then you could say that it is already a minimum heap and that is a binary tree.
Use binary search to construct a balanced tree
===================================
assume you are writing number in ascending order to an array of constant size. once you reach the end of the array, you start writing from the beginning, thus writing over the oldest entries. write an algorithm for finding a specific number in this array.
Start with begining of array.
If n<beginingEle
start from the end and look for element till you see a bounce in ordering
else
look in the first half till you see the bounce in ordering
By bounce i mean incresing and then sudden decrese or vice versa.
===================================
you have a sequence where each number is a multiple of 2 or 5 (so: 2^i * 5^j). he gave the beginning of the sequence as 1,2,3,4,5,8,10,16... and asked me to find an algorithm to calculate the next number in the sequence.
Set-I is powers of 2 {2, 4, 8, 16, 32, 64....}
Set-II is powers of 5 {5, 25, 125....}
Set-III is composite function of both the above set {10, 20, 40,50, 80.......}
Now start with Set-I and keep on displaying it until it is less than 1st element of Set-II ans Set-III or otherwise display an element of Set-II or Set-III which ever is less.
Note:- Set-III is formed by multiplying Set I and Set II in sorted manner. (A code snippet is required to generate it)
public class googleAnswer {
public static void main(String args[]){
int[] sequence={1,2,4,5,8,10,16, 20, 25, 32};
System.out.println(getNextNumber(sequence));
}
public static int getNextNumber(int[]sequence){
int curNum=sequence[sequence.length-1];
System.out.println(curNum);
boolean found=false;
while(!found){
curNum++;
int tempNum=curNum;
while(tempNum%5==0){
tempNum=tempNum/5;
}
while(tempNum%2==0){
tempNum=tempNum/2;
}
if(tempNum==1){
found=true;
}
}
return curNum;
}
}
The interviewer said something like multiply each number by 2 and also by 5 (two separate operations). This would give you:
1 x 2 = 2
1 x 5 = 5
then look at the smallest unused number and use that for the next set of operations:
2 x 2 = 4
2 x 5 = 10
and so on ...
I like the way of thinking about it as three sets better. Too bad I didn't think of that during the interview! I was hung up looking for a pattern in the exponents since he phrased it as 2^i * 5^j. It seemed like there was one until he gave me sample values after 20.
This question really threw me off because it's not like the other google interview questions I read about. It would have been interesting to pursue it past the math sequence portion (merging sets of streams of numbers).
Thanks for the feedback, though!
Here's an easy python solution. it doesn't print the first element in the sequence (1):
#!/usr/bin/env python
from operator import itemgetter
a = 2
b = 5
x = 0
y = 0
ptr1 = 2
ptr2 = 5
for i in range(30):
h = { a**(x+1) * b**y : (lambda x,y,ptr1,ptr2: [x+1,y,ptr1,ptr2]),
a**x * b**(y+1) : (lambda x,y,ptr1,ptr2: [x,y+1,ptr1,ptr2]),
a * ptr1 * ptr2 : (lambda x,y,ptr1,ptr2: [x,y,ptr1*a,ptr2]),
ptr1 * b * ptr2 : (lambda x,y,ptr1,ptr2: [x,y,ptr1,ptr2*b]) }
for k, v in sorted(h.iteritems(), key=itemgetter(0), cmp=lambda i, j: i - j):
if k > 0:
print k
x,y,ptr1,ptr2 = v(x,y,ptr1,ptr2)
break
Well, we can make several observation:
* a number multiplied by 8 (2^3) is always larger than that multiplied by 5.
* if we seed the list with 2 consecutive multiple of 2 (e.g. 2 and 4), we do not need to multiply the number by 2 (2*2 = 4 is already in the list) and only need to multiply by 4 (see that 2*4 = 8, then 4*2 = 8, oops already in the list)
* if we have a number that is divisible by 5, we simply need to multiply it by 5 since the results of multiplying the number by 2 or 4 would have been generated previously (we note that 5n * 4 is equivalent to 4n * 5, obviously 4n < 5n; similar argument for multiplication by 2).
Hence, here is a very simple algorithms to generate the first n numbers in the sequence:
1. we start with a seed of [1, 2]
2. we iterate through the list until length of list = n, for each number k we see,
(a) if k is divisible by 5, append to the list 5k, otherwise
(b) append 4k and 5k
def createPowerList(n):
lst = [1, 2];
i = 0
while len(lst) < n:
currentValue = lst[i]
if currentValue % 5 == 0:
lst.append(currentValue * 5)
else:
lst.append(currentValue * 4)
lst.append(currentValue * 5)
i = i + 1
return lst
Forget that. It makes a small mistake. Instead this simpler version would work much better. :)
def createPowerList2(n):
lst = [1]
power_of_2 = 2
multiply_by_5 = 0
while len(lst) < n:
if power_of_2 < (lst[multiply_by_5] * 5):
lst.append(power_of_2)
power_of_2 *= 2
else:
lst.append(lst[multiply_by_5] * 5)
multiply_by_5 += 1
return lst
============================
Find the longest word in a dictionary, such that the word can be built one character at a time, and all substrings are words in the dictionary. A new character can be added anywhere.
=============================
Deep search binary tree, what is the worst case memory requirement using Queue.
===========================
Given a large web query log file, how to find the most frequent K queries
============================
Easier questions 1) Two variations of a program exist, A and B. When run on machine 1, A runs twice as fast as B. When run on machine 2, A runs 10x as fast as B. How would you investigate this? -same datasets -same versions of OS and libraries -same hardware 2) Had to code a method that calculated the factorial, first simply, then a second one recursively.
One reason is The process has some random function, and it affects the speed.
Perhaps second machine had more memory, so no swapping was necessary. Perhaps it had more CPUs, and one program was able to take care of that (using multi-threading) but the other one could not. Perhaps second machine had SSE instructions, and the programs running were doing something like decoding movies, perhaps the second machine had a more powerful GPU and one program could take advantage of that.
Answer 1: Processor cache was smaller on slower machine
=========================================
trickier question, code a method given the following method signature that will print out any numbers that intersect both arrays of numbers //Example arrays // 4, 18, 25, 40, 411 // 20, 25, 40, 320, 1009, 1100 void intersect(int[] arr1, int len1, int[] arr2, int len2) {
void intersect(int[] arr1, int len1, int[] arr2, int len2) {
int startIndex = 0;
//have an outer loop of one of the arrays
for( int i = 0; i < len1; i ++){
//have an inner loop of the other array
for( int j = startIndex; j < len2; j++){
if(array1[i] == array2[j]){
System.out.println(array1[i] + "\t");
startIndex = j + 1;
break;
}
else if(array1[i] < array2[j]){
//skip because less than the number in the second array
break;
}
else{
startIndex = j++;
}
}
}
}
}}}
Even if they're not sorted, the first commenter's answer is suboptimal. Assume array sizes are n and m with n > m. Then sorting both arrays is O(n log n). Once sorted, you just need to make one pass over each array --- O(n + m) --- to find common elements. Just have two pointers at the head of each list, and use these to move in sorted order through the union of lists. Whenever both pointers point to the same value, you have an intersection.
If you want a very simple answer, use hashtable to store the contents of the first list (O(n)). Then, for each member of the second list, check whether the value is in the hashtable (each check is O(1)). We have O(n + m) algorithm that is extremely simple to code.
if sorted then you can avoid the nsquare with binary search
============================================
Give 2 coding solutions on returning an array by removing duplicates. One solution with O(n^2) and the other Linear.
Create a new array. Run through the original array, on each element insert to new array in sorted order. O(n^2)
hash select is O(n). What if the hash function is "return 0"? it degenerates into a linked list.
Not sure how you can do this in O(n). If the values are integers of a set size, you can use a lookup table or sort in O(size*n) . Otherwise sort O(nlogn)
You can use a hash. Either assume that you have a good hash function or write one (I doubt the interviewer would ask you to do a very complex hashing, but he'd appreciate you telling him about the assumption you made).
Instead of the solution by venk T, I'd suggest another easier one. For every values you have, push it to the hashtable, usually a standard hashtable will return true or false depending on whether the value is already in the hashtable. If the value is not already in hashtable, output it in your answer, otherwise don't output. This will cut the last step of getting all the values off the hashtable.
======================================
Describe the implementation (along with data structures) involved in making a program that: 1. inserts a number 2. returns the median of all unique numbers
=====================================
a very long single linked list, how to find the n-th entry from the end.
this is a frequently asked question. The answer is to use two pointers to maintain the locations, but i didn't realize why this is better than another method:
Scan the whole list once and calculate the length, N. Then move a pointer from the head by N-n steps.
Answer is the IO efficiency. If n is small, an n-entry list can be saved in cache and in the first method, the 2nd pointer can directly read cache.
set two pointers first and second
1. second pointer step forward for n step, first point stay at the head of the list
2. first and second pointer step forward at the same time until second point hit tail
3. first point is your answer
Interview Candidate is right.. it's the same thing.
in the 2-pointer method, you are basically iterating over the entire list with that 2nd pointer, and the 1st pointer is iterating N-n times.
====================================
Difference between function overloading and overriding.
explained one uses polymorphism and other inheritence.He said right thats what he wanted to know.
=============================
how would you find maximum element in an array of numbers.
2 loops and comparing each element with all other as its not sorted array.
Honestly, something like this is probably the best (assuming an unsorted array):
int getBest(int[] nums){
int max;
for(int i = 0; i < nums.length; i++){
if(i == 0){
max = nums[0]; // ensures it's always initialized
}
int curr = nums[i];
if(curr > max)
max = curr;
}
return max;
}
You have to look at every element once, because if you skip an element you can't be sure it isn't the biggest. The lowest bound we can do is O(n). However, two loops makes this an O(n^2) problem, doing much more work without helping us solve the problem.
sort the list. quicksort gives average of O(nlogn) complexity.
return the last element in the list O(1) complexity.
===============================
How would you find maximum element in a list of items whcih implements comparator.
Unless we want to know something else, this is the same as the array question. Iterate through the list, keeping track of a "max" item, and compare each element to the max so far. The comparator is a red herring, probably meant to get you to think about sorting.
Sorting before we search is not the answer here. For a comparison based sort, the best you can do is O(n log n), which isn't as good as linear search over the unsorted array. Sorting wins if we need to make arbitrary queries after the first.
I guess then that there is no one, correct answer, since we are defining the question. I'd define the "best" element as the element which was scored as "better" than the most others, and compare each element with each other one. For that, the best you can do is O(n^2). You could, though, redefine this - for instance, the "best" element is the element which beat the most other elements that it lost to. It's a more vague problem than originally described.
I think the analogy I'd use would be sports teams over the course of a regular season, where team A might beat every other team except team B, who beats team A but loses several other games. I would definitely ask what the heck this guy was talking about, though - because using words like "maximum", "best" and "comparison" definitely imply some sort of logical ranking.
=============================
Base class has one method takes argument as Object and inherited class overrides it to pass String as parameter.If called with argument of Object whcih method will be called
functions calls go from instance class first and then if no match found then towards base class based on arguments ...so anwer is base class method will be called becuase base class only has method matching argument of Object.
He was happy.
===========================
Give us an example where you solved and handled some complex issue in current project
I explained one scenerio for a design change.I felt he was looking for something else but based on my current job thats what i do.
====================
Find if there are two members of an array that sum to 10 (10 and 0 count, but 10 alone does not).
My first guess is something like the following, where 10 would be passed as the sum parameter:
int[] findSum(int[] nums, int sum){
int[] pair = null;
for(int i = 0; i < nums.length; i++){
int a = nums[i];
int target = sum - a;
for(int j = 0; j < nums.length; j++){
if(nums[ j ] == target){
pair = new int[2];
pair[0] = a;
pair[1] = target;
return pair;
}
}
}
return pair; // returns null if no suitable pair found
}
}
I would use a hash to count how often the numbers between 0 and sum/10 show up. then go through the has to see if we have two that match. it is O(n^2) also, and doesn't work if sum is large.
int[] findSum(int[] nums, int sum) {
int[] hash = new int[sum+1];
//count how many numbers
for (int i=0; i<nums.length; i++) {
if (nums[i] <= sum) {
hash[nums[i]]++;
}
}
for (int i=0; i<(sum/2); i++) {
if ((hash[i] > 0) && (hash[sum-i]>0)) {
return new int[] {i, sum-i};
}
}
return null;
}}
Basically sort the elements in n log n . Then for every element k search for (Sum - k) in the array using a binary search. Hence total complexity is n log n.
You can use a hashmap from value to # of occurrences of the number in the hashmap. Enter all numbers to a the hashmap (O(n)), go through the array and calculate 10 - n for each member and determine whether that number has occurrences > 0 in the hashmap. If so, you have a match, don't forget to decrease the occurrence by 1 (O(n)). Total O(n).
Note that if duplicate is allowed (such that 5+5=10 is possible), you may need to check whether n = Sum - n, if so you must make sure the # of occurrences is > 1 and then reduce the occurrences by 2 instead.
================================
What cool things did you do in your previous job. Basically the interviewer wants to know if there can be a good fit between me and the team.
================================
You're given a binary tree and pointers to two nodes in the tree. Describe the fastest algorithm you can come up with to determine the closest common ancestor.
This one is covered a lot on the web and in technical books, but every one I've seen assumes it's a binary search tree. Here he didn't want me to assume it was a binary search tree.
I think I would go with a modified dfs which terminates early. I would need a global variable to indicate the closest ancestor;
//return 0 - none found, 1 - a found, 2 - b found, 3 - both found
public int modifiedDFS(Node node, Node a, Node b, int ) {
int res=0;
if (node) {
if ((node.left == a) || (node.right == a)) {
res += 1;
}
if ((node.left == b) || (node.right == b)) {
res += 2;
}
if (res == 3) {
lowest = node;
} else {
int l = modifiedDFS(node.right, a, b);
int r = modifiedDFS(node.left, a, b);
if ((l==3) || (r==3)) {
//lower node is the one
res = 3;
} else {
if ((3 == (l+r)) ||
(3 == (l+res)) ||
(3 == (r+res)) {
//i am the lowest
lowest = node;
res = 3;
} else {
if ((l==1)||(r==1))
res = 1;
if ((2==l)||(2==r))
res = 2;
}
}
}
}
return res;
}
)}}}}
That may or may not work (I didn't trace through the code), but there's a much simpler and faster way to do it. Worst-case running time is O(lg n). With yours, worst-case running time is O(n).
Remember: you're given a pointer to the nodes in question. To traverse from a leaf node to the root is worst-case O(lg n) operations. Think of something you could do to take advantage of that.
If there is parent pointer, the answer is straightforward. The simplest of which is just to traverse up from 1 node and record all the nodes, then traverse up from the other node and find the first node that match the recorded nodes. This is O(lg n) in a balanced tree, O(n) in worst-case.
If there is no parent pointer, I'd use tree traversal (pre-, in-, or post- doesn't matter much). The closest common ancestor would be the only node that find one node on its left children (or itself) and the other node on its right children (or itself). This is O(n).
=============================
You're writing an application that receives a stream of individual items of data. The stream may be very long or very short, but you have no way of knowing how long it is (i.e. there's no trick to figuring out the size of the stream of data). How would you go about choosing m items such that any subset of m items was equally likely? (Not an even distribution of values, but just that any m items are equally likely to be chosen). So for example, m=1000, and the number of items in the stream, n, may be 1000, or 10000, or 100000000, or much much larger; there is no way to know how many.
The algorithm is rather straightforward:
1. Pick the first m members in the stream (if there is less than m, just return the entire list).
2. For each of the subsequent member of the list, pick the member with m/n probability, where m is the number of elements already seen. If the member is picked, replace 1 member at random from the already selected m members with the new member.
We can prove that this is random via induction. If there is m members (or less), each member has probability 1, which satisfy the requirement that probability is m/N. For the induction step, we assume that for the first k-1 members, we pick them with probability m/k-1, so in step k, we have 2 cases to consider:
(a) the k-th member has m/k probability of being picked, so the induction step holds for k-th member.
(b) for all the previous member, the probability of being picked is now P(k-th member not picked)*P(original) + P(k-th member picked)*P(original)*P(member is not dropped) = (k-m)/k * m/(k-1) + m/k * m/(k-1) * (m-1)/m = m/(k-1) * [(k-m)/k + (m-1)/k] = m/k, which is the required probability.
One idea is for each member, we run random to get a number associate with it. We also keep the minimum m members at any time. For worst case, the algorithm take O(n*log(m)) runtime. But considering the kept m members has very small numbers associated with them after a long run. So the random generated for subsequent members has large probability to be larger, so the comparison can be very quickly.
=================================
Given a stream of integers of unknown (possibly large) length, how would you pick one at random? Now prove its random.
Question was not worded very well, however we eventually reached mutual understanding.
Answer: Store one integer at most. For each pop, decide whether or not to store it in the one slot with probability 1/N (N is current count of integers encountered).
proof is easy:
probability of an object appearing (being chosen) to be in some previous slot K starts with 1/K as per the algorithm. To "survive" to the Nth slot, the probability is
(let M be distance of K to N: M = N - K or N = K + M)
1/K * K/(K+1) * (K+1)/(K+2) * ... * (K+M-1)/(K+M) , the last denominator simplifies to N
All the numerators and denominators cancel except 1/N.
You can prove by induction, probably simpler. If the list only contains 1 member, probability is 1/1 = 1, which is correct. For the induction step, consider what happen when k-th member is encountered:
case(a): for the k-th member, probability of being picked is 1/k, which is what we want.
case(b): for each of the previous (k-1) members, probability of being picked become P(k-th not picked) * P(original, i.e. 1/(k-1)) = (k-1)/k * 1/(k-1) = 1/k, which is what we want.
[]
It is more fun if we were to picked m members at random, then the prove becomes less trivial (though almost equally as straightforward).
======================
Define binary search tree. Develop a procedure to verify a binary search tree.
@Bala
Recursively:
Check node.left < node.value and node.right > node.value. Also, that all subvalues of node.left < node.value and that the values of the subtree of node.right > node.value. Make sure you do checking for edge cases (ie. leaf nodes, etc).
recursive solution is a good option i guess.. its definitely easy to code. However i think we can have it done more efficiently if we just perform a BFS on the given tree. and every time just check tht the left child is smaller and right chiled is greater than or equal to the current node. If we successfully search the entire tree then it is a proper BST otherwise its not.
public static boolean checkTree(BSTNode<Integer> node, Integer leftLimit, Integer rightLimit)
{
if(node == null)
return true;
if(leftLimit != null && leftLimit > node.info)
{
return false;
}
if(rightLimit != null && rightLimit < node.info)
{
return false;
}
return checkTree(node.left,leftLimit,node.info) && checkTree(node.right,node.info,rightLimit);
}
==================================
Write a function to find intersection of 2 sorted arrays.
void intersection(int* arr1, int* arr2, int len1, int len2, int** res, int& len)
{
*res = NULL;
int i1 = 0;
int i2 = 0;
len = 0;
while(1)
{
if(arr1[i1] < arr2[i2])
{
while((arr1[i1] < arr2[i2]) && (i1 < len1)) i1++;
if(i1 == len1) return;
}
if(arr1[i1] == arr2[i2]) break;
if((arr1[i1] > arr2[i2]) && (i2 < len2))
{
while(arr2[i2] < arr1[i1]) i2++;
if(i2 == len2) return;
}
if(arr1[i1] == arr2[i2]) break;
}
// i1, i2 - pos for arr1, arr2 where (arr1[i1] == arr2[i2])
int i3, i4;
i3 = i1;
i4 = i2;
while((arr1[i3] == arr2[i4]) && (i3 < len1) && (i4 < len2))
{
i3++; i4++;
len++;
}
if(len > 0)
{
*res = new int[len];
for(int i=0; (i < len) && (i1+i < len1); i++)
(*res)[i] = arr1[i1+i];
}
}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}
void intersection(int *a, int la, int *b, int lb, int **res, int *lres) {
int min_n = la < lb ? la:lb;
(*res) = new int[ min_n ];
int ca = 0;
int cb = 0;
*lres = 0;
while (ca < la && cb < lb) {
while (a[ca] < b[cb] && ca < la)
ca++;
if (a[ca] == b[cb]) {
(*res)[*lres] = a[ca];
(*lres)++;
}
cb++;
}
}}
ublic static void main(String[] args) {
// TODO Auto-generated method stub
int [] A = {1,2,3,4};
int [] B = {4,5};
System.out.println(checkIntersection(A,B));
}
private static String checkIntersection(int[] A, int[] B) {
// TODO Auto-generated method stub
int curA =0, curB=0; // indices for A and B
String result = "No intersection";
while(curA < A.length && curB < B.length && result.compareTo("No intersection") == 0)
{
if(A[curA] < B[curB]){
curA++;
}
else if(A[curA] > B[curB]){
curB++;
}
else{
result = String.valueOf(A[curA]);
}
}
return result;
}
=============================================
Find the lowest common ancestor for BST
One solution is traversing the BST from top to bottom and for every node, you check its children. Then you recursively go down the chain until you reach the leaf.
In BST you use the ordering property of BST:
http://rmandvikar.blogspot.com/2008/08/lowest-common-ancestor-lca-in-binary.html
The lowest common ancestor will be between the two values, so perform a dfs until you reach a node that is between the two values.
We note that the lowest common ancestor must have a value between the 2 values you are given. Any other ancestors higher than that will have a number that is either greater or smaller than _both_ values. Hence the problem is simplified a lot. We can now simply traverse from the root, as if we are searching for the smallest of the 2 numbers. For each traversal, check whether the number is between the 2 values. If it is, return that node.
===========================================
What is the relationship between equals(Object o) and hashCode() in Java? How would you implement each method?
A class which overrides one must override the other such that objects which .equals() each other generate identical hash codes.
Typically, what this would mean for implementation is that every field or attribute (including type depending on the constraints - maybe you want two objects of a given parent type to be equal to each other, so you check the parent type instead of the exact type) which is compared in the .equals() method must be used as input in the hashCode() algorithm. Something which doesn't affect object equality should never be hashed.
You need to be very careful to preserve transitivity if you allow a subclass to be equal to a super class. The super class needs to be designed for this and you have to rely on all other subclasses also being implemented correctly
==============================================
What is your favorite data structure? What opertions does it have? [Then some detailed questions about the algorithms based on this data structure]
I will go for hash and heap.
============================
Design and describe a system/application that will most efficiently produce a report of the top 1 million Google search requests. You are given: You are given 12 servers to work with. They are all dual-processor machines with 4Gb of RAM, 4x400GB hard drives and networked together.(Basically, nothing more than high-end PC's) The log data has already been cleaned for you. It consists of 100 Billion log lines, broken down into 12 320 GB files of 40-byte search terms per line. You can use only custom written applications or available free open-source software.
This is a great question. I would probably use some sort of map-reduce algorithm. Divide the job up between all of the servers (2 jobs per server, since they're dual-core machines), and calculate how large each job can be from the size of the dataset, and the memory size of each server.
Have each server count the occurences of each search request, then send up the counts. Map-reduce will take care of the rest.
Map reduce is the solution. But one should try to have tree structure(bottom up) of map tasks, so that you can filter out at every step.
There are 12 machines, and 12 files of 320 GB each. Presumably there is one file on each machine.
Since hard-drives are slow, loading a chunk of data should take much more time than processing it. So let's do lots of processing.
On each machine:
1. Load a big chunk of 40-byte entries at a time, say 0.5GB each. Compress each entry, say by using in-memory zip.
2. Put the entries in the chunk in buckets, using a hash function whose range is a big number multiple of 12 (say 3 times a power of 2). The reason for this comes later. Use exactly the same hash function on all machines at all times.
3. Once that chunk is processed, append it to 12 files on disk, by taking the hash function value mod 12.
4. At the end we have 12 files on each machine, each having a (compressed) 40-byte query and its frequency. The same query-frequency pair can show up multiple times in each file, since we append to each file multiple times.
5. Each of those files is at most 3GB (assuming that our hash function has good randomness) and each file could be much less due to compression.
6. Process each file again to combine any repetitions.
7. Now, from each machine, send first file to first machine (by compressing it perhaps), send second file to second machine, etc. The beauty here is that due to the same hashing employed at all times a query will hash to exactly one of the 12 files, the same file on each machine.
8. So, after each machine gets 12 files each, all it needs to do is combine them once again. We will find the frequency of each query.
9. The last step is to output the one million most frequent queries. I did not think of this yet, but will require some more communication among the machines.
=========================================
How would you find the most searched for phrase in Google, assuming that you could use 10000 computer in parallel?
Distribute phrases randomly to each of the 10000 computers, sort list of phrases on each computer, find the phrase that came up the most number of times on each computer. Then compare the top searches across the computers.
If you have 10000 computers, then the data for each map job will become N/10000 and emit only the top element. In the reduce step, you will sort 10000 and emit the top frequency word.
===============================================
How do you delete a given node from a single linked list?
template<class T>
class LList
{
public:
T val;
LList<T>* next;
public:
static void Delete(LList<T>** head, T& node)
{
if(NULL == (*head))
return;
LList<T>* next;
// the case when the element to delete is a head of the list
if((*head)->val == node)
{
next = (*head)->next;
delete (*head);
(*head) = next;
}
else
{
LList<T>* cur = *head;
while(cur->next)
{
if(node == cur->next->val)
{
next = cur->next->next;
delete cur->next;
cur->next = next;
break;
}
cur = cur->next;
}
}
}
};}
==========================
Implement an LRU cache.
==========================
You have a bag of random numbers which can be negative, positive or zero, and it might contain duplicates. Describe an algorithm for finding all pairs whose sum equals some value m.
Devide and conquer algorithm
Assume that m > 0
1. Devide all numbers for 5 buckets: 1st {n<0}; 2nd {n==0}; 3d {n>0 && n<m}; 4th {n == m}; 5th - {n > m}
2. To check if 2 numbers sum is equal to m we need to work with buckets this way:
a) if 2nd and 4th are not empty: sum of all numbers from 2nd and 4th are equal to m
b) if 1st and 5th are not empty check what sum of these negative and positive numbers are equal to m
c) Alpply devide and conquer algorithm to the 3d bucket.
==================================
Describe how you would reverse a single linked list.
template<class T>
class LList
{
public:
T val;
LList<T>* next;
public:
static void Reverse(LList<T>** head)
{
if(NULL == (*head))
return;
LList<T>* headNew = NULL;
LList<T>* cur = *head;
LList<T>* next;
while(cur)
{
// to insert the pointer to the current node at the head of a new list
next = cur->next;
cur->next = headNew;
headNew = cur;
cur = next;
}
*head = headNew;
}
};
}
and same thing but recursive:
reverse(head)
{
if(head.next==null)
return head
next = head.next
newlist = reverse(next)
next.next = head
}}
======================
I am playing a card game called 24. Cards ace to king are numbered 1 to 13. During a given round, I am provided four cards to play with from the shuffled pack. If the numbers from the four cards result in 24 then I win the round if I shout '24' first. How would you code a function for this?
Four numbers and three operators at a time.
Consider permutations of such expressions in postfix notation.
Evaluate these permutations till you get 24.
It's a naturally recursive operation. Take the four cards, A, B, C, D.
In the different steps you have (target = 24):
(target) : using cards A, B, C, D
(target - A): using cards B, C, D
(target - A - B): using cards C, D
(target - A - B - C): using card D
Repeat for each of the other cards in the step's grouping if the target isn't met, ie.
(target) : using cards B, C, D
(target - B): using cards C, D
(target - B - C): using card D
And so on.
Pass true up the stack when any of the target values reach zero.
Once you have that down, it's trivial to write the function.
If we know the rules won't change (always four cards, always summing to 24) and speed REALLY matters, then the correct answer is to ignore complexity and elegance and let the CPU burn through this. Just add the first two cards and check the sum. If sum == 24, shout (return). If sum < 24, add third card and check. If sum < 24, add fourth card and check.
Sorting or structuring will take too much time even though it could be a big win in the general case - someone else will shout while we're moving our cards around.
It's a naturally recursive operation. Take the four cards, A, B, C, D.
In the different steps you have (target = 24):
(target) : using cards A, B, C, D
(target - A): using cards B, C, D
(target - A - B): using cards C, D
(target - A - B - C): using card D
Repeat for each of the other cards in the step's grouping if the target isn't met, ie.
(target) : using cards B, C, D
(target - B): using cards C, D
(target - B - C): using card D
And so on.
Pass true up the stack when any of the target values reach zero.
Once you have that down, it's trivial to write the function.
You only consider the + operator, which is not true in 24 points game. There are +,-,* and / 4 operators.
=======================================
Given the list of points of the skyline of a city in order (from East to West) Find the maximal rectangle contained in this skyline. I was asked to write the code. I managed to find the algorithm but was not sufficient.
Here's an O(n^3) dynamic programming solution. It's pretty ugly, but it seems to work. More elegant/efficient solutions would be greatly appreciated.
--------------------------------------------
import java.util.ArrayList;
public class Skyline {
private int maxX;
private int maxY;
public String[] findLargestRectangle(String[] points){
// First translate the points into a boolean matrix.
// The cell [i][j] in the matrix should be true if
// it appears under the skyline and false otherwise.
boolean[][] cells = this.pointsToMatrix(points);
System.out.println("Matrix is: ");
this.printBooleanMatrix(cells,maxX,maxY);
// Now we calculate rectangle size. Define
// s[i][j] to be the size of the rectangle
// with (i,j) at its upper left corner.
int[][] s = new int[maxX+1][maxY+1];
int maxSizeX = 0;
int maxSizeY = 0;
int maxSize = 0;
for (int i = 0; i < maxX; i++){
for (int j = 1; j < maxY+1; j++){
if (! cells[i][j]) {
s[i][j] = 0;
continue;
}
int maxHSize = 1;
int maxVSize = 1;
int i1 = i+1;
int j1 = j-1;
// Add one to the maximum horizontal size for
// each cell that is true towards the right.
while (cells[i1][j]){
i1++;