Find top N elements in an Array











up vote
27
down vote

favorite
14












What would be the best solution to find top N (say 10) elements in an unordered list (of say 100).



The solution which came in my head was to 1. sort it using quick sort, 2. get top 10.



But is there any better alternative?










share|improve this question
























  • if N is small run finding biggest value for N time
    – Saeed Amiri
    Nov 3 '10 at 9:00










  • That gives you the max N times, not top N elements.
    – piccolbo
    Nov 4 '10 at 5:55










  • how do you define 'top'?
    – Anil Katti
    Nov 27 '10 at 3:33










  • first n elements in a descending sorted array
    – zengr
    Nov 27 '10 at 5:38








  • 1




    please see answers in this thread stackoverflow.com/questions/251781/…
    – didxga
    Jul 13 '11 at 2:42















up vote
27
down vote

favorite
14












What would be the best solution to find top N (say 10) elements in an unordered list (of say 100).



The solution which came in my head was to 1. sort it using quick sort, 2. get top 10.



But is there any better alternative?










share|improve this question
























  • if N is small run finding biggest value for N time
    – Saeed Amiri
    Nov 3 '10 at 9:00










  • That gives you the max N times, not top N elements.
    – piccolbo
    Nov 4 '10 at 5:55










  • how do you define 'top'?
    – Anil Katti
    Nov 27 '10 at 3:33










  • first n elements in a descending sorted array
    – zengr
    Nov 27 '10 at 5:38








  • 1




    please see answers in this thread stackoverflow.com/questions/251781/…
    – didxga
    Jul 13 '11 at 2:42













up vote
27
down vote

favorite
14









up vote
27
down vote

favorite
14






14





What would be the best solution to find top N (say 10) elements in an unordered list (of say 100).



The solution which came in my head was to 1. sort it using quick sort, 2. get top 10.



But is there any better alternative?










share|improve this question















What would be the best solution to find top N (say 10) elements in an unordered list (of say 100).



The solution which came in my head was to 1. sort it using quick sort, 2. get top 10.



But is there any better alternative?







java algorithm sorting






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 3 '10 at 6:13









Yin Zhu

13k1061104




13k1061104










asked Nov 3 '10 at 5:47









zengr

25.8k31108176




25.8k31108176












  • if N is small run finding biggest value for N time
    – Saeed Amiri
    Nov 3 '10 at 9:00










  • That gives you the max N times, not top N elements.
    – piccolbo
    Nov 4 '10 at 5:55










  • how do you define 'top'?
    – Anil Katti
    Nov 27 '10 at 3:33










  • first n elements in a descending sorted array
    – zengr
    Nov 27 '10 at 5:38








  • 1




    please see answers in this thread stackoverflow.com/questions/251781/…
    – didxga
    Jul 13 '11 at 2:42


















  • if N is small run finding biggest value for N time
    – Saeed Amiri
    Nov 3 '10 at 9:00










  • That gives you the max N times, not top N elements.
    – piccolbo
    Nov 4 '10 at 5:55










  • how do you define 'top'?
    – Anil Katti
    Nov 27 '10 at 3:33










  • first n elements in a descending sorted array
    – zengr
    Nov 27 '10 at 5:38








  • 1




    please see answers in this thread stackoverflow.com/questions/251781/…
    – didxga
    Jul 13 '11 at 2:42
















if N is small run finding biggest value for N time
– Saeed Amiri
Nov 3 '10 at 9:00




if N is small run finding biggest value for N time
– Saeed Amiri
Nov 3 '10 at 9:00












That gives you the max N times, not top N elements.
– piccolbo
Nov 4 '10 at 5:55




That gives you the max N times, not top N elements.
– piccolbo
Nov 4 '10 at 5:55












how do you define 'top'?
– Anil Katti
Nov 27 '10 at 3:33




how do you define 'top'?
– Anil Katti
Nov 27 '10 at 3:33












first n elements in a descending sorted array
– zengr
Nov 27 '10 at 5:38






first n elements in a descending sorted array
– zengr
Nov 27 '10 at 5:38






1




1




please see answers in this thread stackoverflow.com/questions/251781/…
– didxga
Jul 13 '11 at 2:42




please see answers in this thread stackoverflow.com/questions/251781/…
– didxga
Jul 13 '11 at 2:42












12 Answers
12






active

oldest

votes

















up vote
22
down vote



accepted










The time could be reduced to linear time:




  1. Use the selection algorithm, which effectively find the k-th element in a un-sorted array in linear time. You can either use a variant of quick sort or more robust algorithms.


  2. Get the top k using the pivot got in step 1.







share|improve this answer

















  • 2




    This is a very nice way. But I'll mention that linear (deterministic) selection is actually pretty slow -- using quickselect with randomly chosen pivots is likely to be much faster on typical problem sizes, but it can take quadratic time.
    – j_random_hacker
    Nov 3 '10 at 7:10


















up vote
8
down vote













How about delegating everything to Java ;)



function findTopN(Array list, int n)
{
Set sortedSet<Integer> = new TreeSet<>(Comparators.naturalOrder());

// add all elements from list to sortedSet

// return the first n from sortedSet
}


I am not trying to say that this is the best way. I still think Yin Zhu's method of finding the kth largest element is the best answer.






share|improve this answer























  • neat............
    – zengr
    Nov 3 '10 at 22:43






  • 11




    TreeSet will remove duplicates which may not be desired.
    – Stefan
    Jan 18 '13 at 16:14










  • A simple and clean solution, but this will be O(nlogn) average/worst case compared to a selection algorithm (such as quickselect) which can select the top k elements in O(n) average case and o time where n is the size of the input array.
    – Pavel
    Feb 7 '14 at 1:22


















up vote
7
down vote













If you're dealing with simple elements like fixed-length integers, then provided you can spare a memory buffer of the same size as the input data, sorting can be done in O(n) time using bucket or radix sorts, and this will be the fastest.



Although there are linear-time selection algorithms, the hidden constant is very high -- around 24. That means an O(nlog n) algorithm will be typically faster for fewer than several million elements.



Otherwise, in the general case when you can only compare 2 elements and determine which is greater, the problem is best solved by a heap data structure.



Suppose you want the top k of n items. All solutions based on fully sorting the data require O(nlog n) time, while using a heap requires only O(nlog k) time -- just build a heap on the first k elements, then keep adding an element and removing the maximum. This will leave you with a heap containing the smallest k elements.






share|improve this answer




























    up vote
    4
    down vote













    Yes, you can do it in O(n) by just keeping a (sorted) running list of the top N. You can sort the running list using the regular library functions or a sorting network. E.g. a simple demo using 3, and showing which elements in the running list change each iteration.



    5 2 8 7 9



    i = 0
    top[0] <= 5

    i = 1
    top[1] <= 2

    i = 2
    top[2] <= top[1] (2)
    top[1] <= top[0] (5)
    top[0] <= 8

    i = 3
    top[2] <= top[1] (5)
    top[1] <= 7

    i = 4
    top[2] <= top[1] (7)
    top[1] <= top[0] (8)
    top[0] <= 9





    share|improve this answer



















    • 1




      This is good if the number of selected items is small. For selecting something like 10000 top elements from a million this is no longer optimal.
      – Rafał Dowgird
      Nov 3 '10 at 8:22










    • This is somehow like Insertion sort.
      – Gumbo
      Nov 3 '10 at 8:37


















    up vote
    2
    down vote













    The best solution is to use whatever facilities your chosen language provides which will make your life easier.



    However, assuming this was a question more related to what algorithm you should choose, I'm going to suggest a different approach here. If you're talking about 10 from 100, you shouldn't generally worry too much about performance unless you want to do it many times per second.



    For example, this C code (which is about as inefficient as I can make it without being silly) still takes well under a tenth of a second to execute. That's not enough time for me to even think about going to get a coffee.



    #include <stdio.h>
    #include <stdlib.h>
    #include <time.h>

    #define SRCSZ 100
    #define DSTSZ 10

    int main (void) {
    int unused[SRCSZ], source[SRCSZ], dest[DSTSZ], i, j, pos;

    srand (time (NULL));
    for (i = 0; i < SRCSZ; i++) {
    unused[i] = 1;
    source[i] = rand() % 1000;
    }

    for (i = 0; i < DSTSZ; i++) {
    pos = -1;
    for (j = 0; j < SRCSZ; j++) {
    if (pos == -1) {
    if (unused[j]) {
    pos = j;
    }
    } else {
    if (unused[j] && (source[j] > source[pos])) {
    pos = j;
    }
    }
    }
    dest[i] = source[pos];
    unused[pos] = 0;
    }

    printf ("Source:");
    for (i = 0; i < SRCSZ; i++) printf (" %d", source[i]);
    printf ("nDest:");
    for (i = 0; i < DSTSZ; i++) printf (" %d", dest[i]);
    printf ("n");

    return 0;
    }


    Running it through time gives you (I've formatted the output a bit to make it readable, but haven't affected the results):



    Source: 403 459 646 467 120 346 430 247 68 312 701 304 707 443
    753 433 986 921 513 634 861 741 482 794 679 409 145 93
    512 947 19 9 385 208 795 742 851 638 924 637 638 141
    382 89 998 713 210 732 784 67 273 628 187 902 42 25
    747 471 686 504 255 74 638 610 227 892 156 86 48 133
    63 234 639 899 815 986 750 177 413 581 899 494 292 359
    60 106 944 926 257 370 310 726 393 800 986 827 856 835
    66 183 901
    Dest: 998 986 986 986 947 944 926 924 921 902

    real 0m0.063s
    user 0m0.046s
    sys 0m0.031s


    Only once the quantities of numbers become large should you usually worry. Don't get me wrong, I'm not saying you shouldn't think about performance. What you shouldn't do is spend too much time optimising things that don't matter - YAGNI and all that jazz.



    As with all optimisation questions, measure don't guess!






    share|improve this answer























    • but is it a suitable interview answer??
      – zengr
      Nov 3 '10 at 6:32










    • Yes, I believe it would be. By far the most dangerous person to hire is the over-engineer. This is someone who, when asked to provide a function to calculate the sum of a list of items, will give you a 900-line monstrosity that's configurable to handle integers, floats, complex numbers and can also perform pre-calculations on each item with a callback function. This is not the sort of person you want working on something that has a definite delivery date :-) In addition, it gives valuable insight that the candidate can think for themself. Any monkey can cut code.
      – paxdiablo
      Nov 3 '10 at 6:37






    • 2




      I would hire someone like that on the spot, provided they explain why they did it that way and what are the consequences. See also stackoverflow.com/questions/903572/… and stackoverflow.com/questions/445425/… although there were ... shall we say, animated ... discussions on the merits :-)
      – paxdiablo
      Nov 3 '10 at 6:40






    • 1




      Hehe, +1 for practicality... But re: what kind of answer is "best" I think the crucial thing is for the interviewee to ask: (1) What are the typical problem sizes? (2) How fast does it need to be? (And BTW if performance on big problems isn't important then the best answer is "Use whatever built-in sort() the language has :-P)
      – j_random_hacker
      Nov 3 '10 at 7:17


















    up vote
    1
    down vote













    Well, you can create a heap from an unsorted array in O(n) time, and you can get the top element from the heap in O(log(n)) time. So your total runtime is O(n + k*log(n)).






    share|improve this answer




























      up vote
      1
      down vote













      Written below both selection sort and insertion sort implementations. For larger data set I suggest insetion sort better than selection sort



      public interface FindTopValues
      {
      int findTopNValues(int data, int n);
      }


      Insertion Sort Implementation:



      public class FindTopValuesInsertionSortImpl implements FindTopValues {  

      /**
      * Finds list of the highest 'n' values in the source list, ordered naturally,
      * with the highest value at the start of the array and returns it
      */
      @Override
      public int findTopNValues(int values, int n) {

      int length = values.length;
      for (int i=1; i<length; i++) {
      int curPos = i;
      while ((curPos > 0) && (values[i] > values[curPos-1])) {
      curPos--;
      }

      if (curPos != i) {
      int element = values[i];
      System.arraycopy(values, curPos, values, curPos+1, (i-curPos));
      values[curPos] = element;
      }
      }

      return Arrays.copyOf(values, n);
      }

      }


      Selection Sort Implementation:



      public class FindTopValuesSelectionSortImpl implements FindTopValues {

      /**
      * Finds list of the highest 'n' values in the source list, ordered naturally,
      * with the highest value at the start of the array and returns it
      */
      @Override
      public int findTopNValues(int values, int n) {
      int length = values.length;

      for (int i=0; i<=n; i++) {
      int maxPos = i;
      for (int j=i+1; j<length; j++) {
      if (values[j] > values[maxPos]) {
      maxPos = j;
      }
      }

      if (maxPos != i) {
      int maxValue = values[maxPos];
      values[maxPos] = values[i];
      values[i] = maxValue;
      }
      }
      return Arrays.copyOf(values, n);
      }
      }





      share|improve this answer






























        up vote
        1
        down vote













        You can use List and can guava's Comparators class to get the desired results. It is a highly optimized solution. Please see a sample below, which gets top 5 numbers. Api can be found here.



        import java.util.Comparator;
        import java.util.List;
        import java.util.stream.Collector;

        import org.junit.Test;

        import com.google.common.collect.Comparators;
        import com.google.common.collect.Lists;

        public class TestComparator {

        @Test
        public void testTopN() {
        final List<Integer> numbers = Lists.newArrayList(1, 3, 8, 2, 6, 4, 7, 5, 9, 0);
        final Collector<Integer, ?, List<Integer>> collector = Comparators.greatest(5,
        Comparator.<Integer>naturalOrder());
        final List<Integer> top = numbers.stream().collect(collector);
        System.out.println(top);
        }

        }


        Output: [9, 8, 7, 6, 5]






        share|improve this answer




























          up vote
          0
          down vote













          Yes there is a way to do better than quicksort. As pointed by Yin Zhu, you can search for kth largest element first and then use that element value as your pivot to split the array






          share|improve this answer




























            up vote
            0
            down vote













            I was asked for the same algorithm on the interview.
            I done that, if somebody can compare that with fastest algorithm in Java - will be very useful.



                public int findTopNValues(int anyOldOrderValues, int n) {
            if (n < 0) {
            return new int{};
            }
            if (n == 1) {
            return new int{findMaxValue(anyOldOrderValues)};
            }

            int result = new int[n + 1];
            for (int i = 0; i < Math.min(n, anyOldOrderValues.length); i++) {
            result[i] = anyOldOrderValues[i];
            }
            Arrays.sort(result);

            int max = result[0];
            for (int i = n - 1; i < anyOldOrderValues.length; i++) {
            int value = anyOldOrderValues[i];
            if (max < value) {
            result[n] = value;
            Arrays.sort(result);
            int result1 = new int[n + 1];
            System.arraycopy(result, 1, result1, 0, n);
            result = result1;
            max = result[0];
            }
            }
            return convertAndFlip(result, n);
            }

            public static int convertAndFlip(int integers, int n) {
            int result = new int[n];
            int j = 0;
            for (int i = n - 1; i > -1; i--) {
            result[j++] = integers[i];
            }
            return result;
            }


            and test for that:



            public void testFindTopNValues() throws Exception {
            final int N = 100000000;
            final int MAX_VALUE = 100000000;
            final int returnArray = 1000;
            final int repeatTimes = 5;

            FindTopValuesArraySorting arraySorting = new FindTopValuesArraySorting();

            int randomArray = createRandomArray(N, MAX_VALUE);
            for (int i = 0; i < repeatTimes; i++) {

            long start = System.currentTimeMillis();
            int topNValues = arraySorting.findTopNValues(randomArray, returnArray);
            long stop = System.currentTimeMillis();

            System.out.println("findTopNValues() from " + N + " elements, where MAX value=" + (MAX_VALUE - 1) + " and return array size " + returnArray + " elements : " + (stop - start) + "msec");
            // System.out.println("Result list = " + Arrays.toString(topNValues));
            }
            }

            private static int createRandomArray(int n, int maxValue) {
            Random r = new Random();
            int arr = new int[n];
            for (int i = 0; i < n; i++) {
            arr[i] = r.nextInt(maxValue);
            }
            return arr;
            }


            Result is something like:



            findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 395msec
            findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 311msec
            findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 473msec
            findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 380msec
            findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 406msec


            ~400msc average result, for getting 1000 max integers from array of 100.000.000 initial elements.
            not bad!



            Just tried that set from above:



            findTopNValues() from 101 elements and return array size 10 elements : 1msec
            Result list = [998, 986, 986, 986, 947, 944, 926, 924, 921, 902]
            Original list = [403, 459, 646, 467, 120, 346, 430, 247, 68, 312, 701, 304, 707, 443, 753, 433, 986, 921, 513, 634, 861, 741, 482, 794, 679, 409, 145, 93, 512, 947, 19, 9, 385, 208, 795, 742, 851, 638, 924, 637, 638, 141, 382, 89, 998, 713, 210, 732, 784, 67, 273, 628, 187, 902, 42, 25, 747, 471, 686, 504, 255, 74, 638, 610, 227, 892, 156, 86, 48, 133, 63, 234, 639, 899, 815, 986, 750, 177, 413, 581, 899, 494, 292, 359, 60, 106, 944, 926, 257, 370, 310, 726, 393, 800, 986, 827, 856, 835, 66, 183, 901]





            share|improve this answer






























              up vote
              0
              down vote













              The best Algorithm would by large depend on the size of K.
              If K is small then by simply following BubbleSort Algorithm and iterating the outer loop K times
              would give the top K values.
              The complexity will be O(n*k).



              However for values of K close to n the complexity will approach O(n^2). In such scenario quicksort might be a good alternative.






              share|improve this answer




























                up vote
                0
                down vote













                public class FindTopValuesSelectionSortImpl implements FindTopValues {

                /**
                * Finds list of the highest 'n' values in the source list, ordered naturally,
                * with the highest value at the start of the array and returns it
                */
                @Override
                public int findTopNValues(int values, int n) {
                int length = values.length;

                for (int i=0; i<=n; i++) {
                int maxPos = i;
                for (int j=i+1; j<length; j++) {
                if (values[j] > values[maxPos]) {
                maxPos = j;
                }
                }

                if (maxPos != i) {
                int maxValue = values[maxPos];
                values[maxPos] = values[i];**strong text**
                values[i] = maxValue;
                }
                }
                return Arrays.copyOf(values, n);
                }
                }





                share|improve this answer





















                  Your Answer






                  StackExchange.ifUsing("editor", function () {
                  StackExchange.using("externalEditor", function () {
                  StackExchange.using("snippets", function () {
                  StackExchange.snippets.init();
                  });
                  });
                  }, "code-snippets");

                  StackExchange.ready(function() {
                  var channelOptions = {
                  tags: "".split(" "),
                  id: "1"
                  };
                  initTagRenderer("".split(" "), "".split(" "), channelOptions);

                  StackExchange.using("externalEditor", function() {
                  // Have to fire editor after snippets, if snippets enabled
                  if (StackExchange.settings.snippets.snippetsEnabled) {
                  StackExchange.using("snippets", function() {
                  createEditor();
                  });
                  }
                  else {
                  createEditor();
                  }
                  });

                  function createEditor() {
                  StackExchange.prepareEditor({
                  heartbeatType: 'answer',
                  convertImagesToLinks: true,
                  noModals: true,
                  showLowRepImageUploadWarning: true,
                  reputationToPostImages: 10,
                  bindNavPrevention: true,
                  postfix: "",
                  imageUploader: {
                  brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                  contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                  allowUrls: true
                  },
                  onDemand: true,
                  discardSelector: ".discard-answer"
                  ,immediatelyShowMarkdownHelp:true
                  });


                  }
                  });














                  draft saved

                  draft discarded


















                  StackExchange.ready(
                  function () {
                  StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f4084495%2ffind-top-n-elements-in-an-array%23new-answer', 'question_page');
                  }
                  );

                  Post as a guest















                  Required, but never shown

























                  12 Answers
                  12






                  active

                  oldest

                  votes








                  12 Answers
                  12






                  active

                  oldest

                  votes









                  active

                  oldest

                  votes






                  active

                  oldest

                  votes








                  up vote
                  22
                  down vote



                  accepted










                  The time could be reduced to linear time:




                  1. Use the selection algorithm, which effectively find the k-th element in a un-sorted array in linear time. You can either use a variant of quick sort or more robust algorithms.


                  2. Get the top k using the pivot got in step 1.







                  share|improve this answer

















                  • 2




                    This is a very nice way. But I'll mention that linear (deterministic) selection is actually pretty slow -- using quickselect with randomly chosen pivots is likely to be much faster on typical problem sizes, but it can take quadratic time.
                    – j_random_hacker
                    Nov 3 '10 at 7:10















                  up vote
                  22
                  down vote



                  accepted










                  The time could be reduced to linear time:




                  1. Use the selection algorithm, which effectively find the k-th element in a un-sorted array in linear time. You can either use a variant of quick sort or more robust algorithms.


                  2. Get the top k using the pivot got in step 1.







                  share|improve this answer

















                  • 2




                    This is a very nice way. But I'll mention that linear (deterministic) selection is actually pretty slow -- using quickselect with randomly chosen pivots is likely to be much faster on typical problem sizes, but it can take quadratic time.
                    – j_random_hacker
                    Nov 3 '10 at 7:10













                  up vote
                  22
                  down vote



                  accepted







                  up vote
                  22
                  down vote



                  accepted






                  The time could be reduced to linear time:




                  1. Use the selection algorithm, which effectively find the k-th element in a un-sorted array in linear time. You can either use a variant of quick sort or more robust algorithms.


                  2. Get the top k using the pivot got in step 1.







                  share|improve this answer












                  The time could be reduced to linear time:




                  1. Use the selection algorithm, which effectively find the k-th element in a un-sorted array in linear time. You can either use a variant of quick sort or more robust algorithms.


                  2. Get the top k using the pivot got in step 1.








                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 3 '10 at 5:51









                  Yin Zhu

                  13k1061104




                  13k1061104








                  • 2




                    This is a very nice way. But I'll mention that linear (deterministic) selection is actually pretty slow -- using quickselect with randomly chosen pivots is likely to be much faster on typical problem sizes, but it can take quadratic time.
                    – j_random_hacker
                    Nov 3 '10 at 7:10














                  • 2




                    This is a very nice way. But I'll mention that linear (deterministic) selection is actually pretty slow -- using quickselect with randomly chosen pivots is likely to be much faster on typical problem sizes, but it can take quadratic time.
                    – j_random_hacker
                    Nov 3 '10 at 7:10








                  2




                  2




                  This is a very nice way. But I'll mention that linear (deterministic) selection is actually pretty slow -- using quickselect with randomly chosen pivots is likely to be much faster on typical problem sizes, but it can take quadratic time.
                  – j_random_hacker
                  Nov 3 '10 at 7:10




                  This is a very nice way. But I'll mention that linear (deterministic) selection is actually pretty slow -- using quickselect with randomly chosen pivots is likely to be much faster on typical problem sizes, but it can take quadratic time.
                  – j_random_hacker
                  Nov 3 '10 at 7:10












                  up vote
                  8
                  down vote













                  How about delegating everything to Java ;)



                  function findTopN(Array list, int n)
                  {
                  Set sortedSet<Integer> = new TreeSet<>(Comparators.naturalOrder());

                  // add all elements from list to sortedSet

                  // return the first n from sortedSet
                  }


                  I am not trying to say that this is the best way. I still think Yin Zhu's method of finding the kth largest element is the best answer.






                  share|improve this answer























                  • neat............
                    – zengr
                    Nov 3 '10 at 22:43






                  • 11




                    TreeSet will remove duplicates which may not be desired.
                    – Stefan
                    Jan 18 '13 at 16:14










                  • A simple and clean solution, but this will be O(nlogn) average/worst case compared to a selection algorithm (such as quickselect) which can select the top k elements in O(n) average case and o time where n is the size of the input array.
                    – Pavel
                    Feb 7 '14 at 1:22















                  up vote
                  8
                  down vote













                  How about delegating everything to Java ;)



                  function findTopN(Array list, int n)
                  {
                  Set sortedSet<Integer> = new TreeSet<>(Comparators.naturalOrder());

                  // add all elements from list to sortedSet

                  // return the first n from sortedSet
                  }


                  I am not trying to say that this is the best way. I still think Yin Zhu's method of finding the kth largest element is the best answer.






                  share|improve this answer























                  • neat............
                    – zengr
                    Nov 3 '10 at 22:43






                  • 11




                    TreeSet will remove duplicates which may not be desired.
                    – Stefan
                    Jan 18 '13 at 16:14










                  • A simple and clean solution, but this will be O(nlogn) average/worst case compared to a selection algorithm (such as quickselect) which can select the top k elements in O(n) average case and o time where n is the size of the input array.
                    – Pavel
                    Feb 7 '14 at 1:22













                  up vote
                  8
                  down vote










                  up vote
                  8
                  down vote









                  How about delegating everything to Java ;)



                  function findTopN(Array list, int n)
                  {
                  Set sortedSet<Integer> = new TreeSet<>(Comparators.naturalOrder());

                  // add all elements from list to sortedSet

                  // return the first n from sortedSet
                  }


                  I am not trying to say that this is the best way. I still think Yin Zhu's method of finding the kth largest element is the best answer.






                  share|improve this answer














                  How about delegating everything to Java ;)



                  function findTopN(Array list, int n)
                  {
                  Set sortedSet<Integer> = new TreeSet<>(Comparators.naturalOrder());

                  // add all elements from list to sortedSet

                  // return the first n from sortedSet
                  }


                  I am not trying to say that this is the best way. I still think Yin Zhu's method of finding the kth largest element is the best answer.







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Sep 21 '16 at 18:47

























                  answered Nov 3 '10 at 20:33









                  nuaavee

                  79221128




                  79221128












                  • neat............
                    – zengr
                    Nov 3 '10 at 22:43






                  • 11




                    TreeSet will remove duplicates which may not be desired.
                    – Stefan
                    Jan 18 '13 at 16:14










                  • A simple and clean solution, but this will be O(nlogn) average/worst case compared to a selection algorithm (such as quickselect) which can select the top k elements in O(n) average case and o time where n is the size of the input array.
                    – Pavel
                    Feb 7 '14 at 1:22


















                  • neat............
                    – zengr
                    Nov 3 '10 at 22:43






                  • 11




                    TreeSet will remove duplicates which may not be desired.
                    – Stefan
                    Jan 18 '13 at 16:14










                  • A simple and clean solution, but this will be O(nlogn) average/worst case compared to a selection algorithm (such as quickselect) which can select the top k elements in O(n) average case and o time where n is the size of the input array.
                    – Pavel
                    Feb 7 '14 at 1:22
















                  neat............
                  – zengr
                  Nov 3 '10 at 22:43




                  neat............
                  – zengr
                  Nov 3 '10 at 22:43




                  11




                  11




                  TreeSet will remove duplicates which may not be desired.
                  – Stefan
                  Jan 18 '13 at 16:14




                  TreeSet will remove duplicates which may not be desired.
                  – Stefan
                  Jan 18 '13 at 16:14












                  A simple and clean solution, but this will be O(nlogn) average/worst case compared to a selection algorithm (such as quickselect) which can select the top k elements in O(n) average case and o time where n is the size of the input array.
                  – Pavel
                  Feb 7 '14 at 1:22




                  A simple and clean solution, but this will be O(nlogn) average/worst case compared to a selection algorithm (such as quickselect) which can select the top k elements in O(n) average case and o time where n is the size of the input array.
                  – Pavel
                  Feb 7 '14 at 1:22










                  up vote
                  7
                  down vote













                  If you're dealing with simple elements like fixed-length integers, then provided you can spare a memory buffer of the same size as the input data, sorting can be done in O(n) time using bucket or radix sorts, and this will be the fastest.



                  Although there are linear-time selection algorithms, the hidden constant is very high -- around 24. That means an O(nlog n) algorithm will be typically faster for fewer than several million elements.



                  Otherwise, in the general case when you can only compare 2 elements and determine which is greater, the problem is best solved by a heap data structure.



                  Suppose you want the top k of n items. All solutions based on fully sorting the data require O(nlog n) time, while using a heap requires only O(nlog k) time -- just build a heap on the first k elements, then keep adding an element and removing the maximum. This will leave you with a heap containing the smallest k elements.






                  share|improve this answer

























                    up vote
                    7
                    down vote













                    If you're dealing with simple elements like fixed-length integers, then provided you can spare a memory buffer of the same size as the input data, sorting can be done in O(n) time using bucket or radix sorts, and this will be the fastest.



                    Although there are linear-time selection algorithms, the hidden constant is very high -- around 24. That means an O(nlog n) algorithm will be typically faster for fewer than several million elements.



                    Otherwise, in the general case when you can only compare 2 elements and determine which is greater, the problem is best solved by a heap data structure.



                    Suppose you want the top k of n items. All solutions based on fully sorting the data require O(nlog n) time, while using a heap requires only O(nlog k) time -- just build a heap on the first k elements, then keep adding an element and removing the maximum. This will leave you with a heap containing the smallest k elements.






                    share|improve this answer























                      up vote
                      7
                      down vote










                      up vote
                      7
                      down vote









                      If you're dealing with simple elements like fixed-length integers, then provided you can spare a memory buffer of the same size as the input data, sorting can be done in O(n) time using bucket or radix sorts, and this will be the fastest.



                      Although there are linear-time selection algorithms, the hidden constant is very high -- around 24. That means an O(nlog n) algorithm will be typically faster for fewer than several million elements.



                      Otherwise, in the general case when you can only compare 2 elements and determine which is greater, the problem is best solved by a heap data structure.



                      Suppose you want the top k of n items. All solutions based on fully sorting the data require O(nlog n) time, while using a heap requires only O(nlog k) time -- just build a heap on the first k elements, then keep adding an element and removing the maximum. This will leave you with a heap containing the smallest k elements.






                      share|improve this answer












                      If you're dealing with simple elements like fixed-length integers, then provided you can spare a memory buffer of the same size as the input data, sorting can be done in O(n) time using bucket or radix sorts, and this will be the fastest.



                      Although there are linear-time selection algorithms, the hidden constant is very high -- around 24. That means an O(nlog n) algorithm will be typically faster for fewer than several million elements.



                      Otherwise, in the general case when you can only compare 2 elements and determine which is greater, the problem is best solved by a heap data structure.



                      Suppose you want the top k of n items. All solutions based on fully sorting the data require O(nlog n) time, while using a heap requires only O(nlog k) time -- just build a heap on the first k elements, then keep adding an element and removing the maximum. This will leave you with a heap containing the smallest k elements.







                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered Nov 3 '10 at 6:56









                      j_random_hacker

                      43k879140




                      43k879140






















                          up vote
                          4
                          down vote













                          Yes, you can do it in O(n) by just keeping a (sorted) running list of the top N. You can sort the running list using the regular library functions or a sorting network. E.g. a simple demo using 3, and showing which elements in the running list change each iteration.



                          5 2 8 7 9



                          i = 0
                          top[0] <= 5

                          i = 1
                          top[1] <= 2

                          i = 2
                          top[2] <= top[1] (2)
                          top[1] <= top[0] (5)
                          top[0] <= 8

                          i = 3
                          top[2] <= top[1] (5)
                          top[1] <= 7

                          i = 4
                          top[2] <= top[1] (7)
                          top[1] <= top[0] (8)
                          top[0] <= 9





                          share|improve this answer



















                          • 1




                            This is good if the number of selected items is small. For selecting something like 10000 top elements from a million this is no longer optimal.
                            – Rafał Dowgird
                            Nov 3 '10 at 8:22










                          • This is somehow like Insertion sort.
                            – Gumbo
                            Nov 3 '10 at 8:37















                          up vote
                          4
                          down vote













                          Yes, you can do it in O(n) by just keeping a (sorted) running list of the top N. You can sort the running list using the regular library functions or a sorting network. E.g. a simple demo using 3, and showing which elements in the running list change each iteration.



                          5 2 8 7 9



                          i = 0
                          top[0] <= 5

                          i = 1
                          top[1] <= 2

                          i = 2
                          top[2] <= top[1] (2)
                          top[1] <= top[0] (5)
                          top[0] <= 8

                          i = 3
                          top[2] <= top[1] (5)
                          top[1] <= 7

                          i = 4
                          top[2] <= top[1] (7)
                          top[1] <= top[0] (8)
                          top[0] <= 9





                          share|improve this answer



















                          • 1




                            This is good if the number of selected items is small. For selecting something like 10000 top elements from a million this is no longer optimal.
                            – Rafał Dowgird
                            Nov 3 '10 at 8:22










                          • This is somehow like Insertion sort.
                            – Gumbo
                            Nov 3 '10 at 8:37













                          up vote
                          4
                          down vote










                          up vote
                          4
                          down vote









                          Yes, you can do it in O(n) by just keeping a (sorted) running list of the top N. You can sort the running list using the regular library functions or a sorting network. E.g. a simple demo using 3, and showing which elements in the running list change each iteration.



                          5 2 8 7 9



                          i = 0
                          top[0] <= 5

                          i = 1
                          top[1] <= 2

                          i = 2
                          top[2] <= top[1] (2)
                          top[1] <= top[0] (5)
                          top[0] <= 8

                          i = 3
                          top[2] <= top[1] (5)
                          top[1] <= 7

                          i = 4
                          top[2] <= top[1] (7)
                          top[1] <= top[0] (8)
                          top[0] <= 9





                          share|improve this answer














                          Yes, you can do it in O(n) by just keeping a (sorted) running list of the top N. You can sort the running list using the regular library functions or a sorting network. E.g. a simple demo using 3, and showing which elements in the running list change each iteration.



                          5 2 8 7 9



                          i = 0
                          top[0] <= 5

                          i = 1
                          top[1] <= 2

                          i = 2
                          top[2] <= top[1] (2)
                          top[1] <= top[0] (5)
                          top[0] <= 8

                          i = 3
                          top[2] <= top[1] (5)
                          top[1] <= 7

                          i = 4
                          top[2] <= top[1] (7)
                          top[1] <= top[0] (8)
                          top[0] <= 9






                          share|improve this answer














                          share|improve this answer



                          share|improve this answer








                          edited Nov 3 '10 at 6:00

























                          answered Nov 3 '10 at 5:50









                          Matthew Flaschen

                          217k37435500




                          217k37435500








                          • 1




                            This is good if the number of selected items is small. For selecting something like 10000 top elements from a million this is no longer optimal.
                            – Rafał Dowgird
                            Nov 3 '10 at 8:22










                          • This is somehow like Insertion sort.
                            – Gumbo
                            Nov 3 '10 at 8:37














                          • 1




                            This is good if the number of selected items is small. For selecting something like 10000 top elements from a million this is no longer optimal.
                            – Rafał Dowgird
                            Nov 3 '10 at 8:22










                          • This is somehow like Insertion sort.
                            – Gumbo
                            Nov 3 '10 at 8:37








                          1




                          1




                          This is good if the number of selected items is small. For selecting something like 10000 top elements from a million this is no longer optimal.
                          – Rafał Dowgird
                          Nov 3 '10 at 8:22




                          This is good if the number of selected items is small. For selecting something like 10000 top elements from a million this is no longer optimal.
                          – Rafał Dowgird
                          Nov 3 '10 at 8:22












                          This is somehow like Insertion sort.
                          – Gumbo
                          Nov 3 '10 at 8:37




                          This is somehow like Insertion sort.
                          – Gumbo
                          Nov 3 '10 at 8:37










                          up vote
                          2
                          down vote













                          The best solution is to use whatever facilities your chosen language provides which will make your life easier.



                          However, assuming this was a question more related to what algorithm you should choose, I'm going to suggest a different approach here. If you're talking about 10 from 100, you shouldn't generally worry too much about performance unless you want to do it many times per second.



                          For example, this C code (which is about as inefficient as I can make it without being silly) still takes well under a tenth of a second to execute. That's not enough time for me to even think about going to get a coffee.



                          #include <stdio.h>
                          #include <stdlib.h>
                          #include <time.h>

                          #define SRCSZ 100
                          #define DSTSZ 10

                          int main (void) {
                          int unused[SRCSZ], source[SRCSZ], dest[DSTSZ], i, j, pos;

                          srand (time (NULL));
                          for (i = 0; i < SRCSZ; i++) {
                          unused[i] = 1;
                          source[i] = rand() % 1000;
                          }

                          for (i = 0; i < DSTSZ; i++) {
                          pos = -1;
                          for (j = 0; j < SRCSZ; j++) {
                          if (pos == -1) {
                          if (unused[j]) {
                          pos = j;
                          }
                          } else {
                          if (unused[j] && (source[j] > source[pos])) {
                          pos = j;
                          }
                          }
                          }
                          dest[i] = source[pos];
                          unused[pos] = 0;
                          }

                          printf ("Source:");
                          for (i = 0; i < SRCSZ; i++) printf (" %d", source[i]);
                          printf ("nDest:");
                          for (i = 0; i < DSTSZ; i++) printf (" %d", dest[i]);
                          printf ("n");

                          return 0;
                          }


                          Running it through time gives you (I've formatted the output a bit to make it readable, but haven't affected the results):



                          Source: 403 459 646 467 120 346 430 247 68 312 701 304 707 443
                          753 433 986 921 513 634 861 741 482 794 679 409 145 93
                          512 947 19 9 385 208 795 742 851 638 924 637 638 141
                          382 89 998 713 210 732 784 67 273 628 187 902 42 25
                          747 471 686 504 255 74 638 610 227 892 156 86 48 133
                          63 234 639 899 815 986 750 177 413 581 899 494 292 359
                          60 106 944 926 257 370 310 726 393 800 986 827 856 835
                          66 183 901
                          Dest: 998 986 986 986 947 944 926 924 921 902

                          real 0m0.063s
                          user 0m0.046s
                          sys 0m0.031s


                          Only once the quantities of numbers become large should you usually worry. Don't get me wrong, I'm not saying you shouldn't think about performance. What you shouldn't do is spend too much time optimising things that don't matter - YAGNI and all that jazz.



                          As with all optimisation questions, measure don't guess!






                          share|improve this answer























                          • but is it a suitable interview answer??
                            – zengr
                            Nov 3 '10 at 6:32










                          • Yes, I believe it would be. By far the most dangerous person to hire is the over-engineer. This is someone who, when asked to provide a function to calculate the sum of a list of items, will give you a 900-line monstrosity that's configurable to handle integers, floats, complex numbers and can also perform pre-calculations on each item with a callback function. This is not the sort of person you want working on something that has a definite delivery date :-) In addition, it gives valuable insight that the candidate can think for themself. Any monkey can cut code.
                            – paxdiablo
                            Nov 3 '10 at 6:37






                          • 2




                            I would hire someone like that on the spot, provided they explain why they did it that way and what are the consequences. See also stackoverflow.com/questions/903572/… and stackoverflow.com/questions/445425/… although there were ... shall we say, animated ... discussions on the merits :-)
                            – paxdiablo
                            Nov 3 '10 at 6:40






                          • 1




                            Hehe, +1 for practicality... But re: what kind of answer is "best" I think the crucial thing is for the interviewee to ask: (1) What are the typical problem sizes? (2) How fast does it need to be? (And BTW if performance on big problems isn't important then the best answer is "Use whatever built-in sort() the language has :-P)
                            – j_random_hacker
                            Nov 3 '10 at 7:17















                          up vote
                          2
                          down vote













                          The best solution is to use whatever facilities your chosen language provides which will make your life easier.



                          However, assuming this was a question more related to what algorithm you should choose, I'm going to suggest a different approach here. If you're talking about 10 from 100, you shouldn't generally worry too much about performance unless you want to do it many times per second.



                          For example, this C code (which is about as inefficient as I can make it without being silly) still takes well under a tenth of a second to execute. That's not enough time for me to even think about going to get a coffee.



                          #include <stdio.h>
                          #include <stdlib.h>
                          #include <time.h>

                          #define SRCSZ 100
                          #define DSTSZ 10

                          int main (void) {
                          int unused[SRCSZ], source[SRCSZ], dest[DSTSZ], i, j, pos;

                          srand (time (NULL));
                          for (i = 0; i < SRCSZ; i++) {
                          unused[i] = 1;
                          source[i] = rand() % 1000;
                          }

                          for (i = 0; i < DSTSZ; i++) {
                          pos = -1;
                          for (j = 0; j < SRCSZ; j++) {
                          if (pos == -1) {
                          if (unused[j]) {
                          pos = j;
                          }
                          } else {
                          if (unused[j] && (source[j] > source[pos])) {
                          pos = j;
                          }
                          }
                          }
                          dest[i] = source[pos];
                          unused[pos] = 0;
                          }

                          printf ("Source:");
                          for (i = 0; i < SRCSZ; i++) printf (" %d", source[i]);
                          printf ("nDest:");
                          for (i = 0; i < DSTSZ; i++) printf (" %d", dest[i]);
                          printf ("n");

                          return 0;
                          }


                          Running it through time gives you (I've formatted the output a bit to make it readable, but haven't affected the results):



                          Source: 403 459 646 467 120 346 430 247 68 312 701 304 707 443
                          753 433 986 921 513 634 861 741 482 794 679 409 145 93
                          512 947 19 9 385 208 795 742 851 638 924 637 638 141
                          382 89 998 713 210 732 784 67 273 628 187 902 42 25
                          747 471 686 504 255 74 638 610 227 892 156 86 48 133
                          63 234 639 899 815 986 750 177 413 581 899 494 292 359
                          60 106 944 926 257 370 310 726 393 800 986 827 856 835
                          66 183 901
                          Dest: 998 986 986 986 947 944 926 924 921 902

                          real 0m0.063s
                          user 0m0.046s
                          sys 0m0.031s


                          Only once the quantities of numbers become large should you usually worry. Don't get me wrong, I'm not saying you shouldn't think about performance. What you shouldn't do is spend too much time optimising things that don't matter - YAGNI and all that jazz.



                          As with all optimisation questions, measure don't guess!






                          share|improve this answer























                          • but is it a suitable interview answer??
                            – zengr
                            Nov 3 '10 at 6:32










                          • Yes, I believe it would be. By far the most dangerous person to hire is the over-engineer. This is someone who, when asked to provide a function to calculate the sum of a list of items, will give you a 900-line monstrosity that's configurable to handle integers, floats, complex numbers and can also perform pre-calculations on each item with a callback function. This is not the sort of person you want working on something that has a definite delivery date :-) In addition, it gives valuable insight that the candidate can think for themself. Any monkey can cut code.
                            – paxdiablo
                            Nov 3 '10 at 6:37






                          • 2




                            I would hire someone like that on the spot, provided they explain why they did it that way and what are the consequences. See also stackoverflow.com/questions/903572/… and stackoverflow.com/questions/445425/… although there were ... shall we say, animated ... discussions on the merits :-)
                            – paxdiablo
                            Nov 3 '10 at 6:40






                          • 1




                            Hehe, +1 for practicality... But re: what kind of answer is "best" I think the crucial thing is for the interviewee to ask: (1) What are the typical problem sizes? (2) How fast does it need to be? (And BTW if performance on big problems isn't important then the best answer is "Use whatever built-in sort() the language has :-P)
                            – j_random_hacker
                            Nov 3 '10 at 7:17













                          up vote
                          2
                          down vote










                          up vote
                          2
                          down vote









                          The best solution is to use whatever facilities your chosen language provides which will make your life easier.



                          However, assuming this was a question more related to what algorithm you should choose, I'm going to suggest a different approach here. If you're talking about 10 from 100, you shouldn't generally worry too much about performance unless you want to do it many times per second.



                          For example, this C code (which is about as inefficient as I can make it without being silly) still takes well under a tenth of a second to execute. That's not enough time for me to even think about going to get a coffee.



                          #include <stdio.h>
                          #include <stdlib.h>
                          #include <time.h>

                          #define SRCSZ 100
                          #define DSTSZ 10

                          int main (void) {
                          int unused[SRCSZ], source[SRCSZ], dest[DSTSZ], i, j, pos;

                          srand (time (NULL));
                          for (i = 0; i < SRCSZ; i++) {
                          unused[i] = 1;
                          source[i] = rand() % 1000;
                          }

                          for (i = 0; i < DSTSZ; i++) {
                          pos = -1;
                          for (j = 0; j < SRCSZ; j++) {
                          if (pos == -1) {
                          if (unused[j]) {
                          pos = j;
                          }
                          } else {
                          if (unused[j] && (source[j] > source[pos])) {
                          pos = j;
                          }
                          }
                          }
                          dest[i] = source[pos];
                          unused[pos] = 0;
                          }

                          printf ("Source:");
                          for (i = 0; i < SRCSZ; i++) printf (" %d", source[i]);
                          printf ("nDest:");
                          for (i = 0; i < DSTSZ; i++) printf (" %d", dest[i]);
                          printf ("n");

                          return 0;
                          }


                          Running it through time gives you (I've formatted the output a bit to make it readable, but haven't affected the results):



                          Source: 403 459 646 467 120 346 430 247 68 312 701 304 707 443
                          753 433 986 921 513 634 861 741 482 794 679 409 145 93
                          512 947 19 9 385 208 795 742 851 638 924 637 638 141
                          382 89 998 713 210 732 784 67 273 628 187 902 42 25
                          747 471 686 504 255 74 638 610 227 892 156 86 48 133
                          63 234 639 899 815 986 750 177 413 581 899 494 292 359
                          60 106 944 926 257 370 310 726 393 800 986 827 856 835
                          66 183 901
                          Dest: 998 986 986 986 947 944 926 924 921 902

                          real 0m0.063s
                          user 0m0.046s
                          sys 0m0.031s


                          Only once the quantities of numbers become large should you usually worry. Don't get me wrong, I'm not saying you shouldn't think about performance. What you shouldn't do is spend too much time optimising things that don't matter - YAGNI and all that jazz.



                          As with all optimisation questions, measure don't guess!






                          share|improve this answer














                          The best solution is to use whatever facilities your chosen language provides which will make your life easier.



                          However, assuming this was a question more related to what algorithm you should choose, I'm going to suggest a different approach here. If you're talking about 10 from 100, you shouldn't generally worry too much about performance unless you want to do it many times per second.



                          For example, this C code (which is about as inefficient as I can make it without being silly) still takes well under a tenth of a second to execute. That's not enough time for me to even think about going to get a coffee.



                          #include <stdio.h>
                          #include <stdlib.h>
                          #include <time.h>

                          #define SRCSZ 100
                          #define DSTSZ 10

                          int main (void) {
                          int unused[SRCSZ], source[SRCSZ], dest[DSTSZ], i, j, pos;

                          srand (time (NULL));
                          for (i = 0; i < SRCSZ; i++) {
                          unused[i] = 1;
                          source[i] = rand() % 1000;
                          }

                          for (i = 0; i < DSTSZ; i++) {
                          pos = -1;
                          for (j = 0; j < SRCSZ; j++) {
                          if (pos == -1) {
                          if (unused[j]) {
                          pos = j;
                          }
                          } else {
                          if (unused[j] && (source[j] > source[pos])) {
                          pos = j;
                          }
                          }
                          }
                          dest[i] = source[pos];
                          unused[pos] = 0;
                          }

                          printf ("Source:");
                          for (i = 0; i < SRCSZ; i++) printf (" %d", source[i]);
                          printf ("nDest:");
                          for (i = 0; i < DSTSZ; i++) printf (" %d", dest[i]);
                          printf ("n");

                          return 0;
                          }


                          Running it through time gives you (I've formatted the output a bit to make it readable, but haven't affected the results):



                          Source: 403 459 646 467 120 346 430 247 68 312 701 304 707 443
                          753 433 986 921 513 634 861 741 482 794 679 409 145 93
                          512 947 19 9 385 208 795 742 851 638 924 637 638 141
                          382 89 998 713 210 732 784 67 273 628 187 902 42 25
                          747 471 686 504 255 74 638 610 227 892 156 86 48 133
                          63 234 639 899 815 986 750 177 413 581 899 494 292 359
                          60 106 944 926 257 370 310 726 393 800 986 827 856 835
                          66 183 901
                          Dest: 998 986 986 986 947 944 926 924 921 902

                          real 0m0.063s
                          user 0m0.046s
                          sys 0m0.031s


                          Only once the quantities of numbers become large should you usually worry. Don't get me wrong, I'm not saying you shouldn't think about performance. What you shouldn't do is spend too much time optimising things that don't matter - YAGNI and all that jazz.



                          As with all optimisation questions, measure don't guess!







                          share|improve this answer














                          share|improve this answer



                          share|improve this answer








                          edited Feb 4 '17 at 5:46

























                          answered Nov 3 '10 at 6:27









                          paxdiablo

                          624k16812361656




                          624k16812361656












                          • but is it a suitable interview answer??
                            – zengr
                            Nov 3 '10 at 6:32










                          • Yes, I believe it would be. By far the most dangerous person to hire is the over-engineer. This is someone who, when asked to provide a function to calculate the sum of a list of items, will give you a 900-line monstrosity that's configurable to handle integers, floats, complex numbers and can also perform pre-calculations on each item with a callback function. This is not the sort of person you want working on something that has a definite delivery date :-) In addition, it gives valuable insight that the candidate can think for themself. Any monkey can cut code.
                            – paxdiablo
                            Nov 3 '10 at 6:37






                          • 2




                            I would hire someone like that on the spot, provided they explain why they did it that way and what are the consequences. See also stackoverflow.com/questions/903572/… and stackoverflow.com/questions/445425/… although there were ... shall we say, animated ... discussions on the merits :-)
                            – paxdiablo
                            Nov 3 '10 at 6:40






                          • 1




                            Hehe, +1 for practicality... But re: what kind of answer is "best" I think the crucial thing is for the interviewee to ask: (1) What are the typical problem sizes? (2) How fast does it need to be? (And BTW if performance on big problems isn't important then the best answer is "Use whatever built-in sort() the language has :-P)
                            – j_random_hacker
                            Nov 3 '10 at 7:17


















                          • but is it a suitable interview answer??
                            – zengr
                            Nov 3 '10 at 6:32










                          • Yes, I believe it would be. By far the most dangerous person to hire is the over-engineer. This is someone who, when asked to provide a function to calculate the sum of a list of items, will give you a 900-line monstrosity that's configurable to handle integers, floats, complex numbers and can also perform pre-calculations on each item with a callback function. This is not the sort of person you want working on something that has a definite delivery date :-) In addition, it gives valuable insight that the candidate can think for themself. Any monkey can cut code.
                            – paxdiablo
                            Nov 3 '10 at 6:37






                          • 2




                            I would hire someone like that on the spot, provided they explain why they did it that way and what are the consequences. See also stackoverflow.com/questions/903572/… and stackoverflow.com/questions/445425/… although there were ... shall we say, animated ... discussions on the merits :-)
                            – paxdiablo
                            Nov 3 '10 at 6:40






                          • 1




                            Hehe, +1 for practicality... But re: what kind of answer is "best" I think the crucial thing is for the interviewee to ask: (1) What are the typical problem sizes? (2) How fast does it need to be? (And BTW if performance on big problems isn't important then the best answer is "Use whatever built-in sort() the language has :-P)
                            – j_random_hacker
                            Nov 3 '10 at 7:17
















                          but is it a suitable interview answer??
                          – zengr
                          Nov 3 '10 at 6:32




                          but is it a suitable interview answer??
                          – zengr
                          Nov 3 '10 at 6:32












                          Yes, I believe it would be. By far the most dangerous person to hire is the over-engineer. This is someone who, when asked to provide a function to calculate the sum of a list of items, will give you a 900-line monstrosity that's configurable to handle integers, floats, complex numbers and can also perform pre-calculations on each item with a callback function. This is not the sort of person you want working on something that has a definite delivery date :-) In addition, it gives valuable insight that the candidate can think for themself. Any monkey can cut code.
                          – paxdiablo
                          Nov 3 '10 at 6:37




                          Yes, I believe it would be. By far the most dangerous person to hire is the over-engineer. This is someone who, when asked to provide a function to calculate the sum of a list of items, will give you a 900-line monstrosity that's configurable to handle integers, floats, complex numbers and can also perform pre-calculations on each item with a callback function. This is not the sort of person you want working on something that has a definite delivery date :-) In addition, it gives valuable insight that the candidate can think for themself. Any monkey can cut code.
                          – paxdiablo
                          Nov 3 '10 at 6:37




                          2




                          2




                          I would hire someone like that on the spot, provided they explain why they did it that way and what are the consequences. See also stackoverflow.com/questions/903572/… and stackoverflow.com/questions/445425/… although there were ... shall we say, animated ... discussions on the merits :-)
                          – paxdiablo
                          Nov 3 '10 at 6:40




                          I would hire someone like that on the spot, provided they explain why they did it that way and what are the consequences. See also stackoverflow.com/questions/903572/… and stackoverflow.com/questions/445425/… although there were ... shall we say, animated ... discussions on the merits :-)
                          – paxdiablo
                          Nov 3 '10 at 6:40




                          1




                          1




                          Hehe, +1 for practicality... But re: what kind of answer is "best" I think the crucial thing is for the interviewee to ask: (1) What are the typical problem sizes? (2) How fast does it need to be? (And BTW if performance on big problems isn't important then the best answer is "Use whatever built-in sort() the language has :-P)
                          – j_random_hacker
                          Nov 3 '10 at 7:17




                          Hehe, +1 for practicality... But re: what kind of answer is "best" I think the crucial thing is for the interviewee to ask: (1) What are the typical problem sizes? (2) How fast does it need to be? (And BTW if performance on big problems isn't important then the best answer is "Use whatever built-in sort() the language has :-P)
                          – j_random_hacker
                          Nov 3 '10 at 7:17










                          up vote
                          1
                          down vote













                          Well, you can create a heap from an unsorted array in O(n) time, and you can get the top element from the heap in O(log(n)) time. So your total runtime is O(n + k*log(n)).






                          share|improve this answer

























                            up vote
                            1
                            down vote













                            Well, you can create a heap from an unsorted array in O(n) time, and you can get the top element from the heap in O(log(n)) time. So your total runtime is O(n + k*log(n)).






                            share|improve this answer























                              up vote
                              1
                              down vote










                              up vote
                              1
                              down vote









                              Well, you can create a heap from an unsorted array in O(n) time, and you can get the top element from the heap in O(log(n)) time. So your total runtime is O(n + k*log(n)).






                              share|improve this answer












                              Well, you can create a heap from an unsorted array in O(n) time, and you can get the top element from the heap in O(log(n)) time. So your total runtime is O(n + k*log(n)).







                              share|improve this answer












                              share|improve this answer



                              share|improve this answer










                              answered Oct 23 '12 at 5:56









                              Charles Munger

                              1,2551011




                              1,2551011






















                                  up vote
                                  1
                                  down vote













                                  Written below both selection sort and insertion sort implementations. For larger data set I suggest insetion sort better than selection sort



                                  public interface FindTopValues
                                  {
                                  int findTopNValues(int data, int n);
                                  }


                                  Insertion Sort Implementation:



                                  public class FindTopValuesInsertionSortImpl implements FindTopValues {  

                                  /**
                                  * Finds list of the highest 'n' values in the source list, ordered naturally,
                                  * with the highest value at the start of the array and returns it
                                  */
                                  @Override
                                  public int findTopNValues(int values, int n) {

                                  int length = values.length;
                                  for (int i=1; i<length; i++) {
                                  int curPos = i;
                                  while ((curPos > 0) && (values[i] > values[curPos-1])) {
                                  curPos--;
                                  }

                                  if (curPos != i) {
                                  int element = values[i];
                                  System.arraycopy(values, curPos, values, curPos+1, (i-curPos));
                                  values[curPos] = element;
                                  }
                                  }

                                  return Arrays.copyOf(values, n);
                                  }

                                  }


                                  Selection Sort Implementation:



                                  public class FindTopValuesSelectionSortImpl implements FindTopValues {

                                  /**
                                  * Finds list of the highest 'n' values in the source list, ordered naturally,
                                  * with the highest value at the start of the array and returns it
                                  */
                                  @Override
                                  public int findTopNValues(int values, int n) {
                                  int length = values.length;

                                  for (int i=0; i<=n; i++) {
                                  int maxPos = i;
                                  for (int j=i+1; j<length; j++) {
                                  if (values[j] > values[maxPos]) {
                                  maxPos = j;
                                  }
                                  }

                                  if (maxPos != i) {
                                  int maxValue = values[maxPos];
                                  values[maxPos] = values[i];
                                  values[i] = maxValue;
                                  }
                                  }
                                  return Arrays.copyOf(values, n);
                                  }
                                  }





                                  share|improve this answer



























                                    up vote
                                    1
                                    down vote













                                    Written below both selection sort and insertion sort implementations. For larger data set I suggest insetion sort better than selection sort



                                    public interface FindTopValues
                                    {
                                    int findTopNValues(int data, int n);
                                    }


                                    Insertion Sort Implementation:



                                    public class FindTopValuesInsertionSortImpl implements FindTopValues {  

                                    /**
                                    * Finds list of the highest 'n' values in the source list, ordered naturally,
                                    * with the highest value at the start of the array and returns it
                                    */
                                    @Override
                                    public int findTopNValues(int values, int n) {

                                    int length = values.length;
                                    for (int i=1; i<length; i++) {
                                    int curPos = i;
                                    while ((curPos > 0) && (values[i] > values[curPos-1])) {
                                    curPos--;
                                    }

                                    if (curPos != i) {
                                    int element = values[i];
                                    System.arraycopy(values, curPos, values, curPos+1, (i-curPos));
                                    values[curPos] = element;
                                    }
                                    }

                                    return Arrays.copyOf(values, n);
                                    }

                                    }


                                    Selection Sort Implementation:



                                    public class FindTopValuesSelectionSortImpl implements FindTopValues {

                                    /**
                                    * Finds list of the highest 'n' values in the source list, ordered naturally,
                                    * with the highest value at the start of the array and returns it
                                    */
                                    @Override
                                    public int findTopNValues(int values, int n) {
                                    int length = values.length;

                                    for (int i=0; i<=n; i++) {
                                    int maxPos = i;
                                    for (int j=i+1; j<length; j++) {
                                    if (values[j] > values[maxPos]) {
                                    maxPos = j;
                                    }
                                    }

                                    if (maxPos != i) {
                                    int maxValue = values[maxPos];
                                    values[maxPos] = values[i];
                                    values[i] = maxValue;
                                    }
                                    }
                                    return Arrays.copyOf(values, n);
                                    }
                                    }





                                    share|improve this answer

























                                      up vote
                                      1
                                      down vote










                                      up vote
                                      1
                                      down vote









                                      Written below both selection sort and insertion sort implementations. For larger data set I suggest insetion sort better than selection sort



                                      public interface FindTopValues
                                      {
                                      int findTopNValues(int data, int n);
                                      }


                                      Insertion Sort Implementation:



                                      public class FindTopValuesInsertionSortImpl implements FindTopValues {  

                                      /**
                                      * Finds list of the highest 'n' values in the source list, ordered naturally,
                                      * with the highest value at the start of the array and returns it
                                      */
                                      @Override
                                      public int findTopNValues(int values, int n) {

                                      int length = values.length;
                                      for (int i=1; i<length; i++) {
                                      int curPos = i;
                                      while ((curPos > 0) && (values[i] > values[curPos-1])) {
                                      curPos--;
                                      }

                                      if (curPos != i) {
                                      int element = values[i];
                                      System.arraycopy(values, curPos, values, curPos+1, (i-curPos));
                                      values[curPos] = element;
                                      }
                                      }

                                      return Arrays.copyOf(values, n);
                                      }

                                      }


                                      Selection Sort Implementation:



                                      public class FindTopValuesSelectionSortImpl implements FindTopValues {

                                      /**
                                      * Finds list of the highest 'n' values in the source list, ordered naturally,
                                      * with the highest value at the start of the array and returns it
                                      */
                                      @Override
                                      public int findTopNValues(int values, int n) {
                                      int length = values.length;

                                      for (int i=0; i<=n; i++) {
                                      int maxPos = i;
                                      for (int j=i+1; j<length; j++) {
                                      if (values[j] > values[maxPos]) {
                                      maxPos = j;
                                      }
                                      }

                                      if (maxPos != i) {
                                      int maxValue = values[maxPos];
                                      values[maxPos] = values[i];
                                      values[i] = maxValue;
                                      }
                                      }
                                      return Arrays.copyOf(values, n);
                                      }
                                      }





                                      share|improve this answer














                                      Written below both selection sort and insertion sort implementations. For larger data set I suggest insetion sort better than selection sort



                                      public interface FindTopValues
                                      {
                                      int findTopNValues(int data, int n);
                                      }


                                      Insertion Sort Implementation:



                                      public class FindTopValuesInsertionSortImpl implements FindTopValues {  

                                      /**
                                      * Finds list of the highest 'n' values in the source list, ordered naturally,
                                      * with the highest value at the start of the array and returns it
                                      */
                                      @Override
                                      public int findTopNValues(int values, int n) {

                                      int length = values.length;
                                      for (int i=1; i<length; i++) {
                                      int curPos = i;
                                      while ((curPos > 0) && (values[i] > values[curPos-1])) {
                                      curPos--;
                                      }

                                      if (curPos != i) {
                                      int element = values[i];
                                      System.arraycopy(values, curPos, values, curPos+1, (i-curPos));
                                      values[curPos] = element;
                                      }
                                      }

                                      return Arrays.copyOf(values, n);
                                      }

                                      }


                                      Selection Sort Implementation:



                                      public class FindTopValuesSelectionSortImpl implements FindTopValues {

                                      /**
                                      * Finds list of the highest 'n' values in the source list, ordered naturally,
                                      * with the highest value at the start of the array and returns it
                                      */
                                      @Override
                                      public int findTopNValues(int values, int n) {
                                      int length = values.length;

                                      for (int i=0; i<=n; i++) {
                                      int maxPos = i;
                                      for (int j=i+1; j<length; j++) {
                                      if (values[j] > values[maxPos]) {
                                      maxPos = j;
                                      }
                                      }

                                      if (maxPos != i) {
                                      int maxValue = values[maxPos];
                                      values[maxPos] = values[i];
                                      values[i] = maxValue;
                                      }
                                      }
                                      return Arrays.copyOf(values, n);
                                      }
                                      }






                                      share|improve this answer














                                      share|improve this answer



                                      share|improve this answer








                                      edited Apr 26 '15 at 14:13









                                      Ashish Chopra

                                      1,021618




                                      1,021618










                                      answered Jul 19 '13 at 14:18









                                      Srihari Konakanchi

                                      512




                                      512






















                                          up vote
                                          1
                                          down vote













                                          You can use List and can guava's Comparators class to get the desired results. It is a highly optimized solution. Please see a sample below, which gets top 5 numbers. Api can be found here.



                                          import java.util.Comparator;
                                          import java.util.List;
                                          import java.util.stream.Collector;

                                          import org.junit.Test;

                                          import com.google.common.collect.Comparators;
                                          import com.google.common.collect.Lists;

                                          public class TestComparator {

                                          @Test
                                          public void testTopN() {
                                          final List<Integer> numbers = Lists.newArrayList(1, 3, 8, 2, 6, 4, 7, 5, 9, 0);
                                          final Collector<Integer, ?, List<Integer>> collector = Comparators.greatest(5,
                                          Comparator.<Integer>naturalOrder());
                                          final List<Integer> top = numbers.stream().collect(collector);
                                          System.out.println(top);
                                          }

                                          }


                                          Output: [9, 8, 7, 6, 5]






                                          share|improve this answer

























                                            up vote
                                            1
                                            down vote













                                            You can use List and can guava's Comparators class to get the desired results. It is a highly optimized solution. Please see a sample below, which gets top 5 numbers. Api can be found here.



                                            import java.util.Comparator;
                                            import java.util.List;
                                            import java.util.stream.Collector;

                                            import org.junit.Test;

                                            import com.google.common.collect.Comparators;
                                            import com.google.common.collect.Lists;

                                            public class TestComparator {

                                            @Test
                                            public void testTopN() {
                                            final List<Integer> numbers = Lists.newArrayList(1, 3, 8, 2, 6, 4, 7, 5, 9, 0);
                                            final Collector<Integer, ?, List<Integer>> collector = Comparators.greatest(5,
                                            Comparator.<Integer>naturalOrder());
                                            final List<Integer> top = numbers.stream().collect(collector);
                                            System.out.println(top);
                                            }

                                            }


                                            Output: [9, 8, 7, 6, 5]






                                            share|improve this answer























                                              up vote
                                              1
                                              down vote










                                              up vote
                                              1
                                              down vote









                                              You can use List and can guava's Comparators class to get the desired results. It is a highly optimized solution. Please see a sample below, which gets top 5 numbers. Api can be found here.



                                              import java.util.Comparator;
                                              import java.util.List;
                                              import java.util.stream.Collector;

                                              import org.junit.Test;

                                              import com.google.common.collect.Comparators;
                                              import com.google.common.collect.Lists;

                                              public class TestComparator {

                                              @Test
                                              public void testTopN() {
                                              final List<Integer> numbers = Lists.newArrayList(1, 3, 8, 2, 6, 4, 7, 5, 9, 0);
                                              final Collector<Integer, ?, List<Integer>> collector = Comparators.greatest(5,
                                              Comparator.<Integer>naturalOrder());
                                              final List<Integer> top = numbers.stream().collect(collector);
                                              System.out.println(top);
                                              }

                                              }


                                              Output: [9, 8, 7, 6, 5]






                                              share|improve this answer












                                              You can use List and can guava's Comparators class to get the desired results. It is a highly optimized solution. Please see a sample below, which gets top 5 numbers. Api can be found here.



                                              import java.util.Comparator;
                                              import java.util.List;
                                              import java.util.stream.Collector;

                                              import org.junit.Test;

                                              import com.google.common.collect.Comparators;
                                              import com.google.common.collect.Lists;

                                              public class TestComparator {

                                              @Test
                                              public void testTopN() {
                                              final List<Integer> numbers = Lists.newArrayList(1, 3, 8, 2, 6, 4, 7, 5, 9, 0);
                                              final Collector<Integer, ?, List<Integer>> collector = Comparators.greatest(5,
                                              Comparator.<Integer>naturalOrder());
                                              final List<Integer> top = numbers.stream().collect(collector);
                                              System.out.println(top);
                                              }

                                              }


                                              Output: [9, 8, 7, 6, 5]







                                              share|improve this answer












                                              share|improve this answer



                                              share|improve this answer










                                              answered Jun 28 '17 at 8:19









                                              Pritesh Mhatre

                                              1,61411320




                                              1,61411320






















                                                  up vote
                                                  0
                                                  down vote













                                                  Yes there is a way to do better than quicksort. As pointed by Yin Zhu, you can search for kth largest element first and then use that element value as your pivot to split the array






                                                  share|improve this answer

























                                                    up vote
                                                    0
                                                    down vote













                                                    Yes there is a way to do better than quicksort. As pointed by Yin Zhu, you can search for kth largest element first and then use that element value as your pivot to split the array






                                                    share|improve this answer























                                                      up vote
                                                      0
                                                      down vote










                                                      up vote
                                                      0
                                                      down vote









                                                      Yes there is a way to do better than quicksort. As pointed by Yin Zhu, you can search for kth largest element first and then use that element value as your pivot to split the array






                                                      share|improve this answer












                                                      Yes there is a way to do better than quicksort. As pointed by Yin Zhu, you can search for kth largest element first and then use that element value as your pivot to split the array







                                                      share|improve this answer












                                                      share|improve this answer



                                                      share|improve this answer










                                                      answered Nov 3 '10 at 5:55









                                                      CodeKata

                                                      53747




                                                      53747






















                                                          up vote
                                                          0
                                                          down vote













                                                          I was asked for the same algorithm on the interview.
                                                          I done that, if somebody can compare that with fastest algorithm in Java - will be very useful.



                                                              public int findTopNValues(int anyOldOrderValues, int n) {
                                                          if (n < 0) {
                                                          return new int{};
                                                          }
                                                          if (n == 1) {
                                                          return new int{findMaxValue(anyOldOrderValues)};
                                                          }

                                                          int result = new int[n + 1];
                                                          for (int i = 0; i < Math.min(n, anyOldOrderValues.length); i++) {
                                                          result[i] = anyOldOrderValues[i];
                                                          }
                                                          Arrays.sort(result);

                                                          int max = result[0];
                                                          for (int i = n - 1; i < anyOldOrderValues.length; i++) {
                                                          int value = anyOldOrderValues[i];
                                                          if (max < value) {
                                                          result[n] = value;
                                                          Arrays.sort(result);
                                                          int result1 = new int[n + 1];
                                                          System.arraycopy(result, 1, result1, 0, n);
                                                          result = result1;
                                                          max = result[0];
                                                          }
                                                          }
                                                          return convertAndFlip(result, n);
                                                          }

                                                          public static int convertAndFlip(int integers, int n) {
                                                          int result = new int[n];
                                                          int j = 0;
                                                          for (int i = n - 1; i > -1; i--) {
                                                          result[j++] = integers[i];
                                                          }
                                                          return result;
                                                          }


                                                          and test for that:



                                                          public void testFindTopNValues() throws Exception {
                                                          final int N = 100000000;
                                                          final int MAX_VALUE = 100000000;
                                                          final int returnArray = 1000;
                                                          final int repeatTimes = 5;

                                                          FindTopValuesArraySorting arraySorting = new FindTopValuesArraySorting();

                                                          int randomArray = createRandomArray(N, MAX_VALUE);
                                                          for (int i = 0; i < repeatTimes; i++) {

                                                          long start = System.currentTimeMillis();
                                                          int topNValues = arraySorting.findTopNValues(randomArray, returnArray);
                                                          long stop = System.currentTimeMillis();

                                                          System.out.println("findTopNValues() from " + N + " elements, where MAX value=" + (MAX_VALUE - 1) + " and return array size " + returnArray + " elements : " + (stop - start) + "msec");
                                                          // System.out.println("Result list = " + Arrays.toString(topNValues));
                                                          }
                                                          }

                                                          private static int createRandomArray(int n, int maxValue) {
                                                          Random r = new Random();
                                                          int arr = new int[n];
                                                          for (int i = 0; i < n; i++) {
                                                          arr[i] = r.nextInt(maxValue);
                                                          }
                                                          return arr;
                                                          }


                                                          Result is something like:



                                                          findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 395msec
                                                          findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 311msec
                                                          findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 473msec
                                                          findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 380msec
                                                          findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 406msec


                                                          ~400msc average result, for getting 1000 max integers from array of 100.000.000 initial elements.
                                                          not bad!



                                                          Just tried that set from above:



                                                          findTopNValues() from 101 elements and return array size 10 elements : 1msec
                                                          Result list = [998, 986, 986, 986, 947, 944, 926, 924, 921, 902]
                                                          Original list = [403, 459, 646, 467, 120, 346, 430, 247, 68, 312, 701, 304, 707, 443, 753, 433, 986, 921, 513, 634, 861, 741, 482, 794, 679, 409, 145, 93, 512, 947, 19, 9, 385, 208, 795, 742, 851, 638, 924, 637, 638, 141, 382, 89, 998, 713, 210, 732, 784, 67, 273, 628, 187, 902, 42, 25, 747, 471, 686, 504, 255, 74, 638, 610, 227, 892, 156, 86, 48, 133, 63, 234, 639, 899, 815, 986, 750, 177, 413, 581, 899, 494, 292, 359, 60, 106, 944, 926, 257, 370, 310, 726, 393, 800, 986, 827, 856, 835, 66, 183, 901]





                                                          share|improve this answer



























                                                            up vote
                                                            0
                                                            down vote













                                                            I was asked for the same algorithm on the interview.
                                                            I done that, if somebody can compare that with fastest algorithm in Java - will be very useful.



                                                                public int findTopNValues(int anyOldOrderValues, int n) {
                                                            if (n < 0) {
                                                            return new int{};
                                                            }
                                                            if (n == 1) {
                                                            return new int{findMaxValue(anyOldOrderValues)};
                                                            }

                                                            int result = new int[n + 1];
                                                            for (int i = 0; i < Math.min(n, anyOldOrderValues.length); i++) {
                                                            result[i] = anyOldOrderValues[i];
                                                            }
                                                            Arrays.sort(result);

                                                            int max = result[0];
                                                            for (int i = n - 1; i < anyOldOrderValues.length; i++) {
                                                            int value = anyOldOrderValues[i];
                                                            if (max < value) {
                                                            result[n] = value;
                                                            Arrays.sort(result);
                                                            int result1 = new int[n + 1];
                                                            System.arraycopy(result, 1, result1, 0, n);
                                                            result = result1;
                                                            max = result[0];
                                                            }
                                                            }
                                                            return convertAndFlip(result, n);
                                                            }

                                                            public static int convertAndFlip(int integers, int n) {
                                                            int result = new int[n];
                                                            int j = 0;
                                                            for (int i = n - 1; i > -1; i--) {
                                                            result[j++] = integers[i];
                                                            }
                                                            return result;
                                                            }


                                                            and test for that:



                                                            public void testFindTopNValues() throws Exception {
                                                            final int N = 100000000;
                                                            final int MAX_VALUE = 100000000;
                                                            final int returnArray = 1000;
                                                            final int repeatTimes = 5;

                                                            FindTopValuesArraySorting arraySorting = new FindTopValuesArraySorting();

                                                            int randomArray = createRandomArray(N, MAX_VALUE);
                                                            for (int i = 0; i < repeatTimes; i++) {

                                                            long start = System.currentTimeMillis();
                                                            int topNValues = arraySorting.findTopNValues(randomArray, returnArray);
                                                            long stop = System.currentTimeMillis();

                                                            System.out.println("findTopNValues() from " + N + " elements, where MAX value=" + (MAX_VALUE - 1) + " and return array size " + returnArray + " elements : " + (stop - start) + "msec");
                                                            // System.out.println("Result list = " + Arrays.toString(topNValues));
                                                            }
                                                            }

                                                            private static int createRandomArray(int n, int maxValue) {
                                                            Random r = new Random();
                                                            int arr = new int[n];
                                                            for (int i = 0; i < n; i++) {
                                                            arr[i] = r.nextInt(maxValue);
                                                            }
                                                            return arr;
                                                            }


                                                            Result is something like:



                                                            findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 395msec
                                                            findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 311msec
                                                            findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 473msec
                                                            findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 380msec
                                                            findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 406msec


                                                            ~400msc average result, for getting 1000 max integers from array of 100.000.000 initial elements.
                                                            not bad!



                                                            Just tried that set from above:



                                                            findTopNValues() from 101 elements and return array size 10 elements : 1msec
                                                            Result list = [998, 986, 986, 986, 947, 944, 926, 924, 921, 902]
                                                            Original list = [403, 459, 646, 467, 120, 346, 430, 247, 68, 312, 701, 304, 707, 443, 753, 433, 986, 921, 513, 634, 861, 741, 482, 794, 679, 409, 145, 93, 512, 947, 19, 9, 385, 208, 795, 742, 851, 638, 924, 637, 638, 141, 382, 89, 998, 713, 210, 732, 784, 67, 273, 628, 187, 902, 42, 25, 747, 471, 686, 504, 255, 74, 638, 610, 227, 892, 156, 86, 48, 133, 63, 234, 639, 899, 815, 986, 750, 177, 413, 581, 899, 494, 292, 359, 60, 106, 944, 926, 257, 370, 310, 726, 393, 800, 986, 827, 856, 835, 66, 183, 901]





                                                            share|improve this answer

























                                                              up vote
                                                              0
                                                              down vote










                                                              up vote
                                                              0
                                                              down vote









                                                              I was asked for the same algorithm on the interview.
                                                              I done that, if somebody can compare that with fastest algorithm in Java - will be very useful.



                                                                  public int findTopNValues(int anyOldOrderValues, int n) {
                                                              if (n < 0) {
                                                              return new int{};
                                                              }
                                                              if (n == 1) {
                                                              return new int{findMaxValue(anyOldOrderValues)};
                                                              }

                                                              int result = new int[n + 1];
                                                              for (int i = 0; i < Math.min(n, anyOldOrderValues.length); i++) {
                                                              result[i] = anyOldOrderValues[i];
                                                              }
                                                              Arrays.sort(result);

                                                              int max = result[0];
                                                              for (int i = n - 1; i < anyOldOrderValues.length; i++) {
                                                              int value = anyOldOrderValues[i];
                                                              if (max < value) {
                                                              result[n] = value;
                                                              Arrays.sort(result);
                                                              int result1 = new int[n + 1];
                                                              System.arraycopy(result, 1, result1, 0, n);
                                                              result = result1;
                                                              max = result[0];
                                                              }
                                                              }
                                                              return convertAndFlip(result, n);
                                                              }

                                                              public static int convertAndFlip(int integers, int n) {
                                                              int result = new int[n];
                                                              int j = 0;
                                                              for (int i = n - 1; i > -1; i--) {
                                                              result[j++] = integers[i];
                                                              }
                                                              return result;
                                                              }


                                                              and test for that:



                                                              public void testFindTopNValues() throws Exception {
                                                              final int N = 100000000;
                                                              final int MAX_VALUE = 100000000;
                                                              final int returnArray = 1000;
                                                              final int repeatTimes = 5;

                                                              FindTopValuesArraySorting arraySorting = new FindTopValuesArraySorting();

                                                              int randomArray = createRandomArray(N, MAX_VALUE);
                                                              for (int i = 0; i < repeatTimes; i++) {

                                                              long start = System.currentTimeMillis();
                                                              int topNValues = arraySorting.findTopNValues(randomArray, returnArray);
                                                              long stop = System.currentTimeMillis();

                                                              System.out.println("findTopNValues() from " + N + " elements, where MAX value=" + (MAX_VALUE - 1) + " and return array size " + returnArray + " elements : " + (stop - start) + "msec");
                                                              // System.out.println("Result list = " + Arrays.toString(topNValues));
                                                              }
                                                              }

                                                              private static int createRandomArray(int n, int maxValue) {
                                                              Random r = new Random();
                                                              int arr = new int[n];
                                                              for (int i = 0; i < n; i++) {
                                                              arr[i] = r.nextInt(maxValue);
                                                              }
                                                              return arr;
                                                              }


                                                              Result is something like:



                                                              findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 395msec
                                                              findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 311msec
                                                              findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 473msec
                                                              findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 380msec
                                                              findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 406msec


                                                              ~400msc average result, for getting 1000 max integers from array of 100.000.000 initial elements.
                                                              not bad!



                                                              Just tried that set from above:



                                                              findTopNValues() from 101 elements and return array size 10 elements : 1msec
                                                              Result list = [998, 986, 986, 986, 947, 944, 926, 924, 921, 902]
                                                              Original list = [403, 459, 646, 467, 120, 346, 430, 247, 68, 312, 701, 304, 707, 443, 753, 433, 986, 921, 513, 634, 861, 741, 482, 794, 679, 409, 145, 93, 512, 947, 19, 9, 385, 208, 795, 742, 851, 638, 924, 637, 638, 141, 382, 89, 998, 713, 210, 732, 784, 67, 273, 628, 187, 902, 42, 25, 747, 471, 686, 504, 255, 74, 638, 610, 227, 892, 156, 86, 48, 133, 63, 234, 639, 899, 815, 986, 750, 177, 413, 581, 899, 494, 292, 359, 60, 106, 944, 926, 257, 370, 310, 726, 393, 800, 986, 827, 856, 835, 66, 183, 901]





                                                              share|improve this answer














                                                              I was asked for the same algorithm on the interview.
                                                              I done that, if somebody can compare that with fastest algorithm in Java - will be very useful.



                                                                  public int findTopNValues(int anyOldOrderValues, int n) {
                                                              if (n < 0) {
                                                              return new int{};
                                                              }
                                                              if (n == 1) {
                                                              return new int{findMaxValue(anyOldOrderValues)};
                                                              }

                                                              int result = new int[n + 1];
                                                              for (int i = 0; i < Math.min(n, anyOldOrderValues.length); i++) {
                                                              result[i] = anyOldOrderValues[i];
                                                              }
                                                              Arrays.sort(result);

                                                              int max = result[0];
                                                              for (int i = n - 1; i < anyOldOrderValues.length; i++) {
                                                              int value = anyOldOrderValues[i];
                                                              if (max < value) {
                                                              result[n] = value;
                                                              Arrays.sort(result);
                                                              int result1 = new int[n + 1];
                                                              System.arraycopy(result, 1, result1, 0, n);
                                                              result = result1;
                                                              max = result[0];
                                                              }
                                                              }
                                                              return convertAndFlip(result, n);
                                                              }

                                                              public static int convertAndFlip(int integers, int n) {
                                                              int result = new int[n];
                                                              int j = 0;
                                                              for (int i = n - 1; i > -1; i--) {
                                                              result[j++] = integers[i];
                                                              }
                                                              return result;
                                                              }


                                                              and test for that:



                                                              public void testFindTopNValues() throws Exception {
                                                              final int N = 100000000;
                                                              final int MAX_VALUE = 100000000;
                                                              final int returnArray = 1000;
                                                              final int repeatTimes = 5;

                                                              FindTopValuesArraySorting arraySorting = new FindTopValuesArraySorting();

                                                              int randomArray = createRandomArray(N, MAX_VALUE);
                                                              for (int i = 0; i < repeatTimes; i++) {

                                                              long start = System.currentTimeMillis();
                                                              int topNValues = arraySorting.findTopNValues(randomArray, returnArray);
                                                              long stop = System.currentTimeMillis();

                                                              System.out.println("findTopNValues() from " + N + " elements, where MAX value=" + (MAX_VALUE - 1) + " and return array size " + returnArray + " elements : " + (stop - start) + "msec");
                                                              // System.out.println("Result list = " + Arrays.toString(topNValues));
                                                              }
                                                              }

                                                              private static int createRandomArray(int n, int maxValue) {
                                                              Random r = new Random();
                                                              int arr = new int[n];
                                                              for (int i = 0; i < n; i++) {
                                                              arr[i] = r.nextInt(maxValue);
                                                              }
                                                              return arr;
                                                              }


                                                              Result is something like:



                                                              findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 395msec
                                                              findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 311msec
                                                              findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 473msec
                                                              findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 380msec
                                                              findTopNValues() from 100000000 elements, where MAX value=99999999 and return array size 1000 elements : 406msec


                                                              ~400msc average result, for getting 1000 max integers from array of 100.000.000 initial elements.
                                                              not bad!



                                                              Just tried that set from above:



                                                              findTopNValues() from 101 elements and return array size 10 elements : 1msec
                                                              Result list = [998, 986, 986, 986, 947, 944, 926, 924, 921, 902]
                                                              Original list = [403, 459, 646, 467, 120, 346, 430, 247, 68, 312, 701, 304, 707, 443, 753, 433, 986, 921, 513, 634, 861, 741, 482, 794, 679, 409, 145, 93, 512, 947, 19, 9, 385, 208, 795, 742, 851, 638, 924, 637, 638, 141, 382, 89, 998, 713, 210, 732, 784, 67, 273, 628, 187, 902, 42, 25, 747, 471, 686, 504, 255, 74, 638, 610, 227, 892, 156, 86, 48, 133, 63, 234, 639, 899, 815, 986, 750, 177, 413, 581, 899, 494, 292, 359, 60, 106, 944, 926, 257, 370, 310, 726, 393, 800, 986, 827, 856, 835, 66, 183, 901]






                                                              share|improve this answer














                                                              share|improve this answer



                                                              share|improve this answer








                                                              edited Jun 27 '13 at 23:01

























                                                              answered Jun 27 '13 at 22:50









                                                              Dmitri Algazin

                                                              2,3921718




                                                              2,3921718






















                                                                  up vote
                                                                  0
                                                                  down vote













                                                                  The best Algorithm would by large depend on the size of K.
                                                                  If K is small then by simply following BubbleSort Algorithm and iterating the outer loop K times
                                                                  would give the top K values.
                                                                  The complexity will be O(n*k).



                                                                  However for values of K close to n the complexity will approach O(n^2). In such scenario quicksort might be a good alternative.






                                                                  share|improve this answer

























                                                                    up vote
                                                                    0
                                                                    down vote













                                                                    The best Algorithm would by large depend on the size of K.
                                                                    If K is small then by simply following BubbleSort Algorithm and iterating the outer loop K times
                                                                    would give the top K values.
                                                                    The complexity will be O(n*k).



                                                                    However for values of K close to n the complexity will approach O(n^2). In such scenario quicksort might be a good alternative.






                                                                    share|improve this answer























                                                                      up vote
                                                                      0
                                                                      down vote










                                                                      up vote
                                                                      0
                                                                      down vote









                                                                      The best Algorithm would by large depend on the size of K.
                                                                      If K is small then by simply following BubbleSort Algorithm and iterating the outer loop K times
                                                                      would give the top K values.
                                                                      The complexity will be O(n*k).



                                                                      However for values of K close to n the complexity will approach O(n^2). In such scenario quicksort might be a good alternative.






                                                                      share|improve this answer












                                                                      The best Algorithm would by large depend on the size of K.
                                                                      If K is small then by simply following BubbleSort Algorithm and iterating the outer loop K times
                                                                      would give the top K values.
                                                                      The complexity will be O(n*k).



                                                                      However for values of K close to n the complexity will approach O(n^2). In such scenario quicksort might be a good alternative.







                                                                      share|improve this answer












                                                                      share|improve this answer



                                                                      share|improve this answer










                                                                      answered Jun 12 '14 at 13:51









                                                                      user3734336

                                                                      1




                                                                      1






















                                                                          up vote
                                                                          0
                                                                          down vote













                                                                          public class FindTopValuesSelectionSortImpl implements FindTopValues {

                                                                          /**
                                                                          * Finds list of the highest 'n' values in the source list, ordered naturally,
                                                                          * with the highest value at the start of the array and returns it
                                                                          */
                                                                          @Override
                                                                          public int findTopNValues(int values, int n) {
                                                                          int length = values.length;

                                                                          for (int i=0; i<=n; i++) {
                                                                          int maxPos = i;
                                                                          for (int j=i+1; j<length; j++) {
                                                                          if (values[j] > values[maxPos]) {
                                                                          maxPos = j;
                                                                          }
                                                                          }

                                                                          if (maxPos != i) {
                                                                          int maxValue = values[maxPos];
                                                                          values[maxPos] = values[i];**strong text**
                                                                          values[i] = maxValue;
                                                                          }
                                                                          }
                                                                          return Arrays.copyOf(values, n);
                                                                          }
                                                                          }





                                                                          share|improve this answer

























                                                                            up vote
                                                                            0
                                                                            down vote













                                                                            public class FindTopValuesSelectionSortImpl implements FindTopValues {

                                                                            /**
                                                                            * Finds list of the highest 'n' values in the source list, ordered naturally,
                                                                            * with the highest value at the start of the array and returns it
                                                                            */
                                                                            @Override
                                                                            public int findTopNValues(int values, int n) {
                                                                            int length = values.length;

                                                                            for (int i=0; i<=n; i++) {
                                                                            int maxPos = i;
                                                                            for (int j=i+1; j<length; j++) {
                                                                            if (values[j] > values[maxPos]) {
                                                                            maxPos = j;
                                                                            }
                                                                            }

                                                                            if (maxPos != i) {
                                                                            int maxValue = values[maxPos];
                                                                            values[maxPos] = values[i];**strong text**
                                                                            values[i] = maxValue;
                                                                            }
                                                                            }
                                                                            return Arrays.copyOf(values, n);
                                                                            }
                                                                            }





                                                                            share|improve this answer























                                                                              up vote
                                                                              0
                                                                              down vote










                                                                              up vote
                                                                              0
                                                                              down vote









                                                                              public class FindTopValuesSelectionSortImpl implements FindTopValues {

                                                                              /**
                                                                              * Finds list of the highest 'n' values in the source list, ordered naturally,
                                                                              * with the highest value at the start of the array and returns it
                                                                              */
                                                                              @Override
                                                                              public int findTopNValues(int values, int n) {
                                                                              int length = values.length;

                                                                              for (int i=0; i<=n; i++) {
                                                                              int maxPos = i;
                                                                              for (int j=i+1; j<length; j++) {
                                                                              if (values[j] > values[maxPos]) {
                                                                              maxPos = j;
                                                                              }
                                                                              }

                                                                              if (maxPos != i) {
                                                                              int maxValue = values[maxPos];
                                                                              values[maxPos] = values[i];**strong text**
                                                                              values[i] = maxValue;
                                                                              }
                                                                              }
                                                                              return Arrays.copyOf(values, n);
                                                                              }
                                                                              }





                                                                              share|improve this answer












                                                                              public class FindTopValuesSelectionSortImpl implements FindTopValues {

                                                                              /**
                                                                              * Finds list of the highest 'n' values in the source list, ordered naturally,
                                                                              * with the highest value at the start of the array and returns it
                                                                              */
                                                                              @Override
                                                                              public int findTopNValues(int values, int n) {
                                                                              int length = values.length;

                                                                              for (int i=0; i<=n; i++) {
                                                                              int maxPos = i;
                                                                              for (int j=i+1; j<length; j++) {
                                                                              if (values[j] > values[maxPos]) {
                                                                              maxPos = j;
                                                                              }
                                                                              }

                                                                              if (maxPos != i) {
                                                                              int maxValue = values[maxPos];
                                                                              values[maxPos] = values[i];**strong text**
                                                                              values[i] = maxValue;
                                                                              }
                                                                              }
                                                                              return Arrays.copyOf(values, n);
                                                                              }
                                                                              }






                                                                              share|improve this answer












                                                                              share|improve this answer



                                                                              share|improve this answer










                                                                              answered Sep 23 '16 at 10:37









                                                                              HARISH

                                                                              13




                                                                              13






























                                                                                  draft saved

                                                                                  draft discarded




















































                                                                                  Thanks for contributing an answer to Stack Overflow!


                                                                                  • Please be sure to answer the question. Provide details and share your research!

                                                                                  But avoid



                                                                                  • Asking for help, clarification, or responding to other answers.

                                                                                  • Making statements based on opinion; back them up with references or personal experience.


                                                                                  To learn more, see our tips on writing great answers.





                                                                                  Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                                                                                  Please pay close attention to the following guidance:


                                                                                  • Please be sure to answer the question. Provide details and share your research!

                                                                                  But avoid



                                                                                  • Asking for help, clarification, or responding to other answers.

                                                                                  • Making statements based on opinion; back them up with references or personal experience.


                                                                                  To learn more, see our tips on writing great answers.




                                                                                  draft saved


                                                                                  draft discarded














                                                                                  StackExchange.ready(
                                                                                  function () {
                                                                                  StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f4084495%2ffind-top-n-elements-in-an-array%23new-answer', 'question_page');
                                                                                  }
                                                                                  );

                                                                                  Post as a guest















                                                                                  Required, but never shown





















































                                                                                  Required, but never shown














                                                                                  Required, but never shown












                                                                                  Required, but never shown







                                                                                  Required, but never shown

































                                                                                  Required, but never shown














                                                                                  Required, but never shown












                                                                                  Required, but never shown







                                                                                  Required, but never shown







                                                                                  Popular posts from this blog

                                                                                  Coverage of Google Street View

                                                                                  Full-time equivalent

                                                                                  Surfing