Click here to Skip to main content
16,004,647 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I wrote a program to sort an array for Hackerrank median problem, my code passes 3/4 testcases. The last testcase fails due to timeout when the total elements of the array are 10001. How can I optimize this code to prevent timeouts with higher numbers?

C++
#include<iostream>
using namespace std;
int main() {
    int n,j,i,tmp,med;
    cin >> n;
    int *a = (int*) malloc(n * sizeof(int));
    for(i=0; i<n; i++) 
        cin >> a[i];
    for(i=0; i<n-1; i++) {
        for(j=0; j<n-i-1; j++) {
            if(a[j] > a[j+1]) {
                tmp = a[j];
                a[j] = a[j+1];
                a[j+1] = tmp;
            }
        }
    }
    med = a[n/2];
    free(a);
    cout << med;
    return 0;
}


What I have tried:

I tried changing the integer to long int but the time complexity remains the same.
Posted
Updated 13-May-19 3:17am

If you only want to find the median, not the fully ordered array of inputs, sorting is suboptimal. A simple trick to reduce the number of comparisons is to partition your array into values that are greater and smaller than some given test value, and then recurse the process within the 'correct' sub array.

You can find a better description of this concept - with an additional twist - here[^]. According to this site, the algorithm is guaranteed to be linear in time (i.e. O(N)), and therefore shouldn't have a hard time with a list of size N=10001.

Compared to that, even the best sorting algorithms are no better than O(N*log(N)) worst case, and, IIRC, at best O(N*log(log(N))) on average.

For those who don't like clicking on links from unknown sources, check google or wikipedia for "Median of medians", or take this short step by step description
Quote:
Median-of-medians Algorithm

The algorithm takes in a list and an index-median-of-medians(A, i). Assume that all elements of A are distinct (though the algorithm can be further generalized to allow for duplicate elements).

1. Divide the list into sublists each of length five (if there are fewer than five elements available for the last list, that is fine).

2. Sort each sublist and determine the median. If the list has an even number of elements, take the floor of the length of the list divided by 2 to find the index of the median.

3. Use the median-of-median algorithm to recursively determine the median of the set of all the medians.

4. Use this median as the pivot element, x. The pivot is an approximate median of the whole list and then each recursive step hones in on the true median.

5. Reorder the list such that all elements less than x are to the left of x, and all elements that are greater than x are to the right. This is called partitioning. The elements are in no particular order once they are placed on either side of x.

6. Let k be the “rank” of x meaning, for a set of numbers S, x is the kth smallest number in S.

7a. If i==k, then return x.
7b. If i<k, then recurse using median-of-medians to find the ith number of the left sub array
7c. If i>k, recurse using the median-of-medians algorithm to find the (i-k)th number of the right sub array.
 
Share this answer
 
v2
Quote:
How can I optimize this code to prevent timeouts with higher numbers?

I agree with OG and Rick, you are using the simplest possible Bubble sort algorithm with no refinement. And this is a bad idea as it is really inefficient.
First, you need to understand how your code works and change it that way:
C++
int cnt_test= 0;
int cnt_swap= 0;
for(i=0; i<n-1; i++) {
    for(j=0; j<n-i-1; j++) {
        cnt_test++;
        if(a[j] > a[j+1]) {
            cnt_swap++;
            tmp = a[j];
            a[j] = a[j+1];
            a[j+1] = tmp;
        }
    }
}
// and then print the 2 counters

This way, you will be able to see the workload.
Then run your code with sample data:
1 2 3 4 5 6 7 8 9
9 1 2 3 4 5 6 7 8
2 3 4 5 6 7 8 9 1
1 2 6 5 4 3 7 8 9
Then same data with 20 and 30 values

Then try changes to make your code sensitive to data and see how the counters evolve.

A little study of sort algorithms will help you choose a better one.
 
Share this answer
 
Change the looping to look like this :
C++
for( i = 0; i < n - 2; i++ )
{
    for( j = i + 1; j < n - 1; j++ )
    {
        if( a[j] > a[i] )
        {
            tmp = a[j];
            a[j] = a[i];
            a[i] = tmp;
        }
    }
}
This will have the outer loop run to one less than the end and the inner loop runs from one after the outer loop's index to the end of the array. The problem was the inner loop repeated processing of items that had been evaluated previously so it was wasting time. Also, this code does not check adjacent entries only. It checks the array at the i and j indexes which won't always be adjacent. This should work considerably faster.
 
Share this answer
 
v2
Comments
Rick York 11-May-19 21:25pm    
I did some testing with this algorithm. It was only slightly better than the OP's. I also found that std::sort was way, way better (when the array was a vector) and qsort was even better than that. Oh well.
Simple: change the algorithm.
That's a simple Bubble sort: it's not a "quick" algorithm by any means, and there are much faster versions: Sorting algorithm - Wikipedia[^]

But you can dramatically improve Bubble just by "remembering" where you last swapped, and ignoring sorted areas; by doing it bidirectionally, so pass one moves the highest to the end, and the second moves the lowest to the beginning.

But changing the algorithm for a time efficient one will be the biggest improvement as data sizes increase.
 
Share this answer
 
v2
Comments
Richard MacCutchan 11-May-19 12:41pm    
Do you get the feeling there is a class of students all working on this problem?
OriginalGriff 11-May-19 14:24pm    
Not "working" on it, no ... :laugh:

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900