Matrix Median

Problem

Given a row wise sorted matrix of size R*C where R and C are always odd, find the median of the matrix. (link)

Example 1:

Input:
R = 3, C = 3
M = [[1, 3, 5], 
     [2, 6, 9], 
     [3, 6, 9]]
Output: 5
Explanation: Sorting matrix elements gives 
us {1,2,3,3,5,6,6,9,9}. Hence, 5 is median.

Example 2:

Input:
R = 3, C = 1
M = [[1], [2], [3]]
Output: 2
Explanation: Sorting matrix elements gives 
us {1,2,3}. Hence, 2 is median.

Your Task: You don't need to read input or print anything. Your task is to complete the function median() which takes the integers R and C along with the 2D matrix as input parameters and returns the median of the matrix.

Solution

Brute Force Approach

  • Store all the matrix elements into a 1D array.

  • Sort the 1D array.

  • Return the index (m*n)/2 value. (Median)

This approach is akin to applying binary search on potential answers. First, we find out the answer range. Our answer will lie in between the minimum and maximum elements present in the matrix. Since each row of the matrix is sorted we can get the minimum in first column and maximum in last column. Now, we apply binary search on the answer range [min, max]. What we should do now where will our answer lies?

Since median is the mid element of the all the elements when we put them in sorted order, and number of rows (m) and number of columns (n) are odd, then m*n is also odd. So, to the left and right of the median, there will be exactly (m*n)/2.

So, if we calculate the number of elements for which median is greater, it is (m*n)/2. So, the frequency of median is greater than (m*n)/2 (Including the median). So, we calculate the frequency of each potential answer and compare it with the required frequency, which is (m*n)/2.

Initially, low = min and high = max. We move the low and high based on the following conditions

  • low = mid + 1

    • Move low to right when frequency(mid) is less than or equal to requried frequency.

    • We move low to right because we want a frequency which is greater than the required frequency.

  • high = mid - 1:

    • Move high to the left when frequency(mid) is greater than required.

    • We move high to left because we want a frequency which is near to (m*n)/2 but just greater than required.

To determine the frequency of a potential answer, we perform the following steps:

  • We iterate through each row and identify the upper bound of the potential answer.

  • Since each row is sorted, the number of elements greater than the potential answer corresponds to the index of the upper bound.

Does our binary search algorithm return a value that doesn’t exist in the matrix?

No, it doesn’t.

Consider this scenario: we have a 1D array derived from a matrix with elements like 1 1 1 3 3 3 9 9 9. Here’s the frequency of each number:

  • 1: frequency - 3

  • 2: frequency - 3

  • 3: frequency - 6

  • 4: frequency - 6

  • 5: frequency - 6

  • 6: frequency - 6

  • 7: frequency - 6

  • 8: frequency - 6

  • 9: frequency - 9

Now, let’s perform a binary search. We start with low = 1 and high = 9. The mid value is 5, which has a frequency of 6. So, we move high to mid - 1 = 4.

Next, with low = 1 and high = 4, the mid value is 4, which also has a frequency of 6. So, we move high to mid - 1 = 3.

Then, with low = 1 and high = 3, the mid value is 2, which has a frequency of 3. So, we move low to mid + 1 = 3.

Now, with low = 3 and high = 3, the mid value is 3, which has a frequency of 6. So, we move high to mid - 1 = 2.

At this point, low = 3 remains constant and the loop breaks.

The key takeaway here is that low always points to an element that exists in the list. It will always point to the first element in the range that has the same frequency as all the other elements in that range. In our case, the numbers 3, 4, 5, 6, 7 all have a frequency of 6, but low ends up pointing to the first one (3), which is present in the list. Therefore, low cannot point to a number that is not present in the list.

LowHighMidFrequency of MidAction
1956Move high to mid - 1 = 4
1423Move low to mid + 1 = 3
3436Move high to mid - 1 = 2
32--Loop breaks, low remains at 3

In this process, low always points to an element that exists in the list. It will always point to the first element in the range that has the same frequency as all the other elements in that range. Therefore, low cannot point to a number that is not present in the list.

class Solution {
    int median(int matrix[][], int R, int C) {

        int m = matrix.length;
        int n = matrix[0].length;

        int low = getMinOfCol(matrix, 0);
        int high = getMaxOfCol(matrix, n-1);

        int req = (m*n)/2;

        while(low<=high){
            int mid = (low+high)/2;

            int count = getFreqOfElementsLessThan(matrix,mid);

            if(count<=req){
                low = mid+1;
            }
            else{
                high = mid-1;
            }
        }
        return low;
    }

    private int getFreqOfElementsLessThan(int[][] matrix, int mid){
        int m = matrix.length;
        int count = 0;
        for(int i=0; i<m; i++){
            count+=upperBound(matrix[i], mid);
        }
        return count;
    }

    private int upperBound(int[] row, int x){
        int low = 0;
        int high = row.length-1;

        while(low<=high){
            int mid = (low+high)/2;

            if(row[mid]<=x){
                low = mid+1;
            }
            else{
                high = mid-1;
            }
        }
        return low;
    }

    private int getMinOfCol(int[][] matrix, int col){
        int min = -1;

        for(int i=0; i<matrix.length; i++){
            min = Math.min(matrix[i][col], min);
        }

        return min;
    }

    private int getMaxOfCol(int[][] matrix, int col){
        int max = -1;

        for(int i=0; i<matrix.length; i++){
            max = Math.max(matrix[i][col], max);
        }

        return max;
    }
}