Python Data Structure and Algorithm Tutorial – Bucket Sort Algorithm

Bucket Sort Algorithm

 

In this tutorial, you will learn how bucket sort works. Also, you will find working examples of bucket sort in Python.

Bucket Sort is a sorting technique that sorts the elements by first dividing the elements into several groups called buckets. The elements inside each bucket are sorted using any of the suitable sorting algorithms or recursively calling the same algorithm.

Several buckets are created. Each bucket is filled with a specific range of elements. The elements inside the bucket are sorted using any other algorithm. Finally, the elements of the bucket are gathered to get the sorted array.

The process of bucket sort can be understood as a scatter-gather approach. The elements are first scattered into buckets then the elements of buckets are sorted. Finally, the elements are gathered in order.

Bucket Sort Working
Working of Bucket Sort

How Bucket Sort Works?

  1. Suppose, the input array is:
    Bucket Sort Steps
    Input array

    Create an array of size 10. Each slot of this array is used as a bucket for storing elements.

    Bucket Sort Steps
    Array in which each position is a bucket
  2. Insert elements into the buckets from the array. The elements are inserted according to the range of the bucket.In our example code, we have buckets each of ranges from 0 to 1, 1 to 2, 2 to 3,…… (n-1) to n.
    Suppose, an input element is .23 is taken. It is multiplied by size = 10 (ie. .23*10=2.3). Then, it is converted into an integer (ie. 2.3≈2). Finally, .23 is inserted into bucket-2.

    Bucket Sort Steps
    Insert elements into the buckets from the array

    Similarly, .25 is also inserted into the same bucket. Everytime, the floor value of the floating point number is taken.

    If we take integer numbers as input, we have to divide it by the interval (10 here) to get the floor value.

    Similarly, other elements are inserted into their respective buckets.

    Bucket Sort Steps
    Insert all the elements into the buckets from the array
  3. The elements of each bucket are sorted using any of the stable sorting algorithms. Here, we have used quicksort (inbuilt function).
    Bucket Sort Steps
    Sort the elements in each bucket
  4. The elements from each bucket are gathered.It is done by iterating through the bucket and inserting an individual element into the original array in each cycle. The element from the bucket is erased once it is copied into the original array.
    Bucket Sort Steps
    Gather elements from each bucket

Bucket Sort Algorithm

bucketSort()
  create N buckets each of which can hold a range of values
  for all the buckets
    initialize each bucket with 0 values
  for all the buckets
    put elements into buckets matching the range
  for all the buckets 
    sort elements in each bucket
  gather elements from each bucket
end bucketSort

Python Examples

/* Bucket Sort in Python */

def bucketSort(array):
    bucket = []

    /* Create empty buckets */
    for i in range(len(array)):
        bucket.append([])

    /* Insert elements into their respective buckets */
    for j in array:
        index_b = int(10 * j)
        bucket[index_b].append(j)

    /* Sort the elements of each bucket */
    for i in range(len(array)):
        bucket[i] = sorted(bucket[i])

    /* Get the sorted elements */
    k = 0
    for i in range(len(array)):
        for j in range(len(bucket[i])):
            array[k] = bucket[i][j]
            k += 1
    return array


array = [.42, .32, .33, .52, .37, .47, .51]
print("Sorted Array in descending order is")
print(bucketSort(array))

Complexity

  • Worst Case Complexity: O(n2)
    When there are elements of close range in the array, they are likely to be placed in the same bucket. This may result in some buckets having more number of elements than others.
    It makes the complexity depend on the sorting algorithm used to sort the elements of the bucket.
    The complexity becomes even worse when the elements are in reverse order. If insertion sort is used to sort elements of the bucket, then the time complexity becomes O(n2).
  • Best Case Complexity: O(n+k)
    It occurs when the elements are uniformly distributed in the buckets with a nearly equal number of elements in each bucket.
    The complexity becomes even better if the elements inside the buckets are already sorted.
    If insertion sort is used to sort elements of a bucket then the overall complexity in the best case will be linear ie. O(n+k)O(n) is the complexity for making the buckets and O(k) is the complexity for sorting the elements of the bucket using algorithms having linear time complexity at the best case.
  • Average Case Complexity: O(n)
    It occurs when the elements are distributed randomly in the array. Even if the elements are not distributed uniformly, bucket sort runs in linear time. It holds true until the sum of the squares of the bucket sizes is linear in the total number of elements.

 


Bucket Sort Applications

Bucket sort is used when:

  • input is uniformly distributed over a range.
  • there are floating point values

 

Python Example for Beginners

Two Machine Learning Fields

There are two sides to machine learning:

  • Practical Machine Learning:This is about querying databases, cleaning data, writing scripts to transform data and gluing algorithm and libraries together and writing custom code to squeeze reliable answers from data to satisfy difficult and ill defined questions. It’s the mess of reality.
  • Theoretical Machine Learning: This is about math and abstraction and idealized scenarios and limits and beauty and informing what is possible. It is a whole lot neater and cleaner and removed from the mess of reality.

Data Science Resources: Data Science Recipes and Applied Machine Learning Recipes

Introduction to Applied Machine Learning & Data Science for Beginners, Business Analysts, Students, Researchers and Freelancers with Python & R Codes @ Western Australian Center for Applied Machine Learning & Data Science (WACAMLDS) !!!

Latest end-to-end Learn by Coding Recipes in Project-Based Learning:

Applied Statistics with R for Beginners and Business Professionals

Data Science and Machine Learning Projects in Python: Tabular Data Analytics

Data Science and Machine Learning Projects in R: Tabular Data Analytics

Python Machine Learning & Data Science Recipes: Learn by Coding

R Machine Learning & Data Science Recipes: Learn by Coding

Comparing Different Machine Learning Algorithms in Python for Classification (FREE)

Disclaimer: The information and code presented within this recipe/tutorial is only for educational and coaching purposes for beginners and developers. Anyone can practice and apply the recipe/tutorial presented here, but the reader is taking full responsibility for his/her actions. The author (content curator) of this recipe (code / program) has made every effort to ensure the accuracy of the information was correct at time of publication. The author (content curator) does not assume and hereby disclaims any liability to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from accident, negligence, or any other cause. The information presented here could also be found in public knowledge domains.  

Google –> SETScholars