Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove duplicates strings or int from Slice in Go

Tags:

arrays

slice

go

Let's say I have a list of student cities and the size of it could be 100 or 1000, and I want to filter out all duplicates cities.

I want a generic solution that I can use to remove all duplicate strings from any slice.

I am new to Go Language, So I tried to do it by looping and checking if the element exists using another loop function.

Students' Cities List (Data):

studentsCities := []string{"Mumbai", "Delhi", "Ahmedabad", "Mumbai", "Bangalore", "Delhi", "Kolkata", "Pune"}

Functions that I created, and it's doing the job:

func contains(s []string, e string) bool {
    for _, a := range s {
        if a == e {
            return true
        }
    }
    return false
}

func removeDuplicates(strList []string) []string {
    list := []string{}
    for _, item := range strList {
        fmt.Println(item)
        if contains(list, item) == false {
            list = append(list, item)
        }
    }
    return list
}

My solution test

func main() {
    studentsCities := []string{"Mumbai", "Delhi", "Ahmedabad", "Mumbai", "Bangalore", "Delhi", "Kolkata", "Pune"}

    uniqueStudentsCities := removeDuplicates(studentsCities)
    
    fmt.Println(uniqueStudentsCities) // Expected output [Mumbai Delhi Ahmedabad Bangalore Kolkata Pune]
}

I believe that the above solution that I tried is not an optimum solution. Therefore, I need help from you guys to suggest the fastest way to remove duplicates from the slice?

I checked StackOverflow, this question is not being asked yet, so I didn't get any solution.

like image 721
Riyaz Khan Avatar asked Mar 15 '21 18:03

Riyaz Khan


People also ask

How do you remove duplicates in string?

We can remove the duplicate characters from a string by using the simple for loop, sorting, hashing, and IndexOf() method.

Can you remove duplicates on a single field?

Remove Duplicates from a Single Column in Excel Select the data. Go to Data –> Data Tools –> Remove Duplicates. In the Remove Duplicates dialog box: If your data has headers, make sure the 'My data has headers' option is checked.

Can duplicates be removed using copy stage?

You can use Aggregator stage to remove duplicates. Here you need 2 more stages copy and join stages. Even we can capture the duplicate records using remove duplicate stage.

How to remove duplicates from a string in go?

You just need to replace all occurrences of type string with another selected type. With the release of Generics in Go 1.18, you can write a universal function to remove duplicates without having to re-implement it every time a slice is of a different type. Check out how to do it here.

How to remove duplicate elements from slices of INTs and strings?

Use maps, and slices, to remove duplicate elements from slices of ints and strings. Remove duplicates. A slice contains any elements. It contains different values, but sometimes has duplicate ones. We remove these elements with custom methods. With a map, we enforce uniqueness of elements.

How do you remove duplicates from a slice in Python?

Remove duplicates. A slice contains any elements. It contains different values, but sometimes has duplicate ones. We remove these elements with custom methods. With a map, we enforce uniqueness of elements. We can use a map to remove duplicates while preserving element order. If order does not matter, we can ignore it. Ints, retains order.

How to check if a slice has no duplicates?

We have defined a function where we are passing the slice original values and checking the duplicates. Logic for duplicate check : For this we have defined another slice and assigning the first values by checking if the value already exists in the new slice or not. It returns the slice without duplicates.


4 Answers

I found Burak's and Fazlan's solution helpful. Based on that, I implemented the simple functions that help to remove or filter duplicate data from slices of strings, integers, or any other types with generic approach.

Here are my three functions, first is generic, second one for strings and last one for integers of slices. You have to pass your data and return all the unique values as a result.

Generic solution: => Go v1.18

func removeDuplicate[T string | int](sliceList []T) []T {
    allKeys := make(map[T]bool)
    list := []T{}
    for _, item := range sliceList {
        if _, value := allKeys[item]; !value {
            allKeys[item] = true
            list = append(list, item)
        }
    }
    return list
}

To remove duplicate strings from slice:

func removeDuplicateStr(strSlice []string) []string {
    allKeys := make(map[string]bool)
    list := []string{}
    for _, item := range strSlice {
        if _, value := allKeys[item]; !value {
            allKeys[item] = true
            list = append(list, item)
        }
    }
    return list
}

To remove duplicate integers from slice:

func removeDuplicateInt(intSlice []int) []int {
    allKeys := make(map[int]bool)
    list := []int{}
    for _, item := range intSlice {
        if _, value := allKeys[item]; !value {
            allKeys[item] = true
            list = append(list, item)
        }
    }
    return list
}

You can update the slice type, and it will filter out all duplicates data for all types of slices.

Here is the GoPlayground link: https://go.dev/play/p/iyb97KcftMa

like image 180
Riyaz Khan Avatar answered Oct 16 '22 15:10

Riyaz Khan


You can do in-place replacement guided with a map:

processed := map[string]struct{}{}
w := 0
for _, s := range cities {
    if _, exists := processed[s]; !exists {
        // If this city has not been seen yet, add it to the list
        processed[s] = struct{}{}
        cities[w] = s
        w++
    }
}
cities = cities[:w]
like image 34
Burak Serdar Avatar answered Oct 16 '22 15:10

Burak Serdar


Adding this answer which worked for me, does require/include sorting, however.

func removeDuplicateStrings(s []string) []string {
    if len(s) < 1 {
        return s
    }

    sort.Strings(s)
    prev := 1
    for curr := 1; curr < len(s); curr++ {
        if s[curr-1] != s[curr] {
            s[prev] = s[curr]
            prev++
        }
    }

    return s[:prev]
}

For fun, I tried using generics! (Go 1.18+ only)

type SliceType interface {
    ~string | ~int | ~float64 // add more *comparable* types as needed
}

func removeDuplicates[T SliceType](s []T) []T {
    if len(s) < 1 {
        return s
    }

    // sort
    sort.SliceStable(s, func(i, j int) bool {
        return s[i] < s[j]
    })

    prev := 1
    for curr := 1; curr < len(s); curr++ {
        if s[curr-1] != s[curr] {
            s[prev] = s[curr]
            prev++
        }
    }

    return s[:prev]
}

Go Playground Link with tests: https://go.dev/play/p/bw1PP1osJJQ

like image 5
snassr Avatar answered Oct 16 '22 17:10

snassr


Simple to understand.

func RemoveDuplicate(array []string) []string {
    m := make(map[string]string)
    for _, x := range array {
        m[x] = x
    }
    var ClearedArr []string
    for x, _ := range m {
        ClearedArr = append(ClearedArr, x)
    }
    return ClearedArr
}
like image 1
Spoofed Avatar answered Oct 16 '22 16:10

Spoofed