Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove All but First Occurrence of a Character in a List of Strings

I have a list of names, and I need to output a single string that shows the letters from the names in the order they appear without the duplicates (e.g. If the list is ["John"; "James"; "Jack"], the output string should be Johnamesck). I've got a solution (folding all the names into a string then parse), but I feel like I'm cheesing it a bit by making my string mutable.

I also want to state this is not a school assignment, just an exercise from a work colleague as I'm coming into F# from only ever knowing Java Web stuff.

Here is my working solution (for insight purposes):

let lower = ['a' .. 'z']
let upper = ['A' .. 'Z']
let mutable concatedNames = ["John"; "James"; "Jack"] |> List.fold (+) ""

let greaterThanOne (length : int) = length > 1
let stripChars (str : string) letter =
    let parts = str.Split([| letter |])
    match greaterThanOne (Array.length parts) with
    | true -> seq {
                    yield Array.head parts
                    yield string letter
                    yield! Array.tail parts
                  }
                  |> String.concat ""
    | _ -> str

let killAllButFirstLower = lower |> List.iter (fun letter -> concatedNames <- (stripChars concatedNames letter))
let killAllButFirstUpper = upper |> List.iter ( fun letter -> concatedNames <- (stripChars concatedNames letter))
printfn "All names with duplicate letters removed: %s" concatedNames

I originally wanted to do this explicitly with functions alone and had a solution previous to above

let lower = ['a' .. 'z']
let upper = ['A' .. 'Z']
:
:
:
let lowerStripped = [""]
let killLowerDuplicates = lower |> List.iter (fun letter -> 
                                        match lowerStripped.Length with
                                        | 1 ->
                                                (stripChars concatedNames letter)::lowerStripped |> ignore

                                        | _ ->  (stripChars (List.head lowerStripped) letter)::lowerStripped |> ignore

                                 )

let upperStripped = [List.head lowerStripped]
let killUpperDuplicates = lower |> List.iter ( fun letter -> (stripChars (List.head upperStripped) letter)::upperStripped |> ignore )
let strippedAll = List.head upperStripped
printfn "%s" strippedAll

But I couldn't get this working because I realized the consed lists weren't going anywhere (not to mention this is probably inefficient). The idea was that by doing it this way, once I parsed everything, the first element of the list would be the desired string.

I understand it may be strange asking a question I already have a solution to, but I feel like using mutable is just me not letting go of my Imperative habits (as I've read it should be rare to need to use it) and I want to more reinforce pure functional. So is there a better way to do this? Is the second solution a feasible route if I can somehow pipe the result somewhere?

like image 533
LumberSzquatch Avatar asked Jun 06 '18 19:06

LumberSzquatch


2 Answers

You can use Seq.distinct to remove duplicates and retain ordering, so you just need to convert the list of strings to a single string, which can be done with String.concat "":

let distinctChars s = s |> String.concat ""
                        |> Seq.distinct
                        |> Array.ofSeq
                        |> System.String

If you run distinctChars ["John"; "James"; "Jack"], you will get back:

"Johnamesck"
like image 162
Chad Gilbert Avatar answered Nov 15 '22 10:11

Chad Gilbert


This should do the trick:

let removeDuplicateCharacters strings =
    // Treat each string as a seq<char>, flattening them into one big seq<char> 
    let chars = strings |> Seq.collect id  // The id function (f(x) = x) is built in to F#
                                           // We use it here because we want to collect the characters themselves
    chars
    |> Seq.mapi (fun i c -> i,c) // Get the index of each character in the overall sequence
    |> Seq.choose (fun (i,c) ->  
        if i = (chars |> Seq.findIndex ((=) c))  // Is this character's index the same as the first occurence's index?
        then Some c                              // If so, return (Some c) so that `choose` will include it,
        else None)                               // Otherwise, return None so that `choose` will ignore it
    |> Seq.toArray // Convert the seq<char> into a char []
    |> System.String // Call the new String(char []) constructor with the choosen characters

Basically, we just treat the list of strings as one big sequence of characters, and choose the ones where the index in the overall sequence is the same as the index of the first occurrence of that character.

Running removeDuplicateCharacters ["John"; "James"; "Jack";] gives the expected output: "Johnamesck".

like image 41
Aaron M. Eshbach Avatar answered Nov 15 '22 11:11

Aaron M. Eshbach