# Ocaml Beginner: How do I check if there are duplicate elements in a list?

Hi! So I’m working on this assignment, essentially I’m having some trouble thinking about a way to go about this. The problem is to identify if a list has duplicate elements, and return true if it does. For example, [1;2;3;4] would be false, and [1;2;2;3] would return true. In java I would probably just take the first element then check if the rest are equal to the first element, and move on until the end of the list. How would I get started in OCaml?

My current code isnt much, but I know that for pattern matching, obviously I have the []->false, but to traverse the list I’m confused how to go about it

I would approach this problem by creating a secondary list(initially []) and check each head element against it and then build up the secondary list with the head elements… Basically something like this…

``````let dupExist lst =
let rec dupExistAux lst b =
match lst with
| [] -> false
| hd::tl ->
if (List.exists (fun x -> x = hd) b)
then
true
else
dupExistAux tl (hd::b) in
dupExistAux lst []
``````

What was I thinking(duh!)? Here’s a better way!

``````let rec exist elem lst =
match lst with
| [] -> false
| hd::tl -> elem = hd || exist elem tl

let rec dupExist lst =
match lst with
| [] -> false
| hd::tl -> (exist hd tl) || dupExist tl``````
1 Like

I would probably use a hash table here to internally keep track of items that have already been seen. For each element in the input list, add a key-value pair of `element, ()` to the hash table and simultaneously update a list length counter. At the end, check if the list length counter is different from the hash table length (which is O(1)). If they’re different, you have duplicate elements in the list.

Relevant functions here would be `List.iter`, `Hashtbl.create`, and `Hashtbl.replace`.

3 Likes

It might be nice to have a way to break on the first duplicate encountered. You could fold over the list `l`, building a list `u` of unique elements as you go. For each new element in `l` you will have to compare it to all elements in `u`; if it is equal to one of them, return `true`, otherwise add it to the list `u` and continue.

1 Like

That’s true. `List.exists` would probably be good for that, used in an imperative way with a hash table.

A variant of G4143 's solution with the List.exists function :

``````let rec dup_exist = function
| [] -> false
| hd::tl -> List.exists ((=) hd) tl || dup_exist tl``````
1 Like

We can also short-circuit a bit more on every iteration. In lists with upto two elements we can tell quite easily that there are no duplicates:

``````let dupExist = function
| [] | [_] -> false
| [a1; a2] -> a1 = a2
| list -> ...
``````

For larger lists we can do some logic to walk the list and check for duplicates.

2 Likes

However, hash tables mean mutation. Unless performance is critical, maps allow a pure functional solution.

Mutation doesn’t mean it’s not pure-functional. If the mutation is happening inside the scope of a single function call and not leaking out across function calls, it’s completely pure. The critical point is, what side effects are observable outside the scope of a function call.

Anyway, to me, using a hash table wasn’t really about performance but more about convenience. With maps, you need to instantiate a concrete map module using the type of map keys. For a potentially polymorphic input list, you don’t know the type of map keys. This can, probably, be overcome using a locally abstract type, but with a hash table you don’t need such tricks.

3 Likes

Your mileage may vary, but I always find that unless something really requires mutation, even inside a sealed box where it can’t leak out, I’m always happier without it. The more mutation in my code, the more mistakes I seem to make.

3 Likes