3

I was doing some coding on CodeSignal to refresh my knowledge.

I was asked this question:

Suppose you've got a list of words, let's say ['apple', 'banana', 'apple', 'mango', 'banana']. Each word could be repeated an arbitrary number of times. Think of this list as a conveyor belt in a space-age fruit factory. Now, your task is to identify the last unique fruit on the belt, i.e., the one that didn't repeat. If all the fruits are repeating, then there ain't any unique fruit, and your function should return an empty string ('').

Your function should take a list of strings (the conveyor belt of fruits) as input. Now, a string can be any word, not just a fruit name, and the list can have any number of strings. There could also be an edge case where the list has no strings at all (Empty conveyor belt, eh?). For output, your function should return the last unique string in the list or an empty string if there are not any of them.

To solve this task, you are expected to use sets. Sets are efficient for tracking seen and duplicate elements due to their fast membership testing capability.

and my solution is:

def find_unique_string(words):
    last_uniq = ''
    seen = set()
    for w in words:
        if last_uniq == w:
           last_uniq = ''
        if w not in seen:
           last_uniq = w
        seen.add(w)
    return last_uniq

and the test cases are:

print(find_unique_string(['apple', 'banana', 'apple', 'mango', 'banana']))  # It should print: 'mango'
print(find_unique_string(['hello', 'world', 'hello']))  # It should print: 'world'
print(find_unique_string(['hello', 'world', 'hello', 'world']))  # It should print: ''
print(find_unique_string([]))  # It should print: ''
print(find_unique_string(['apple', 'banana', 'apple', 'kiwi', 'banana', 'kiwi'])) # it should print ''

The CodeSignal AI doesn't accept my code as a solution and reports that it is not correct, but it can not provide any cases in which the result of my code is wrong.

Now my question:

Who is right? Me or CodeSignal AI?

If I am right, how can I prove to CodeSignal AI that I am right so I can pass the test and go to the next one?

If AI is right, can you give me a sample test case that breaks my code and doesn't generate the expected output?

Note:

I do not want you to write a solution which would have two sets, one for seen and one for duplicate as this is the result that AI is expecting (I think), but I want to know if my code is correct and if it is, how to prove that it is, or you have a test case to prove that my code is wrong.

0

8 Answers 8

3

I believe CodeSignal AI is correct in identifying an issue with your solution. Consider this test case: ["apple", "banana", "banana"]. Your code currently returns "", but the answer should be "apple".

1

Your code fails to find the correct last unique string.

The reason is that you overwrite last_uniq each time when it matches the last unique string, consider the test case ['hello', 'world', 'world'], when your loop iterating on the first world, it will consider that word be your last unique string. Then when it matches the second world, it will miss the first unique string hello.

More easy way to solve this problem is to use a map to record the element and its occurrence times. With python, you can use Counter to do that.

New contributor
Rookie is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.
0

One option would be to do this in two steps. First, use Counter to find the counts of each list entry. Then, use a list comprehension to filter off any list element appearing more than once.

from collections import Counter

inp = ['apple', 'banana', 'apple', 'mango', 'banana']
counts = Counter(inp)
u_list = [item for item in inp if counts[item] == 1]
print(u_list[-1])  # mango
0

As others have commented there are counter examples for your proposed solution. I think the issue stems from the fact that your are using a set to track strings that have been seen. I believe in this case it is more sensible to use a set to track all the words that are unique. With this approach the worse case scenario is iterating over words twice. That being said, the best case scenario iterates over words once. I am not sure if there is a more optimal solution, most probably - optimisation isn't my forte.

0
def find_unique_string(words):
    array_input = words.copy()
    if len(words) == 0:
        return ""
    sets = set(words)
    for item in sets:
        array_input.remove(item)

    for w in words[::-1]:
        if w not in array_input:
            return w
    return ""

Main idea is that sets will remove duplicates, and when we proceed to start removing items on the original array, a unique value should be completely erased. The last for loop checks for that missing value in reverse and immediately ends when it finds one.

Hope this helps.

Tested on these tests:

['apple', 'banana', 'apple', 'mango', 'banana']
['hello', 'world', 'hello']'world'
['hello', 'world', 'hello', 'world']
[]
['apple', 'banana', 'apple', 'kiwi', 'banana', 'kiwi']
['apple', 'berry', 'berry', 'coke', 'coke']
['apple', 'berry', 'berry', 'coke', 'mango', 'coke']
['apple', 'berry', 'berry', 'berry', 'coke', 'mango', 'coke']
New contributor
lemon8de is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.
0

order matters, frequency matters.

In your solution, even though you have a set, you are not keeping track if it was seen more than twice or whatever, I didn't check your code that much.

take a look at my solution that is O(n) without using anything but a dict.

def find_unique_string(str_list):
  print(str_list)

  buckets = {} # {word: (freq, first_time)}
  i = 0
  for word in str_list:
      if buckets.get(word):
          buckets[word] = (buckets[word][0] +1, buckets[word][1]) # increment freq but keep first time seen
      else:
          buckets[word] = (1, i)

      i += 1

  print(buckets)

  word = ('', 0) # (word, last_time)
  for key, value in buckets.items():
      # value = (freq, first_time)
      if (value[0] != 1):
          continue

      if (value[1] > word[1]):
          word = (key, word[1])

  return word[0]

print(find_unique_string(['apple', 'banana', 'apple', 'mango', 'banana']))  # It should print: 'mango'
print(find_unique_string(['hello', 'world', 'hello']))  # It should print: 'world'
print(find_unique_string(['hello', 'world', 'hello', 'world']))  # It should print: ''
print(find_unique_string([]))  # It should print: ''
print(find_unique_string(['apple', 'banana', 'apple', 'kiwi', 'banana', 'kiwi'])) # it should print ''

I have kept the prints so it is easy to follow. Let me know if you have any questions

0

From Version 3.7 of Python, Dictionaries are ordered. We can use this Fact to complete the Program in O(n) time.

ls = ['apple', 'banana', 'apple', 'mango', 'banana'] # Set this list as whatever you want
frequency = {}
for l in ls:
    frequency[l] = frequency.get(l, 0) + 1

uniqueFruit = ''
for key, value in frequency.items():
    if value == 1:
        uniqueFruit = key
print(uniqueFruit)
New contributor
Kalp Shah is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.
0

If you wanted to do this without any supporting modules then:

ls = ['apple', 'banana', 'apple', 'mango', 'banana']

d = dict()

for w in ls:
    d[w] = d.get(w, 0) + 1

for k, v in reversed(d.items()):
    if v == 1:
        print(k)
        break
else:
    print("No unique values found")

This builds a dictionary where the keys are the words from the list and the associated values are a count of the number of occurrences of that word. (One would usually use Counter for this)

We can take advantage of the fact that the dictionary key order is retained in current Python. Therefore, if we examine the key/value pairs in reverse then the first value equal to 1 is what we're looking for.

As a general solution we need to allow for the fact that there may not be any unique words in the list.

1
  • Minor note but you don't need to use dict() as that's equivalent to {} (only empty sets need to be constructed with set()). Commented 7 hours ago

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.