Finding the last unique string in a list

Question

I was doing some coding on CodeSignal to refresh my knowledge.

I was asked this question:

Suppose you've got a list of words, let's say ['apple', 'banana', 'apple', 'mango', 'banana']. Each word could be repeated an arbitrary number of times. Think of this list as a conveyor belt in a space-age fruit factory. Now, your task is to identify the last unique fruit on the belt, i.e., the one that didn't repeat. If all the fruits are repeating, then there ain't any unique fruit, and your function should return an empty string ('').

Your function should take a list of strings (the conveyor belt of fruits) as input. Now, a string can be any word, not just a fruit name, and the list can have any number of strings. There could also be an edge case where the list has no strings at all (Empty conveyor belt, eh?). For output, your function should return the last unique string in the list or an empty string if there are not any of them.

To solve this task, you are expected to use sets. Sets are efficient for tracking seen and duplicate elements due to their fast membership testing capability.

and my solution is:

def find_unique_string(words):
    last_uniq = ''
    seen = set()
    for w in words:
        if last_uniq == w:
           last_uniq = ''
        if w not in seen:
           last_uniq = w
        seen.add(w)
    return last_uniq

and the test cases are:

print(find_unique_string(['apple', 'banana', 'apple', 'mango', 'banana']))  # It should print: 'mango'
print(find_unique_string(['hello', 'world', 'hello']))  # It should print: 'world'
print(find_unique_string(['hello', 'world', 'hello', 'world']))  # It should print: ''
print(find_unique_string([]))  # It should print: ''
print(find_unique_string(['apple', 'banana', 'apple', 'kiwi', 'banana', 'kiwi'])) # it should print ''

The CodeSignal AI doesn't accept my code as a solution and reports that it is not correct, but it can not provide any cases in which the result of my code is wrong.

Now my question:

Who is right? Me or CodeSignal AI?

If I am right, how can I prove to CodeSignal AI that I am right so I can pass the test and go to the next one?

If AI is right, can you give me a sample test case that breaks my code and doesn't generate the expected output?

Note:

I do not want you to write a solution which would have two sets, one for seen and one for duplicate as this is the result that AI is expecting (I think), but I want to know if my code is correct and if it is, how to prove that it is, or you have a test case to prove that my code is wrong.

RSIMB GO · Accepted Answer · 2024-12-14 02:54:59Z

3

I believe CodeSignal AI is correct in identifying an issue with your solution. Consider this test case: ["apple", "banana", "banana"]. Your code currently returns "", but the answer should be "apple".

answered yesterday

RSIMB GO

611 silver badge6 bronze badges

Add a comment |

Rookie · Accepted Answer · 2024-12-14 03:02:25Z

1

Your code fails to find the correct last unique string.

The reason is that you overwrite last_uniq each time when it matches the last unique string, consider the test case ['hello', 'world', 'world'], when your loop iterating on the first world, it will consider that word be your last unique string. Then when it matches the second world, it will miss the first unique string hello.

More easy way to solve this problem is to use a map to record the element and its occurrence times. With python, you can use Counter to do that.

answered yesterday

Rookie

191 bronze badge

New contributor

Add a comment |

Tim Biegeleisen · Accepted Answer · 2024-12-14 02:54:01Z

0

One option would be to do this in two steps. First, use Counter to find the counts of each list entry. Then, use a list comprehension to filter off any list element appearing more than once.

from collections import Counter

inp = ['apple', 'banana', 'apple', 'mango', 'banana']
counts = Counter(inp)
u_list = [item for item in inp if counts[item] == 1]
print(u_list[-1])  # mango

answered yesterday

Tim Biegeleisen

520k29 gold badges310 silver badges386 bronze badges

Add a comment |

Zeta-Squared · Accepted Answer · 2024-12-14 03:51:07Z

As others have commented there are counter examples for your proposed solution. I think the issue stems from the fact that your are using a set to track strings that have been seen. I believe in this case it is more sensible to use a set to track all the words that are unique. With this approach the worse case scenario is iterating over words twice. That being said, the best case scenario iterates over words once. I am not sure if there is a more optimal solution, most probably - optimisation isn't my forte.

lemon8de · Accepted Answer · 2024-12-14 04:07:31Z

def find_unique_string(words):
    array_input = words.copy()
    if len(words) == 0:
        return ""
    sets = set(words)
    for item in sets:
        array_input.remove(item)

    for w in words[::-1]:
        if w not in array_input:
            return w
    return ""

Main idea is that sets will remove duplicates, and when we proceed to start removing items on the original array, a unique value should be completely erased. The last for loop checks for that missing value in reverse and immediately ends when it finds one.

Hope this helps.

Tested on these tests:

['apple', 'banana', 'apple', 'mango', 'banana']
['hello', 'world', 'hello']'world'
['hello', 'world', 'hello', 'world']
[]
['apple', 'banana', 'apple', 'kiwi', 'banana', 'kiwi']
['apple', 'berry', 'berry', 'coke', 'coke']
['apple', 'berry', 'berry', 'coke', 'mango', 'coke']
['apple', 'berry', 'berry', 'berry', 'coke', 'mango', 'coke']

Vinicius Kamakura · Accepted Answer · 2024-12-14 04:09:25Z

order matters, frequency matters.

In your solution, even though you have a set, you are not keeping track if it was seen more than twice or whatever, I didn't check your code that much.

take a look at my solution that is O(n) without using anything but a dict.

def find_unique_string(str_list):
  print(str_list)

  buckets = {} # {word: (freq, first_time)}
  i = 0
  for word in str_list:
      if buckets.get(word):
          buckets[word] = (buckets[word][0] +1, buckets[word][1]) # increment freq but keep first time seen
      else:
          buckets[word] = (1, i)

      i += 1

  print(buckets)

  word = ('', 0) # (word, last_time)
  for key, value in buckets.items():
      # value = (freq, first_time)
      if (value[0] != 1):
          continue

      if (value[1] > word[1]):
          word = (key, word[1])

  return word[0]

print(find_unique_string(['apple', 'banana', 'apple', 'mango', 'banana']))  # It should print: 'mango'
print(find_unique_string(['hello', 'world', 'hello']))  # It should print: 'world'
print(find_unique_string(['hello', 'world', 'hello', 'world']))  # It should print: ''
print(find_unique_string([]))  # It should print: ''
print(find_unique_string(['apple', 'banana', 'apple', 'kiwi', 'banana', 'kiwi'])) # it should print ''

I have kept the prints so it is easy to follow. Let me know if you have any questions

Kalp Shah · Accepted Answer · 2024-12-14 06:33:54Z

0

From Version 3.7 of Python, Dictionaries are ordered. We can use this Fact to complete the Program in O(n) time.

ls = ['apple', 'banana', 'apple', 'mango', 'banana'] # Set this list as whatever you want
frequency = {}
for l in ls:
    frequency[l] = frequency.get(l, 0) + 1

uniqueFruit = ''
for key, value in frequency.items():
    if value == 1:
        uniqueFruit = key
print(uniqueFruit)

answered 22 hours ago

Kalp Shah

11

New contributor

Add a comment |

SIGHUP · Accepted Answer · 2024-12-14 10:05:51Z

If you wanted to do this without any supporting modules then:

ls = ['apple', 'banana', 'apple', 'mango', 'banana']

d = dict()

for w in ls:
    d[w] = d.get(w, 0) + 1

for k, v in reversed(d.items()):
    if v == 1:
        print(k)
        break
else:
    print("No unique values found")

This builds a dictionary where the keys are the words from the list and the associated values are a count of the number of occurrences of that word. (One would usually use Counter for this)

We can take advantage of the fact that the dictionary key order is retained in current Python. Therefore, if we examine the key/value pairs in reverse then the first value equal to 1 is what we're looking for.

As a general solution we need to allow for the fact that there may not be any unique words in the list.

Minor note but you don't need to use dict() as that's equivalent to {} (only empty sets need to be constructed with set()). — Unmitigated, Commented 7 hours ago

Collectives™ on Stack Overflow

Finding the last unique string in a list

Note:

8 Answers 8

Your Answer

Not the answer you're looking for? Browse other questions tagged
python
algorithm
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

Note:

8 Answers 8

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged pythonalgorithm or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
python
algorithm
or ask your own question.