v2.3 Mme Potomacguidelines

Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

v2.

3 MME Guidelines

Table of Contents 

7 l
35 tia
Contents
Table of Contents  ........................................................................................................................... 1
Purpose ........................................................................................................................................... 2
Types of Content Comparisons ....................................................................................................... 2
Glossary of Key Terms ..................................................................................................................... 3

45 n
Basic Outline of the 3-Step Rating Process ..................................................................................... 3
Disqualification Criteria .................................................................................................................. 4
Individual Component Level Comparison ....................................................................................... 5
e
Tips for Dealing with Component Based Labels ......................................................................... 5
Image to Image Comparison ....................................................................................................... 6
18 id
Common Cases for Image Comparisons ..................................................................................... 7
Caption to Caption Comparison.................................................................................................. 8
Common Cases for Captions ....................................................................................................... 9
URL to URL Comparison ............................................................................................................ 10
nf

Common Cases for URLs ........................................................................................................... 11


Video to Video Comparison ...................................................................................................... 12
Common Cases for Videos ........................................................................................................ 13
Overall Holistic Comparison .......................................................................................................... 14
Rating Process ........................................................................................................................... 14
Co

How to Use Link to Fact Check.................................................................................................. 14


Labels for Holistic Comparison ................................................................................................. 14
Common Cases for Holistic Matching ....................................................................................... 15
Tips for Holistic Matching: ........................................................................................................ 15
Examples ....................................................................................................................................... 16

P a g e 1 | 30
Purpose
In this project you will perform pairwise content matching. For each job, you will be presented with two
examples of content. You must review each content pair, think about the main point of each example,

7 l
then tell us if they match with one another.

35 tia
Your job is to tell us the true relationship between each Source and Match Candidate. It is imperative
you follow the process closely so we can measure the precision of our systems and ultimately improve
them.

Types of Content Comparisons


• Image to Image

45 n


•e Caption to Caption Comparison
URL to URL Comparison
Video to Video Comparison
18 id
nf
Co

P a g e 2 | 30
Glossary of Key Terms
Key Terms Definitions
Job The whole process of reviewing whether a match and Source Content match.

7 l
Source Content The piece of content that has been rated by our Third-Party Fact Checkers.
Match Candidate The piece of content that our systems predicted to be a match to the Source

35 tia
Content.
Claim A statement of fact that can be supported or contradicted (or somewhere in
between).
Central Claim A statement of fact that is important to the content’s main point or purpose.
Claim Under Review The main claim that was investigated and rated by the fact checkers. This
usually represents the “main point” of the Source Content.

45 n
Link to Fact Check

e The link that serves as the source of truth for the claim under review and the
central claim in the Source Content.

Basic Outline of the 3-Step Rating Process


Step 1
18 id
Determine if the job meets the criteria to be rated.
• If No, disqualify and move to next job
• If Yes, see below
Step 2
Component Matching
• After making sure the job qualifies, go through each component guided by the questions given
nf

in SRT and decide a label for each individual part.


Step 3
Holistic Matching (or Overall Comparison)
• Read through the Claim Under Review
o If the claim is vague or unclear, click on the link and read the Headline of the fact check
article. The Headline is usually about the central claim of the Source Content. If it is still
Co

unclear, skim through the article until you have a good understanding of the fact
checked claim.
• Make sure to take all components into account and look at the content holistically (for example,
take into account how the caption, the image, and the overlaid text combined tell the intended
meaning of the content).
• Determine if the claim made in the Match Candidate, matches the main claim being made in the
Source Content.

P a g e 3 | 30
Disqualification Criteria
If the job contains any of the following, then it should be disqualified:
• Missing Content
o The SRT asks for information on content not present in the result.

7 l
• More than one foreign language word
o Proper nouns are not considered foreign language (example: Cabo)

35 tia
• Videos containing audio or subtitles in another language that ISN’T English
• Corrupted Content
o Cannot access one or more of the URLs in the job as intended. They take you to an error
page, log-in screen, or paywall, broken links (404, expired domain, server error). Note
this is different than ‘missing content’.
• The Source Content is not related to the Claim Under Review
o Quickly read the claim under review and click the link, skim the headline of the article

45 n and skim over the first couple of sentences. If the Source Content is not about the same
subject, it does not qualify.
• Issues involving SRT
e o Tooltip: Job submit issue or any other related SRT issues
18 id
Immediate Escalation: 
If the content contains imagery or text/voice indicating or soliciting Child Exploitation or Child Nudity,
escalate the Job ID immediately to your manager.
• Child Exploitive Imagery (CEI) refers to imagery (images, videos) depicting the sexual
exploitation of a child.
nf

IMPORTANT: CEI should *never* be screenshot or replicated in any way as it only further exacerbates
the issue. Please always look to use task/job number for issue identification.

If you are not comfortable reviewing the content on the webpage for any other reason, please escalate
the job ID to your manager via ticket and skip to the next job.
Co

Once you have determined that the job meets the criteria to be rated (SRT Question 1),
you will then:
• Determine how well the content matches on a component level (SRT Question 2)
• Determine how well the content agrees holistically (SRT Question 3).

Think of these tasks as narrowly defined, standalone content-matching questions. After the individual
component-level review is complete, you will look at the bigger picture, consider the examples
holistically (or Overall Comparison), and tell us if they share the same meaning overall.

P a g e 4 | 30
Individual Component Level Comparison
The instructions for component-level matching are different for each content type (image, caption, URL
and video). Read through the instructions below and use the SRT’s tooltips while rating, to refresh your
memory when needed.

7 l
Tips for Dealing with Component Based Labels

35 tia
Generally, the labels are as follows:
• Near Duplicates: There are either no differences, or if there are differences, they do not change
or have the potential to change the meaning of the component.
• Near Matches: The differences are greater than the criteria for the "Near Duplicates" label, but
the content still conveys the same overall message.

45 n
•e Do Not Match: The Match Candidate and Source Content are either about different subject
matters (and therefore they’re unrelated) or they are about the same subject matter but clearly
do not share the same meaning.
Unsure: If you are unsure whether something should be labeled as Near Match or Do Not
Match, then choose Unsure.

NOTE: There are differences that are repeated throughout these guidelines. For instance, Trivial
Differences in Text are found in all of the content types. For the most part, these are the exact
18 id
same criteria with slight additions for photos and videos.
nf
Co

P a g e 5 | 30
Image to Image Comparison
Once you have determined the job meets the criteria to be rated, you will be asked to determine how
well the images match on an individual component level.

7 l
Note: Do not consider the overall meaning of the Source or Match Candidate when
completing these subtasks. Think of these tasks as narrowly defined, standalone content-matching

35 tia
questions. After the component-level review is complete, you will look at the bigger picture, consider the
examples holistically, and tell us if they share the same meaning overall.

Decide if the Images match


When you are evaluating this part of the question, it is important to understand the difference between
the text captions and text overlay. Do not consider the text overlay during the initial image
comparison. Only compare the images. We will look at the text overlay in the second part of the

45 n
evaluation.

e
18 id
1. Text Caption – Any text that is added to enhance the post. This can be placed above or below the
image and is not part of the image.
2. Text Overlay – Any text placed within the image box, that acts as part of the image.
nf
Co

Based on your visual assessment of the images, determine the correct label from the list below:

Near Duplicates: These are identical or almost identical with the following trivial differences:
• cropping, tint, color, brightness
• screenshots
• rotations

P a g e 6 | 30
• stretching
• padding
• pixilation
• Trivial Imagery: These are differences that include added or subtracted imagery that is trivial in
amount, such as watermarks, arrows or circles.

7 l
• Trivial difference to overlaid text: These are differences that include added or subtracted overlay

35 tia
text that is trivial in amount, such as:
o Different spellings or character substitutions (deliberate or not)
▪ Examples:
• ✅ The cat's fur really soft | The kats fur is very sofft
• ✅ The cat's fur really soft | Thè cätš für is vêrÿ søft
o The addition or subtraction of emojis for emphasis
o Videos with English audio and accurate English subtitles
o Comments that don’t change the meaning of the component

45 ne ▪ Examples: “Wow”, “Amazing!”


NOTE: These differences do not change the meaning of the component.
Near Match: The Match Candidate and Source Content are a near match when:
• The differences are greater than the criteria for Near Duplicates, but the components
share a similar message.
They Do Not Match: The Match Candidate and Source Content do not match when:
18 id
• They don’t refer to the same subject matter
• They make different claims
Unsure: If you are unsure whether something should be labeled as Near Match or Do Not Match, then
choose Unsure.

One or both images are missing: Select if either or both images are missing.
nf

Common Cases for Image Comparisons


1. Photo within Albums: For an image to image comparison, an album (multiple photos) matches
to a single photo
a. How to handle: Focus on the most relevant image and use the criteria for image to
Co

image matches.
2. Link to Image: This is when a link appears instead of a photo, but by looking at the questions
given in SRT, it’s a photo comparison
a. How to handle: If the link takes you only to an image, use the criteria for labeling the
image component.
i. Given the photos are an exact Match when you paste the link into the browser,
the photos are considered near duplicates.

P a g e 7 | 30
Caption to Caption Comparison
Once you have determined the job meets the criteria to be rated, you will be asked to determine how
well the captions match on an individual component level.

7 l
Decide if the Captions match
Read through the text components for both the Source Content and Match Candidate and determine

35 tia
how well they match. For overly long texts, read through the first five paragraphs and skim the rest.
o You may stop reading long texts when it becomes clear the components are unrelated/do not
match.
o When coming across a job where the Match Candidate does not have an image, but the Source
Content does, compare the captions and ignore the single image.

Based on your assessment, determine the correct description from the list of choices below:

45 n
Near Duplicates: These are both identical or almost identical with the following trivial differences:
• Trivial Differences in Variance: 10% variance in text.
e o Example: 100 words of text with a candidate that has 10 words that are different.
▪ The captions can be the same length, but have a 10% difference in overall text
or
▪ The captions can be 10% difference in length
• Trivial Differences in Formatting: These are differences that include:
18 id
o spacing or text formatting
o punctuation
o the addition or subtraction of citations or copyright claims
o linking strategies.
• Trivial Differences to text: These are differences that include:
o Different spellings
nf

o Character substitutions (deliberate or not)


▪ Examples:
• ✅ The cat's fur really soft | The kats fur is very sofft
• ✅ The cat's fur really soft | Thè cätš für is vêrÿ søft
o The addition or subtraction of emojis for emphasis.
▪ Examples:
Co

• Adding after a joke.


• Adding after a question.
NOTE: These differences do not change the meaning of the component.

Near Match: The Match Candidate and Source Content are a near match when:
• The differences are greater than the criteria for Near Duplicates, but the components
share a similar message.
They Do Not Match: The Match Candidate and Source Content do not match when:
• They don’t refer to the same subject matter
• They make different claims
Unsure: If you are unsure whether something should be labeled as Near Match or Do Not Match, then
choose Unsure.

P a g e 8 | 30
One or both captions are missing: Select if either or both of the captions are missing.

Common Cases for Captions


1. Small remarks of agreement: This is when someone agrees with the post before or after by
adding in a couple words or sentences that adds no additional meaning besides agreement (or

7 l
disagreement depending on the job)

35 tia
a. How to handle: This is usually less than 10% of difference and does not change the
meaning of the component.
i. According to the guidelines, this would be considered a Near Duplicate.
2. Copy and Paste: This is when the post author urges others to reshare by copying or pasting the
content.
a. How to handle: This is usually less than 10% of difference and does not change the
meaning of the component.
i. According to the guidelines, this would be considered a Near Duplicate.

45 n
3. Attribution: This case is when someone replaces the person attributed to saying a quote or

e being the subject to a story.


a. How to handle: Make sure to read the Claim Under Review first since it matters. If the
claim is about person A saying something (person A’s quote), but in the Match
Candidate, the quote is attributed to someone else, it would be considered Do Not
Match. However, if the claim is about the content itself, the Match Candidate and the
Source Content can be a Near Duplicate depending on the difference.
18 id
nf
Co

P a g e 9 | 30
URL to URL Comparison
Once you have determined the job meets the criteria to be rated, you will be asked to determine how
well the URL Links match on an individual component level.

7 l
Decide if the URL’s Match
Compare the Match Candidate and Source Content; how would you describe the relationship between

35 tia
the two URL links?

1. Open each link in a new tab; then review the body of the destination page. The SRT links are not
currently clickable. You must manually copy and paste each URL into a new browser tab. Please take
great care to copy the entire URL string.
• Ignore advertisements, menu bars, comments, and related articles. Focus solely on the title and
body of the URL’s destination (blog post, article, etc.)
• If the focal point of the target page is a video (i.e. the URL is a YouTube link), watch the first 2

45 ne minutes and skim through the remainder.


2. Read through the first 5 paragraphs, then skim through the rest.
• You may stop reading long texts when it becomes clear the components are unrelated/do not
match.
3. Decide how well the Match Candidate matches the Source Content, using the labels outlined below.

Note: It is easy to accidentally paste the same link into both tabs: ensure you are comparing the two
18 id
distinct URLs provided.

Near Duplicates: These are both identical or almost identical with the following trivial differences:
• Trivial Differences in Variance: 10% variance in text.
o Example: 100 words of text with a candidate that has 10 words that are different.
▪ The captions can be the same length, but have a 10% difference in overall text
nf

or
▪ The captions can be 10% difference in length
• Trivial Differences in Formatting: Differences that include:
o spacing or text formatting
o punctuation
o the addition or subtraction of citations or copyright claims
Co

o linking strategies.
▪ Examples:
• The link https://nyti.ms/3cXHiDa is equivalent to https://nytimes.com
• Spelling out a link after a hypertext reference is trivial, you may ignore
differences such as these:
o Visit this link
o Visit this link (http://nytimes.com)
• Trivial Differences to Text: Differences that include:
o Different spellings
o Character substitutions (deliberate or not)
▪ Examples:
• ✅ The cat's fur really soft | The kats fur is very sofft
• ✅ The cat's fur really soft | Thè cätš für is vêrÿ søft
o The addition or subtraction of emojis for emphasis.

P a g e 10 | 30
▪ Examples:
• Adding after a joke.
• Adding after a question.
NOTE: These differences do not change the meaning of the component.

7 l
Near Match: The Match Candidate and Source Content are a near match when:

35 tia
• The differences are greater than the criteria for Near Duplicates, but the components
share a similar message.

They Do Not Match: The Match Candidate and Source Content do not match when:
• They don’t refer to the same subject matter
• They make different claims

Unsure: If you are unsure whether something should be labeled as Near Match or Do Not Match, then

45 n
choose Unsure.

Missing URLs: One or both of the URL components is missing.


e
Common Cases for URLs
1. Link to a video or page where there is another content type (for example, a link that takes you
18 id
article that has a video)
a. If the URL links return video content, rather than text content, then evaluate the job
using the criteria outlined under the Video section.
nf
Co

P a g e 11 | 30
Video to Video Comparison
Once you have determined the job meets the criteria to be rated, you will be asked to determine how
well the videos match on an individual component level.

7 l
Note: Do not consider the overall meaning of the Source or Match Candidate when
completing these subtasks. Think of these tasks as narrowly defined, standalone content-matching

35 tia
questions. After the component-level review is complete, you will look at the bigger picture, consider the
examples holistically, and tell us if they share the same meaning overall.

Decide if the videos match

Near Duplicates: These are identical or almost identical videos with the following trivial differences:
• Trivial differences in Formatting: These are differences that
o include cropping, tint, color, brightness

45 ne o screenshots
o rotation
o stretching
o padding
o pixilation
• Trivial differences in Imagery: These are differences that include added or subtracted imagery
that is trivial in amount, such as watermarks, arrows or circles.
• Trivial difference to overlaid text: These are differences that include added or subtracted overlay
18 id
text that is trivial in amount, such as:
o Different spellings or character substitutions (deliberate or not)
▪ Examples:
• ✅ The cat's fur really soft | The kats fur is very sofft
• ✅ The cat's fur really soft | Thè cätš für is vêrÿ søft
nf

o The addition or subtraction of emojis for emphasis


o Accurate close captioning (in English)
o Comments that don’t change the meaning of the component
▪ Examples: “Wow”, “Amazing!
• Trivial difference in Content or Length: Differences include:
o Difference of the length of the videos, either a couple seconds difference or up to 10%
Co

of a difference in seconds, whatever is longer, between the candidate video and the
source video.
▪ Example: a 3-minute video with a candidate that is ~18 seconds different
▪ Example: 30-second video with a 36-second video
o 10% difference in content
▪ Example: The length of the video is the same, but 10% of it is different content
• Trivial Differences in Audio:
o The audio is compressed (playback is of slightly lower or higher quality)
o Has a different volume level
o Music has been changed, added, or removed in a way that does not affect meaning
o The audio track has been silenced without affecting meaning
o The audio tracks feature different speakers, but the original meaning is not affected
o The audio contains sound-effects that do not affect the meaning of the video

P a g e 12 | 30
NOTE: These differences do not change the meaning of the component.

Near Match: The Match Candidate and Source Content are near matches when:
• The differences are greater than the criteria for Near Duplicates, but the
components share a similar message.

7 l
They Do Not Match: The Match Candidate and Source Content do not match when:

35 tia
• They don’t refer to the same subject matter
• They make different claims
• Differences in Text or Formatting that changes the meaning of the component (swapped from
near match example)

Unsure: If you are unsure whether something should be labeled as Near Match or Do Not Match, then
choose Unsure.

45 n
One or both videos are missing: Select if either or both videos are missing.

e
Common Cases for Videos
1. Differences in Length: Right off the bat, you can look at the length of both videos. A common
case in the queues is when two videos are similar but differ in length.
a. How to handle: Referring to the guidelines, if the difference in content or length is up to
18 id
10%, and
i. Given the photos are an exact Match when you paste the link into the browser,
the photos are considered near duplicates
2. Edited Videos: Videos can appear to be identical or very similar, but they can also be edited to
mislead the viewers.
a. How to handle: If the videos are edited in a manner that changes the meaning of the
nf

video with respect to the claim under review, it cannot be a Near Duplicate.
i. Choose Do Not Match
Co

P a g e 13 | 30
Overall Holistic Comparison
Looking at all of the components together, you will then decide how well the Match Candidate aligns
with the Source Content with respect to the “central claim” being expressed.

7 l
When you are evaluating this part of the question, it is important to understand the “central claim” and
“claim under review”.

35 tia
Rating Process
1. Review the Claim Under Review
a. If the Claim Under Review is ambiguous, vague, or not well written, click the link to skim
the headline and the first couple of sentences of the Fact Check Article.
2. Identify the central claim for the Source content

45 ne a. Recall: If the Source Content does not relate to the Claim Under Review, disqualify the
job.
3. Make sure to take all components into account and look at the content holistically (for example,
take into account how the caption, the image, and the overlaid text combined tell the intended
meaning of the content).
4. Determine if the claim made in the Match Candidate matches the main claim being made in the
Source Content.
18 id
How to Use Link to Fact Check
The link to the Fact Check Article is the source of truth for the associated Claim Under Review or the
nf

Central Claim within the Source Content. The link to the Fact Check Article should first be used to
determine if it is related to the Source Content and the Claim Under Review. Afterwards, it should be
used if you need further clarification about the Claim Under Review while determining if the Source
Content and Match Candidate are holistic matches.
Co

Labels for Holistic Comparison


• Agrees: The Match candidate and the Source Content agrees with the Source Content’s Central
Claim.
• Does Not Agree: The Match Candidate and Source Content refer to same subject matter, but
they disagree with respect to the central claim. In these cases, Source Content and Match
Candidate are mutually exclusive: i.e. they cannot both be true.
• Unrelated: If the main subject or main claims of the Match Candidate and Source Content are
not the same.
• Unsure: The Match Candidate does not take an obvious position on the Claim Under Review or
too much information or context has been taken away or added.

P a g e 14 | 30
Common Cases for Holistic Matching
1. Debunking: This is when a Match candidate debunks the claim made in the Source Content or
vice versa.
a. How to handle: If the Match Candidate debunks the Source Content’s central claim, it

7 l
therefore has a different meaning and should be labeled as Do Not Match.
2. Ambiguous Meaning:

35 tia
a. How to handle: If it is unclear whether the Match Candidate agrees with the Source
Content or vice versa, choose Unsure.
3. Submit Button Issues:
a. How to handle: If this happens, go back to the first question and disqualify the job by
choosing Issues involving SRT.

45 n
Tips for Holistic Matching:
• eTwo posts may appear to be identical, but with the addition of one or a couple of words, they
have completely different meanings.
o Example:
▪ Caption A: “you can cure covid in the following ways: taking magnesium,
drinking apple juice...”
18 id
▪ Caption B: “The following is FALSE.... you can cure covid in the following ways:
taking magnesium, drinking apple juice...”
• Be sure that you understand the Claim Under Review.
o NOTE: if the Claim Under Review that is written is vague, open the link under the Claim
Under Review and find out what part of the Source Content the article is focused on.
nf
Co

P a g e 15 | 30
Examples
1) Trivial Differences in Formatting: Text

7 l
35 tia
45 ne
18 id
nf

Spacing and punctuation: This is an example there’s a 10% variance in text, however it the added
sentence does not change the meaning of the text/ component.

The different spacing and punctuation would be considered a trivial difference (and not change the
meaning of the component).
Co

Although the Match Candidate creates new paragraphs and does not have quotation marks, the
meaning of the text stays the same.

P a g e 16 | 30
2) Trivial Differences in Formatting: Text

7 l
35 tia
45 ne
18 id
nf

According to the guidelines, if the difference in length is up to a 10% difference or a couple of words
(whichever is longer), and the meaning remains the same, it is considered Near Duplicate.

In this example, the additional sentence in the end of the Match candidate does not make up more than
10% of the total text and it does not change the meaning of the text. Therefore, it would be considered
Co

a Near Duplicate.

P a g e 17 | 30
3) Multiple Differences in Text

7 l
35 tia
45 ne
18 id
nf
Co

P a g e 18 | 30
Spacing and punctuation: The spacing and punctuation are different but are trivial since it doesn’t affect
the meaning of the text.

Difference in Length: The additional sentences in the beginning of the Match Candidate do not change
the meaning

7 l
Attribution: The quotation marks are not correctly used in the Match Candidate, but it’s still clear the

35 tia
statement is attributed to Trey Gowdy. Therefore, both of the captions are attributed to the same
person.

Copy and Paste: The post author adds at the end to “Copy and paste if you dare.” This does not affect
the meaning.

Label: Near Duplicate

45 ne
18 id
nf
Co

P a g e 19 | 30
Differences for Images
4) Identical Images

7 l
35 tia
45 ne
18 id
nf

Exact same: these images are exactly the same or near-exactly the same upon observation.

Component Label: Near Duplicate


Co

P a g e 20 | 30
5) Identical Images

7 l
35 tia
45 ne
Exact same: these images are exactly the same or near-exactly the same upon observation.
18 id
Component Label: Near Duplicate

6) Trivial Differences in Formatting: Image


nf
Co

P a g e 21 | 30
Trivial formatting: this is trivial because the full text is included in both images and any cropping does
not change the image’s meaning. There are no substantive changes as cropping the bottom only
removes whitespace and does not change the user’s understanding of the content.

Component Label: Near Duplicate

7 l
35 tia
7) Trivial Differences in Formatting: Image

45 ne
18 id
Trivial formatting: Buffer or whitespace added that does not change the meaning or substance of the
image.

Component Label: Near Duplicate


nf

8) Trivial Differences in Formatting: Image


Co

P a g e 22 | 30
Trivial overlay/text: the overlay does not make a substantive change to the meaning of image. The
added overlays are related to the original text and are not changing user understanding. If the overlays
were a politician’s face, company slogan, etc. then the meaning would be changed.

Component Label: Near Duplicate

7 l
35 tia
9) Trivial Differences in Formatting: Image

45 ne
18 id
nf
Co

Trivial formatting: Screenshots of content that do not show additional comments that change the
meaning of the content are considered Near Duplicates ((if they are the same picture with trivial
differences like cropping or other differences described in the Near Duplicate criteria).

Component Label: Near Duplicate

P a g e 23 | 30
10) Trivial Differences in Formatting: Images

7 l
35 tia
45 ne
18 id
nf

Trivial formatting: Screenshots of content that do not show additional comments that change the
meaning of the content are considered Near Duplicates (if they are the same picture with trivial
Co

differences like cropping or other differences described in the Near Duplicate criteria).

Component Label: Near Duplicate

P a g e 24 | 30
11) Trivial Differences in Formatting: Images

7 l
35 tia
45 n
Trivial formatting: cropping and trivial overlay that don’t change the meaning or message of the image.
e
The reshare text (generic account) is not indicating endorsement or comment by a person of interest
that changes the image’s message.

Component Label: Near Duplicate


18 id
12) Substantive Differences in Formatting: Image
nf
Co

Substantive Overlay/Text: If we ignore the fact that the text is non-English, the overlays and text on the
image add a value judgment and change the meaning of the photo. This particular example is a quasi-
political statement added to the original image.

Component Label: Do Not Match

P a g e 25 | 30
13) Substantive Differences in Formatting: Image

7 l
35 tia
45 n
Substantive formatting: the cropping removes key elements of the image that change its meaning.
e
Cropping to remove an individual that is key to the context (here holding the chain) influences user
perception and does not qualify as a match.
18 id
[watermark wouldn’t have done anything]

Component Label: Do Not Match

14) Substantive Differences in Formatting: Image


nf
Co

P a g e 26 | 30
Substantive formatting: The cropping changes the image meaning by removing context and words
integral to the message (“Corona virus”). The user would not have the same understanding when
removing key parts of the message.

Component Label: Do Not Match

7 l
35 tia
15) Common Cases for Captions: Small Remarks of Agreement

45 ne
18 id
nf
Co

Small remarks of agreement: This is when someone agrees with the post before or after by adding in a
(this is usually less than 10% of difference in length) and should NOT change the meaning
• Example: In this case, the first couple sentences give an additional sense of agreement to the
caption that follows but does not change or add to the meaning.

Label: Near Duplicate

P a g e 27 | 30
16) Common Cases for Captions: Copy and Paste

7 l
35 tia
45 ne
18 id
nf
Co

Copy and Paste:


This is when the post author urges others to reshare by copying or pasting the content.
According to the guidelines, this would be considered a Near Duplicate.

P a g e 28 | 30
17) Common Cases for Images: Link to Image

7 l
35 tia
45 ne
18 id
nf

Link to Image: This is when a link appears instead of a photo, but by looking at the questions given in
SRT, it’s a photo comparison.

Given the photos are the same photos with trivial differences (cropping) when you paste the link into
Co

the browser, the photos are considered Near Duplicates

P a g e 29 | 30
18) Common Cases for Images: Photo to Album (Multiple photos) Comparison

7 l
35 tia
45 ne
18 id
nf
Co

P a g e 30 | 30

You might also like