Image Vision (AI) models and features

Quick Navigation

Support Home Automation Image Vision (AI) models and features

Implio comes with a wide range of Image Vision models carefully crafted by our team of Data Scientists and moderation experts. These models allow you to make sense of user-generated images and handle their moderation specifically, based on what they contain.

This page describes all available models and how to use corresponding predictions.

Note that these models need to be enabled before they can be used. To do so, please contact our support team.

Types of model

There are 2 different types of models: classification and detection:

Classification models yield predictions on the entire image.
Here's how they are represented in the Implio user interface:

Detection models on the other hand detect individual objects/elements on the image.
Here's how they are represented in the Implio user interface:

Detection models come with additional features – see the Detection models features section below for more information on how to use their unique features.

List of available models

The following table lists all currently available models, organized by topic.

Each model may yield one or several tags per image. The last column contains an explanation on how these tags should be interpreted:

Topic	Model	Type	Tags	How to interpret tags
Nudity	Nudity	Classification	nudity	Image contains full-on nudity.
	Suggestive	Classification	suggestive	Image is suggestive/racy/sexy.
	Sex toy	Classification	sex_toy	Image contains a sex toy.
Vulgarity	Middle finger	Classification	middle_finger	Image contains a hand with its middle finger raised.
Vulgarity	Tongue out	Classification	tongue_out	Image contains a person with their tongue out.
Violence & extremism	Gore	Classification	gore	Image contains elements of gore or violence.
	Weapon	Classification	weapon	Image contains a weapon.
	Nazi	Classification	nazi	Image contains nazi symbols, figures or propaganda.
	Communism	Classification	communism	Image contains communist symbols, figures or propaganda.
	Daech	Classification	daech	Image contains daech/isis symbols, figures or propaganda.
	Terrorist	Classification	terrorist	Image contains a terrorist.
Substances	Drug	Classification	drug	Image contains drugs/substances that may be prohibited.
	Marijuana	Classification	marijuana	Image contains marijuana or references to it.
	Tobacco	Classification	tobacco	Image contains tobacco products or people smoking.
	Alcohol	Classification	alcohol	Image contains alcoholic beverages or people drinking alcohol.
Faces	Face	Detection	face	Image contains a human face (face is visible).
	Gender	Classification	male, female	One of the faces detected looks like a male/female.
	Minor	Classification	minor	One of the faces detected looks like a minor (less than 18 years old).
People	Child	Classification	child	Image contains a child (face is not necessarily visible).
People	Ski mask	Classification	ski_mask	Image contains a ski mask or a person wearing a ski mask.
Fake / misrepresentation	Stock photo	Classification	stock_photo	Image looks like a stock photo.
Fake / misrepresentation	Model	Classification	male_model, female_model	Image looks like a photo of a male/female model.
Art	Artwork	Classification	artwork	Image contains artwork: painting, drawing, or other artistic work.
	Painting	Classification	painting	Image contains a painting.
	Drawing	Classification	drawing	Image contains a drawing.
	Manga	Classification	manga	Image contains a manga (book) or manga art.
	Comic art	Classification	comic_art	Image contains a comic book or comic art.
	Statue	Classification	statue	Image contains a statue.
Text, logos & flags	Text OCR	Detection	text.ocr	Image contains text
	Handwritten text	Classification	handwritten_text	Image contains handwritten text.
	Watermark	Classification	watermark	Image contains a watermark (text or logo).
	Country Flags	Classification	country_flag	Image contains a country flag.
PII & contact information	License plate	Classification	license_plate	Image contains a license plate.
	Contact info	Classification	contact_info	Image contains contact information.
	Phone number	Classification	phone_number	Image contains a phone number.
	QR code	Classification	qr_code	Image contains a QR code.
	Facebook profile	Classification	facebook_profile	Image looks like a Facebook profile.
	Instagram profile	Classification	instagram_profile	Image looks like an Instagram profile.
	User profile	Classification	user_profile	Image looks like a user profile of some sort.
	Social profile	Classification	social_profile	Image looks like a social media profile of some sort.
Image characteristics	Low quality	Classification	low_quality	Image is of poor quality (under/overexposed, grainy, etc).
	Orientation	Classification	misoriented	Orientation of image is incorrect (off by 90, 180 or 270 degrees).
	Solid color	Classification	solid_white solid_black	Image is a solid white/black image.
	Screenshot	Classification	screenshot	Image is a screenshot.
Vehicles	Car	Classification	car	Image contains a car.
	Car interior	Classification	car_interior	Image contains a car interior.
	Boat	Classification	boat	Image contains a boat.
	Motorbike	Classification	motorbike	Image contains a motorbike.
	Bicycle	Classification	bicycle	Image contains a bicycle.
	Snowmobile	Classification	snowmobile	Image contains a snowmobile.

Making use of predictions in automation rules

This section describes how to leverage the output of the above-listed models.

Image tags

Each of the above-listed models can output one or several image tags, for each of the images contained by an item item submitted to Implio for moderation.

These tags are exposed via the following automation variables:

$images[0].tags
$images[1].tags
…

These tags can then be queried using the CONTAINS operator. For instance, you can determine whether the first item's image contains nudity or is suggestive using the following expression:

$images[0].tags CONTAINS ("nudity", "suggestive")

In addition, the $images.tags variable contains the concatenation of tags across all images of an item.

For instance, the following expression checks whether any of the item's images contains nudity or is suggestive:

$images.tags CONTAINS ("nudity", "suggestive")

Those tags are calibrated for high precision (precision is typically around 90%). In other words, you should get a relatively low proportion of false positives.

Uncertain image tags

In addition, image models will output specific tags when the confidence of the prediction is low.

Those tags are exposed via $images[0].uncertain_tags, $images.uncertain_tags and $images.uncertain_tags automation variables.

They are used in the exact same way as regular (high precision) image tags.
For instance, the following expression will match lower-precision nudity or suggestive images:

$images[0].uncertain_tags CONTAINS ("nudity", "suggestive")

You should use uncertain tags if you need maximum recall (i.e. catch as many images containing the notion you are looking for) at the expense of precision. In other words, you will get a higher proportion of false positives using uncertain tags compared to normal (high-precision) tags.

Uncertain tags are typically sent for manual review, so moderators can have a closer look at images and determine how corresponding items should be handled.

You can create separate automation rules for normal and uncertain image tags and set different actions for them. For instance, you could automatically reject items with normal tags, and send those with uncertain tags to a manual moderation queue for closer inspection.

Detection models features

Unlike classification models which predict whether the entire image contains the desired notion, detection models detect individual objects within images.

Additional pieces of information – object count and area – are available via specific automation variables.

Optical Character Recognition variables

OCR allows additional variables to be used, merging some concept of images and text detection. Consequently OCR can use the text variable LENGTH.

Variable name	Possible values	Description
$images[n].text.ocr	string	Contains the characters recognized in image n. $images[2].text.ocr contains “Hello”
$images.text.ocr	string	Contains the characters recognized in all images that the item contains. This is the concatenation of the above $images[n].text.ocr variables, each separated by a line break: $images[1].text.ocr contains “Bonjour” $images[2].text.ocr contains “Hello”

Variable name

Possible values

Description

$images[n].text.ocr

string

Contains the characters recognized in image n.

$images[2].text.ocr contains “Hello”

$images.text.ocr

string

Contains the characters recognized in all images that the item contains.
This is the concatenation of the above $images[n].text.ocr variables, each separated by a line break:
$images[1].text.ocr contains “Bonjour”

$images[2].text.ocr contains “Hello”

Sample rule expressions

For instance, the following expression will match any string of text on any pictures that are 30 or more characters long:

LENGTH($images.text.ocr) >= 30

The following expression will match any picture containing 6 or more digits:

$images.text.ocr CONTAINS /\d{6,}/

Object count

The number of objects detected on images are exposed via the following variables:

Variable name	Possible values	Description
$images[n].<tag>.count	integer	Number of <tag> objects detected in image n. For instance, $images[0].face.count will contain the number of faces detected in the first item's image.
$images.<tag>.count	integer	Number of <tag> objects detected across all of the item's images. For instance, considering an item with 2 images containing 3 faces each, $images.face.count will equal 6.

Variable name

Possible values

Description

$images[n].<tag>.count

integer

Number of <tag> objects detected in image n.

For instance, $images[0].face.count will contain the number of faces detected in the first item's image.

$images.<tag>.count

integer

Number of <tag> objects detected across all of the item's images.

For instance, considering an item with 2 images containing 3 faces each, $images.face.count will equal 6.

Object area

The following variables can be used to match images based on the proportion represented by <tag> object(s) over the total image area.

The area for a given tag object (e.g. a face) is calculated as the number of pixels represented by the object's bounding box, divided by the total number of pixels that the image contains. The result is a float number comprised between 0 and 1.

Variable name	Possible values	Description
$images[n].<tag>.area	float [0-1] 0 if image n doesn’t contain any <tag> object.	Proportion of image area represented by all <tag> objects in image n. For instance, if image 0 contains two faces, one representing 10% of the image area and the other 5%, then $images[0].face.area will equal 0.15 .
$images[n].<tag>.minObjectArea	float [0-1] n/a (no value) if image n doesn’t contain any <tag> object.	Proportion of image area represented by the smallest <tag> object in image n. For instance, if image 0 contains two faces, one representing 10% of the image area and the other 5%, then $images[0].face.minObjectArea will equal 0.05 .
$images[n].<tag>.maxObjectArea	float [0-1] n/a (no value) if image n doesn’t contain any <tag> object.	Proportion of image area represented by the largest <tag> object in image n. For instance, if image 0 contains two faces, one representing 10% of the image area and the other 5%, then $images[0].face.maxObjectArea will equal 0.1 .
$images.<tag>.area	float [0-1] 0 if none of the item's images contain a <tag> object.	Average area represented by <tag> objects across all of the item's images. For instance, if an item contains 2 images with one face in each, representing resp. 10% of the first image area and 5% of the second image area, then $images.face.area will equal 0.075 .
$images.<tag>.minArea	float [0-1] 0 if none of the item's images contain a <tag> object.	Minimum area represented by a <tag> object across all of the item's images. For instance, if an item contains 2 images with one face in each, representing resp. 10% of the first image area and 5% of the second image area, then $images.face.minArea will equal 0.05 .
$images.<tag>.maxArea	float [0-1] 0 if none of the item's images contain a <tag> object.	Maximum area represented by a <tag> object across all of the item's images. For instance, if an item contains 2 images with one face in each, representing resp. 10% of the first image area and 5% of the second image area, then $images.face.minArea will equal 0.10 .

Variable name

Possible values

Description

$images[n].<tag>.area

float [0-1]

0 if image n doesn’t contain any <tag> object.

Proportion of image area represented by all <tag> objects in image n.

For instance, if image 0 contains two faces, one representing 10% of the image area and the other 5%, then $images[0].face.area will equal 0.15 .

$images[n].<tag>.minObjectArea

float [0-1]

n/a (no value) if image n doesn’t contain any <tag> object.

Proportion of image area represented by the smallest <tag> object in image n.

For instance, if image 0 contains two faces, one representing 10% of the image area and the other 5%, then $images[0].face.minObjectArea will equal 0.05 .

$images[n].<tag>.maxObjectArea

float [0-1]

n/a (no value) if image n doesn’t contain any <tag> object.

Proportion of image area represented by the largest <tag> object in image n.

For instance, if image 0 contains two faces, one representing 10% of the image area and the other 5%, then $images[0].face.maxObjectArea will equal 0.1 .

$images.<tag>.area

float [0-1]

0 if none of the item's images contain a <tag> object.

Average area represented by <tag> objects across all of the item's images.

For instance, if an item contains 2 images with one face in each, representing resp. 10% of the first image area and 5% of the second image area, then $images.face.area will equal 0.075 .

$images.<tag>.minArea

float [0-1]

0 if none of the item's images contain a <tag> object.

Minimum area represented by a <tag> object across all of the item's images.

For instance, if an item contains 2 images with one face in each, representing resp. 10% of the first image area and 5% of the second image area, then $images.face.minArea will equal 0.05 .

$images.<tag>.maxArea

float [0-1]

0 if none of the item's images contain a <tag> object.

Maximum area represented by a <tag> object across all of the item's images.

For instance, if an item contains 2 images with one face in each, representing resp. 10% of the first image area and 5% of the second image area, then $images.face.minArea will equal 0.10 .

Sample rule expressions

For instance, the following expression will match items whose first image has text objects representing over 50% of the total image area:

$images[0].text.area > 0.5

Similarly, the following expression will match items where the largest text object across all item's images represents over 50% of the image where it is found:

$images.text.maxArea > 0.5

Sorry, we didn't find any relevant articles for you.

Image Vision (AI) models and features

Quick Navigation

Types of model

List of available models

Making use of predictions in automation rules

Image tags

Uncertain image tags

Detection models features

Optical Character Recognition variables

Object count

Object area

Was this article helpful?

Can’t find what you’re looking for?

Sorry, we didn't find any relevant articles for you.

Image Vision (AI) models and features

Quick Navigation

Types of model

List of available models

Making use of predictions in automation rules

Image tags

Uncertain image tags

Detection models features

Optical Character Recognition variables

Object count

Object area

Was this article helpful?

Related Questions:

Can’t find what you’re looking for?