Battle of the Brands

TL; DR; This video pretty much summaries what we have done!

We were curious at the total sponsorship that IPL has managed to get even during the pandemic.

Looking at early-stage startups pouring an enormous amount of money, we were thinking if we were an early-stage startup, would we be sponsoring IPL?

If we were planning to do so, how do we smartly choose which title to sponsor?

We made an independent analysis report, to understand the above questions. To know more please click on the following link

Note: Accuracy of the system, this post is created to get early feedback on what we are trying to achieve

Setting up the project

In this section we will give overview of how we set up the project and got the

Object Detection:

What is object detection?

Object detection is the process of classifying(what) and localizing(where) the classes are in an image. It generally takes image as an input and returns one or more bounding boxes with the class label attached to each image.

What is one-shot object detection?

Modern machine learning algorithms are data-hungry, which requires tens of thousands of labelled images per class for the object detection task.

One-shot object detection stands on the extreme another side of these algorithms, it requires just ONE or very few images per class for the purpose of object detection. One-shot detection is best suited for two-dimensional images like road signs, logos, etc, and doesn't work well with three-dimensional images like faces, cars, animals, etc

Approaches

How does it work?

There are various ways in which one shot object detection done, we have selectively hand-picked the ones which we liked.

(i) Create synthetic data using the given data and train a object detection model on top of it

Synthetic Data: Given a logo, and background images(optional), the model generates images by augmenting the logo (rotating, stretching, etc) and then placing it on top of the background image.

Model: You can use this synthetic data to train a Deep/Machine learning models like YOLO, or anything.

Remember some of the models require inputs to be in a specific format for training. You can read about these formats here[1]

(ii) Query Image on the input Image

Ref Link: [3]

(iii) By matching features

Feature Extraction: Extract features from the image using ResNet

Correlation Matching: Match the features of the input image with class images

Spatial Alignment:

Computing Output: Annotate the matches.

Ref links: [4]

Did we implement all the things that we said above?

Though we would love to implement from this scratch, it would have been good learning. We didn't have enough time and did not want to reinvent the wheel. We studied the ones mentioned above (and few more) to find the one which would require minimal effort to implement.

We ended up using One-Stage One-Shot Object Detection by Matching Anchor Features (OS2D) for this task, you can test out the others too.

Training models is a luxury which we don't have. We tried using Google-Colab for training, it turned out to be a frustrating thing to do.

Dataset

In this section we explain about the input parameters for the model

How does your data look like?

Logos

Input Logos/Classes: We took screenshot of the each logo from the random match images, the logos that were given as input to the model are displayed below.

Match Images

Link to the match: https://www.hotstar.com/in/sports/cricket/indian-premier-league/delhi-capitals-vs-mumbai-indians-m701710/match-clips/replay-mi-vs-dc-final/1540002359

Total time for the match: 6Hr18min (378 min)

First Half: 1:26 - 3:12 (106 min)

Second Half: 3:25 - 4:55 (190 min)

1 image per second

⇒ (106+190)*60

⇒ ~17,000 Total of images

We used the low-resolution image as most of the users would watch the game in the mobile device and it was faster to run the models these images. This would mean that some of the classes(logos) might not be identified, in the above image PayTM in the umpires T-shirt is hardly visible and might not be predicted.

Note: We didn't remove the advertisements which were shown during the match. Plus points to the brands which managed to show their logos during the advertisements.

Modeling/Training

How did you train your system?

We didn't train the model, we used the model trained by the authors of OS2D V2-train. You can download the model from Google Drive[5].

Whats your system accuracy? There must be a catch.!

Since we didn't have the ground truth data it was not possible for us to calculate the accuracy of the model. However, we ended up calculating the precision of the model.

The mAP for the model was mentioned in the paper is ~90%.

What is your precision and recall?

Source: Wikipedia

Let's say we are building an object detection model to identify all the Paytm logos in the image.

Image (i) has 100% precision and recall, as all the identified logos are of Paytm and all the logos are identified.

Image (ii), has low precision(75%) as there is a false positive, as one of the logos of CEAT is identified.

Image (iii) has 100% precision as all the identified logos are correct, and 33% recall as two of the Paytm logos were not identified.

Image (iv) has 0% precision and recall, as none of the logos are identified correctly.

High* Recall

High* Precision

(i)

Low* Precision

(ii)

Low* Recall

(iii)

(iv)

*Relatively

Hacking/Tweaking part: Hyper-parameters are always HYPER!

There were two approaches that we tried in collecting the information

Approach 1

Reduce the threshold for feature comparison. We tried evaluating the system with a threshold of 40%. The result of this was terrible, we ended up getting a lot of random predictions(roughly 10% were correct). Just to give you some examples this is what we got.

You can guess which class does the above predictions belong to, for each image, there is 1 out of 6 chance that you are right!

Approach 2:

Extension from Approach 1. After a few experimentations, we found that when we set the threshold to 70%, we were getting 100% precision, i.e all the images being predicted as a class were indeed that class. The problem was that the model was weeding out a lot of predictions which came in as less threshold.

Now that we had a lot of variations of a class, we took a subset of the predictions randomly and used it for the input class. For every class, we created one One-shot Detection Model that used a subset of the predictions as the input class images. The subset of the images was picked randomly and the number of images for the input class was set to 20.

Through this approach we were able to get our model to predict better.

We repeated the experiment 3 times, separately for the first half and the second half. For one of iterations, the precision computation is shown below

Precision

Name

Identified

Correct

Precision

578

100

CEAT

456

454

Summarizing from the above table, we managed to get average precision of ## over the across the brands

This below is what our model was able to predict.

[IMAGE WITH PREDICTION]

Alright! You ended up getting data. What to do with this data? Who won the IPL?

We were planning to use this data to train the ML-based logo identifier and use it to get a better result. (We might have one post coming up soon!)

The results and inferences may or may not be right, it is accurate to a certain extent though.

ML systems are supposed to 99% accurate, but ours is not.

Results

Before we can define which brand won the game, we will need to define various parameters that are associated with the game.

IPL Revenue:

Last year overall IPL ecosystem was valued at 47,500.0 crores[1]. Total title sponsorship revenue for IPL 2020 is 400 Crore[2] which came from Title sponsors Dream11, Unacademy, CRED, Tata Altroz, Paytm and CEAT.

Title Sponsor - 55% (222 Crore)

Official partners - 30% (120 Crore)

Umpire sponsor - 7% (28 Crore)

Timeout partner - 7.5% (30 Crore)

Metrics for evaluation

Metrics for evaluation, we define metrics on which we track the impact of the brands

Viewership:

20 Cr people watched 2020 IPL starter match[3]. This is based on the statistics from Hotstar and Star Sports. Most probably these numbers could be higher as some people like to watch these sports in groups.

Since it was difficult for us to obtain these numbers from the official sources, we ended up discarding this information.

Screen Time

Total Time the brand was visible on the screen:

Splitting up the Pie

Out of the total match, how many seconds was the brand's name on the screen? More the time on screen the better a person remembers it. Below image shows the part of the cake which is shared by the brands.

Solo-Time

We go together

Which are the brands that occur together.

First Half

Second half

From the first and second half we can see that every min cred is occurred more time w.r.t. other brands on the screen. During end of the match count of all brands went down because it a crucial time which means very less advertise. But still cred was highest.

Area of Interest

For a better impact, it is crucial to understand where the logo has appeared on average. As a logo which continuously appears where the user doesn't look at would decrease the impact that it can generate.

This heat map shows how the screen was consumed by different brands. This will help in analyzing area which was not missed. Camera angle plays a vital in this. Below is first half and second half result.

First Half- Heat Map

Second Half - Heat Map

From the first half and second heat map we can see that center region of the is not prominent it because camera is always trying to focus player at the center. Most prominent areas are top of the screen. It's because if you remember most of the advertise banner are at the end of the field. Banner near to camera are very less.