Trending YouTube videos in Canada

YouTube is the second most-visited website in Canada and in whole world after Google Search, according to Alexa rankings. YouTube is an American online video-sharing platform and the new reality TV. In this article, I’ll be analyzing trending YouTube videos in Canada sourced from Kaggle. Trending videos are most popular videos among viewers which can be something related to current affairs or any viral video which got viewers attraction.

The goal of the article is to find out what type of content to upload and market in order to make the channel hit.

The Data

Tools used to analyze data

Dashboard to view videos based on categories with maximum trending videos

Which categories take less time to trend and came on trend list more than once

Correlations between video’s likes/dislikes and other variables

What factors can lead video to trend?

Popular channels

Network graphs for Categories and Channels

Code Snippets for Data Wrangling


The Data

The Trending YouTube videos dataset contains data for 24,427 distinct videos which were on trend list at least once between 14th November 2017 through 14th June 2018 in Canada. Each video record contains video Id along with timestamp when video published and trending date, plus other details like comments, views, description, tiltle etc. In total there are 16 columns and size of dataset is 61.1 MB. Below is the sneak peek of the data.

First three rows of the dataset

In the dataset only category_id is given, and to know categories respective to each Id refered GitHub link for youtube api video category id list. In our dataset used for analysis, only 17 categories are available such as — 1: “Film & Animation”, 2: “Autos & Vehicles”, 10: “Music”,15: “Pets & Animals”,17: “Sports”,19: “Travel & Events”, 20: “Gaming”, 22: “People & Blogs”, 23: “Comedy”, 24: “Entertainment”, 25: “News & Politics”, 26: “Howto & Style”, 27: “Education”, 28: “Science & Technology”, 29: “Nonprofits & Activism”, 30: “Movies”, 43: “Shows”.

For year 2017 and 2018 data is not available for all the months and there is no trending video for dates 10,11 January 2018 and 8 to 13 April 2018.

We can observe this missing data directly, by looking at monthly graph of year 2017 and 2018. For February, bar graph is low because it had 28 days only as 2018 was not a leap year.

Plot 1: Number of videos trending each month

Also just with difference of two days in between November 2017 and June 2018 the difference in count of videos is noticeable. Lets check the reason for same with below graph.

Plot 2: Number of trending videos each day

From plot 2 we can see that some days have videos less than 200 which I believe because some videos were trending for more time.

Tools used to analyze data

The analysis were performed in a Jupyter Notebook.

List of libraries used in Python for data wrangling and visualization is as below:

And also used Tableau and Gephi to visualize data.

Dashboard to view videos based on categories with maximum trending videos

On Youtube each video is uploaded under specific category and category should be decided related to video content. Below dashboard created using Tableau shows that “Entertainment” is most popular category in terms of number of trending videos, followed by “News & Politics”. To show video on dashboard used URL<Video Id>

The purpose of dashboard is to see type of video content under each category with their past details when they were on trend list and current details on video itself. For example term “Entertainment” can be confusing for some because we consider music, comedy also as an entertainment source. But in real it contain content like Dance, Drama, Fun, Story etc. And this dashboard can help you to relate. For music and comedy content you can check “Music” and “Content” categories according to their ranking based on number of views.

Dashboard to view videos based on categories with maximum trending videos

Which categories took less time to trend and came on trend list more than once

Does Age of video matters to be on trend list? The video content which appeals to a wide range of viewers will be on trend list in short time.

Below plot will show average days taken to be on trend list. As many videos were on trend list more than once, so to show this plot I have selected video Ids with minimum views.

Plot 3: Average days taken to trend for each category

Here plot shows that videos under Movies category get trending on same day. Followed by Shows which take almost one day to trend. Surprised to see that category Entertainment which have maximum number of videos in trend list take more time compared to others to be on trend list. And some videos in Entertainment took years to get famous. Nonprofits & Activism category take week or more on an average to be on trend list.

Here only category can not decide how much time video will take to be hit for example one music video took 9 plus year to gain viewers attention. As it was on trend list on singers death.

Some videos can attract viewers for days. Below plot will give count of videos in each category which were on trend list more than once. For this plot I have selected video Ids which were duplicate in the dataset.

Plot 4: Count of videos in each category which were on trend list more than once

Plot shows that Entertainment category may take on an average more time as compared to other categories but it can entertain viewers for long period. On the other hand if Movies get hit easily than people also prefer to watch once.

Correlations between video’s likes/dislikes and other metrics

For trending videos interested to see correlations between different metrics. Do likes increase if views are increasing or how dislike count looks?Correlation coefficients values range between -1.0 and 1.0. A value of exactly 1.0 means there is a perfect positive relationship between the two variables.A value of -1.0 means there is a perfect negative relationship between the two variables. If the correlation between two variables is 0, there is no linear relationship between them.

To plot correlation heatmap used library Seaborn in Jupyter Notebook.

Plot 5: Correlation Heatmap for trending videos metrics

Heatmap plot shows that likes, views and comment_count are positively related. If one increase than other also increases with it. There is also relation between likes, views and dislikes, videos with more likes/views also have more dislikes. If ratings are disabled than chances are that comments will also be disabled. For other metrics correlation coefficient is almost zero suggesting no relation.

Let’s visualize these relations for top 5 categories with highest trending videos in below scatterplots.

Plot 6: Scatterplot between Trending videos metrics

From above we can observe somewhat positive trend between metrics. If we just look at scatters for comedy category than its clear number of likes and comments are more as compared to dislikes.

What factors can lead video to trend?

Video’s title, thumbnail, and description are important pieces of metadata for video’s search. These main pieces of information help viewers decide which videos to watch. So it is important to give right title which can be easily searched and easy to read.

YouTube has a 100-character limit for titles, anything longer than 70 will be truncated in most search results.

Plot 7: Count of characters in titles of trending videos

Average length for title is 54.38 character. And plot shows that range of character count is between 1 to 100 for trending videos.

Do adding Tags help videos to trend? Yes,Tags can be useful if the content of your video is commonly misspelled. And to know which tags are commonly used for videos below created Bar graph.

Plot 8: Famous tags used for trending videos

Most common tags were like “funny”,“comedy” , “news”. And “Donald Trump” was famous among tags in 2017 and 2018.

Good description play important role for videos and some also use it to promote other content. It helps viewers understand what to expect from channel or video.

Plot 9: Count of characters in description of trending videos under different categories

We can see for shows, description are small and for Travel & Events long.

Popular channels

Below bar chart shows top 15 channels sorted according to number of videos which were on trend list at least once. It shows that channel “Vikatan TV” has maximum number of videos which were hit in past.

Plot 10: Channels with number of trending videos

Now we’ll see channels for which videos went viral more than once. Below are top 15 channels with number of videos which were on trend list at least twice or we can say on trend list at least on two different days. To plot this graph selected videos for which distinct count of trending days was more than 1.

Plot 11: Channels with number of trending videos which went viral more than once

From above we can observe that “REACT” channel has maximum videos which went viral at least two times.

Network graph for Categories and Channels

Do channels post videos under same category always? Or they have variety of content? To know answer of these questions created network graph using Gephi.

Below graph shows interconnectivity between channels and categories in single component.

Plot 12: Network graph for Categories and Channels

If we look into degree chart below, it shows that Entertainment category has videos from 1325 different channels. Here Category ID: 24 is for Entertainment. I have shared list for categories with their IDs above under “The Data” section. Here below degree chart is sorted according to degrees in descending order.

And from highlighted channel title we can see that it has video in eight different categories. All eight categories for “PewDiePie” channel is shown with table in right.

Gephi screenshot + Tableau table

Code Snippets for Data Wrangling

To add “Category” in Tableau dataset respective to “Category ID” column, used below code in calculation field.

Categories respective to Category ID

To convert string Trending date into date format in Tableau used below few codes

To select Day

Similarly done for year and month and then combined all three to make date using below code.

To make trend date

To find number of days taken to trend for each video in Tableau used below code.

Total days taken to trend

To remove duplicate videos used code like below at various point.

To filter duplicate videos

In Conclusion

If you’re looking to create your own YouTube channel and make it hit, at the end now I hope you all know in which category chances are more. How much time you can expect to get on trend list for the very first time. What kind of tags , title and description you can choose for your channel. Posting some new good movie on your channel (hope you have the copyrights)can make your channel viral in no time. But make sure that it is short term fame, mostly People prefer not to watch movie twice(But can lead you good amount of subscribers). You can upload variety of content in different categories for your channel.

For Future work interested to find:

  • Most popular N-grams in titles
  • Comparison of trending and non trending videos
  • Does “tags” preferences change with time
  • Prediction of either video will be trending or not based on different video features

Like, Comment and Subscribe for updates!

GitHub Link

GitHub link to download Python code and Tableau workbook used for data wrangling and visualizations.