Data Facts: Top 100 Movies of all time

Yulu

Jul 25, 2024

This dataset contains information about the top 100 movies, including various attributes such as title, genre, rating, and more. Powered by Powerdrill AI, we can better understand these amazing movies and find some interesting conclusions from them. Take a look!

Source: kaggle

Given the dataset, Powerdrill detects and analyzes the metadata, then gives these relevant inquiries:

  • What are the top 5 highest-rated movies in the dataset? 

  • Which genres are most common among the top 100 movies? 

  • What are the similarities in the themes of the top 100 movies?

  • What is the average rating of movies released in the 20th and 21st century? 

  • How does the average rating of movies vary by decade? 

  • What is the distribution of movie ratings in the dataset?

What are the top 5 highest-rated movies in the dataset? 

Based on the analysis of the dataset provided, here are the top 5 highest-rated movies:

The Shawshank Redemption

  • Rating: 9.3

  • Year: 1994

  • Genre: Drama

  • IMDb Link: The Shawshank Redemption

  • Description: Two imprisoned men bond over a number of years, finding solace and eventual redemption through acts of common decency.

The Godfather

  • Rating: 9.2

  • Year: 1972

  • Genre: Crime, Drama

  • IMDb Link: The Godfather

  • Description: The aging patriarch of an organized crime dynasty in postwar New York City transfers control of his clandestine empire to his reluctant youngest son.

The Dark Knight

  • Rating: 9.0

  • Year: 2008

  • Genre: Action, Crime, Drama

  • IMDb Link: The Dark Knight

  • Description: When the menace known as the Joker wreaks havoc and chaos on the people of Gotham, Batman must accept one of the greatest psychological and physical tests of his ability to fight injustice.

The Godfather Part II

  • Rating: 9.0

  • Year: 1974

  • Genre: Crime, Drama

  • IMDb Link: The Godfather Part II

  • Description: The early life and career of Vito Corleone in 1920s New York City is portrayed, while his son, Michael, expands and tightens his grip on the family crime syndicate.

12 Angry Men

  • Rating: 9.0

  • Year: 1957

  • Genre: Crime, Drama

  • IMDb Link: 12 Angry Men

  • Description: The jury in a New York City murder trial is frustrated by a single member whose skeptical caution forces them to more carefully consider the evidence before jumping to a hasty verdict.

Which genres are most common among the top 100 movies? 

Based on the analysis of the 'genre' column from the dataset of the top 100 movies, the following conclusions can be drawn regarding the popularity of movie genres:

  • Most Popular GenreDrama is the most common genre, with a frequency of 70. This indicates that it appears in 70 out of the top 100 movies, making it significantly prevalent.

  • Other Popular Genres: Following Drama, the genres Adventure and Action are also popular, with frequencies of 27 and 23 respectively. This shows a strong preference for high-energy and expansive cinematic experiences among the top movies.

  • Variety in Genres: Other genres such as Crime and Mystery also appear in the list with notable frequencies of 20 and 14. This diversity reflects a range of thematic interests among the top movies.

  • Statistical Overview: The average frequency of genres among the top 100 movies is 12.40, with a standard deviation of 15.38. This variation indicates that while some genres are extremely common, others are less frequently represented.

In summary, Drama clearly dominates the genre landscape among the top 100 movies, with Adventure and Action also being significantly represented. This suggests a strong audience preference for emotionally gripping narratives as well as visually dynamic and exciting content.

What are the similarities in the themes of the top 100 movies?

Ultimate Conclusion: Common Themes in Top 100 Movie Plots

Based on the analysis of the descriptions from the top 100 movies, several common themes and characteristics have been identified. The analysis involved preprocessing the 'description' column, followed by text analysis including tokenization, stopword removal, and frequency analysis.

Key Themes Identified:

  • War: The word 'war' appears frequently (12 times), indicating that conflict, especially on a large scale, is a prevalent theme in these top movies.

  • Family: The theme of family is also significant, with a frequency of 7, suggesting that familial relationships and dynamics are central to many of the plots.

  • Help: Appearing 12 times, this theme underscores the importance of assistance, support, or rescue in the narratives of these films.

  • Life: With a frequency of 11, this theme highlights the exploration of life’s complexities and the human condition in movie plots.

  • New: Mentioned 7 times, indicating themes of new beginnings, changes, or experiences.

Additional Observations:

  • The presence of the word 'ii' (frequency: 6) might indicate sequels or historical films related to World War II, reflecting a specific interest in historical or continuation themes.

  • The word 'son' (frequency: 9) from the initial preprocessing suggests that relationships, particularly paternal ones, might be a focal point in several plots.

Conclusion:

The common themes in the plots of the top 100 movies predominantly revolve around war, family, help, life, and new experiences. These themes suggest that the top movies often tackle broad, relatable issues that resonate with a wide audience, encompassing both personal and societal conflicts.

What is the average rating of movies released in the 20th and 21st century?  

Conclusion

The top 100 movies were released between the years 1931 and 2023.

Average Movie Ratings by Century:

20th Century:

  • Average Rating: 8.52

21st Century:

  • Average Rating: 8.52

Observation:

The average ratings for movies released in the 20th and 21st centuries are remarkably similar, both rounding to approximately 8.52. This indicates a consistent quality and appreciation of movies across the two centuries.

How does the average rating of movies vary by decade? 

Conclusion on Average Movie Ratings Across Decades

Overview

The analysis of average movie ratings across various decades reveals a fluctuating trend in how movies have been rated over time. The data spans from the 1930s to the 2020s.

Key Observations from Data

  • Highest Average Rating: The 1970s showcased the highest average movie rating at approximately 8.72.

  • Lowest Average Rating: The lowest average rating occurred in the 1930s, with a rating of about 8.43.

  • General Trend: There is a notable variability in ratings across decades, with peaks in the 1970s and a significant dip in the 1980s.

Visual Analysis

The line chart provided visualizes the average movie ratings over the decades:

  • Peak Performance: The 1970s stand out with a peak, indicating a period where movies were rated exceptionally high.

  • Volatility: Post-1970s, there is a sharp decline in the 1980s followed by fluctuations in subsequent decades.

  • Recent Trends: The 2020s show an upward trend in ratings, suggesting a possible resurgence in higher-quality movie productions or changes in rating behaviors.

Conclusion

The average movie ratings by decade indicate that viewer preferences, industry standards, and possibly the quality of movie productions have varied significantly over the past century. The 1970s were the golden era in terms of high ratings, while other decades experienced more moderate evaluations. This historical perspective can be useful for understanding shifts in the film industry and audience reception over time.

What is the distribution of movie ratings in the dataset? 

Conclusion on the Distribution of Movie Ratings

Overview of Rating Statistics:

  • Mean Rating: 8.77

  • Standard Deviation: 0.33

  • Minimum Rating: 8.30

  • Maximum Rating: 9.30

Histogram Analysis:

  • The histogram visualizes the frequency of movie ratings in the dataset.

  • Most Frequent Rating Range: The 8.3 to 8.5 range has the highest frequency, with the peak at 8.4 having 27 counts.

  • Decreasing Frequency: There is a general trend of decreasing frequency as the rating increases, with fewer movies achieving higher ratings.

Key Observations:

  • High Average Rating: The average rating is relatively high at 8.77, indicating a selection of highly rated movies.

  • Narrow Spread of Ratings: The standard deviation is low (0.33), showing that most of the movie ratings are clustered around the mean.

  • Popular Ratings: Ratings between 8.3 and 8.6 are more common, suggesting a concentration of movies in this high-quality range.

Visual Representation:

  • The histogram effectively shows the distribution with a clear right-skewed pattern, where lower ratings (though still high on an absolute scale) are more frequent than very high ratings.

  • This analysis provides a comprehensive view of how movie ratings are distributed within the dataset, highlighting a collection of generally high-rated films with a tendency towards ratings just above 8.3.

Try Now

Try Powerdrill AI now, explore more interesting data stories in an effective way! 

 

TABLE OF CONTENTS

title

title